Reliable Multicasting with JGroups

Post on 05-Feb-2016

31 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Reliable Multicasting with JGroups. Bela Ban, Jan 2004 belaban@yahoo.com http://www.jgroups.org. Overview. API, architecture Protocols Building Blocks Performance Future, Conclusion. What Is It ?. Toolkit for reliable multicasting Fragmentation Message retransmission Ordering - PowerPoint PPT Presentation

Transcript

Reliable Multicasting with JGroups

Bela Ban, Jan 2004belaban@yahoo.comhttp://www.jgroups.org

EBIG, Oakland Jan 21 2004 2

OverviewAPI, architectureProtocolsBuilding BlocksPerformanceFuture, Conclusion

EBIG, Oakland Jan 21 2004 3

What Is It ?

Toolkit for reliable multicasting Fragmentation Message retransmission Ordering Group membership, membership

change notificationLAN or WAN based

EBIG, Oakland Jan 21 2004 4

License

JGroups is a toolkit (JAR), to be linked against an application

Open Source under LGPL Commercial products can use JGroups

without having to LGPL their code Modifications to JGroups itself need to

be LGPL'ed (if distributed)Dual licensing in the future

EBIG, Oakland Jan 21 2004 5

API

Channel: similar to java.net.MulticastSocket plus group membership, reliability

Operations: Create a channel with a set of properties Connect to a group X. Everyone that

connects to X will see each other Send a message to all members of X Send a message to a single member

EBIG, Oakland Jan 21 2004 6

API

Receive a message Retrieve membership Be notified when members join, leave

(including crashes) Disconnect from the group Close the channel

EBIG, Oakland Jan 21 2004 7

API

JChannel channel=new JChannel("file://home/bela/default.xml");

channel.connect("demo-group");

System.out.println("members are: " + channel.getView().getMembers());

Message msg=new Message(null, null, "Hello world");

channel.send(msg);

Message m=(Message)channel.receive(0);

System.out.println("received msg from " + m.getSrc() + ": " + m.getObject());

ch.disconnect();

ch.close();

EBIG, Oakland Jan 21 2004 8

Group topology

Architecture of JGroups

C ha nne l

G M S

U N IC A S T

U D P

F D

N A K A C K

C ha nne l

G M S

U N IC A S T

U D P

F D

N A K A C K

C ha nne l

G M S

U N IC A S T

U D P

F D

N A K A C K

N e tw o rk

A p p lic a tio n A p p lic a tio n A p p lic a tio n

B u ild ingB lo c k s

B u ild ingB lo c k s

B u ild ingB lo c k s

EBIG, Oakland Jan 21 2004 10

Demo

DrawReplicatedTree: shared state

EBIG, Oakland Jan 21 2004 11

Stats

JGroups has ~ 90KLOC 30KLOC protocols 45KLOC main + building blocks 15KLOC unit tests

~ 90 protocols shipped with JGroupsSet of well-tested stacks (in XML

files)

EBIG, Oakland Jan 21 2004 12

Available protocols I

Transport UDP, TCP, TCP_NIO, TUNNEL, JMS,

LOOPBACKDiscovery

PING, TCPPING, TCPGOSSIP, UDPPINGGroup membershipReliable delivery & FIFO

NAKACK, SMACK, UNICAST

EBIG, Oakland Jan 21 2004 13

Available protocols II

Failure detection FD, FD_SOCK, FD_PID, FD_SIMPLE,

FD_PROB, VERIFY_SUSPECTSecurity

ENCRYPT, SSL ConnectionTable (n/a)Fragmentation (FRAG)State transfer (STATE_TRANSFER)

EBIG, Oakland Jan 21 2004 14

Available protocols III

Ordering FIFO, CAUSAL, TOTAL, TOTAL_TOKEN

Virtual Synchrony FLUSH, QUEUE, VIEW_ENFORCER

Probabilistic Broadcast PBCAST

Merging: MERGE(2), MERGEFAST

EBIG, Oakland Jan 21 2004 15

Available protocols IV

Distributed message garbage collection STABLE

Debugging PERF, TRACE, PRINTOBJS, SIZE, BSH

Simulation SHUFFLE, DELAY, DISCARD, DEADLOCK,

LOSS, PARTITIONER

EBIG, Oakland Jan 21 2004 16

Available protocols V

Dynamic configuration AUTOCONF

Flow control FLOW_CONTROL, FC

Misc PIGGYBACK, COMPRESS

EBIG, Oakland Jan 21 2004 17

Transport

Task Send messages from above to all members in

the group, or to a single member Receive messages from NW, pass up stack

UDP: multicast and multiple UDP unicastTCP: mcast done by multiple TCP unicastsTUNNEL: send to external router, e.g.

through firewall

EBIG, Oakland Jan 21 2004 18

Discovery

Task Initial discovery of members Used by GMS to determine coordinator to

send JOIN request toEach member returns its own addr,

plus the addr of the coordinator Typical response ({A,A}, {B,A}, {C,A})

Wait for n milliseconds or m responses

EBIG, Oakland Jan 21 2004 19

Discovery - UDP

Multicast discovery requestEach member responds with a

unicast UDP datagram (local-addr, coord-addr), back to the sender

EBIG, Oakland Jan 21 2004 20

Discovery - TCPGOSSIPCan be used by both UDP and TCPExternal GossipServer

org.jgroups.stack.GossipServer Maintains table of <group, members> Each member registers (groupname, own

addr) Lease based - members have to

periodically renew registration Multiple GossipServers possible

EBIG, Oakland Jan 21 2004 21

Discovery - TCPGOSSIP

To obtain initial membership for a given group, TCPGOSSIP contacts the GossipServer

Membership info does not need to be accurate - only goal is to determine coord to send JOIN request to

EBIG, Oakland Jan 21 2004 22

Discovery - TCPPING

Give a set of well known membersFor discovery, those members are

pingedIf at least 1 responds, we can find

the coordinatorDoes not require additional process

EBIG, Oakland Jan 21 2004 23

Group Membership

Task Maintain a list of members Notify members when a new member

joins, or an existing member leaves (or crashes)

Each member has the same ordered list List can be retrieved by Channel.getView() First (= oldest) member is coordinator If coord crashes, 2nd oldest takes over

EBIG, Oakland Jan 21 2004 24

Group Membership - JOIN

New member uses discovery to find coord If first member -> become coord Else: sends JOIN to coord

Coord adds new member to list, multicasts new view (member list) to all members

If 2 initial members are started at the same time, MERGE protocol merges them into a single group

EBIG, Oakland Jan 21 2004 25

Group Membership - LEAVE

Member sends LEAVE to coordCoord multicasts new view to all

members

EBIG, Oakland Jan 21 2004 26

Group membership - CRASH

Failure detection protocol sends up SUSPECT event

VERIFY_SUSPECT double checksGMS multicasts new view (not

containing crashed member)If member resurfaces, it will be

shunned Has to leave and rejoin group

EBIG, Oakland Jan 21 2004 27

Failure detection

Task Detect if a member has crashed and

send SUSPECT event up the stack (to be handled by GMS)

Logical ring over membershipEach member pings its neighbor to the right

EBIG, Oakland Jan 21 2004 28

Failure detection - FD

EBIG, Oakland Jan 21 2004 29

Reliable delivery & FIFO

Lossless and FIFO delivery for multicast and unicast messages Multicast: NAK and ACK Unicast: ACK

Missing messages (gaps) are retransmitted Sender resends or Receiver requests retransmission

EBIG, Oakland Jan 21 2004 30

Encryption

Uses public/private encryption to join new member and get shared group key

Shared key is used to encrypt all messages

Group key is recomputed on joins/leavesSSL ConnectionTable

As alternative, to be used in TCP Uses SSLSocket rather than Socket

EBIG, Oakland Jan 21 2004 31

Properties configuration

Plain string format "UDP(mcast_addr=228.8.8.8;mcast_port=45566;ip_ttl=32;" +

"mcast_send_buf_size=64000;mcast_recv_buf_size=64000):" + "PING(timeout=2000;num_initial_members=3):" + "MERGE2(min_interval=5000;max_interval=10000):" + "FD_SOCK:" + "VERIFY_SUSPECT(timeout=1500):" +

"pbcast.NAKACK(max_xmit_size=8096;gc_lag=50;retransmit_timeout=600,1200,2400):" + "UNICAST(timeout=600,1200,2400,4800):" + "pbcast.STABLE(desired_avg_gossip=20000):" + "FRAG(frag_size=8096;down_thread=false;up_thread=false):" + "pbcast.GMS(join_timeout=5000;join_retry_timeout=2000;" + "shun=false;print_local_addr=true)"

URL / XML

EBIG, Oakland Jan 21 2004 32

Advantages of protocol stacks

Each property is implemented by 1 prot Fragmentation, retransmission, ordering

Protocols are assembled into a stackStack has exactly the properties needed

by the appl / required by the networkCan‘t get this with java.net.Socket,

always comes with full TCP/IP

EBIG, Oakland Jan 21 2004 33

Advantages of protocol stacks

Small scope: a protocol does just one job, but does it well

Protocol stacks are fashionable: Servlet 2.3 filters Interceptors (Corba, JBoss) AOP: separation of concerns, e.g.

fragmentation should not be an application concern

EBIG, Oakland Jan 21 2004 34

Benefits

Same application code, different protocol stacks (deployment issue)

Application requirements reflected in protocol stack specification

App focuses on domain specific issues

EBIG, Oakland Jan 21 2004 35

Building Blocks

Replicated CacheNotificationBusGroup RPC

EBIG, Oakland Jan 21 2004 36

Replicated Cache

Shared state across a groupAny change is replicated to all membersNew members acquire initial state from

coordStructures supported

Tree Hashmap Queues

EBIG, Oakland Jan 21 2004 37

NotificationBus

Thin layer on ChannelNotifications sent to all membersCallback when notification is

receivedHook for state sharing

EBIG, Oakland Jan 21 2004 38

Group RPC

Invoke a method call in all membersGet a list of responsesWait for all responses, majority, first,

or none response (use optional timeout)

Handles crashed members correctly (no blocking)

EBIG, Oakland Jan 21 2004 46

Serverless JMS

JMS based on JGroupsPeer-to-peer architecture rather than C/SClient publishing to a topic

Instead of sending msg to server, and server distributes to multiple clients: publisher multicasts message

JMS Server just another member Handles persistent messages (DB)

EBIG, Oakland Jan 21 2004 47

Serverless JMS

Publisher

JMS Server

Subscriber

Subscriber

Subscriber

Client/Server Model

Publisher

JMS Server

Subscriber

Subscriber

Subscriber

Serverless Model

Multicast

(discard) (accept)

(accept)

(accept)

(accept)

(discard)

Cost: 4 unicasts Cost: 1 multicast

EBIG, Oakland Jan 21 2004 48

Serverless JMS

Clients are still able to publish even when server is down

Caveat: works in scenario where client and server are in same multicast-reachable NW

Status Topics/Queues available No TX/XA, no durable subscriptions, no

persistent messages Download (standalone) beta at jboss.org

EBIG, Oakland Jan 21 2004 52

Where is JGroups used ?

JBoss Clustering

Replication of entity beans, SLSBs and SFSBsHA-JNDICache invalidationSession repl (integrated Tomcat, Jetty)

Serverless JMS Cache

Replicated transactional clustered cache

EBIG, Oakland Jan 21 2004 53

Where is JGroups used ?

Jonas appserver (clustering)GroupPac (FT-CORBA impl)GCT: port to .NETReplicated Caching

OpenSyphony OSCache Jakarta Turbine's JCS Swarmcache

EBIG, Oakland Jan 21 2004 54

Where is JGroups used ?

Session replication Jetty Tomcat 4.x Work in progress on plugin architecture

for Tomcat 5.xUnofficial ones...

EBIG, Oakland Jan 21 2004 55

Performance

4 nodes, 1 or 2 senders750MHz SunBlade 1000 512MB, 100MB

switched ethernetJGroups 2.18000 10K msgs, in 200 bursts of 20 (2

senders), sleep after burst = 5ms 451 msgs/s == 4.5MB/s throughput Resident heap size 35MB max (-Xmx128m)

EBIG, Oakland Jan 21 2004 56

Performance

1.4 billion messages total4 nodes, 2 sendersMessage size = 10KAverage msgs/s: 350Max resident mem: 35M (-Xmx128m)Tests available as part of JG distro

Includes gnuplot scripts to generate graphs

EBIG, Oakland Jan 21 2004 57

Current and future projects

JBossCache, Serverless JMSPort to J2ME (first version available on

www.jgroups-me.org)hsqldb (HyperSonic) database replicationJCache JSR 107 compliant impl (JBoss

Cache)Potential work on GroupComm JSR

jcluster project on dev.java.net

EBIG, Oakland Jan 21 2004 58

Links

www.jgroups.org "Papers and Articles": link to IBM

devworks

Questions ?

top related