Brian Mitchell ([email protected]) - Distributed Systems 1 Data Communications: RPC & Group Communications Distributed Systems Memory 1 Directory 1 Memory 2 Directory 2 Memory N Directory N ... Interconnection Network Cache 1 Processor 1 Cache 2 Processor 2 Cache N Processor N ...
23
Embed
Distributed Systems - Drexel CCIbmitchell/course/mcs721/datacomm.pdf · Brian Mitchell ([email protected]) - Distributed Systems 1 Data Communications: ... – The original
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• To enable a distributed system, one processormust communicate with another processor
• We can do this by building our owncommunication programs using sockets
• With sockets we are responsible for:– Encoding all data– Having the client locate the server– Byte ordering– Client and server processes must agree on the
format of exchanged buffers– Client and server processes must agree on
handshaking for exchange of data– Server must include all support code for
handling multiple clients concurrently• Summary: With sockets the programmer is
intimately familiar with the details ofcommunications programming
• Desire is to enable a remote procedure callto support call by reference and call byvalue conventions
• Problem: traditional language syntax isnot descriptive enough to describe if thecall is by value or by reference
• Problem: how do we handle call byreference in RPC’s where the addressspace of the called procedure is differentfrom that of the calling procedure– Passing a pointer will not work– If we use a pointer, how much storage are
we pointing to?• Result: We need additional knowledge in
RPC’s to properly support parameterexchange between the calling and calledprocedures
• RPC’s are not as reliable as LPC’s becausethe client may not be able to locate theserver
• Server may be down• How is the best way to handle this
situation considering that the RPC isdesigned to look like an LPC
• LPC only able to return a single value, howdo we notify the caller that the server isdown or that we are unable to establish aconnection with the server?
• At least once semantics: Client retries therequest once the server reboots. RPC hasbeen carried out at least once, but possiblymore then once
• At most once semantics: Client detects theserver crashes and does not retry. Result:Operation has been carried out at most onetime, but perhaps none at all
• Exactly once semantics: Client request isguarenteed to be executed exactly onetime. Typically this is not possible unlessreliable (persistent) message queues areused– IBM MQ Series
• Client sends a request to a server, theserver performs the operation on the clientsbehalf, and the client crashes before it canprocess the response from the server
• Situation results in an orphan RPC call• Potential solutions:
– Client logs all RPC request so that they canbe handled after a reboot
– After a reboot a client broadcasts that it isonline so that servers who have orphanedclients can respond
– When a server detects an orphan it waits aspecified period of time (for the client toreboot) and then sends a reply
• All of the above scenarios require theclient to be able to respond to an out ofscope request
• Sometime the clients and servers in anRPC environment need to exchange a lotof data
• Stop and wait protocol: Client breaksdown it’s request into packets, sends eachpacket, and waits for an acknowledgementfor each packet that is sent– Advantages: If a packet is lost only the lost
packet needs to be resent– Disadvantages: Performance -> each
packet must be acknowledged• Blast protocol: Client sends all of its
packets and only waits for anacknowledgement for all of its packets– Advantages: Performance -> only one
acknowledgement is needed for all packets– Disadvantages: A lost packet requires all
• The RPC stubs, along with thecommunication device drivers mustproperly manage flow control
• Network interface cards can often not sendand receive information over a network asfast as the information is generated by theprocessor
• Network interface cards have buffers toprovide some relief
• However, often flow control techniques areneeded to govern the amount ofinformation being sent by one processorand received by another processor (andvice versa)
• If not carefully managed, overrun errorscan occur because the buffers on thenetwork cards overflow