Top Banner
CSE 124 Networked Services Fall 2009: Lecture 15 B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa09/cse124 1 Some of these slides are adapted from various sources/individuals including but not limited to the images and text from the text book by Kurose and Ross and publications from the IEEE/ACM digital libraries. Use of these slides other than for pedagogical purpose for CSE 124, may require explicit permissions from the respective sources. 11/12/2009 CSE 124 Network Services FA 2009
28

CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

Jul 07, 2019

Download

Documents

ngongoc
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

CSE 124 Networked ServicesFall 2009: Lecture 15

B. S. Manoj, Ph.D

http://cseweb.ucsd.edu/classes/fa09/cse124

1

Some of these slides are adapted from various sources/individuals including but not limited to

the images and text from the text book by Kurose and Ross and publications from the

IEEE/ACM digital libraries. Use of these slides other than for pedagogical purpose for CSE 124,

may require explicit permissions from the respective sources. 11/12/2009 CSE 124 Network Services FA 2009

Page 2: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

Announcements• Programming Project-2:

– 1-pager submission • deadline: 11/13/2009• Project Title• Team members• 1-2 paragraph description of the project

– Progress report due• Deadline: 11/20/2009• Progress made so far • What further to do

– Project final presentation: • Schedule: 12/03/2009• Final report submission: Any time during the finals week

• Paper discussion session• Paper will be posted soon

• Guest Lecture by experts from Sun micro systems– Tentatively scheduled for 11/24/2009

211/12/2009 CSE 124 Network Services FA 2009

Page 3: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

3

Electronic Mail: SMTP [RFC 2821]

• uses TCP to reliably transfer email message from client to server, port 25

• direct transfer: sending server to receiving server

• three phases of transfer

– handshaking (greeting)

– transfer of messages

– closure

• command/response interaction

– commands: ASCII text

– response: status code and phrase

• messages must be in 7-bit ASCII

11/12/2009 CSE 124 Network Services FA 2009

Page 4: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

4

Scenario: Alice sends message to Bob

1) Alice uses UA to compose message and “to” [email protected]

2) Alice’s UA sends message to her mail server; message placed in message queue

3) Client side of SMTP opens TCP connection with Bob’s mail server

4) SMTP client sends Alice’s message over the TCP connection

5) Bob’s mail server places the message in Bob’s mailbox

6) Bob invokes his user agent to read message

user

agent

mail

server

mail

server user

agent

1

2 3 45

6

11/12/2009 CSE 124 Network Services FA 2009

Page 5: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

5

Sample SMTP interactionS: 220 hamburger.edu

C: HELO crepes.fr

S: 250 Hello crepes.fr, pleased to meet you

C: MAIL FROM: <[email protected]>

S: 250 [email protected]... Sender ok

C: RCPT TO: <[email protected]>

S: 250 [email protected] ... Recipient ok

C: DATA

S: 354 Enter mail, end with "." on a line by itself

C: Do you like ketchup?

C: How about pickles?

C: .

S: 250 Message accepted for delivery

C: QUIT

S: 221 hamburger.edu closing connection

11/12/2009 CSE 124 Network Services FA 2009

Page 6: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

6

SMTP: final words

• SMTP uses persistent connections

• SMTP requires message (header & body) to be in 7-bit ASCII

• SMTP server uses CRLF.CRLFto determine end of message

Comparison with HTTP:

• HTTP: pull

• SMTP: push

• both have ASCII command/response interaction, status codes

• HTTP: each object encapsulated in its own response msg

• SMTP: multiple objects sent in multipart msg

11/12/2009 CSE 124 Network Services FA 2009

Page 7: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

7

Mail message format

SMTP: protocol for exchanging email msgs

RFC 822: standard for text message format:

• header lines, e.g.,

– To:

– From:

– Subject:

different from SMTP commands!

• body

– the “message”, ASCII characters only

header

body

blank

line

11/12/2009 CSE 124 Network Services FA 2009

Page 8: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

8

Mail access protocols

• SMTP: delivery/storage to receiver’s server

• Mail access protocol: retrieval from server

– POP: Post Office Protocol [RFC 1939]

• authorization (agent <-->server) and download

– IMAP: Internet Mail Access Protocol [RFC 1730]

• more features (more complex)

• manipulation of stored msgs on server

– HTTP: gmail, Hotmail, Yahoo! Mail, etc.

user

agent

sender’s mail

server

user

agent

SMTP SMTP access

protocol

receiver’s mail

server

11/12/2009 CSE 124 Network Services FA 2009

Page 9: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

P2P email• Email – a very important communication medium• Desired features

– High availability– Privacy or Confidentiality – Easiness of operation

• Client-server Email– Dominant in today’s Internet– Server receives, stores, and provides inbox access to users

• E-mail servers– Storage stress

• Large attachments sent over to large number of people

– Processor stress• Due to additional processing load for virus checking, spam filtering etc

– Clustering is done for reliability• Moderate improvement in fault tolerance

– Expensive for large scales

• P2P Email may become an alternative– High reliability– No single server– Relatively low complexity of the system

911/12/2009 CSE 124 Network Services FA 2009

Page 10: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

P2P Email

• Retrieved messages are local

• Inbox and New messages are stored in the DHT substrate– Many peers may

store your messages

1011/12/2009 CSE 124 Network Services FA 2009

Page 11: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

P2P Email

• P2P Email runs over DHT substrate– DHT may be any one of CAN, Chord, or Pastry

• Resilient to – faults– Disasters– Attacks

• Scalability– Reduced storage and processor stress

• Spam filtering and Virus check is done by User Agent

• Better security and privacy– Inbox is stored by User Agent

1111/12/2009 CSE 124 Network Services FA 2009

Page 12: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

Basic Components of P2P Email• User

– Each user has an email address which is public• E.g, [email protected]

– Certificate from an external certificate authority• Email address certificates that binds Email addresses to public keys

– Alice has her private key to decrypt messages encrypted using her public key– Inbox contains only notifications of Unread messages

• Nodes or peers– System nodes

• Nodes in the DHT substrate• Objective is to provide persistent delivery of incoming messages

– User Agents• Runs Email reader software• Access the messages using System nodes

– System nodes and User agents can coexist • Otherwise, UA must have an IP address of at least one of the System Nodes to access the inbox and messages

• DHT substrate space stores– Email address certificates– Email message bodies– Inboxes

• DHT lookup service– Use one of the Existing lookup service with a single hash function 1211/12/2009 CSE 124 Network Services FA 2009

Page 13: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

Service Primitives of P2P Email• UAs call service primitives to do higher level system functions

– Sending– Retrieving– Deleting

• Each service is directly conducted by a UA on a peer node– IP address of the peer is looked up by the UA

• Five primitives– Store, Fetch, Delete, Append-inbox, and Read-inbox

• Store – Used to store Email message bodies and Email address certificates– Arguments

• an object, object’s ID, and a set of Email address certificates with permission to delete the object

– Message object: ID is an EFC 2822 message ID– Email certificate object: Id is an email address appended with certificate– Stores the object in k replicas– Lookup provides k closest peers to the Object ID– UA requests each peer to store the message object

1311/12/2009 CSE 124 Network Services FA 2009

Page 14: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

Service primitives of P2P Email

• Delete Service primitive– Used by a requester to reclaim storage space– Arguments: Object ID, Requester’s Email address– Receiver of the request

• Locates the requester’s certificate list• Authenticates the requester• Removes the requester’s certificate from the object’s list• If resulting certificate list is empty, then object is discarded

– Sometimes, the k closest nodes of the object may not contain the object

• New nodes join the DHT with ID closer to the key• UA can widen the search to include older nodes if k nodes do not

have the object

1411/12/2009 CSE 124 Network Services FA 2009

Page 15: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

• Append-Inbox – Used by a sender to append Email message header to a receiver’s inbox– Inbox consistency is not guaranteed

• Inbox is maintained by multiple peers and messages delivered asynchronously • As the user reads emails, the inbox gets updated

• Fetch service– Used to retrieve stored objects– Arguments: Object ID– Returns the object (encrypted message or Email certificate)

• Read-Inbox – UA calls the Read-Inbox to retrieve a message notifications in the inbox– Arguments: Email address and certificate– Returns Email notifications for the receiver’s inbox

• Garbage Collection Process– Every peer maintains the objects sorted by key– A node can check if that is k-closest to the object’s original location

• Otherwise, it can delete the object• Or Least Recently Used object can be replaced

15

Service primitives of P2P Email

11/12/2009 CSE 124 Network Services FA 2009

Page 16: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

P2P Email message creation• Alice’s User Agent, A Creates Message

– Appends Bob’s email address with ”-certificate,” and maps this to a key– Uses lookup service to obtain the list of k nodes closest to the key– Fetches Bob’s certificate from one of these nodes– Extracts Bob’s public key– Generates a session key, and uses it to encrypt the e-mail message body.– Generates an RFC 822 message ID that will be used to identify the message body.

• User Agent A Stores the message– maps the message ID to a key – A uses the lookup service to obtain the k nodes closest to the key– invokes the store operation on each of them

• using the message ID as identifier and the encrypted message body as object

• User Agent A constructs the e-mail message headers and Updates the inbox– A creates message headers

• Include the session key, the message ID header, and a digest of the message)• Encrypts them with Bob’s public key• Maps Bob’s e-mail address to a key

– User Agent Updates Bob’s Inbox• Obtains from the lookup service the k nodes closest to the key• Invokes the append-inbox function on each of them with Bob’s e-mail address and the encrypted message

headers 1611/12/2009 CSE 124 Network Services FA 2009

Page 17: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

P2P Email: Reading the Inbox• Bob wants to read his Emails

– Retrieve Inbox• User Agent B obtains the k-closest nodes with Bob’s email

address– Senders append message notifications to these nodes

• B invokes read-inbox action on all k nodes– To provide consistent inbox

• Creates a superset of all k message notifications • Each of k nodes will delete their copy of notifications

– Retrieve Messages• UA B looksup k closest nodes to the message ID• UA invokes fetch operation on one of the k nodes• Verifies the message body by comparing the digest that the sender

placed in the message headers• If the message body is not valid, then UA B proceeds to another of

the k nodes– Repeats until the B gets a satisfactory message

• UA B decrypts the object and peers delete the object 1711/12/2009 CSE 124 Network Services FA 2009

Page 18: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

Persistence of Data/Operation

• Probability of finding all k peers down

– p is the up probability of a peer

• Probability of success of peer retrieving an object (reading a message)

1811/12/2009 CSE 124 Network Services FA 2009

Page 19: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

Persistency of Operation in P2P Email

• Use of the system requires multiple operations, n

– E.g., read inbox, read message

• Probability of successful transaction over n operations

1911/12/2009 CSE 124 Network Services FA 2009

Page 20: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

Persistency of Operation in P2P Email

• What is the optimal k for a certain persistency of operation (reliability)

– p- probability of node up

– n- number of operations per transaction

– k- number of peers for replicating the data

2011/12/2009 CSE 124 Network Services FA 2009

Page 21: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

P2P Email: behavior of k

• n=2, p=0.9, k=5 or 6; n=2, p=0.5, k=15

• n=100, p=0.9, k=5 or 6; n=100, p=0.5, k>20

2111/12/2009 CSE 124 Network Services FA 2009

Page 22: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

P2P Search

• Challenges– Volume of data– P2P may not scale well as of today

• Google indexes only >2-3 Billion pages– Of a total of >550 Billion pages

• New players to the search market will face tough challenges

• Will P2P search has sufficient capacity?

2211/12/2009 CSE 124 Network Services FA 2009

Page 23: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

P2P Search

• Two popular candidate for search basis – Flood queries over some or all peers

• E.g., Gnutella

– Use DHT based substrates• P2P text search • Tested upto 100,000 documents

• P2P search has an estimated >500million documents– Mainly music file related key words– Search is based on the title, authors, and small keywords of objects

• Much smaller workload than the real Web!

2311/12/2009 CSE 124 Network Services FA 2009

Page 24: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

P2P search• Fundamental constraints

– Google: 3Billion x 1000 words/doc – Google: 1000 queries per second– Total index size of Web: 6x1013 Bytes

• Storage constraints– Each peer has very limited storage

• Assume 1GB per peer

– To index 6x1013 Bytes• 60,000 peer nodes are required

• Communication constraints– Assume search could consume 10% of Internet’s capacity– Assuming Internet bisection backbone bandwidth is 100Gbps– 1000 queries/second– Per query bandwidth budget is 100Gbps/1000=10Mbps = 1 MBps

2411/12/2009 CSE 124 Network Services FA 2009

Page 25: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

P2P search approaches

• Partition by document

– Docs are divided up among the hosts

– Each peer maintains a local index of the docs it is responsible for

– Query must be broadcast or flooded to all peers

– Peer returns most highly ranked documents

• Gnutella and KaZaA utilize partitition by document

– Flooding query to 60000 packets would be required each of size atleast 100 Bytes

• Communication cost: 6MBps : 6x higher than the budget

2511/12/2009 CSE 124 Network Services FA 2009

Page 26: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

P2P search approaches

• Partition by keyword– Each peer is responsible for a set of keywords that appear in the

document– Each peer stores the posting list of the words that it is

responsible for – DHT is used to map keyword to peer that is responsible – Multiple keywords require multiple queries

• A test of 81,000 queries (at search engine for mit.edu) show that– 40% carries one term– 35% carries two terms– 25% carries three or more terms– Queries moves about 300,000 bytes for MIT with with 1.7 million web pages

• For 3 Billion webpage: 530MBps [(3B/1.7M)*300KB]

– An expected 530x improvement required• Some keywords are particularly expensive: the, who, etc. [4000x

improvement required]

2611/12/2009 CSE 124 Network Services FA 2009

Page 27: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

P2P search optimization strategies

• Caching and precomputation

• Compression– Bloom filters

– Gap compression

– Adaptive Set Intersection

– Clustering

• Compromises– Compromising result quality

– Compromising P2P structure

2711/12/2009 CSE 124 Network Services FA 2009

Page 28: CSE 124 Networked Services Fall 2009cseweb.ucsd.edu/classes/fa09/cse124/presentations/CSE-124-bsmanoj-Lecture-15.pdf · 4 Scenario: Alice sends message to Bob 1) Alice uses UA to

Summary

• Reading material:

– IEEE Papers to be listed on the website

2811/12/2009 CSE 124 Network Services FA 2009