1 Electronic Mail (SMTP, POP, IMAP, MIME) We will work through the handout from Tanenbaum’s book “Computer Networking.” Overview : The message will be constructed under RFC 822, then passed to SMTP (RFC 821) for transmission. Internet E-mail standards were published in two parts in 1982: RFC 822: STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT MESSAGES by David H. Crocker RFC 821: SIMPLE MAIL TRANSFER PROTOCOL by Jonathan B. Postel (Updated as RFC 2822 and 2821 (April, 2001).)
Electronic Mail (SMTP, POP, IMAP, MIME) We will work through the handout from Tanenbaum’s book “Computer Networking.”. Internet E-mail standards were published in two parts in 1982: RFC 822: STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT MESSAGES by David H. Crocker - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Electronic Mail (SMTP, POP, IMAP, MIME)
We will work through the handout from Tanenbaum’s book “Computer Networking.”
Overview:
The message will be constructed under RFC 822, then passed to SMTP (RFC 821) for transmission.
Internet E-mail standards were published in two parts in 1982:
RFC 822: STANDARD FOR THE FORMAT OF
ARPA INTERNET TEXT MESSAGES
by David H. Crocker
RFC 821: SIMPLE MAIL TRANSFER PROTOCOL by Jonathan B. Postel
(Updated as RFC 2822 and 2821 (April, 2001).)
2
7.4.3 Message Formats
RFC 822 messages consist of lines of ASCII text, ending with <CR> <LF>
maximum 1000 characters
Messages are divided into three sections:
■ header fields
■ optionally, the message body.
■ a blank line (a line with nothing except <CR><LF> )
3
Headers
■ contain readable text (ASCII – no control characters)
■ are divided into lines
■ each line of form <keyword> : <value>
Keywords To and From are required, others optional.
4
Some other RFC 822 header fields not involved in transport:
5
RFC 822 states that the message can consist
only of ASCII text and SMTP (RFC 821) expects this.
ASCII is a 7-bit code, which is transmitted right-adjusted in an 8-bit byte, leaving binary 0 in the
high-order position.
6
MIME – Multipurpose Internet Mail Extensions (RFC 1521, 1993)
In the body of the message we would like to be able to include items such as:
To send non-ASCII information (arbitrary binary string) we must “disguise” it as ASCII
Such material may contain arbitrary sequences of binary digits. No reason that high-order bit of byte is always zero.
■ Messages not containing any kind of text (image, audio and video)
■ Messages in languages without alphabets (Chinese and Japanese)
■ Messages in non-Latin alphabets (Arabic, Russian, Hebrew)
■ messages in languages with accents
7
Questions:
■ how does the sender disguise the binary string as ASCII?
■ when recipient receives the “ASCII” how does she
retrieve the binary string?
■ when recipient retrieves the binary string, how does
she know what it is?
8
Questions:
■ how do we disguise the binary string as ASCII?
9
10
U A B
In this example, disguise is not necessary, since ‘UAB’ is already ASCII text!
010101 01
V
11
Receiver sees the Content-Transfer-Encoding header, then knows how to reverse the encoding to retrieve the original binary string.
Second Question:
■ when recipient receives the “ASCII” how does she
retrieve the binary string?
12
Third question:
■ when recipient retrieves the binary string, how does
she know what it is?
13
14
Body
Section boundary
Required blank line
RFC 822 Headers
15
7.4.4 Message Transfer
This is RFC 821, ”Simple Mail Transfer Protocol.”
SMTP is a simple ASCII protocol, running on top of TCP.
First, the client establishes a TCP connection to port 25 of the server
(this would have involved a preliminary access to the DNS system to discover a type MX resource record for the destination domain).
Overview:
This message has been constructed under RFC 822, and will be passed to SMTP (RFC 821) for transmission.
We will illustrate the client/server exchange by considering transmission of the message in figure 7-46.
16
TCP connection from client abc.com to port 25 on Mail Exchanger for xyz.com already established.
RFC821 (SMTP) Dialog
RFC 822 message
End marker added by SMTP client
17
The e-mail message as seen on user screen:
Subject: Test IIFrom: Anthony Barnard <[email protected]>Date: Fri, 20 Jul 2007 11:59:23 -0500To: "Anthony (work) Barnard" [email protected]
The following two lines have a period in the first position:..The following two lines have periods in the first two positions:....end test
What if the 822 message itself has a period alone in the first position?
Will SMTP server see this and terminate the message prematurely?
18
Wireshark trace of sending message:Frame 22 (588 bytes on wire, 588 bytes captured)Internet Protocol, Src: 192.168.2.99, Dst: 207.69.189.206Transmission Control Protocol, Src Port: 3693 (3693), Dst Port: smtp (25), Simple Mail Transfer Protocol Message: Message-ID: <[email protected]>\r\n Message: Date: Fri, 20 Jul 2007 11:59:23 -0500\r\n Message: From: Anthony Barnard <[email protected]>\r\n Message: User-Agent: Thunderbird 1.5.0.12 (Windows/20070509)\r\n Message: MIME-Version: 1.0\r\n Message: To: "Anthony (work) Barnard" <[email protected]>\r\n Message: Subject: Test II\r\n Message: Content-Type: text/plain; charset=ISO-8859-1; format=flowed\r\n Message: Content-Transfer-Encoding: 7bit\r\n Message: \r\n [the blank line] Message: The following two lines have a period in the first position:\r\n Message: ..\r\n Message: ..\r\n Message: The following two lines have periods in the first two positions:\r\n Message: ...\r\n Message: ...\r\n Message: end test\r\n Message: .\r\n [the end-of-message marker appended by SMTP client]
[Extra period “stuffed” in by SMTP client]
[Extra period “stuffed” in by SMTP client]
19
Introduction to the World Wide Web
Since we are coming off a study of E-mail, it may be helpful to note the influence that it had on the WWW protocols. Both separate the specification of the message from its transmission.
►RFC822/MIME govern format of E-mail messages
HTML governs format of WWW pages
Like SMTP and POP3, HTTP is an “ASCII protocol” that can be easily read and understood by humans.
►RFC821/SMTP and RFC1939/POP3 govern transmission of E-mail messages
HTTP governs transmission of WWW pages
However, the correspondence is only loose: HTML look very different from RFC/822/MIME, whereas HTTP draws from both RFC 822/MIME and RFC821/SMTP
20
An HTML document!
We will revisit this!
21
Chapter 27 – World Wide Web
Skim sections 27.1 – 27.5
22
27.6 Hypertext Transfer Protocol (HTTP)
► Application Level
► Request/Response
► Stateless
► Bi-directional Transfer
► Capability Negotiation
► Support for Caching
► Support for Intermediaries (proxies)
23
27.7 HTTP GET Request
Using Comer’s example
http://www.cs.purdue.edu/people/comer/
once TCP connection to HTTP server www.cs.purdue.edu has been made, browser sends command
GET /people/comer/ HTTP/1.1
27.8 Error Messages
Not much to say!
Host: www.cs.purdue.edu Required request header (see later)
24
27.9 Persistent Connections
HTTP/1.0 followed the FTP paradigm, using one TCP connection per data transfer – create data connection, transfer one file, close data connection.
► Disadvantage:
need to identify beginning and end of each itemcan’t reserve a bit pattern as “sentinel”have to use content-length response header
► Advantage:
reduced overhead pipelining
Default in HTTP/1.1 is persistent connection
25
27.10 Data Length and Program Output
May not be convenient or even possible for server to know the length of an item before sending.
In this case we cannot use persistent connection.
HTTP server reverts to closing connection after a sending a single file (as in HTTP/1.0)
Server tells client about this by sending connection: close header (HTTP headers in next section).
26
27.11 Length Encoding and Headers
After the first line of a request or response:
“..HTTP borrows the basic format from e-mail, using the 822 format and MIME extensions. Like a standard 822 message, each HTTP transmission contains a header, a blank line, and the item being sent. Furthermore each line in the header contains a keyword, a colon, and information.”
Figure 27.1
Some headers:
27
Hypertext Transfer Protocol GET /barnard/old_home.html HTTP/1.1\r\n Host: www.cis.uab.edu\r\n User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3)
When Submit order button is clicked the system first assembles the input information into a string.
41
Every form needs at least one submit button!
The ACTION and method parameters specify what should happen next after the submit order button is clicked.
42
1. Make TCP connection to widget.com, port 80
2. Use HTTP to POST the string to script widgetorder in directory cgi-bin
What happens when the submit order button is clicked?
43
7.3 THE WORLD WIDE WEB
7.3.1 Architectural OverviewStatelessness and Cookies 625
******** content *********
7.3.2 Static Web Documents 629HTML — The Hypertext Markup Language 629Forms 634
7.3.3 Dynamic Web Documents 643Server-Side Dynamic Page Generation 643 656
44
7.3.3 Dynamic Web Documents
Not all WWW pages can be prepared in advance.
Server-side Dynamic Web Page Generation
Example of the need for a server to build a page dynamically:
You have several items in your shopping cart and have clicked on the PROCEED TO CHECKOUT button.
The server needs to build a page showing your purchases, for your confirmation.
45
7.3.3 Dynamic Web Documents – continued
Common Gateway Interface (CGI)
Standard interface allows WWW servers to talk to back-end servers.
Scripts are usually stored in directory cgi-bin
Recall ACTION parameter in figure 7-29(a) :
“3-tier system”
46
7.3 THE WORLD WIDE WEB
7.3.1 Architectural OverviewStatelessness and Cookies 625
******** content *********
7.3.2 Static Web Documents 629HTML — The Hypertext Markup Language 629Forms 634
7.3.3 Dynamic Web Documents 643Server-Side Dynamic Page Generation 643
******** transmission across internet *******
7.3.4 — The HyperText Transfer Protocol 651Connections 652Methods 652Message Headers 654Example HTTP Usage 656
47
7.3.4 HTTP – The Hypertext Transfer Protocol
Each interaction consists of one ASCII request, followed by one RFC 822 MIME-like response.
48
7.3.4 HTTP – The Hypertext Transfer Protocol
Connections (recall from Comer section 27.9)
“In HTTP 1.0 after the connection was established, a single request was sent over and a single response was sent back. Then the TCP connection was released.”
HTTP 1.1 default is persistent connections – can send numerous requests and get numerous responses over the same TCP connection.
49
7.3.4 HTTP – The Hypertext Transfer Protocol – continued
Requests
“Each request consists of one or more lines of ASCII text, with the first word on the first line being the name of the method requested.”
Example:
GET filename HTTP/1.1
50
7.3.4 HTTP – The Hypertext Transfer Protocol – continued
Responses
“Every request gets a response, consisting of a status line and possibly additional information (e.g. all or part of a WWW page).”
51
7.3.4 HTTP – The Hypertext Transfer Protocol – continued
Message Headers
After the first line ( request or response) HTTP messages follow the pattern of E-mail messages, one or more headers, followed by a blank line, optionally followed by the message body.
The MIME rules apply to the body and some of the MIME headers are used (e.g. content-type and content-encoding).
52
7.3.4 HTTP – The Hypertext Transfer Protocol – continued
Message headers - Request
Message headers - Response
*
53
7.3.4 HTTP – The Hypertext Transfer Protocol – continued
Example HTTP usage
“Because HTTP is an ASCII protocol, it is quite easy for a person at a terminal (as opposed to a browser) to talk directly to Web servers. All that is needed is a TCP connection to port 80 on the server. Readers are encouraged to try this scenario personally.”
Required request header
User keys in:
telnet www.ietf.org 80 > log.
GET /rfc.html HTTP/1.1Host: www.ietf.org
.
.
.close
54
7.3.4 HTTP – The Hypertext Transfer Protocol - continued
Figure 7-44
telnet www.ietf.org 80 > log
GET /rfc.html HTTP/1.1Host: www.ietf.org blank line to signal end
Blank line in response
[More HTML]
55
TERMINOLOGY
► user agent – the client that initiates a request, usually a browser.
► origin server – the server on which a given resource resides (“origin” to distinguish it from any proxy servers involved)
► Host domain name (HDN)
www.mylab.org/cgi-bin/sampleform
request-host request-URI
URL:
► request-host
► request-URI
Back to Cookies!
This treatment is based onRFC 2109/2965 HTTP State Management Mechanism
56
TERMINOLOGY - continued
► domain-match
Host A’s name domain-matches host B’s if
► their names or IP addresses match exactly
► A is a HDN string and has the form NB,
where N is a non-empty name string, B has the form .B́ and B́ is a HDN
Examples:
► www.amazon.com domain matches .amazon.com
► www.amazon.com does not domain-match amazon.com
N B
► pda-as.amazon.com domain matches .amazon.com
57
Definition of HTTP session
1. Each session has a beginning and an end.
An HTTP session may contain several TCP sessions
Informally: a session might include access to a catalog, selection of purchase items into a shopping cart, checkout, and
acknowledgement of purchase.
5. The session is implicit in the exchange of state information(there is no special message to start or stop a session).
4. Either the user-agent or the origin server may terminate a session
3. Session is started by the origin server
2. Each session is relatively short-lived.
58
OUTLINE
Origin server sends state information (cookie) to the user agent
User agent returns state information to origin server.
59
The Role of the Origin Server
►The origin server (surprising!) initiates an HTTP session, if it so desires.
► To identify themselves, user agents should send Cookie request headers (subject to other rules detailed below) with every request.
► Servers may send a Set-Cookie header with any response (not necessarily with every response, but Amazon sends same
cookies repeatedly – see in Lab session 8).
► To initiate a session, the origin server sends a message with an extra response header to the client, Set-Cookie
► The origin server may include multiple Set-Cookie headers in a response.
60
set-cookie = "Set-Cookie:" cookies
cookies = 1#cookie
cookie = NAME "=" VALUE
*(";" cookie-av)
cookie-av = "Comment" "=" value |
"Domain" "=" value |
"Expires" "=" value |
"Path" "=" value
Set-Cookie Syntax
At least one cookie
Zero or more attribute-value pairs
61
Example: Wireshark trace of response to user keying in www.amazon.com (from Lab session 8)