Top Banner
HTTP Hypertext Transport Protocol
40

HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Dec 26, 2015

Download

Documents

Samuel Park
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

HTTP

Hypertext Transport Protocol

Page 2: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Hypertext Transfer Protocol(HTTP) A communications protocol

Used to transfer or convey information on the World Wide Web

Original purpose was to provide a way to publish and retrieve HTML hypertext pages

Development of HTTP was coordinated by W3C (World Wide Web Consortium) IETF (Internet Engineering Task Force)

Culminating in the publication of a series of RFCs Most notably RFC 2616 (June 1999)

Defines HTTP/1.1, the version of HTTP in common use today

Page 3: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Hypertext Transfer Protocol(HTTP) HTTP is a request/response protocol between

clients and servers Client makes an HTTP request

Referred to as the user agent A web browser, spider, or other end-user tool

Server responds Called the origin server

Stores or creates resources such as HTML files and images

In between the user agent and origin server may be several intermediaries

proxies, gateways, tunnels, etc.

Page 4: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Hypertext Transfer Protocol(HTTP) HTTP does not need to use TCP/IP or its

supporting layers HTTP:

Can be implemented on top of any other protocol on the Internet, or on other networks

Only presumes a reliable transport Any protocol that provides such guarantees can be

used

Page 5: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Hypertext Transfer Protocol(HTTP) An HTTP client initiates a request by establishing a

Transmission Control Protocol (TCP) connection to a particular port on a host Port 80 by default An HTTP server listening on that port waits for the client

to send a request message Upon receiving the request, the server sends back

A status line E.g. "HTTP/1.1 200 OK“

A message of its own Body of which is perhaps the requested file, an error

message, or some other information

Page 6: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Hypertext Transfer Protocol(HTTP) Resources to be accessed by HTTP are

identified using Uniform Resource Identifiers (URIs) Or, more specifically, URLs

Using the http: or https URI schemes

Page 7: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Request message

The request message consists of the following: Request line

E.g. GET /images/logo.gif HTTP/1.1 Requests the file logo.gif from the /images directory

Headers E.g. Accept-Language: en

An empty line An optional message body

Page 8: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Request message

The request line and headers must all end with CRLF A carriage return followed by a line feed

ASCII Code 13 followed by an ASCII Code 10 An empty line must consist of only CRLF and

no other whitespace In the HTTP/1.1 protocol, all headers except

Host are optional

Page 9: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

HTTP Methods

HTTP defines eight methods Indicates the desired action to be

performed on the identified resource Sometimes referred to as "verbs"

Page 10: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Request Methods HEAD

Asks for the response identical to the one that would correspond to a GET request, but without the response body

Useful for retrieving meta-information written in response headers, without having to transport the entire content

GET Requests a representation of the specified resource

By far the most common method used on the Web today Should not be used for operations that cause side-effects

Using it for actions in web applications is a common misuse See 'safe methods' below

POST Submits data to be processed (e.g. from an HTML form) to

the identified resource The data is included in the body of the request May result in the creation of a new resource or the updates of

existing resources or both

Page 11: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Request methods

PUT Uploads a representation of the specified resource

DELETE Deletes the specified resource

TRACE Echoes back the received request

so a client can see what intermediate servers are adding or changing in the request

OPTIONS Returns the HTTP methods that the server supports

Can be used to check the functionality of a web server CONNECT

Converts the request connection to a transparent TCP/IP tunnel

Usually to facilitate SSL-encrypted communication (HTTPS) through an unencrypted HTTP proxy

Page 12: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Request methods

HTTP servers are supposed to implement at least: GET and HEAD methods OPTIONS method

Whenever possible

Page 13: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Request methods

Safe methods Some methods (e.g. HEAD or GET) are defined as safe, which means

they are intended only for information retrieval and should not change the state of the server

In other words, they should not have side effects Unsafe methods (such as POST, PUT and DELETE) should be displayed

to the user in a special way Typically as buttons rather than links Make the user aware of possible obligations

Such as a button that causes a financial transaction Despite the required safety of GET requests they can cause changes on

the server For example, a Web server may use the retrieval through a simple

hyperlink to initiate deletion of a domain database record, thus causing a change of the server's state as a side-effect of a GET request

This is discouraged, because it can cause problems for Web caching, search engines and other automated agents, which can make unintended changes on the server

Another case is that a GET request may cause the server to create a cache space

Page 14: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Request methods

Idempotent methods and Web Applications Methods GET, HEAD, PUT and DELETE are defined to be idempotent

Multiple identical requests should have the same effect as a single request Methods OPTIONS and TRACE, being safe, are inherently idempotent

The RFC allows a user-agent, such as a browser to assume that any idempotent request can be retried without informing the user

This is done to improve the user experience when connecting to unresponsive or heavily-loaded web servers

However, note that the idempotence is not assured by the protocol or web server

It is perfectly possible to write a web application in which (eg) a database insert or update is triggered by a GET request - this would be a very normal example of what the spec refers to as "a change in server state"

This misuse of GET can combine with the retry behavior above to produce erroneous transactions and used, as intended, for document retrieval only

For this reason GET should be avoided for anything transactional

Page 15: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

HTTP versions

HTTP has evolved into multiple, mostly backwards-compatible protocol versions. RFC 2145 describes the use of HTTP version

numbers The client tells in the beginning of the request the

version it uses, and the server uses the same or earlier version in the response

Page 16: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

HTTP versions 0.9

Deprecated Supports only one command, GET — which does not specify the HTTP version Does not support headers Since this version does not support POST, the client can't pass much information to the

server HTTP/1.0 (May 1996)

This is the first protocol revision to specify its version in communications Still in wide use, especially by proxy servers

HTTP/1.1 (June 1999) Current version; persistent connections enabled by default and works well with proxies. Supports request pipelining

Allows multiple requests to be sent at the same time Allows the server to prepare for the workload and potentially transfer the requested

resources more quickly to the client HTTP/1.2

The initial 1995 working drafts were prepared by the W3C and submitted to the IETF an Extension Mechanism for HTTP proposed the Protocol Extension Protocol, abbreviated PEP

PEP was originally intended to become a distinguishing feature of HTTP/1.2 In later PEP working drafts, however, the reference to HTTP/1.2 was removed The experimental RFC 2774, HTTP Extension Framework, largely subsumed PEP. It was published in February 2000

Page 17: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Status codes

In HTTP/1.0 and since, the first line of the HTTP response is called the status line Includes a

Numeric status code (such as "404") Textual reason phrase (such as "Not Found").

The way the user agent handles the response primarily depends on 1. the code2. the response headers

Custom status codes can be used If the user agent encounters a code it does not

recognize, it can use the first digit of the code to determine the general class of the response.

Page 18: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Status codes

Standard reason phrases are only recommendations Can be replaced with "local equivalents" at the

web developer's discretion If the status code indicated a problem

The user agent might display the reason phrase to the user to provide further information about the nature of the problem

The standard also allows the user agent to attempt to interpret the reason phrase This might be unwise since the standard explicitly specifies

that Status codes are machine-readable Reason phrases are human-readable.

Page 19: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Status Codes

1xx Informational 2xx Success 3xx Redirection 4xx Client Error 5xx Server Error

Page 20: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

1xx Informational

Request received, continuing process. This class of status code indicates a

provisional response Consists only of the Status-Line and optional

headers Terminated by an empty line

Since HTTP/1.0 did not define any 1xx status codes, servers MUST NOT send a 1xx response to an HTTP/1.0 client except under experimental conditions

Page 21: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

1xx Informational

100 Continue The server has received the request headers

The client should proceed to send the request body in the case of a request for which a body needs to be sent for example, a POST request

If the request body is large, sending it to a server when a request has already been rejected based upon inappropriate headers is inefficient

To have a server check if the request could be accepted based on the request's headers alone, a client must send

Expect: 100-continue as a header in its initial request see RFC 2616 §14.20: Expect header) Check if a 100 Continue status code is received in response before

Continuingor

receive 417 Expectation Failed and not continue

101 Switching Protocols 102 Processing (WebDAV)

Page 22: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

2xx Success

The action was successfully received, understood, and accepted This class of status code indicates that the

client's request was successfully received, understood, and accepted

Page 23: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

2xx Success 200 OK

Standard response for successful HTTP requests. 201 Created

Request has been fulfilled and resulted in a new resource being create 202 Accepted

Request has been accepted for processing The processing has not been completed

Request might or might not eventually be acted upon It might be disallowed when processing actually takes place

203 Non-Authoritative Information (since HTTP/1.1) 204 No Content 205 Reset Content 206 Partial Content

Notice that a file has been partially downloaded. Used by tools like wget to enable resuming of interrupted downloads, or split a

download into multiple simultaneous streams. 207 Multi-Status (WebDAV)

The message body that follows is an XML message and can contain a number of separate response codes, depending on how many sub-requests were made.

Page 24: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

3xx Redirection

The client must take additional action to complete the request This class of status code indicates that further action

needs to be taken by the user agent in order to fulfill the request

The action required MAY be carried out by the user agent without interaction with the user if and only if the method used in the second request is GET or HEAD

A user agent SHOULD NOT automatically redirect a request more than 5 times, since such redirections usually indicate an infinite loop

Page 25: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

3xx Redirection 300 Multiple Choices

Indicates multiple options for the URI that the client may follow. Canbe used to present different format options for video, list files with different extensions, or word sense disambiguation.

301 Moved Permanently This and all future requests should be directed to the given URI.

302 Found Most popular redirect code, but also an example of industrial practice contradicting the standard. HTTP/1.0 specification (RFC 1945) required the client to perform a temporary redirect (the original describing phrase was

"Moved Temporarily"), but popular browsers implemented it as a 303 See Other. Therefore, HTTP/1.1 added status codes 303 and 307 to disambiguate between the two behaviors. However, the majority of Web applications and frameworks still use the 302 status code as if it were the 303.

303 See Other (since HTTP/1.1) The response to the request can be found under another URI using a GET method.

304 Not Modified Indicates the request URL has not been modified since last requested. Typically, the HTTP client provides a header like the If-Modified-Since header to provide a time with which to compare Utilizing this saves bandwidth and reprocessing on both the server and client.

305 Use Proxy (since HTTP/1.1) Many HTTP clients (such as Mozilla [1] and Internet Explorer) don't correctly handle responses with this status code,

primarily for security reasons 306 Switch Prox

No longer used. 307 Temporary Redirect (since HTTP/1.1)

In this occasion, the request should be repeated with another URI, but future requests can still be directed to the original URI.

In contrast to 303, the request method should not be changed when reissuing the original request. For instance, a POST request must be repeated using another POST request

Page 26: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

4xx Client Error

The request contains bad syntax or cannot be fulfilled The 4xx class of status code is intended for cases in which

the client seems to have erred Except when responding to a HEAD request, the server

SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition

These status codes are applicable to any request method User agents SHOULD display any included entity to the

user

Page 27: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

4xx Client Error 400 Bad Request

The request contains bad syntax or cannot be fulfilled. 401 Unauthorized

Similar to 403 Forbidden, specifically for use when authentication is possible but has failed or not yet been provided

402 Payment Required Original intention was that this code might be used as part of some form of digital cash or

micropayment scheme Has not happened, and this code has never been used

403 Forbidden Request was a legal request, but the server is refusing to respond to it Unlike a 401 Unauthorized response, authenticating will make no difference

404 Not Found 405 Method Not Allowed

Request made to a URL using a request method not supported by that URL Using GET on a form which requires data to be presented via POST Using PUT on a read-only resource

406 Not Acceptable 407 Proxy Authentication Required 408 Request Timeout 409 Conflict

Page 28: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

4xx Client Error

410 Gone Indicates that the resource requested is no longer

available and will not be available again Should be used when a resource has been

intentionally removed In practice, a 404 Not Found is often issued instead

411 Length Required 412 Precondition Failed 413 Request Entity Too Large 414 Request-URI Too Long 415 Unsupported Media Type 416 Requested Range Not Satisfiable 417 Expectation Failed

Page 29: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

4xx Client Error 422 Unprocessable Entity (WebDAV)

Request was well-formed but was unable to be followed due to semantic errors

423 Locked (WebDAV) The resource that is being accessed is locked

424 Failed Dependency (WebDAV) The request failed due to failure of a previous request (e.g. a

PROPPATCH). 425 Unordered Collection

Defined in drafts of WebDav Advanced Collections Not present in "Web Distributed Authoring and Versioning

(WebDAV) Ordered Collections Protocol" 426 Upgrade Required

The client should switch to TLS/1.0. 449 Retry With

A Microsoft extension: The request should be retried after doing the appropriate action.

Page 30: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

5xx Server Error

The server failed to fulfill an apparently valid request Response status codes beginning with the digit "5"

indicate cases in which the server is aware that it has erred or is incapable of performing the request

Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition

User agents SHOULD display any included entity to the user

These response codes are applicable to any request method

Page 31: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

5xx Server Error

500 Internal Server Error 501 Not Implemented 502 Bad Gateway 503 Service Temporarily Unavailable 504 Gateway Timeout 505 HTTP Version Not Supported 506 Variant Also Negotiates 507 Insufficient Storage (WebDAV) 509 Bandwidth Limit Exceeded

Not an official HTTP status code Still used by many servers

510 Not Extended (RFC 2774)

Page 32: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Persistent connections

In HTTP/0.9 and 1.0, the connection is closed after a single request/response pair.

In HTTP/1.1 a keep-alive-mechanism was introduced, where a connection could be reused for more than one request. Such persistent connections reduce lag perceptibly, because

the client does not need to re-negotiate the TCP connection after the first request has been sent.

Version 1.1 of the protocol also introduced: Chunked transfer encoding to allow content on persistent

connections to be streamed, rather than buffered HTTP pipelining, which allows clients to send some types of

requests before the previous response has been received, further reducing lag

Main article: HTTP persistent connections

Page 33: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

HTTP session state

HTTP can occasionally pose problems for Web developers and applications since HTTP is stateless The advantage of a stateless protocol is that hosts do not

need to retain information about users between requests This forces the use of alternative methods for maintaining

users' state E.g. when a host would like to customize content for a user who

has visited before One common method for solving this problem involves the use

of sending and requesting cookies Other methods include

Server side sessions Hidden variables

When current page is a form URL encoded parameters

Such as /index.php?userid=3

Page 34: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Secure HTTP

There are currently two methods of establishing a secure HTTP connection: The https URI scheme The HTTP 1.1 Upgrade header

Introduced by RFC 2817 Browser support for the Upgrade header is

nearly non-existent The https URI scheme is still the dominant

method of establishing a secure HTTP connection

Page 35: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Secure HTTP

https URI scheme A URI scheme syntactically identical to the http:

scheme used for normal HTTP connections Signals the browser to use an added encryption

layer of SSL/TLS to protect the traffic SSL – Secure Sockets Layer TLS – Transport Layer Security

SSL is especially suited for HTTP since it can provide some protection even if only one side of the communication is authenticated

In the case of HTTP transactions over the Internet, typically, only the server side is authenticated

Main article: https

Page 36: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Secure HTTP HTTP 1.1 Upgrade header

HTTP 1.1 introduced support for the Upgrade header. In the exchange

The client begins by making a clear-text request, which is later upgraded to TLS

Either the client or the server may request (or demand) that the connection be upgraded

The most common usage is a clear-text request by the client followed by a server demand to upgrade the connection, which looks like this:

Client: GET /encrypted-area HTTP/1.1 Host: www.example.com

Server: HTTP/1.1 426 Upgrade Required Upgrade: TLS/1.0, HTTP/1.1 Connection: Upgrade

The server returns a 426 status-code because 400 level codes indicate a client failure

Correctly alerts legacy clients that the failure was client-related

Page 37: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Secure HTTP

Benefits of using this method for establishing a secure connection are: Removes messy and problematic redirection and URL

rewriting on the server side Allows virtual hosting (single IP, multiple domain-names)

of secured websites Reduces user confusion by providing a single way to

access a particular resource A weakness with this method is:

Requirement for secure HTTP cannot be specified in the URI

In practice, the (untrusted) server will thus be responsible for enabling secure HTTP, not the (trusted) client

Page 38: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Sample

Following is a sample conversation between an HTTP client and an HTTP server running on www.example.com, port 80

Page 39: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Sample Client Request

Client request Followed by a blank line

Request ends with a double newline In the form of a carriage return followed by a line

feed The "Host" header

Distinguishes between various DNS names sharing a single IP address

Allows name-based virtual hosting. Optional in HTTP/1.0, mandatory in HTTP/1.1

GET /index.html HTTP/1.1 Host: www.example.com

Page 40: HTTP Hypertext Transport Protocol. Hypertext Transfer Protocol (HTTP) A communications protocolcommunications protocol Used to transfer or convey information.

Sample Server Response

Server response Followed by a blank line and text of the requested page

ETag (entity tag) header is used to determine if the URL cached is identical to the requested URL on the server.

Content-Type specifies the Internet media type of the data conveyed by the http message Content-Length indicates its length in bytes.

The webserver publishes its ability to respond to requests for certain byte ranges of the document by setting the header

Accept-Ranges: bytes This is useful if the connection was interrupted before the data was completely transferred to the client

Connection: close It is stated, that the webserver will close the TCP connection immediately after the transfer of this

package.

HTTP/1.1 200 OK Date: Mon, 23 May 2005 22:38:34 GMT Server: Apache/1.3.27 (Unix) (Red-Hat/Linux) Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT Etag: "3f80f-1b6-3e1cb03b" Accept-Ranges: bytes Content-Length: 438 Connection: close Content-Type: text/html; charset=UTF-8