INF 123 SW ARCH, DIST SYS & INTEROP LECTURE 5 Prof. Crista Lopes
Dec 19, 2015
INF 123 SW ARCH, DIST SYS & INTEROP
LECTURE 5
Prof. Crista Lopes
Objectives
Web history competency Thorough understanding of HTTP
Recap
Distributed System
“Collection of interacting components hosted on different computers that are connected through a computer network”
Component1
Component n
Hardware
Network OS
Host 3
Component1
Component n
Hardware
Network OS
Host 2
Component1
Component n
Hardware
Network OS
Host 1
…
Network
The Origins of the Internet
Heterogeneous computers Decentralized control Many interested players
OSI ModelIm
age c
ourt
esy
of
The A
bdus
Sala
m
Inte
rnati
onal C
entr
e f
or
Theore
tica
l Ph
ysi
cs
OSI Model in Action
Internet
Your laptop
DBH wirelessrouter
UCI routers Google routers
Google server
The Internet
Large-scale infrastructure consisting of 100’s of 1,000’s of routers, cables, wireless links, and millions of hosts.
Traffic through the network consists of small data packets.
Software in each node follows, roughly, the OSI model. Main “contract” between nodes: Internet Protocol (IP)
IP addresses (v4 and now v6) Packets don’t contain routing information Route packets according to their final destination but
depending on local context of router Each packet is routed independently of others
Lecture 5
Context, 1985-1990
Full decade of Internet usage Foundation: TCP/IP [and UDP]
Enabled Client-Server architectures Application: Telnet
Virtual terminal (login to remote machine) Can be used to ‘talk’ to *any* TCP/IP server
Application: Email SMTP: See example next page POP IMAP
Application: News NNTP (before it, Usenet and UUCP)
Application: Instant Messaging Unix’s Talk program Popularized by AOL
Application: File sharing FTP
Client-Server over TCP/IP
Server opens TCP [server] socket, binds to port, listens for connection requests
Client opens TCP [client] socket, connect to server host/port
Server accepts connection, initiates dedicated full-duplex “virtual circuit” Eventually spawns thread for it Main thread goes back to listen for other connections
Client and server send each other messages (byte streams) TCP implementation takes care of protocol details
Example: SMTP over TCP/IP
tagus: crista$ telnet smtp.ics.uci.edu 25Trying 128.195.1.219...Connected to smtp.ics.uci.edu.Escape character is '^]'.220 david-tennant-v0.ics.uci.edu ESMTP mailer ready at Mon, 5 Apr 2010 17:15:01 -0700'HELO smtp.ics.uci.edu250 david-tennant-v0.ics.uci.edu Hello barbara-wright.ics.uci.edu [128.195.1.137], pleased to meet youMAIL FROM:<[email protected]>250 2.1.0 <[email protected]>... Sender okRCPT TO:<[email protected]>250 2.1.5 <[email protected]>... Recipient okDATA354 Enter mail, end with "." on a line by itselftest.250 2.0.0 o360F1Mo029280 Message accepted for deliveryQUIT221 2.0.0 david-tennant-v0.ics.uci.edu closing connectionConnection closed by foreign host.
Origins of the Web
CERN Conseil Européen pour la Recherche Nucléaire (European Laboratory for Particle Physics; Geneva, Switzerland) Tim Berners-Lee & Robert Cailliou
Originally a system for sharing documents among scientists
First implementation made publicly available quickly became very popular in universities & research institutions
NCSA Mosaic browser made it popular across the board
Main Design Principles, originally Client requests a text document from the server
Server sends back the text document Text document may contain retrieval references (hyperlinks)
to other text documents on that or other servers HyperText Markup Language (HTML)
Client may also send text documents for the server to store Requests/Responses sent over TCP, but
Client makes connection, sends, receives, connection is closed Connection is not maintained among interactions
Requests are self-contained, do not rely on past interactions “Stateless”
(Notice the story based on “text document”; it quickly became apparent that it needed generalization)
Generalization
Document Resource “Page” with markups Actual document, many types Program generating resource
Universal Resource Identifier (URI) Abstract concept Concrete realization: Universal Resource
Locator (URL) Provides a method for finding the resource
http://, file://, ftp://, mailto://, etc.
HTTP URLs
Syntax: http://<host>:<port>[/<path>][?<query>]
Examples Hosts: www.ics.uci.edu, 127.0.0.1 Ports: Number Paths: /wifi/admin/users Queries: first=John&last=Smith
Spec
HyperText Transfer Protocol (HTTP) GET PUT DELETE HEAD OPTIONS TRACE POST CONNECT
Spec
Idempotent methods
HTTP Request Syntax
<OPERATION> <ARGS> <VERSION>[<HEADER_1_NAME>: <HEADER_1_VALUE> …<HEADER_N_NAME >: <HEADER_N_VALUE>]<blank line>[<DATA>]
HTTP Response Syntax
<VERSION> <CODE> <EXPLANATION>[<HEADER_1_NAME>: <HEADER_1_VALUE> …<HEADER_N_NAME >: <HEADER_1_VALUE>]<blank line>[<DATA>]
HTTP Example
GET /index.html HTTP/1.1Host: ics.uci.edu
Blank line here
HTTP/1.1 200 OKDate: Fri, 09 Apr 2010 19:48:36 GMTServer: Apache/2.2.3 (CentOS)Last-Modified: Fri, 19 Feb 2010 22:01:21 GMTETag: "238003-64-47ffb39422e40"Accept-Ranges: bytesContent-Length: 100Connection: closeContent-Type: text/html; charset=UTF-8
<html><head><meta HTTP-EQUIV="REFRESH" content="0; URL=http://www.ics.uci.edu/"></head></html>(show live)
HTTP Headers
Request headers Response headers
Spec
HTTP Status Codes
Informational 1xxx E.g. 100 Continue
Successful 2xx E.g. 200 OK, 201 Created
Redirection 3xx E.g. 300 Multiple Choices, 301 Moved Permanently
Client error 4xx E.g. 400 Bad Request, 404 Not Found
Server error 5xx E.g. 500 Internal Server Error, 503 Service
Unavailable Complete list
Another Example
GET /index.html HTTP/1.1Host: cnn.com
Blank line here
HTTP/1.1 301 Moved PermanentlyDate: Fri, 09 Apr 2010 20:32:14 GMTServer: ApacheLocation: http://www.cnn.com/index.htmlVary: Accept-EncodingContent-Length: 294Content-Type: text/html; charset=iso-8859-1
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><html><head><title>301 Moved Permanently</title></head><body><h1>Moved Permanently</h1><p>The document has moved <a href="http://www.cnn.com/index.html">here</a>.</p><hr><address>Apache Server at cnn.com Port 80</address></body></html>
(show live)
Web Caches
Internet
Proxy
ClientClient …
Internet
ReverseProxy
Server
Server
…
Caches contentfrom Internet
Caches contentfrom servers
Web Caches
Reduce bandwidth Reduce server load Reduce lag
Cache content from Idempotent methods (GET mostly)
Web Caches: Why you need to know about them
github.com demo
Web Cache Control
“Cache-Control” header in responses E.g. Cache-Control: no-cache
“Expires” header in responses E.g. Expires: Fri, 09 Apr 2010 16:00:00 GMT
“Last-Modified” header in responses Proxy can use If-Modified-Since header in
request, server may respond 304 Not Modified
If subsequent POST, PUT, DELETE to same URL, cache should be invalidated
Cookies
Text data sent from the server to the client meant to be sent back in subsequent requests from the client to the same server
Added to Mosaic browser and Web servers in 1994
Uses Session management Personalization Tracking
Setting and Using Cookies
GET /index.html HTTP/1.1Host: www.google.com
HTTP/1.1 200 OKDate: Sat, 10 Apr 2010 14:35:22 GMTExpires: -1Cache-Control: private, max-age=0Content-Type: text/html; charset=ISO-8859-1Set-Cookie: PREF=ID=1bb89b81c47c05fb:TM=1270910122:LM=1270910122:S=YQ3wzhShOas9UStn; expires=Mon, 09-Apr-2012 14:35:22 GMT; path=/; domain=.google.comSet-Cookie: NID=33=CeVJK2EKVB5kcCiguCD1OjG3g5UKlPq78SXCibOjYQOU46P6SMaAKqAhw2hEVPqqnKfFlTzmC-w4Ol5ZwKQqnjyla1DZcS6ZYmb1lLHe2zNuEVnXJRtd4lMrr6gA4o8m; expires=Sun, 10-Oct-2010 14:35:22 GMT; path=/; domain=.google.com; HttpOnlyServer: gwsTransfer-Encoding: chunked
…
Client Server
Server Client
Setting and Using Cookies
GET /index.html HTTP/1.1Host: www.google.comCookie: PREF=ID=1bb89b81c47c05fb:TM=1270910122:LM=1270910122:S=YQ3wzhShOas9UStn
Client Server
Etc.
Uses
Session Management User logs in, server sends cookie Subsequent requests include that cookie
Personalization User visits, server sends cookie User changes preferences, all with cookie Future visits include cookie, server “remembers”
preferences Tracking within same site
Cookie + path + date/time Tracking inter-site
Referer + Cookie (Privacy concerns)