Top Banner
Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department of Electrical and Computer Engineering Carnegie Mellon University Institute for eCommerce, Summer 1999
41

Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 1

20-755: The InternetLecture 10: Web Services III

David O’Hallaron

School of Computer Science and

Department of Electrical and Computer Engineering

Carnegie Mellon University

Institute for eCommerce, Summer 1999

Page 2: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 2

Today’s lecture

• Anatomy of a simple Web server (40 min)

• Break (10 min)

• Advanced server features (45 min)

Page 3: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 3

Anatomy of Tiny: A simple Web server

#!/usr/local/bin/perl5 -w use IO::Socket; # # tiny.pl - The Tiny HTTP server #

Page 4: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 4

Tiny: configuration

# # Configuration # $port = 8000; # the port we listen on $htmldir = "./html/"; # the base html directory $cgidir = "./cgi-bin/"; # the base cgi directory $server = "Tiny Web server 1.0"; # server info

Page 5: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 5

Tiny: error messages

# # Error messages # # Terse error messages go in the response header %terse_errors = ( "403", "Forbidden", "404", "Not Found", "501", "Not Implemented", ); # Verbose error messages go in the response message body %verbose_errors = ( "403", "You are not allowed to access this item", "404", "Tiny couldn't find the requested item on the server", "501", "Tiny does not support the given request type", );

Page 6: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 6

Tiny:Create a listening socket

# # Create a TCP listening socket file descriptor # # LocalPort: list on port $port # Type : use TCP # Resuse : reuse address right away # Listen : buffer at most 10 requests # $listenfd = IO::Socket::INET->new(LocalPort => $port, Type => SOCK_STREAM, Reuse => 1, Listen => 10) or die "Couldn't listen on port $port: $@\n";

Page 7: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 7

Tiny:main loop structure

# # Loop forever waiting for HTTP requests # while(1) { # Wait for a connection request from a client $connfd = $listenfd->accept(); # Determine the domain name and IP address of this client # Parse the request line (after stripping the newline) # Parse the URI # Parse the request headers # OPTIONS method # HEAD method # GET method # misc: POST, PUT, DELETE, and TRACE methods}

Page 8: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 8

Tiny: error procedure# # error - send an error message back to the client # $_[0]: the error number # $_[1]: the method or URI that caused the error # sub error { local($errno) = $_[0]; local($errmsg) = "$errno $terse_errors{$errno}"; print $connfd <<EndOfMessage; HTTP/1.1 $errmsg Content-type: text/html <HTML> <HEAD><TITLE>$errmsg</TITLE></HEAD> <BODY bgcolor="#ffffff"> <H1>$errmsg</H1> $verbose_errors{$errno}: <PRE> $_[1] </PRE> <HR> The Tiny Web Server </BODY> </HTML> EndOfMessage }

Page 9: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 9

Tiny:get client’s name and address

# Determine the domain name and IP address of this client $client_sockaddr = getpeername($connfd); ($client_port, $client_iaddr) = unpack_sockaddr_in($client_sockaddr); $client_port = $client_port; # so -w won't complain $client_name = gethostbyaddr($client_iaddr, AF_INET); ($a1, $a2, $a3, $a4) = unpack('C4', $client_iaddr); print "Opened connection with $client_name ($a1.$a2.$a3.$a4)\n";

Page 10: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 10

Tiny:parsing the request line

# Parse the request line (after stripping the newline) chomp($line = <$connfd>); ($method, $uri, $version) = split(/\s+/, $line); print "received $line\n";

Page 11: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 11

Tiny:parsing the URI

# # Parse the URI # # Either the URI refers to a CGI program... if ($uri =~ m:^/cgi-bin/:) { $is_static = 0; # extract the program name and its arguments ($filename, $cgiargs) = split(/\?/, $uri); if (!defined($cgiargs)) { $cgiargs = ""; } # replace /cgi-bin with the default cgi directory $filename =~ s:^/cgi-bin/:$cgidir:o; }

Page 12: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 12

Tiny:Parsing the URI

# ... or the URI refers to a file else { $is_static = 1; # static content $cgiargs = ""; # replace the first / with the default html directory $filename = $uri; $filename =~ s:^/:$htmldir:o; # use index.html for the default file $filename =~ s:/$:/index.html:; } # debug statements like this will help you a lot print "parsed URI: is_static=$is_static, filename=$filename, cgiargs=$cgiargs\n";

Page 13: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 13

Tiny:parsig the request headers

# # Parse the request headers # $content_length = 0; $content_type = "text/html"; while (<$connfd>) { # read request header into $_ # Delete CR and NL chars s/\n|\r//g; # delete CRLF and CR chars from $_ # Determine the length of the message body # search for "Content-Length:" at beginning of string $_ # ignore the case if (/^Content-Length: (\S*)/i) { $content_length = $1; }

Page 14: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 14

Tiny:parse the command line (cont)

# determine the type of content (if any) in msg body # search for "Content-Type:" at beginning of string $_ # ignore the case if (/^Content-Type: (\S*)/i) { $content_type = $1; } # If $_ was a blank line, exit the loop if (length == 0) { last; } }

Page 15: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 15

Tiny:OPTIONS

# # OPTIONS method # if ($method eq "OPTIONS") { $today = gmtime()." GMT"; $connfd->print("$version 200 OK\n"); $connfd->print("Date: $today\n"); $connfd->print("Server: $server\n"); $connfd->print("Content-length: 0\n"); $connfd->print("Allow: OPTIONS HEAD GET\n"); $connfd->print("\n"); }

Page 16: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 16

Tiny:HEAD

# # HEAD method # elsif ($method eq "HEAD") { # we're dissallowing HEAD methods on scripts if (!$is_static) { error(403, $filename); } else { $today = gmtime()." GMT"; head_method($filename, $uri, $today, $server); } }

Page 17: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 17

Tiny:HEAD (cont)

# # process the HEAD method on static content # $_[0] : the file to be processed # $_[1] : the uri # $_[2] : today's date # $_[3] : server name # sub head_method { local ($filename) = $_[0]; local ($uri) = $_[1]; local ($today) = $_[2]; local ($server) = $_[3]; local $modified; local $filesize; local $filetype;

Page 18: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 18

Tiny:HEAD (cont)

# make sure the requested file exists if (!(-e $filename)) { error(404, $uri); } # make sure the requested is readable elsif (!(-r $filename)) { error(403, $uri); }

Page 19: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 19

Tiny: HEAD (cont)

# serve the response header but not the file else { # determine file modifcation date $modified = gmtime((stat($filename))[9])." GMT"; # determine filesize in bytes $filesize = (stat($filename))[7]; # determin filetype (default is text) if ($filename =~ /\.html$/) { $filetype = "text/html"; } elsif ($filename =~ /\.gif$/) { $filetype = "image/gif"; } elsif ($filename =~ /\.jpg$/) { $filetype = "image/jpeg"; } else { $filetype = "text/plain"; }

Page 20: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 20

Tiny:HEAD (cont)

# print the response header $connfd->print("HTTP/1.1 200 OK\n"); $connfd->print("Date: $today\n"); $connfd->print("Server: $server\n"); $connfd-> print("Last-modified: $modified\n"); $connfd-> print("Content-length: $filesize\n"); $connfd->print("Content-type: $filetype\n"); print("\n"); # CRLF required by HTTP standard } # end of else} # end of procedure

Page 21: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 21

Some Tiny issues

• How would you serve static and dynamic content with GET?

• How would you serve dynamic content with POST?

• How safe are your CGI scripts?– hint: consider the impact of allowing “..” in URIs.

Page 22: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 22

Break time!

Fish

Page 23: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 23

Today’s lecture

• Anatomy of a simple Web server (40 min)

• Break (10 min)

• Advanced server features (45 min)

Page 24: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 24

Cookies

• An HTTP session is a sequence of request and response messages between a client and a server.

• Regular HTTP sessions are stateless– Each request/response pair is independent of the others

• Cookies are a mechanism for creating stateful sessions (RFC 2109)

– Allows servers and CGI scripts to maintain state information (e.g., which items are in a shopping cart) during a session.

• Based on HTTP Set-Cookie (server->client) and Cookie (client->server) headers.

Page 25: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 25

Cookies

serverclientrequest 1 Client initiates request

to server.

serverclientresponse 1

(Set-Cookie)

Server includes a Set-Cookieheader in the HTTP response that contains info (the cookie)the identifies the user.

The client stores the cookieon disk.

Page 26: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 26

Cookies

serverclientrequest 2(Cookie)

Next time the client sendsa request to the server, itincludes the cookie as aCookie header in the HTTPrequest message.

serverclientresponse 2

(Set-Cookie)

The server incorporates anyrelevant new info fromrequest 2 into the Set-Cookieheader in response 2.

Page 27: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 27

Cookie example(from RFC 2109)

• Initially the client has no stored cookies.

• Client -> server– POST /acme/login HTTP/1.1

– [form data]

– user identifies self in form data

• Server -> client– HTTP/1.1 200 OK

– Set-Cookie: Customer=“WILY_COYOTE”; path= “/acme”

– cookie identifies user

– client stores cookie for the next request to this server

Page 28: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 28

Cookie example (cont)

• Client -> server– POST /acme/pickitem HTTP/1.1

– Cookie: Customer=“WILY_COYOTE”; $Path = “/acme”

– [form data]

– User selects an item for a “shopping basket”

• Server -> client– HTTP/1.1 200 OK

– Set-Cookie: Part_Number=“Rocket_Launcher_0001” path=“/acme”

– Server remembers that shopping basket contains an item

Page 29: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 29

Cookie example (cont)

• Client -> server– POST /acme/shipping HTTP/1.1

– Cookie: Customer=“WILY_COYOTE”; $Path=“/acme” Part_Number=“Rocket_Launcher_0001”; $Path=“/acme”

– [form data]

– user selects a shipping method from form

• Server -> client– HTTP/1.1 200 OK

– Set-Cookie: Shipping=“FedEx”; path=“/acme”

Page 30: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 30

Cookie example (cont)

• Client -> server– POST /acme/process HTTP/1.1

– Cookie: Customer=“WILY_COYOTE”; $Path=“/acme”; Part_Number=“Rocket_Launcher_0001”; $Path=“/acme”; Shipping=“FedEx”; $Path=“/acme”

– [form data]

– user chooses to process order

• Server -> client– HTTP/1.1 200 OK

– transaction complete

Page 31: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 31

Cookies

• Cookies are groups by the URI pathname in the request headers (in this case /acme)

• The server adds cookies to the client in the response headers.

• The server an implicitly delete cookies by setting an expiration data in the Set-Cookie header (not shown in previous example)

Page 32: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 32

Applications and implications of cookies

• Click tracking– can be used to correlate a user’s activity at many

different sites.

– Doubleclick.com pays a web site to place an <img src=> tag on the site’s page.

– Causes an advertising banner and a cookie from Doubleclick.com to be loaded into the client when the site’s page is referenced.

– Firms like Doubleclick maintain a unique id per client machine, but have no way to determine the user’s name or other info unless the user supplies it.

Page 33: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 33

Applications of cookies

• Content customization– Cookies can be used to remember user preferences and

customize content to suit those preferences.

– Firms like Doubleclick can record past browsing patterns and target advertising based on the reference pattern and where they are currently browsing.

Page 34: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 34

Refer links• User looking at page

www.cs.cmu.edu/~droh/755/foo.html clicks a link to kittyhawk.cmcl.cs.cmu.edu/bar.html

• Browser sends a referer (sic) header to identify the source page of the request

GET /bar.html HTTP/1.1Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/msword, application/vnd.ms-powerpoint, */*Referer: http://www.cs.cmu.edu/~droh/755/foo.htmlAccept-Language: en-usAccept-Encoding: gzip, deflateUser-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows 98)Host: kittyhawk.cmcl.cs.cmu.edu:8000Connection: Keep-Alive

Page 35: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 35

Applications of refer links

• Allows advertisers to gauge the effectiveness of ads they place on other sites.

• Allows the kind of 3rd party referral businesses like BeFree.com.

Page 36: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 36

Log filesextissnj1.foo.com - - [14/Jul/1999:20:14:38 -0400] "GET /people/faculty/dohallaron HTTP/1.0" 301 375 "http://www.ecom.cmu.edu/people/faculty/" "Mozilla/4.05 [en] (WinNT; I)"inet-fw1-o.foo.com - - [15/Jul/1999:02:58:10 -0400] "GET /people/faculty/dohallaron HTTP/1.0" 301 375 "http://www.ecom.cmu.edu/people/faculty/" "Mozilla/4.06 [en] (WinNT; U)"internet5.foo.com - - [15/Jul/1999:16:35:59 -0400] "GET /people/faculty/dohallaron HTTP/1.0" 301 375 "http://www.ecom.cmu.edu/people/faculty/" "Mozilla/4.04 [en]C-c32f404p (Win95; I)"tmpce001.foo.com - - [16/Jul/1999:16:04:18 -0400] "GET /people/faculty/dohallaron HTTP/1.0" 301 375 "http://www.ecom.cmu.edu/people/faculty/" "Mozilla/4.06 [en] (Win95; I)"hqinbh2.foo.com - - [22/Jul/1999:16:03:51 -0400] "GET /people/faculty/dohallaron/droh.quake.gif HTTP 1.0" 200 14336 "http://www.ecom.cmu.edu/people/faculty/dohallaron/" "Mozilla/4.6C-CCK-MCD [en] (X\

Page 37: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 37

Implications of logs

• Contain a great deal of personal information about the browsing patterns of people inside and outside a site.

• Important issue?– Who has access to logs?

– How is the log information being used?

Page 38: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 38

Virtual hosting

• Virtual hosting allows one web server to serve requests for multiple domains.

• Allows ISPs to provide customers with their own “vanity” sites.

– Each eCommerce student has their own virtual Web server running at <andrewid>.student.ecom.cmu.edu.

– e.g., http://zak.student.ecom.cmu.edu

– equivalent to http://euro.ecom.cmu.edu/~zack

Page 39: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 39

Virtual hosting:How it works

• Configure DNS so that all virtual hosts have the same IP address

» e.g., each eCommerce student site has the IP address 128.2.218.2 (same as euro.ecom)

» verify this yourself with nslookup

• Server maintains a list of (domain name, directory tree) pairs in a hash.

• Server sets base html and cgi directories according to the target domain name.

Page 40: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 40

Virtual hosting

www

cgi-bin html

~zak

www

cgi-bin html

~elenak

www

cgi-bin html

~mansoo

serverRequests to 128.2.218.2

zak.student.ecom.cmu.edu elenak.student.ecom.cmu.edu

Page 41: Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.

Lecture 10, 20-755: The Internet, Summer 1999 41

Server-side includes

• Server mechanism that inserts dynamic or static content directly into an HTML document.

some html<!--#INCLUDE VIRTUAL="message.txt"-->some more html

some html<!--#INCLUDE VIRTUAL=”cgi-bin/printenv.pl"-->some more html