Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department of Electrical and Computer Engineering Carnegie Mellon University Institute for eCommerce, Summer 1999
41
Embed
Lecture 10, 20-755: The Internet, Summer 1999 1 20-755: The Internet Lecture 10: Web Services III David O’Hallaron School of Computer Science and Department.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Lecture 10, 20-755: The Internet, Summer 1999 1
20-755: The InternetLecture 10: Web Services III
David O’Hallaron
School of Computer Science and
Department of Electrical and Computer Engineering
Carnegie Mellon University
Institute for eCommerce, Summer 1999
Lecture 10, 20-755: The Internet, Summer 1999 2
Today’s lecture
• Anatomy of a simple Web server (40 min)
• Break (10 min)
• Advanced server features (45 min)
Lecture 10, 20-755: The Internet, Summer 1999 3
Anatomy of Tiny: A simple Web server
#!/usr/local/bin/perl5 -w use IO::Socket; # # tiny.pl - The Tiny HTTP server #
Lecture 10, 20-755: The Internet, Summer 1999 4
Tiny: configuration
# # Configuration # $port = 8000; # the port we listen on $htmldir = "./html/"; # the base html directory $cgidir = "./cgi-bin/"; # the base cgi directory $server = "Tiny Web server 1.0"; # server info
Lecture 10, 20-755: The Internet, Summer 1999 5
Tiny: error messages
# # Error messages # # Terse error messages go in the response header %terse_errors = ( "403", "Forbidden", "404", "Not Found", "501", "Not Implemented", ); # Verbose error messages go in the response message body %verbose_errors = ( "403", "You are not allowed to access this item", "404", "Tiny couldn't find the requested item on the server", "501", "Tiny does not support the given request type", );
Lecture 10, 20-755: The Internet, Summer 1999 6
Tiny:Create a listening socket
# # Create a TCP listening socket file descriptor # # LocalPort: list on port $port # Type : use TCP # Resuse : reuse address right away # Listen : buffer at most 10 requests # $listenfd = IO::Socket::INET->new(LocalPort => $port, Type => SOCK_STREAM, Reuse => 1, Listen => 10) or die "Couldn't listen on port $port: $@\n";
Lecture 10, 20-755: The Internet, Summer 1999 7
Tiny:main loop structure
# # Loop forever waiting for HTTP requests # while(1) { # Wait for a connection request from a client $connfd = $listenfd->accept(); # Determine the domain name and IP address of this client # Parse the request line (after stripping the newline) # Parse the URI # Parse the request headers # OPTIONS method # HEAD method # GET method # misc: POST, PUT, DELETE, and TRACE methods}
Lecture 10, 20-755: The Internet, Summer 1999 8
Tiny: error procedure# # error - send an error message back to the client # $_[0]: the error number # $_[1]: the method or URI that caused the error # sub error { local($errno) = $_[0]; local($errmsg) = "$errno $terse_errors{$errno}"; print $connfd <<EndOfMessage; HTTP/1.1 $errmsg Content-type: text/html <HTML> <HEAD><TITLE>$errmsg</TITLE></HEAD> <BODY bgcolor="#ffffff"> <H1>$errmsg</H1> $verbose_errors{$errno}: <PRE> $_[1] </PRE> <HR> The Tiny Web Server </BODY> </HTML> EndOfMessage }
Lecture 10, 20-755: The Internet, Summer 1999 9
Tiny:get client’s name and address
# Determine the domain name and IP address of this client $client_sockaddr = getpeername($connfd); ($client_port, $client_iaddr) = unpack_sockaddr_in($client_sockaddr); $client_port = $client_port; # so -w won't complain $client_name = gethostbyaddr($client_iaddr, AF_INET); ($a1, $a2, $a3, $a4) = unpack('C4', $client_iaddr); print "Opened connection with $client_name ($a1.$a2.$a3.$a4)\n";
Lecture 10, 20-755: The Internet, Summer 1999 10
Tiny:parsing the request line
# Parse the request line (after stripping the newline) chomp($line = <$connfd>); ($method, $uri, $version) = split(/\s+/, $line); print "received $line\n";
Lecture 10, 20-755: The Internet, Summer 1999 11
Tiny:parsing the URI
# # Parse the URI # # Either the URI refers to a CGI program... if ($uri =~ m:^/cgi-bin/:) { $is_static = 0; # extract the program name and its arguments ($filename, $cgiargs) = split(/\?/, $uri); if (!defined($cgiargs)) { $cgiargs = ""; } # replace /cgi-bin with the default cgi directory $filename =~ s:^/cgi-bin/:$cgidir:o; }
Lecture 10, 20-755: The Internet, Summer 1999 12
Tiny:Parsing the URI
# ... or the URI refers to a file else { $is_static = 1; # static content $cgiargs = ""; # replace the first / with the default html directory $filename = $uri; $filename =~ s:^/:$htmldir:o; # use index.html for the default file $filename =~ s:/$:/index.html:; } # debug statements like this will help you a lot print "parsed URI: is_static=$is_static, filename=$filename, cgiargs=$cgiargs\n";
Lecture 10, 20-755: The Internet, Summer 1999 13
Tiny:parsig the request headers
# # Parse the request headers # $content_length = 0; $content_type = "text/html"; while (<$connfd>) { # read request header into $_ # Delete CR and NL chars s/\n|\r//g; # delete CRLF and CR chars from $_ # Determine the length of the message body # search for "Content-Length:" at beginning of string $_ # ignore the case if (/^Content-Length: (\S*)/i) { $content_length = $1; }
Lecture 10, 20-755: The Internet, Summer 1999 14
Tiny:parse the command line (cont)
# determine the type of content (if any) in msg body # search for "Content-Type:" at beginning of string $_ # ignore the case if (/^Content-Type: (\S*)/i) { $content_type = $1; } # If $_ was a blank line, exit the loop if (length == 0) { last; } }
# # HEAD method # elsif ($method eq "HEAD") { # we're dissallowing HEAD methods on scripts if (!$is_static) { error(403, $filename); } else { $today = gmtime()." GMT"; head_method($filename, $uri, $today, $server); } }
Lecture 10, 20-755: The Internet, Summer 1999 17
Tiny:HEAD (cont)
# # process the HEAD method on static content # $_[0] : the file to be processed # $_[1] : the uri # $_[2] : today's date # $_[3] : server name # sub head_method { local ($filename) = $_[0]; local ($uri) = $_[1]; local ($today) = $_[2]; local ($server) = $_[3]; local $modified; local $filesize; local $filetype;
Lecture 10, 20-755: The Internet, Summer 1999 18
Tiny:HEAD (cont)
# make sure the requested file exists if (!(-e $filename)) { error(404, $uri); } # make sure the requested is readable elsif (!(-r $filename)) { error(403, $uri); }
Lecture 10, 20-755: The Internet, Summer 1999 19
Tiny: HEAD (cont)
# serve the response header but not the file else { # determine file modifcation date $modified = gmtime((stat($filename))[9])." GMT"; # determine filesize in bytes $filesize = (stat($filename))[7]; # determin filetype (default is text) if ($filename =~ /\.html$/) { $filetype = "text/html"; } elsif ($filename =~ /\.gif$/) { $filetype = "image/gif"; } elsif ($filename =~ /\.jpg$/) { $filetype = "image/jpeg"; } else { $filetype = "text/plain"; }
Lecture 10, 20-755: The Internet, Summer 1999 20
Tiny:HEAD (cont)
# print the response header $connfd->print("HTTP/1.1 200 OK\n"); $connfd->print("Date: $today\n"); $connfd->print("Server: $server\n"); $connfd-> print("Last-modified: $modified\n"); $connfd-> print("Content-length: $filesize\n"); $connfd->print("Content-type: $filetype\n"); print("\n"); # CRLF required by HTTP standard } # end of else} # end of procedure
Lecture 10, 20-755: The Internet, Summer 1999 21
Some Tiny issues
• How would you serve static and dynamic content with GET?
• How would you serve dynamic content with POST?
• How safe are your CGI scripts?– hint: consider the impact of allowing “..” in URIs.
Lecture 10, 20-755: The Internet, Summer 1999 22
Break time!
Fish
Lecture 10, 20-755: The Internet, Summer 1999 23
Today’s lecture
• Anatomy of a simple Web server (40 min)
• Break (10 min)
• Advanced server features (45 min)
Lecture 10, 20-755: The Internet, Summer 1999 24
Cookies
• An HTTP session is a sequence of request and response messages between a client and a server.
• Regular HTTP sessions are stateless– Each request/response pair is independent of the others
• Cookies are a mechanism for creating stateful sessions (RFC 2109)
– Allows servers and CGI scripts to maintain state information (e.g., which items are in a shopping cart) during a session.
• Based on HTTP Set-Cookie (server->client) and Cookie (client->server) headers.
Lecture 10, 20-755: The Internet, Summer 1999 25
Cookies
serverclientrequest 1 Client initiates request
to server.
serverclientresponse 1
(Set-Cookie)
Server includes a Set-Cookieheader in the HTTP response that contains info (the cookie)the identifies the user.
The client stores the cookieon disk.
Lecture 10, 20-755: The Internet, Summer 1999 26
Cookies
serverclientrequest 2(Cookie)
Next time the client sendsa request to the server, itincludes the cookie as aCookie header in the HTTPrequest message.
serverclientresponse 2
(Set-Cookie)
The server incorporates anyrelevant new info fromrequest 2 into the Set-Cookieheader in response 2.
• Cookies are groups by the URI pathname in the request headers (in this case /acme)
• The server adds cookies to the client in the response headers.
• The server an implicitly delete cookies by setting an expiration data in the Set-Cookie header (not shown in previous example)
Lecture 10, 20-755: The Internet, Summer 1999 32
Applications and implications of cookies
• Click tracking– can be used to correlate a user’s activity at many
different sites.
– Doubleclick.com pays a web site to place an <img src=> tag on the site’s page.
– Causes an advertising banner and a cookie from Doubleclick.com to be loaded into the client when the site’s page is referenced.
– Firms like Doubleclick maintain a unique id per client machine, but have no way to determine the user’s name or other info unless the user supplies it.
Lecture 10, 20-755: The Internet, Summer 1999 33
Applications of cookies
• Content customization– Cookies can be used to remember user preferences and
customize content to suit those preferences.
– Firms like Doubleclick can record past browsing patterns and target advertising based on the reference pattern and where they are currently browsing.
Lecture 10, 20-755: The Internet, Summer 1999 34
Refer links• User looking at page
www.cs.cmu.edu/~droh/755/foo.html clicks a link to kittyhawk.cmcl.cs.cmu.edu/bar.html
• Browser sends a referer (sic) header to identify the source page of the request