Top Banner
WEB BASICS
21
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Webbasics

WEB BASICS

Page 2: Webbasics

URI(Unified Resource Identifier)

Uniform Resource Identifier is a general term for identifying resources on the internet or private intranet.

URI's were originally defined as two types:

a) Uniform Resource Locators (URLs) which are addresses with network locations, Syntax of URL:

protocol://host[:port]/url-pathEx of URL :

http://www.javafasttrack.com/index.html b) Uniform Resource Names (URNs), which are persistent names that are address independent.URN is basically used for aliasing URL.

2 types of URLa) Absolute URL designates the protocol,host,path and name of a resource. Ex: <A HREF=“http://www,awl.com/index.html”> Absolute URL to AWL</A>

b) Relative URL not fully qualified but rather it inherits the protocol,host and path information from parent documentation.

Ex: <A HREF=“link.html”>Relative URL to document called link.html </A>

Definitions

Page 3: Webbasics

URL Encoding

URL encoding involves replacing all unsafe and non printable characters with a percent sign followed by 2 hexadecimal digits corresponding to the character’s ASSCI value.

Unsafe characters are those that may be altered by network hardware or software.

Assume that we would like to reference a source called sun’s java.html stored in the server java.sun.com. According to the rule (‘) apostrophe and the space are unsafe and must be encoded. The encoded URL look like this

http;//java.sun.com/sun%27s+java.html

What characters need to be encoded and why?

ASCII Control characters   

Why: These characters are not printable.

Non-ASCII characters   

Why: These are by definition not legal in URLs since they are not in the ASCII set.

Definitions

Page 4: Webbasics

"Reserved characters “

Why:URLs use some characters for special use in defining their syntax. When these characters are not used in their special role inside a URL, they need to be encoded.

concepts

Page 5: Webbasics

concepts

"Unsafe characters” Why:Some characters present the possibility of being misunderstood within URLs for various reasons. These characters should also always be encoded.

Page 6: Webbasics

• How are characters URL encoded?

URL encoding of characters are consist of “%” symbols followed by 2 digit hexadecimal symbol.

Ex:

Space= 32 in decimal code point.

32 decimal= 20 in hexadecimal.

The URl encoded representation will be “%20”

concepts

Page 7: Webbasics

Web Browser A web browser lets the user request something on the server and shows the user the

result of the request. Browser software interprets the markup of files in HTML, formats them into Web pages, and displays them to the user

Currently 2 web browsers are widely used (a) MozillaFireFox (b) Microsoft Internet Explorer

Definition

Page 8: Webbasics

Web Server

1) A web server is essentially a computer program which is responsible for handling HTTP requests. A browser request to view a page the server then accepts the request and displays the page.

2) From a general point of view we consider web servers as the storage area for files which are available on the web. So in order for any page to be viewable on the web it must be loaded on to the web server. A web server is usually a dedicated piece of hardware and software used to allow a website to be displayed on the net.

Some of the web servers are :

a) Microsoft Internet Information Server.a) Apache Web Server.b) Macromedia JRun Web Server.

Definitions

Page 9: Webbasics

Browser/Server Communication

The message passed from the browser to the web server is known as an HTTP request. When the web server receives this request, it checks its stores to find the appropriate page. If the web server finds the page, it parcels up the HTML contained within (using TCP), addresses these parcels to the browser (using HTTP), and sends them back across the network. If the web server cannot find the requested page, it issues a page containing an error message (e.g. Error 404: Page Not Found) – and it parcels up and dispatches that page to the browser. The message sent from the web server to the browser is known as the HTTP response.

concepts

Page 10: Webbasics

Browser / Server Communication

Diagram

Page 11: Webbasics

Common Gateway Interface(CGI)

• The Common Gateway Interface (CGI) is a specification defined by the World Wide Web Consortium rW3C), defining how a program interacts with a Hyper Text Transfer Protocol(HTTP) server. The Common Gateway Interface (CGI) provides the middleware between WWWservers and external databases and information sources. CGI applications perform specific information processing, retrieval, and formatting tasks on behalf of WWW servers.

• CGI is the older version of servlet . The draw back of CGI is that it takes separate process space for each request so memory cosumption is more.

concepts

Page 12: Webbasics

HTTP

• HTTP is the protocol used for information exchange on the WWW. HTTP defines how messages are formatted and transmitted, and what actions a HTTP Server and an HTTP Client (which in most cases is a Browser) should take in response to various messages. HTTP uses a reliable, connection-oriented transport service such as the TCP. HTTP is a stateless Protocol, where each request is interpreted independently, without any knowledge of the requests that came before it.HTTP protocol follows request/response model . In the communication between browser and server, the client makes the request and server responds to the request and the connection to be

closed. There is 4 steps 4 simple Web transaction :1. The client opens the connection to rhe server. The client opens the TCP/IP

connection to the server.Since TCP/IP connection is established down in the transport layer of the protocol stack. There is not of much HTTP activity in this stage.

2. The Client makes request to the server. This is where we get first HTTP syntax. Lets assume that the web browser makes a very basic request to retrive the HTML

file.. The URL entered into the URL might look like:http://www.awl.com/index.html . The HTTP request sent by the browser to the server would look something like this:

GET /index.html HTTP/1.0This request can be broken into 3 parts: the request method , the source name and the

protocol. GET is a HTTP method requesting a server to send the file.

concepts

Page 13: Webbasics

GET /index.html HTTP/1.0 This request can be broken into 3 parts: the request method , the source name and

the protocol. GET is a HTTP method requesting a server to send the file.

3. The server responds to the request. The server respond with the status code ,various header fields and if possible, the content of the requested file.

If the file requested in step 2 is available and the client has proper authorization the server response may something look like this :

HTTp/1.0 200 OKServer: Netscape-Enterprise/2.01Content-Typetext/htmlContent-length: 87

The first line of the response indicates the server’s protocol and returns a status code stating that the request was fulfilled sucessfully.The OK message on this line provide the human readable description of the human code 200.

4. The connection is closed. The TCP connection may be either closed by server or clientUsually, it’s a server who terminates the connection after the response has been sent

Page 14: Webbasics

Connectionless Protocol

• Using a connection less protocol, client opens the connection with the server ,sends the request , recieves the response and close the connection.

Connectionless describes communication between two network end points in which a message can be sent from one end point to another without prior arrangement. The device at one end of the communication transmits data to the other, without first ensuring that the recipient is available and ready to receive the data. The device sending a message simply sends it addressed to the intended recipient. As such there are more frequent problems with transmission than with connection-oriented protocols and it may be necessary to resend the data several times.

Advantage of connectionless protocol is :

- It holds the connection open only long enough to service the request, very few server resources required to service.

The drawback of connectionless protocol is that connection has to be establised with every request.

Page 15: Webbasics

Stateless Protocol

• A protocol is said to be stateless if it has no memory of prior connections and cannot distinguish request of one client from that of another. HTTP is a stateless protocol. Due to stateless nature of HTTP protocol it consumes less resources of server and can support more simultaneous users since there are no client credentials and connection to maintain…

This does not require the user to start an instance of the application as all access to the application is via a web browser. The application, which is running under a web server, knows nothing about any particular client until it receives an HTTP request. When it receives a request the following happens:

• The web server creates and activates a child process to deal with the request, or it may reactivate an existing process which is currently "sleeping".

• The child process deals with the request and generates a response, which the web server will send back to the client device.

• The child process then dies, or puts itself in a "sleep" condition.

definations

Page 16: Webbasics

The GET method• The GET method is so called because the browser uses the HTTP GET command to

submit the data. The GET command sends a URL to the Web server. If the form's data is sent to the Webserver using the GET command, the browser must include all data in the URL.The key features of the GET method of data submission are as follows:The values of all the fields are concatenated and passed to the URL. The example of get method is as folloes:

GET /login.html?username=dustin&password=servlets HTTP/1.0

User-Agent: Mozilla/4.02[en](Win95;)

Accept: image/gif, image/jpeg, image/pipeg

- The first line of the example implies that GET method is requesting login.htmlfile using HTTP/1.0 protocoland passes two parameter called username and password with values of dustin and servlets.

- The User-Agent header field conveys the type of browser that initiated the request. In this case , the browser is the english version of Mozilla Firefox 4.02 running on Windows 95.

- The Accept header field indicates the file types supported by the client.

The information Can be passed to the server by manually appending the query string to the URLor by an HTML form. The above GET request request may be generated by the browser in response to the user requesting the URL ;

http://www.sourcestream.com/login.html?username=dustin&password=servlets.

Page 17: Webbasics

The POST method

• In the POST method of data submission, the Web browser uses the POST command to submit the data to the server and includes the form's data in the body of that command. The POST method can handle any amount of data, because the browser sends the data as a separate stream.The POST method should be used to send potentially large amounts of data to a web server. A post method typically generated by the browser in response to the user clicking the submit button on HTML form that utilize the POST method. The POST method looks like following :

POST /login.html HTTP/1.0 User-Agent: Mozilla/4.02 [en] (win 95;i) Accept: image/gif, image/jpeg, */* Content-Length: 34 username=dustin&passwords=servlets Here the user name and password is passed in the body of the request rather than the

method statement. This is the primary difference between GET and POSt method.

concepts

Page 18: Webbasics

MIME(Multipurpose Internet Mail Extension)

• MIME is an extensionto the electronic mail format . MIME provides 3 internet services :

a) Provides the standard mechanism for encoding binary data into the ASCII format

b) Define the standard for specifying the type of content stored in the body of message.

c) Descibe the standard method for defining the multipart mail message containing heterogeneous body part

concepts

Page 19: Webbasics

HTML Forms

• HTML Form provide a simple way to prompt user for tnput via. Formatted HTML page and allow user to submit information to the server.Forms are first mechanism to allow true way interaction on the web.

<form> tag• The form element creates a form for user input. A form can contain textfields,

checkboxes, radio-buttons and more. Forms are used to pass user-data to a specified URL.

• Syntax

form action="actionpage.cfm" method="post"> ... </form> The method attribute

The two possible methods are GET and POST

concepts

Page 20: Webbasics

The action attribute

The action attribute specifies the action that is to be performed whem the user submit

the form.Usually the action attributes indicates the URL of script or servlet that will process the user’s input.

Ex :

<FORM METHOD= “POST” ACTION=mailto:[email protected]>

<INPUT> tag The <input> tag defines the start of an input field where the user can enter data <form action="form_action.asp“ method="get">

First name: <input type="text" name="fname“ value="Mickey" /><br />Last name: <input type="text" name="lname" value="Mouse" /><br /><input type="submit" value="Submit" />

</form>

concepts

Page 21: Webbasics

The <TEXTAREA. Tag• This tag displays a multiline textbox that is useful for collecting large amounts of text

from the client.• Syntax:

<TEXTAREA NAME=“name” COLS=“x” ROWS=“y”> Default Text </TEXTAREA>

concepts