Top Banner
UFCEKG202 Data, Schemas & Applications Data, Schemas & Applications Lecture 2 Introduction to the WWW URLs HTTP Services and Introduction to the WWW , URLs, HTTP , Services and Mashups N. H. N. D. de Silva (Slides adapted from Prakash Chatterjee, UWE)
25

UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

Jul 27, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

UFCEKG‐20‐2 Data, Schemas & ApplicationsData, Schemas & Applications

Lecture 2Introduction to the WWW URLs HTTP Services andIntroduction to the WWW, URLs, HTTP, Services and 

MashupsN. H. N. D. de Silva

(Slides adapted from Prakash Chatterjee, UWE)

Page 2: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

Suppose all the information stored on computers everywhere were linked I thought Suppose I could program my computer towere linked, I thought. Suppose I could program my computer to create a space in which anything could be linked to anything. All the bits of information in every computer at CERN, and on the planet, would be available to me and to anyone else. There would be a single, global information space.

Tim Berners‐Lee, Weaving the WebJan 2013 2N. H. N. D de Silva

Page 3: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

WWW : definitionWWW : definition

The World Wide Web (abbreviated as WWW or W3, commonly ( , yknown as the Web), is a system of interlinked hypertextdocuments accessed via the Internet. With a web browser, one can view web pages that may contain text images videos andcan view web pages that may contain text, images, videos, and other multimedia, and navigate between them via hyperlinks.

Wikipedia : World Wide Web

Concept originally proposed by Sir Tim Berners‐Lee (1989) based on earlier hypertext systems Berners‐Lee and Belgian computeron earlier hypertext systems. Berners‐Lee and Belgian computer scientist Robert Cailliau proposed in 1990 to use hypertext "to link and access information of various kinds as a web of nodes in h h h b ll" d h bl l d dwhich the user can browse at will", and they publicly introduced 

the project in December of the same year.

Jan 2013 3N. H. N. D de Silva

Page 4: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

MashupsMashups

O i i ll t f th li & i i fo Originally a term for the sampling & mixing of two pieces of music together. Here the term refers to web applications which combine datarefers to web applications which combine data from multiple sources to create added value sites. o on Wikipediao o ped ao Programmable Web run by John Musser tracks the emerging collection of mashups.

o Here we review the basic mechanisms for integration. Next week we will cover the basics of XML one of the data formats widely used forXML, one of the data formats widely used for integration and configuration.

Jan 2013 4N. H. N. D de Silva

Page 5: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

Mashup pre requisitesMashup pre‐requisites

lo HTTP protocolo client ‐ server interactiono URI Schema o HTML/HTML Forms the simplest Mashupo HTML/HTML Forms ‐ the simplest Mashuptechnique

( )o media type (Mime‐type, content‐type)o URL Encodinggo Character encoding

Jan 2013 5N. H. N. D de Silva

Page 6: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

HTTPHTTPo Request 

Query string attached files and information about the client Server can access allQuery string, attached files and information about the client. Server can access all this data to determine the appropriate response.

o ResponseDocument formatted by the server, with a wrapper which identifies the kind ofDocument formatted by the server, with a wrapper which identifies the kind of Content‐type

o GETQuery string appended to URI ‐ limited length, exposes the parameter names, easyQuery string appended to URI  limited length, exposes the parameter names, easy to edit, use for development, formally only for requests which only read data and don't update. 

o POSTQuery string passed in HTTP request body, unlimited size, hides the interface, use for sending data to server for update

o PUTAdd a resource to the remote store 

o DELETEDelete a resource 

Often authentication is required ‐ username/password passed in the HTTP header.Jan 2013 6N. H. N. D de Silva

Page 7: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

HTTP interactionHTTP interaction

o This sequence diagram explains the main processes in the HTTP Protocol. It is the foundation for much of the interaction on the web Client‐server interaction with HTTP

Jan 2013 7N. H. N. D de Silva

Page 8: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

HTTP interactionHTTP interactiono We can think of an HTTP request/response as a remote 

procedure call (RPC) There are other more low levelprocedure call (RPC). There are other, more low‐level mechanisms for RPC which are useful in special circumstances but the web is built around HTTP

l b l f ll do Applications built on HTTP interaction are often called RESTful. REST is an abbreviation which stands for Representational State Transfer. 

o Strictly this is a well‐defined architectural style in which the HTTP operations are used in a specific restricted sense, and unique URIs identify each resource in the application.unique URIs identify each resource in the application.

o Informally it refers to any interface to a site in which all data is requested and transmitted via HTTP without any additional layers such as is found in SOAP and Web Services and thelayers such as is found in SOAP and Web Services, and the state of the interaction is passed in the request.

Jan 2013 8N. H. N. D de Silva

Page 9: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

Th i l h l i i h l & hThree essential technologies : uri, html & http

1 a system of globally unique identifiers for resources on the1. a system of globally unique identifiers for resources on the Web and elsewhere, the Universal Document Identifier (UDI), later known as Uniform Resource Locator (URL) and ( ), ( )Uniform Resource Identifier (URI);

2. the publishing language HyperText Markup Language(HTML);

3. the Hypertext Transfer Protocol (HTTP).

Jan 2013 9N. H. N. D de Silva

Page 10: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

Anatomy of a URIAnatomy of a URIo Uniform Resource Identifier (more general than URL). The structure of a URI is defined by the URI scheme URIsThe structure of a URI is defined by the URI scheme. URIs are case‐sensitive

http://www.example.com/modules/dsa/index.html?year=2012p // p / / / y< scheme > : < hierarchical part > [ ? < query > ] [ # < fragment > ] 

o httpo //www.example.com/modules/dsa/index.html

o user info ‐ terminated by @ h lo hostname ‐ www.example.com 

o port ‐ :80 th / d l /d /i d ht lo path ‐ /modules/dsa/index.html

o year=2012 (query parameter)

Jan 2013 10N. H. N. D de Silva

Page 11: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

Anatomy of a URIAnatomy of a URI

Jan 2013 11N. H. N. D de Silva

Page 12: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

URI Scheme namesURI Scheme names

o http ‐ The most common scheme name ‐ Hypertext p ypTransfer Protocol . Typically web pages are requested and delivered using this protocol. 

o https ‐ secure HTTP o mailto ‐ an email address ‐ usually handled by theo mailto an email address  usually handled by the 

browser handing responsibility to another applicationo file ‐ read a local file (but do not execute it)o file read a local file (but do not execute it)o ftp ‐ file transfer

any others? 

Jan 2013 12N. H. N. D de Silva

Page 13: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

hi hi lURI hierarchical part

o user info ‐ e.g. [email protected]

o hostname – www.gmail.com ‐ converted by DNS to an IP address ‐ 74.125.230.214to an IP address  74.125.230.214 

port e g 80 defa lt http porto port ‐ e.g. : 80 ‐ default http port 

o path ‐ /modules/dsa/index.html 

Jan 2013 13N. H. N. D de Silva

Page 14: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

Query StringQuery String

o Parameters passed to the script Multiple parameters areo Parameters passed to the script. Multiple parameters are passed in several common forms

o delimited values are positional and delimited by a special character such as ";" 

o /modules/dsa/index.html;2h th t to where the two parameters are 

/modules/dsa/index.html and 2o keyword/value pairs each parameter value is passedo keyword/value pairs each parameter value is passed 

as a keyword=value pair, with pairs separated by &. hi i h f d b fThis is the form used by HTML forms. The order of the parameters is not significant 

Jan 2013 14N. H. N. D de Silva

Page 15: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

FragmentFragment

o address a place within a document placeo address a place within a document ‐ place marked as 

<a name="fragid"> 

Jan 2013 15N. H. N. D de Silva

Page 16: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

Uses of URIso Destination of HTTP request 

o a link in an HTML document bodyy<a href="http://en.wikipedia.org/wiki/URI">URI<a>

o a link in an HTML document head<link rel "alternate" media type "application/rss+xml“ href "news rss"/><link rel="alternate" media‐type="application/rss+xml“ href="news.rss"/>

o typed into the location bar in a browser ‐ or editing an existing URIo created in a browser by javascript

document.location = "http://en.wikipedia.org/wiki/" + termo used by the Javascript AJAX technique to add interactivity to a web 

pagepageo created by a server script e.g. PHP

$x = file("http://en.wikipedia.org/wiki/$term")

U i id fo Unique id for a resource ‐o XML namespaces ‐ http://www.w3.org/1999/xhtmlo semantic web resource id ‐o semantic web resource id 

http://www.cems.uwe.ac.uk/rdffold/moduleRun/UFIEKG‐20‐2Jan 2013 16N. H. N. D de Silva

Page 17: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

URI re writeURI re‐write

o URIs are often re‐written by the server e.g. usingo URIs are often re written by the server e.g. using Apache mod‐rewrite to map to a different internal locationinternal location. 

o http://www.cems.uwe.ac.uk/rdffold/module/UFIEKG‐20‐2

o re‐written too re written to http://fold.cems.uwe.ac.uk:8080/exist/servlet/db/fold1/rdf/rdf xq?p=module/UFIEKG‐20‐2b/fold1/rdf/rdf.xq?p=module/UFIEKG‐20‐2

Jan 2013 17N. H. N. D de Silva

Page 18: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

URI re writeURI re‐write

o This allows the actual server, file locations ando This allows the actual server, file locations and script languages to be changed while providing a stable resource identifierstable resource identifier. 

“Any software problem can be solved by y f p yadding another layer of indirection.” 

Steve Bellovin of AT&T Labs 

Jan 2013 18N. H. N. D de Silva

Page 19: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

Form interface to create URIForm interface to create URI

o The simplest way to reuse another application is to create a new form to create the appropriate URIs. To understand the application is to understand the i t f th i t th t t i t dinterface, the scripts, the parameters to scripts and the range and meaning of parameter values.H th l i it i th US b NOAAo Here the example is a site in the US run by NOAAwhich gathers data on Weather observation stations at seaat sea.o UK buoyso Buoy near Pembrokeo Buoy near Pembrokeo Wind speed

Jan 2013 19N. H. N. D de Silva

Page 20: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

HTMLo Hypertext Markup Language (HTML) is the language of the 

WebWebo Hypertext because the Web is a hypermedia systemo Markup because documents are encoded using texto Language because HTML is used for communications

o Markup Languages are different from most file formatst f t bi d d d t j to many computer formats are binary encoded and not just 

texto markup allows structured documents to be encoded as pjust text

o Web data formats use markup as well as other encodingso HTML and XML are markup languageso HTML and XML are markup languageso JavaScript is also exchanged textually (but it's not markup)o images and other multimedia content is encoded as binary o g yfiles

Jan 2013 20N. H. N. D de Silva

Page 21: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

TextTexto <h1>‐<h6> are different levels of headingso <p> contains paragraph texto <p> contains paragraph text

o whitespace and line wrapping are ignoredo paragraphs are set as boxes containing a number of linesp g p g

o Text inside paragraphs can use additional markup (phrase markup)

<em> for emphasized texto <em> for emphasized texto <strong> for text with a strong emphasiso <sub> for subscript textsubscript texto <sup> for superscript text

o <q> for quoted text (try nesting quotes)o <code> for code exampleso <code> for code examples

o rendering of all these elements is built into the browsero more sophisticated issues probably are more browser‐dependent

Jan 2013 21N. H. N. D de Silva

Page 22: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

More Advanced TextMore Advanced Text

o Quotations can be explicitly marked up as suchoblockquote for block‐level quotationsoq for inline quotations (part of a block)oq for inline quotations (part of a block)o cite provides support for pointing to the sourceP f tt d t t ll t t f tti i tho Preformatted text allows text formatting in the HTML sourceopre leaves whitespace intact and usually uses monospaced fonts

oword wrapping may be turned off by default

Jan 2013 22N. H. N. D de Silva

Page 23: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

Lists and TablesLists and Tables

o HTML supports three kinds of listso HTML supports three kinds of listso <ul> for unordered lists containing lio <ol> for ordered lists containing lio <ol> for ordered lists containing lio <dl> for definition lists containing dt/dd

o Tables are the most complex visual structure ino Tables are the most complex visual structure in HTML

<table> represents a table as a seq ence of ro so <table> represents a table as a sequence of rowso <tr> represents a table row as a sequence of cells<td> t t bl ll t i i t bl d to <td> represents a table cell containing table data

o <th> is a special cell containing header data

Jan 2013 23N. H. N. D de Silva

Page 24: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

ImagesImageso The Web is an open hypermedia system

h f h h f li k do hyper refers to the term hypertext for linked contento media refers to the fact that multiple media types are supportedpp

o For a long time, the Web only supported text and imageso images can be used in a variety of formats (GIF, JPEG, PNG)o audio and video are possible today, but not part of the Web

o Images are not part of a Web page, they are included byo Images are not part of a Web page, they are included by markupo img is an empty element for including images

i i i h i ( f l i )o src is a URI pointing to the image (often a relative URI)<img src="../img/portrait.png" alt="Portrait">

Jan 2013 24N. H. N. D de Silva

Page 25: UFCEKG 20 2 Data, Schemas & Applicationsix.cs.uoregon.edu/.../Lecture_02_-_Introduction_to_the_The World Wide Web (abbreviated as W3, commonly known as the Web), is a system of interlinked

Linkso Links are the most important feature of the Web

o conceptually, the Web is one large hypermedia documento links are based on Web identifiers, the Uniform Resource Identifier (URI)

o <a> is a link anchor and links to a URI (the link target)o <a> is a link anchor and links to a URI (the link target)<a href="http://www.cems.uwe.ac.uk" title=“CSCT UWE">CSCT</a>

o URIs can have various formso http: points to resources available on Web serverso https: is the same but uses encrypted connectionso URIs can use a variety of other URI Schemeso URIs can be relative (in the same was as file names)o relative URIs are evaluated relative to the URI of theiro relative URIs are evaluated relative to the URI of their occurrence

o relative URIs can use path segments such as / and ..

Jan 2013 25N. H. N. D de Silva