Top Banner
XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM
35

XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Dec 29, 2015

Download

Documents

Hester Rogers
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

XML

What is XML?

XML v.s. HTML

XML Components

Well-formed and Valid

Document Type Definition (DTD)

Extensible Style Language (XSL)

SAX and DOM

Page 2: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

What is XML ?

Extensible Markup Language(XML) is a meta-language that describes the content of the document(self-describing data) Derives from SGML. Interoperable with both HTML and SGML.

Page 3: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

XML v.s. HTML

Markup languages generally combine two distinct

functions of representing text (document) –the ‘look’ and the ‘structure’. HTML and XML have different sets of goals. While

HTML was designed to display data and hence

focused on the ‘look’ of the data, XML was designed

to describe and carry data and hence focuses on

‘what data is’.

Page 4: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

XML v.s. HTML

HTML is about displaying data and XML is about

describing data. HTML and XML are complementary to each other.

HTML explicitly defines a set of legal tags . <TABLE>….</TABLE>

XML allows any tags to be used ,you can create new tags.

<BOOK>….</BOOK>

Page 5: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

XML Components

Prolog

Defines the xml version,entity definitions, and DOCTYPE

Components of the documentTags and attributesCDATA(character data)EntitiesProcessing instructionsComments

Page 6: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

XML Prolog

XML Files always start with a prolog

<?xml version=“1.0” encoding=“ISO-8859-1” standalone=“no”?>

The version of xml is required

The encoding identified character set(default UTF-8)

The value standalone identifies if an external document is referenced for DTD of entity definition

The prolog can contain entities and DTD definitions

Page 7: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Prolog Example

<?xml version=“1.0” standalone=“yes”?><DOCTYPE authors[<!ELEMENT authors (name)*><!ELEMENT name (firstname, lastname)><!ELEMENT firstname (#PCDATA)><!ELEMENT lastname (#PCDATA)>]><authors>

<name><firstname>James</firstname><lastname>Gosling</lastname>

</name>…

</authors>

Page 8: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

XML DOCTYPEDocument Type Declarations

Specifies the location of the DTD defining the syntax and structure of elements in the document

Common forms:<!DOCTYPE root [DTD]><!DOCTYPE root SYSTEM URL><!DOCTYPE root PUBLIC FPI-identifier URL>

The root identifies the starting element( root element) of the document

The DTD can be external to the XML document, referenced by a SYSTEM or PUBLIC URL

•PUBLIC URL refers to a DTD intended for public use

•SYSTEM UPL refers to a private DTD (located on the local file system or HTTP server)

Page 9: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

DOCTYPE Examples

<!DOCTYPE book “book.dtd”>

Book must be the root element

DTD located in same directory of xml document

<!DOCTYPE book SYSTEM “http://.vishnu.cs.lamar.edu/~jingw/book.dtd

DTD located HTTP server: vishnu.cs.lamar.edu

Page 10: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

XML DOCTYPE

Specifying a PUBLIC DTD

<!DOCTYPE root PUBLIC FPI-identifier URL>

The Formal Public Identifier(FPI) has four parts:

1. Connection of DTD to a formal standard- if defining yourself+ nonstandards body has approved the DTDISO if approved by formal standards committee

2. Group responsible for the DTD

3. Description and type of document

4. Language used in the DTD

Page 11: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

PUBLIC DOCTYPE Example

<!DICTYPE Book

PUBLIC “-//w3c//DTD XHMTL 1.0 Transitional //EN”

“http://www.w3.org/TR?xhtml1/DTD/xhtml1-transitional.dtd”>

<!DICTYPE CWP

PUBLIC “-//Prenticd Hall//DTD Core Series 1.0 //EN”

“http://www.prenticehall.com/DTD/Core.dtd”>

Page 12: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

XML Root ElementRequired for XML –aware applications to recognize beginning and end of document, it is the first element . All other elements must be nested within this root element.

Example:

<?xml version=”1.0” ?><book>

<title>123</tilte>…

</book>

Page 13: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

XML Tags

Tag names:Case sensitiveStart with a letter or underscoreAfter first charcater, numbers, - and . are allowedConnot contain whitespacesAvoid use of colon expect for indicating namespaces

Tags can have attributes<message to=“[email protected]” from=“[email protected]”>

<priority/><text> what did you do ?</text>

</message>

All XML elements must have close tags.

Page 14: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Document CDATA

CDATA(character data) is not parsed

<?xml version=“1.0” encoding=“UTF-8”?>

<server>

<port status=“accept”>

<![CDATA[8001 <= port < 9000 ] ]>

</port>

</server>

Page 15: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Document Entities

Entities refer to a data item,typically textGeneral entity references start with & and end with ;The entity reference is replaced by it’s true value when parsedThe characters < > & ‘ “ require entity references to avoid conflicts with the XML application

&lt; &gt; &amp; &quot; &apos;

Entities are user definable<?xml version=“1.0” standalone=“yes” ?><!DOCTYPE book[<!ELEMENT book (title)><!ELEMENT title (#PCDATA) ><!ENTITY copyright “2001, Prentice Hall “> ]><book>

<title>web programming, &copyright; </title></book>

Page 16: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Processing Instructions

Application-specific instruction to the XML processor

<?processor-instruction?>

Example<?xml version=“1.0” ?><?xml-stylesheet type=“text/xml” href=“orders.xsl” ?><orders>

<order><count>37</count><price>49.99</price><book>

<isbn>0130896789</isbn><author>Marty Hall </author>

</book></order>

</orders>

Page 17: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

XML Comments

Comments are the same as HTML comments

<!-- This is an xml and html comment -->

Page 18: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Well-formed versus Valid

An XML document can be well-formed if it follows basic syntax rules.

An XML document is valid if its structure matches a Document Type Definition (DTD) and it is well-formed.

Page 19: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Document Type Definition(DTD)

Defines Structure of the Document

• Allowable tags and their attributes

• Attribute values constraints

• Nesting of tags

• Number of occurrences for tags

• Entity definitions

Page 20: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

DTD Example

<?xml version=“1.0” encoding=”UTF-8” ?>

<!ELEMENT TVSCHEDULE (CHANNEL+)><!ELEMENT CHANNEL (BANNER, DAY+)><!ELEMENT BANNER (#PCDATA)><!ELEMENT DAY ((DATE, HOLIDAY) | (DATE, PROGRAMSLOT+))+><!ELEMENT HOLIDAY (#PCDATA)><!ELEMENT DATE (#PCDATA)><!ELEMENT PROGRAMSLOT (TIME, TITLE, DESCRIPTION?)><!ELEMENT TIME (#PCDATA)><!ELEMENT TITLE (#PCDATA)> <!ELEMENT DESCRIPTION (#PCDATA)>

<!ATTLIST TVSCHEDULE NAME CDATA #REQUIRED><!ATTLIST CHANNEL CHAN CDATA #REQUIRED><!ATTLIST PROGRAMSLOT VTR CDATA #IMPLIED><!ATTLIST TITLE RATING CDATA #IMPLIED><!ATTLIST TITLE LANGUAGE CDATA #IMPLIED>

Page 21: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Defining Elements

<!ELEMENT name definition/type>

<!ELEMENT CHANNEL (BANNER, DAY+)><!ELEMENT BANNER (#PCDATA)><!ELEMENT DAY ((DATE, HOLIDAY) | (DATE, PROGRAMSLOT+))+>

Types

ANY Any well-formed xml data

EMPTY Element cannot contain any text or child elements

PCDATA Character data only (should not contain markup)

Elements List of legal child elements (no character data)

Mixed May contain character data and/or child elements (cannot constrain order and number of child elements)

Page 22: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Defining Elements

Cardinality[none] Default(one and only one instance)

? 0,1

* 0,1,…, n

+ 1,2,…, n

List Operators, Sequence( in order)

<! ELEMENT book (title,price,author)>| Choice(one of several)

<! ELEMENT classroom (teacher | student)>

Page 23: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Defining Attribute

<!ATTLIST element attrName type modifier>

Example

<!ELEMENT Customer (#PCDATA)>

<!ATTLIST Customer id CDATA #IMPLIED>

<!ELEMENT Product (#PCDATA)><!ATTLIST Product

cost CDATA #FIXED “200” id CDATA #REQUIRED>

Page 24: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Attribute Type

CDATA

Essentially anything;simply unparsed data<!ATTLIST Customer id CDATA #IMPLIED>

Enumeration

Attribute(value1|value2|value3)[Modifier]

Eight other attribute typesID,IDREF,NMTOKEN,NMTOKENS,ENTIRY,ENTITIES,NOTATION

Page 25: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Attribute Modifiers

#IMPLIEDAttribute is not required

<!ATTLIST Customer id CDATA #IMPLIED>

#REQUIREDAttribute must be present

<!ATTLIST Customer id CDATA #REQUIRED>

#FIXED “value”Attribute is present and always has this value<!ATTLIST Product cost CDATA #FIXED “200”>

Default value (applies to enumeration)<!ATTLIST car color (red|white|blue) “white”>

Page 26: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Defining Entities

Specify entity reference resolution in a DTD using the ENTITY keyword.<!ENTITY name “replacement” >

<!ENTITY copyright “Copyright 2001” >

Page 27: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Limitations of DTDs

DTD itself is not in XML format – more work for parsers

Does not express data types (weak data typing)

No namespace support

Document can override external DTD definitions

No DOM support

XML Schema is intended to resolve these issues but … DTDs are going to be around for a while

Page 28: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Namespace

Namespaces identify collections of element type declarations so that they do not conflict with other element type declarations with the same name created by other programmers

Two predefined XML namespaces are xml and xsl.

You can create your own namespaces

Example:

<subject> English</subject> <subject>Thrombosis</subject>can be differentiated by using namespaces, as in <school:subject>English</school:subject> <medical:subject>Thrombosis</medical:subject>

Page 29: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

XSL - Extensible Style Language

• Defines the layout of an xml document, an XSL style sheet provides the rules for displaying an XML document.

• XSLT is XSL transformations.• XML -> XSLT -> HTML

• In XML document include:<?xml-stylesheet type="text/xsl"

href=“myXSL.xsl"?>

Page 30: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

XSL Example

• <?xml version="1.0" encoding="big5"?><xsl:stylesheet version="1.0" xmlns:xsl=“http://www.w3.org/TR/WD-xsl”><xsl:template match="/">........ HTML....</xsl:template> </xsl:stylesheet>

Page 31: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

What is the SAX?

SAX is the Simple API for XML, originally a Java-only API. SAX was the first widely adopted API for XML in Java, and is a “de facto” standard.

SAX is an event-based API. The application implements handlers to deal with the different events, much like handling events in a graphical user interface.

Page 32: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

What is the Document Object Model (DOM)?

Is a platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents.

Provides APIs that let you create nodes, modify them, delete and rearrange them. So it is relatively easy to create a DOM.

Maintains a recommended tree-based API for XML and HTML documents.

Page 33: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

DOM/SAX Processing

DOM is a standard. It yields a tree in memory.

SAX yields a sequence of events corresponding to XML input.

Both generally destroy attribute ordering, insignificant white space, insignificant namespace aspects, …

Verification of a signature based on DOM/SAX requires serialization to a byte stream of the DOM tree or the SAX event stream.

Page 34: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

Summary

XML is a self-describing meta data

DOCTYPE defines the root element and location of DTD

Document Type Definition(DTD) defines the grammar of the document

Required to validate the document

Constrains grouping and cardinality of elements

XSL is defined as a language for expressing stylesheetsIs a language for transforming XML documents

Is an XML vocabulary for specifying the formatting of XML documents

DOM and SAX are two most common low-level APIs, they are all in some form of standardization (SAX as a de facto, DOM by the W3C )

Page 35: XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.

XML Resources•XML 1.0 Specification http://www.w3.org/TR/REC-xml

•WWW consortium’s Home Page on XML http://www.w3.org/XML/

•Sun Page on XML and Java http://java.sun.com/xml/

•Apache XML Project http://xml.coverpages.org/

•XML Resource Collection http://xml.coverpages.org/

•O’Reilly XML Resource Center http://www.xml.com/