Top Banner
XML Instructor: Charles Moen CSCI/CINF 4230
27

XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML Extensible Markup Language A set of rules that allow you to create your own markup language Designed.

Jan 04, 2016

Download

Documents

Lynne Benson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

XML

Instructor: Charles Moen

CSCI/CINF 4230

Page 2: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

2

XML

Extensible Markup Language

A set of rules that allow you to create your own markup language

Designed for delivering data over the Web in text files that are self-describing and readable both by computer programs and by humans

The XML specification has been maintained by the World Wide Web Consortium (W3C) since 1998

XML (Spainhour, Ray, W3Schools)

Page 3: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

Example of an XML File

3

<?xml version="1.0" encoding="UTF-8" ?><sandwiches>

<sandwich name="Shrimp Poorboy"> <price>5.99</price> <sandwich> <sandwich name="Grilled Burger"> <price>4.99</price> </sandwich></sandwiches>

XML declaration is always on the first line.

XML uses markup tags, like HTML, but developers can invent their own tag names.

As long as the tags follow the XML syntax rules, we can invent whatever tags and attributes are needed to describe our data.

XML (Spainhour, Ray, W3Schools)

Page 4: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

Problem with HTML

4

<h1>Beginning ASP.NET 3.5 in C# 2008</h1><h2>Matthew MacDonald</h2>

It’s difficult to get the meaning of this data by looking at the HTML elements.

XML (Yue)

HTML provides the structure of a Web page, but not the semantic meaning of its content.

<book> <title>Beginning ASP.NET 3.5 in C# 2008</title> <author>Matthew MacDonald</author></book>

XML can provide the semantic meaning through its markup tags.

Page 5: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

5

XML is Portable Data

XML files are plain text files that contain markup tags

Any software that can process plain text can read XML• Hardware independent• Software independent• XML can be used to exchange data between incompatible

systems

XML-aware applications• Can process XML data as long as the application “knows” the

meaning of the tags• Meaning of the tags depends on the application

XML (Ding, W3Schools)

Page 6: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

6

XML Technologies

XML

XML Namespaces

DTD (Document Type Definition) • For describing your markup language

XML Schema• An XML-based method of describing your markup language

XSL (Extensible Stylesheet Language)• For displaying and transforming XML documents

DOM (Document Object Model)• Object library for manipulating an XML document as a tree

XML (Yue)

Page 7: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

7

XML documents must be well-formed

An XML document that conforms to the minimal XML syntax rules is well-formed

Elements must always have a closing tagTag names and attribute names are case-sensitiveElements must be properly nestedAll attributes must have a valueAll attribute values must be surrounded with quotes or apostrophesThe XML declaration is on the first lineThe document has a single root element

XML (Spainhour, Ray, W3Schools)

Page 8: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

Root Element

8

<?xml version="1.0" encoding="UTF-8" ?><sandwiches>

<sandwich name="Shrimp Poorboy"> <price>5.99</price> <sandwich> <sandwich name="Grilled Burger"> <price>4.99</price> </sandwich></sandwiches >

Root element

XML (Spainhour, Ray, W3Schools)

The top-level element • Only one• All other elements must be nested within it

In an XHTML document, the root element is <html>

Page 9: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

9

Tag Names

There are no predefined tag names; you must invent your own (or use tags that another developer invented)

Should be descriptive, so that the document can be self-describing

Should be short and concise

Can contain letters, numbers, and other characters

Must not start with a number or punctuation character, including the dollar sign, caret, percent symbol, semicolon, etc.

Must not start with the letters “xml”

Cannot contain spaces

Should not contain the characters “:” or “.”

XML (Spainhour, Ray, W3Schools)

Page 10: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

10

Element Content

The text between the start tag and end tag

Content can be any of the following:Empty, without content

Nested elements

Character data

Character entities

Processing instructions

Comments

CDATA sections

XML (Spainhour, Ray, W3Schools)

<br />

<sandwich name="Shrimp Poorboy"> <price>5.99</price><sandwich>

&lt; &gt; &amp; &quot; &apos;

<?xml-stylesheet type="text/xsl" href="simple.xsl"?>

<!-- This is a comment -->

Page 11: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

<?xml version="1.0" encoding="UTF-8" ?><sandwiches>

<sandwich name="BLT"> <price>5.99</price> <ingredients> <![CDATA[ Bacon, lettuce, & tomato ]]> </ingredients> <sandwich></sandwiches >

11

CDATA

Can be inserted anywhere that character data can occur All characters within a CDATA section are treated as a literal

part of the character data

Begins with these special characters

All characters within are treated as literals and are not parsed as XML

XML (Spainhour, Ray, W3Schools)

Ends with these special characters

Page 12: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

12

<sandwich name="Poorboy"/>

Attributes

Attribute

Name-value pair that describes a property of the element

Can be included in the start tag or an empty tag

A particular attribute can appear only once in the same tag

XML (Spainhour, Ray, W3Schools)

Page 13: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

13

Validation

A DTD describes your XML markup language Which tags can be used What each element can contain

A document can be tested with the DTD, and if it passes then it is valid• Must be well-formed

• Must be free of mistakes‒ No misspelled tag names

‒ No improper nesting

‒ No missing elements

Important when used by software that expects a particular document structure; and when separate groups of people need to agree on a common language for data exchange

XML (Spainhour, Ray, W3Schools)

Page 14: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

14

DTD

Defines the structure or grammar of an XML document by describing your markup language

Used to test whether the XML document is valid

Can be internal or external

Can contain the following types of markup declarations• ELEMENT – the XML elements

• ATTLIST – attributes of the elements

• ENTITY – characters referenced using the “&...;” syntax

• NOTATION – description of the data format

• Processing instructions

• Comments

XML (Yue, Spainhour, Ray, W3Schools)

Page 15: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

15

DTD Example

If we want to maintain a phone list as an XML document, the DTD might look like the following:

XML (Yue, Spainhour, Young, W3Schools)

<!ELEMENT phonelist (person)*>

<!ELEMENT person (name,phonenumber)>

<!ELEMENT name (#PCDATA)>

<!ELEMENT phonenumber (areacode,number)>

<!ELEMENT areacode (#PCDATA)>

<!ELEMENT number (#PCDATA)>

This DTD defines a phone list that contains the name, area code and phone number of each person in the list.

Page 16: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

16

Element Declarations

ELEMENT’s are the “building blocks” of an XML document.

<!ELEMENT phonelist (person)*>

<!ELEMENT person (name,phonenumber)>

<!ELEMENT name (#PCDATA)>

<!ELEMENT phonenumber (areacode,number)>

<!ELEMENT areacode (#PCDATA)>

<!ELEMENT number (#PCDATA)>

The first line declares that a “phonelist” element has element content, and it can contain zero or more “person” child elements.

<!ELEMENT phonelist (person)*>

Begins the element declaration

Tag name of this element

Content can be zero or more “person” elements

* Zero or more

+ One or more

? Zero or one

These three characters can be used to specify the

number of elements

Ends the element declaration

XML (Yue, Spainhour, Young, W3Schools)

Page 17: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

Element Declarations

17

<!ELEMENT phonelist (person)*>

<!ELEMENT person (name,phonenumber)>

<!ELEMENT name (#PCDATA)>

<!ELEMENT phonenumber (areacode,number)>

<!ELEMENT areacode (#PCDATA)>

<!ELEMENT number (#PCDATA)>

The second line declares that a “person” element has element content, and it must contain exactly one of each of the elements “name” and “phonenumber,” in that order.

<!ELEMENT person (name,phonenumber)>

Tag name of this element

When there are multiple child elements with commas separating the names, then the child elements must appear in that specific sequence

XML (Yue, Spainhour, Young, W3Schools)

Page 18: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

Element Declarations

18

<!ELEMENT phonelist (person)*>

<!ELEMENT person (name,phonenumber)>

<!ELEMENT name (#PCDATA)>

<!ELEMENT phonenumber (areacode,number)>

<!ELEMENT areacode (#PCDATA)>

<!ELEMENT number (#PCDATA)>

The third line declares that the content of the “name” element is simple character data.

<!ELEMENT name (#PCDATA)>

Tag name of this element

“PCDATA” stands for “parsed character data,” text that will be parsed by the XML parser. Tags inside the text will be treated as markup and entities will be expanded. It can also be empty.

XML (Ding,Yue, Young, W3Schools)

Page 19: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

Element Declarations

19

<!ELEMENT phonelist (person)*>

<!ELEMENT person (name,phonenumber)>

<!ELEMENT name (#PCDATA)>

<!ELEMENT phonenumber (areacode,number)>

<!ELEMENT areacode (#PCDATA)>

<!ELEMENT number (#PCDATA)>

What can you say about the next three declarations?

XML (Yue, Spainhour, Young, W3Schools)

Page 20: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

Element Declarations

20

<!ELEMENT phonelist (person)*>

<!ELEMENT person (name,phonenumber)>

<!ELEMENT name (#PCDATA)>

<!ELEMENT phonenumber (areacode,number)>

<!ELEMENT areacode (#PCDATA)>

<!ELEMENT number (#PCDATA)>

Is the following XML document valid, according to this DTD?

<?xml version="1.0"

encoding="UTF-8"?>

<phonelist>

<person>

<name>Charles Moen</name>

<phonenumber>

<areacode>281</areacode>

<number>283-3848</number>

</phonenumber>

</person>

</phonelist>

XML (Yue, Spainhour, Young, W3Schools)

Page 21: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

Using an External DTD

21

<!ELEMENT phonelist (person)*>

<!ELEMENT person (name,phonenumber)>

<!ELEMENT name (#PCDATA)>

<!ELEMENT phonenumber (areacode,number)>

<!ELEMENT areacode (#PCDATA)>

<!ELEMENT number (#PCDATA)>

Use the DOCTYPE instruction to connect the xml document with an external DTD

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE phonelist SYSTEM "phonelist.dtd">

<phonelist> <person> <name>Charles Moen</name> <phonenumber> <areacode>281</areacode> <number>283-3848</number> </phonenumber> </person></phonelist>

XML (Yue, Spainhour, Young, W3Schools)

phonelist.dtd

phonelist.xml

<!DOCTYPE phonelist SYSTEM "phonelist.dtd">

The root element

Describes the location of the DTD, and can be relative or fully qualified, such as:"http://sce.uhcl.edu/moenc/dtds/phonelist.dtd"

Either SYSTEM or PUBLIC (if PUBLIC, then must be followed by both a name and URI)

Page 22: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

Using an Internal DTD

An internal DTD is placed in the DOCTYPE instruction of the XML document.

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE phonelist [ <!ELEMENT phonelist (person)*>

<!ELEMENT person (name,phonenumber)> <!ELEMENT name (#PCDATA)> <!ELEMENT phonenumber (areacode,number)> <!ELEMENT areacode (#PCDATA)> <!ELEMENT number (#PCDATA)> ]>

<phonelist> <person> <name>Charles Moen</name> <phonenumber> <areacode>281</areacode> <number>283-3848</number> </phonenumber> </person></phonelist>

XML (Yue, Spainhour, Young, W3Schools)

phonelist.xml

Page 23: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

23

More about Element Declarations

ELEMENT content can be specified in several forms.

<!ELEMENT phonelist (listitem)*>

<!ELEMENT listitem (person | department)>

<!ELEMENT department (name,phonenumber)>

<!ELEMENT person (name,phonenumber)>

<!ELEMENT name (#PCDATA)>

<!ELEMENT phonenumber (areacode,number)>

<!ELEMENT areacode (#PCDATA)>

<!ELEMENT number (#PCDATA)>

The “choice” form specifies a series of possible child elements

XML (Yue, Spainhour, Young, W3Schools)

The “sequence” form specifies a required sequence of child elements

<!ELEMENT misc ANY>

The “ANY” keyword means the element can have any legal content,

in any order.<!ELEMENT br EMPTY>

The “EMPTY” keyword means the element must have no content.

Page 24: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

24

Attribute-List Declarations

All attributes must be explicitly declared with an “ATTLIST” declaration.

<!ELEMENT phonelist (listitem)*>

<!ELEMENT listitem (person | department)>

<!ELEMENT department (name,phonenumber)>

<!ELEMENT person (name,phonenumber)>

<!ATTLIST person title CDATA "Dr" #Required>

<!ELEMENT name (#PCDATA)>

<!ELEMENT phonenumber (areacode,number)>

<!ELEMENT areacode (#PCDATA)>

<!ELEMENT number (#PCDATA)>

Here, the “title” attribute is required; it must be CDATA; and it defaults to “Dr”.

XML (Yue, Spainhour, Young, W3Schools)

<!ATTLIST person title (Dr|Ms|Mr) "Dr">

Here, the “title” attribute is not required; it must be one of the three values that are enumerated; and it defaults to “Dr”.

Page 25: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

XML Namespaces

25

<?xml version="1.0" encoding="UTF-8"?>

<uhcl:courses xmlns:uhcl="http://www.uhcl.edu/ns">

<uhcl:course>

<uhcl:title>Charles Moen</uhcl:title>

<uhcl:rubric>CSCI/CINF</uhcl:rubric>

<uhcl:number>4230</uhcl:number>

</uhcl:course>

</uhcl:courses>

XML (Yue, Spainhour, Young, W3Schools)

We can be sure that there is no conflict with element names by using a namespace.

The namespace must be declared before using it, and the declaration is often in the root element.

The identifier must be unique, and is usually a URL. (The URL does not have to be a valid URL of a Web page.)

The qualified element name consists of the namespace, followed by a colon, followed by the local name.

Page 26: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

Just for Fun

XSL

26

XML

An XSL (Extensible Stylesheet Language) document can be used to transform the data in an XML document to an HTML document, or a document in some other format.

<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="simple.xsl"?>

<phonelist> <person> <name>Charles Moen</name> <phonenumber> <areacode>281</areacode> <number>283-3848</number> </phonenumber> </person></phonelist>

<?xml version="1.0" encoding="UTF-8"?><xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head> <title>Demo XSL</title> </head> <body> <h1>Phone List</h1> <table border="1" cellspacing="0" cellpadding="5" width="480"> <tr><th>Name</th><th>Phone number</th></tr> <xsl:apply-templates select="phonelist/person"/> </table> </body> </html> </xsl:template>

<xsl:template match="person"> <tr> <td><xsl:value-of select="@title"/>&#160;<xsl:value-of select="name"/></td> <td> (<xsl:value-of select="phonenumber/areacode"/>)&#160;<xsl:value-of select="phonenumber/number"/> </td> </tr> </xsl:template></xsl:stylesheet>

<?xml version="1.0" encoding="UTF-8"?><xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head> <title>Demo XSL</title> </head> <body> <h1>Phone List</h1> <table border="1" cellspacing="0" cellpadding="5" width="480"> <tr><th>Name</th><th>Phone number</th></tr> <xsl:apply-templates select="phonelist/person"/> </table> </body> </html> </xsl:template>

<xsl:template match="person"> <tr> <td><xsl:value-of select="@title"/>&#160;<xsl:value-of select="name"/></td> <td> (<xsl:value-of select="phonenumber/areacode"/>)&#160;<xsl:value-of select="phonenumber/number"/> </td> </tr> </xsl:template></xsl:stylesheet>

The XSL must be linked to the XML

Page 27: XML Instructor: Charles Moen CSCI/CINF 4230. 2 XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.

27

References

Ding, Wei, “XML” UHCL lecture slides, 2008.

Ray, Erik T. Learning XML. O'Reilly, 2001.

Spainhour, Stephen and Robert Eckstein. Webmaster in a Nutshell, 3rd Edition. O'Reilly, 2002.

W3Schools Online Web Tutorials. “DTD Tutorial". [Online]. Available: http://www.w3schools.com/dtd/default.asp

W3Schools Online Web Tutorials. "XML Tutorial". [Online]. Available: http://www.w3schools.com/xml/default.asp

Young, Michael J., XML Step by Step. Microsoft Press, 2000.

Yue, Kwok-Bun, “An Introduction to XML” UHCL lecture notes, 2001.