Top Banner
ITC570 1
43

XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES Be able to: Understand XML technologies and their roles. Understand different.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

ITC570

1

Page 2: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

LEARNING OBJECTIVES

Be able to:

Understand XML technologies and their roles.

Understand different components of an XML document.

Create a well-form XML document.

2

Page 3: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

3

XML stands for eXtensible Markup Language

HTML is used to mark up text so it can be displayed to users

XML is used to mark up data so it can be processed by computers

HTML describes both structure (e.g. <p>, <h2>, <em>) and appearance (e.g. <br>, <font>, <i>)

XML describes only content, or “meaning”

HTML uses a fixed, unchangeable set of tags

In XML, you make up your own tags

Page 4: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

HTML and XML look similar, because they are both SGML languages (SGML = Standard Generalized Markup Language)

Both HTML and XML use elements enclosed in tags (e.g. <body>This is an element</body>)

Both use tag attributes (e.g.,<font face="Verdana" size="+1" color="red">)

Both use entities (&lt;, &gt;, &amp;, &quot;, &apos;)

More precisely,

HTML is defined in SGML

XML is a (very small) subset of SGML

4

Page 5: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

HTML is for humans

HTML describes web pages

You don’t want to see error messages about the web pages you visit

Browsers ignore and/or correct as many HTML errors as they can, so HTML is often sloppy

XML is for computers

XML describes data

The rules are strict and errors are not allowed

In this way, XML is like a programming language

Current versions of most browsers can display XML

However, browser support of XML is spotty at best

5

Page 6: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

6

<?xml version="1.0"?><weatherReport> <date>7/14/97</date> <city>North Place</city>, <state>NX</state> <country>USA</country> High Temp: <high scale="F">103</high> Low Temp: <low scale="F">70</low> Morning: <morning>Partly cloudy, Hazy</morning> Afternoon: <afternoon>Sunny &amp; hot</afternoon> Evening: <evening>Clear and Cooler</evening></weatherReport>

From: XML: A Primer, by Simon St. Laurent

Page 7: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

SOME TECHNOLOGIES WE MAY COVER

7

HTML

Java

HTML FormsJavaScript

XHTML & CSS

But underneath...HTTPTCP/IPSockets

maybe RMI

Javaservlets

JSP

Perl PHP

SQL

XMLDTDXML Schemas

RELAX NG

XSLXSLTXPath

CSS

JavaSAXDOM

JAXP

Java JDBC

ApacheTomcat

Ajax

Page 8: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

WHY XML?

Distributed applications need to share data.

plain text

structure and the meaning of the data are tightly defined.

Delivery of data to multi-devices

Separation of data and presentation.

8

Page 9: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

XML DOCUMENT – AN EXAMPLE

<bookshop><book><title> Harry Potter and the

Sorcerer’s Stone</title><author> <initials>J.K</initials> <surname> Rowling</surname></author><price value=“$16.95”></price></book>…</bookshop>

9

bookshop

book

title

book

author

initials surname

price

value

Page 10: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

DTD (Document Type Definition) and XML Schemas are used to define legal XML tags and their attributes for particular purposes

CSS (Cascading Style Sheets) describe how to display HTML or XML in a browser

XSLT (eXtensible Stylesheet Language Transformations) and XPath are used to translate from one form of XML to another

DOM (Document Object Model), SAX (Simple API for XML, and JAXP (Java API for XML Processing) are all APIs for XML parsing

10

Page 11: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

XML PARSER

Required to read and manipulate XML documents.

Read the XML documents as a plain text and transform it into a data structure, typically tree, in the memory.

The applications, such as web browser, access the data structure and process the data according to their objectives.

Example: msxml

11

Page 12: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

XML DOCUMENT – BASIC COMPONENTS

Elements.

Attributes.

Character and Entity References.

Character Data (CDATA).

Processing Instruction.

Comments.

12

Page 13: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

ELEMENTS13

Root Element (compulsory)

Branch Elements

Leaf Element

bookshop

book

title

book

author

initials surname

price

value

attribute

Page 14: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

ELEMENT

The basic building block of XML markups.

It may contains: Text , Other elements (child elements)

Attributes, Character Data, Other markup, eg comments

Delimited with a start-tag and an end-tag.

Element can be empty.

The end-tag CANNOT be omitted as in HTML.

Each tag must consist a valid element type name.

14

Page 15: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

ELEMENT’S NAME

Element’s Name (Tag’s name) is CASE SENSITIVE.

<BOOK> <Book><book>

Trailing space is legal but will be ignored

<BOOK > = <BOOK>

15

Page 16: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

EMPTY ELEMENT

Has no content.

May be associated with attribute.

Example:

<img src=‘logo.png’></img>

can be abbreviated into

<img src=‘logo.png’/>

16

Page 17: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

XML DOCUMENT – BASIC COMPONENTS

Elements.

Attributes.

Character and Entity References.

Character Data (CDATA).

Processing Instruction.

Comments.

17

Page 18: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

ATTRIBUTES

Information regarding the element.

“If elements are ‘nouns’ of XML then attributes are its ‘adjective’.

<tagname attribute_name=“attribute_value”>

18

<book>

<title> Harry Potter</title>

</book>

<book title=“Harry Potter”>

</book>

Page 19: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

ATTRIBUTES VS ELEMENT

Determine by the semantic contents.

Attributes are characteristics of an element.

19

<book>

<title> Harry Potter</title>

</book>

<book title=“Harry Potter”>

</book>

Page 20: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

XML DOCUMENT – BASIC COMPONENTS

Elements.

Attributes.

Character and Entity References.

Character Data (CDATA).

Processing Instruction.

Comments.

20

Page 21: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

CHARACTER REFERENCES

Use to display characters that are not supported by the input device (keyboard).

entering £ using US-ASCII keyboard.

Format: &#NNNNN; or &#xXXXX;

N decimal

X hexadecimal

Example: $ => &#36; OR &#x24

21

Page 22: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

ENTITY REFERENCES

Entities may be defined and used for:

Representing character used in mark-up

&lt == “<“ &amp == “&”

String

&IR == Information Retrieval

Predefined entities: &lt, &gt, &quot, etc

22

Page 23: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

XML DOCUMENT – BASIC COMPONENTS

Elements.

Attributes.

Character and Entity References.

Character Data (CDATA).

Processing Instruction.

Comments.

23

Page 24: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

CHARACTER DATA

To escape blocks of text containing characters which would otherwise be recognized as markup.

<![CDATA[…]]>

<![CDATA[<greeting>Hello, world!</greeting>]]>

24

Page 25: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

CHARACTER DATA(2)

<example>

<![CDATA[&Warn;-&Disclaimer;&lt;&copy 2001; &PM;&gt;]]>

</example>

<example>

&amp;Warn;-&amp;Disclaimer;&amp;lt;&amp;copy 2001; &amp;PM; &amp;gt>

</example>

25

Page 26: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

XML DOCUMENT – BASIC COMPONENTS

Elements.

Attributes.

Character and Entity References.

Character Data (CDATA).

Processing Instruction.

Comments.

26

Page 27: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

PROCESSING INSTRUCTION(PI)

Processing instructions (PIs) allow documents to contain instructions for applications.

<?target … instruction … ?>

Target is used to identify the application or other object to which the PI is directed.

<?xml-stylesheet href=“mystyle.css” type=“text/css”>

27

Page 28: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

XML DOCUMENT – BASIC COMPONENTS

Elements.

Attributes.

Character and Entity References.

Character Data (CDATA).

Processing Instruction.

Comments.

28

Page 29: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

COMMENTS

Syntax:

<!–- comment text -->

Comments cannot be used within element tags.

<tag>… some content … <tag <!– it is illegal -->>

Comments may never be nested.

<!– Comments cannot <!– be nested --> like this -->

29

Page 30: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

Names (as used for tags and attributes) must begin with a letter or underscore, and can consist of:

Letters, both Roman (English) and foreign

Digits, both Roman and foreign

. (dot)

- (hyphen)

_ (underscore)

: (colon) should be used only for namespaces

Combining characters and extenders (not used in English)

30

Page 31: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

Start with <?xml version="1"?>

XML is case sensitive

You must have exactly one root element that encloses all the rest of the XML

Every element must have a closing tag

Elements must be properly nested

Attribute values must be enclosed in double or single quotation marks

There are only five pre declared entities

31

Page 32: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

<novel> <foreword> <paragraph> This is the great American novel. </paragraph></foreword> <chapter number="1"> <paragraph>It was a dark and stormy night. </paragraph> <paragraph>Suddenly, a shot rang out! </paragraph> </chapter></novel>

32

Page 33: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

An XML document represents a hierarchy; a hierarchy is a tree

33

novel

foreword chapternumber="1"

paragraph paragraph paragraph

This is the greatAmerican novel.

It was a darkand stormy night.

Suddenly, a shotrang out!

Page 34: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

You can make up your own XML tags and attributes, but...

...any program that uses the XML must know what to expect!

A DTD (Document Type Definition) defines what tags are legal and where they can occur in the XML

An XML document does not require a DTD

XML is well-structured if it follows the rules given earlier

In addition, XML is valid if it declares a DTD and conforms to that DTD

A DTD can be included in the XML, but is typically a separate document

Errors in XML documents will stop XML programs

Some alternatives to DTDs are XML Schemas and RELAX NG

34

Page 35: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

An element may contain other elements, plain text, or both

An element containing only text: <name>David Matuszek</name>

An element (<name>) containing only elements: <name><first>David</first><last>Matuszek</last></name>

An element containing both:<class>CIT597 <time>10:30-12:00 MW</time></class>

An element that contains both text and other elements is said to have mixed content

Mixed content is legal, but bad

Mixed content makes it much harder to define valid XML

Mixed content is more complicated to use in a program

Mixed content adds no power to XML--it is never needed for anything35

Page 36: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

36

<?xml version="1.0"?><weatherReport> <date>7/14/97</date> <place><city>North Place</city> <state>NX</state> <country>USA</country> </place> <temperatures><high scale="F">103</high> <low scale="F">70</low> </temperatures> <forecast><time>Morning</time> <predict>Partly cloudy, Hazy</predict> </forecast> <forecast><time>Afternoon</time> <predict>Sunny &amp; hot</predict> </forecast> <forecast><time>Evening</time> <predict>Clear and Cooler</predict></weatherReport>

Page 37: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

XML is designed to be processed by computer programs, not to be displayed to humans

Nevertheless, almost all current browsers can display XML documents

They don’t all display it the same way

They may not display it at all if it has errors

For best results, update your browsers to the newest available versions

Remember: HTML is designed to be viewed, XML is designed to be used

37

Page 38: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

STRUCTURE OF XML DOCUMENT

XML document has to be well-formed.

Conform to syntax requirements

Conform to a simple container structure

Common structure of XML document:

Prolog

Body

Epilog

38

Page 39: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

PROLOG

Includes:

XML Declaration

<?xml version=“1.0” encoding=‘utf-8’ standalone=“yes”>

Version is mandatory, encoding and standalone are optional

Document Type Declaration

<!DOCTYPE

It is not DTD=Document Type Definition

A simple well-formed XML does not need it.

Schema declaration

39

Page 40: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

BODY & EPILOG

Body

Contains 1 or more elements

The “contents”

Epilog

Hardly used

Can be used to identify end of document

40

Page 41: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

WELL-FORMED XML DOCUMENT

41

Every element must have both a start tag and an end tag, e.g. <name> ... </name>

But empty elements can be abbreviated: <break />.

XML tags are case sensitive

XML tags may not begin with the letters xml, in any combination of cases

Elements must be properly nested, e.g. not <b><i>bold and italic</b></i>

Every XML document must have one and only one root element

The values of attributes must be enclosed in single or double quotes, e.g. <time unit="days">

Character data cannot contain < or &

Page 42: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

XML Process

An XML document is created in an editor. The XML parser reads the document and converts it into a tree of elements.

The parser passes the tree to the browser that displays it.

Page 43: XML Programming Introduction to XML ITC570 1. L EARNING O BJECTIVES  Be able to:  Understand XML technologies and their roles.  Understand different.

Summary

XML is a meta-markup language that enables the creation of markup languages for particular documents and domains.

XML documents are created in an editor, read by a parser, and displayed by a browser.

Be careful. XML isn’t completely finished. It will change and expand, and you will encounter bugs in current XML software.