Top Banner
Introduction to XML
31

Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Dec 29, 2015

Download

Documents

Rhoda Lane
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Introduction to XML

Page 2: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

What is XML?

• Extensible Markup Language XML 1.0 1998• Easier-to-use subset of SGML (Standard

Generalized Markup Language)• XML is a text-based markup language• Standard for data interchange on the web• Set of rules for designing semantic tags• Meta-markup language to define other

languages• XML 1.0 Specification

http://www.w3.org/TR/REC-xml

Page 3: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

XML File Sample

<?xml version="1.0"?> <dining-room> <manufacturer>The Wood Shop</manufacturer> <table type="round" wood="maple"> <price>$199.99</price> </table> <chair wood="maple"> <quantity>6</quantity> <price>$39.99</price> </chair> </dining-room>

Page 4: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

HTML Example

<DL> <DT>Mambo <DD>by Enrique Garcia </DL> <UL> <LI>Producer: Enrique Garcia <LI>Publisher: Sony Music Entertainment <LI>Length: 3:46 <LI>Written: 1991 <LI>Artist: Azucar Moreno </UL>

Page 5: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

XML Describes Structure and Semantics, Not Format

<SONG> <TITLE>Mambo</TITLE> <COMPOSER>Enrique Garcia</COMPOSER> <PRODUCER>Enrique Garcia</PRODUCER> <PUBLISHER>Sony Music Entertainment</PUBLISHER> <LENGTH>3:46</LENGTH> <YEAR>1991</YEAR> <ARTIST>Azucar Moreno</ARTIST> </SONG>

Page 6: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Self-Decribing Data

<?xml version="1.0" encoding="UTF-8"?> <DOCUMENT> <GREETING>Hello from XML</GREETING> <MESSAGE>Welcome to Programing XML in Java</MESSAGE> </DOCUMENT>

Page 7: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Structured and Integrated Data

<?xml version="1.0"?> <SCHOOL> <CLASS type="seminar"> <CLASS_TITLE>XML In The Real World</CLASS_TITLE> <CLASS_NUMBER>6.031</CLASS_NUMBER> <SUBJECT>XML</SUBJECT> <START_DATE>6/1/2002</START_DATE> <STUDENTS> <STUDENT status="attending"> <FIRST_NAME>Edward</FIRST_NAME> <LAST_NAME>Samson</LAST_NAME> </STUDENT> <STUDENT status="withdrawn"> <FIRST_NAME>Ernestine</FIRST_NAME> <LAST_NAME>Johnson</LAST_NAME> </STUDENT> </STUDENTS> </CLASS> </SCHOOL>

Page 8: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Creating XML Documents

• HTML, about 100 elements• XML, you define your own elements• HTML Browsers try to fix bad HTML code• XML Processors do not make any guess

about the structure of the document• Well-formed XML Document is the minimal

requirement• Valid XML Document (DTD or XML Schema)

Page 9: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

What is a well-formed XML Doc?

• A textual object is a well-formed XML Document if:– Taken as a whole, it matches the production

labeled document– It meets all the well-formedness contraints given in

this specification:http://www.w3.org/TR/REC-xml– Each of the parsed entities which is referenced

directly or indirectly whitin the document is well-formed

Page 10: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

document ::= prolog element Misc*

• Prolog: ・– <?xml version="1.0"?>– Comments -> <!-- This is a Comment -->– Processing Instructions:<?xml-stylesheet

href="JavaXML.html.xsl" type="text/xsl"?><?xml-stylesheet href="greeting.css" type="text/css"?>

• Element:– Root Element contains more elements– Exactly one root element

• Misc:– Comments– Processing Instructions– Whitespaces

Page 11: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Entities

• Part of an XML Document

• Hold text or binary data

• May refer to other entities

• Parsed entities are character data

• Unparsed entities are binary data

Page 12: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Tags and Elements

• XML Element consists of a start tag and an end tag<document> ... </document>

• Tag Names– Start with a letter <document>, an underscore

<_record> or a colon (avoid using a colon)– Next characters may be letters, digits, underscore,

hyphens, periods and colons (but no whitespaces)– XML Processors are case sensitiveDifferent tags:

<document>, <DOCUMENT>, <Document>– Empty Elements have only one tag:HTML :

<img>, <li>, <hr>XHTML : <img/>, <li/>, <hr/>

Page 13: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Attributes

• Name-value pairs: {STATUS, "Good Credit"}• Specify additional data in start tags

<CUSTOMER STATUS="Good credit">• Attribute Names same rules as tag names• Attribute Values are strings enclosed in

quotation marks

Page 14: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Too many attributes make documents hard to read:

<CUSTOMER LAST_NAME="Smith" FIRST_NAME="Sam" DATE="October 15, 2001" PURCHASE="Tomatoes" PRICE="$1.25" NUMBER="8" /> <CUSTOMER> <NAME> <LAST_NAME>Smith</LAST_NAME> <FIRST_NAME>Sam</FIRST_NAME> </NAME> <DATE>October 15, 2001</DATE> <ORDERS> <ITEM> <PRODUCT>Tomatoes</PRODUCT> <NUMBER>8</NUMBER> <PRICE>$1.25</PRICE> </ITEM> </ORDERS> </CUSTOMER>

Page 15: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

CDATA

• Hold character data that remains unparsed by the XML Processor

• Start a CDATA section: <![CDATA[

• End a CDATA section: ]]>

Page 16: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

<?xml version="1.0"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/tr/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Using The if Statement In JavaScript</title> </head> <body> <script language="javascript">

<![CDATA[ var budget budget = 234.77 if (budget < 0){ document.writeln("Uh oh.")}

]]> </script> <center> <h1>Using The if Statement In JavaScript</h1> </center> </body></html>

Page 17: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Document Type Definition:• Specify Structure and Syntax of XML Document

– <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)>

Page 18: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

<?xml version="1.0"?><!DOCTYPE BOOK [ <!ELEMENT BOOK (P*)> <!ELEMENT P (#PCDATA)>]>

<BOOK> <P>chapter 1 - Intro</P> <P>chapter 2 - Conclusion</P> <P>Index</P></BOOK>

Page 19: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Creating Document Type Declarations

• <!DOCTYPE rootname [DTD]>• <!DOCTYPE rootname SYSTEM URL>• <!DOCTYPE rootname SYSTEM URL

[DTD]>• <!DOCTYPE rootname PUBLIC identifier

URL>• <!DOCTYPE rootname PUBLIC identifier

URL [DTD]>

Page 20: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Element Definition

• <!ELEMENT direction (left, right, top?)>• <!ELEMENT CHAPTER (INTRODUCTION,

(P | QUOTE | NOTE)*, DIV*)>• <!ELEMENT HR EMPTY> ・ <!ELEMENT p

(#PCDATA | I)* >• <!ELEMENT %title; %content; >• <!ELEMENT DOCUMENT ANY>

Page 21: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Content_model

• ANY– Any type of content - Elements or PCDATA<!

ELEMENT DOCUMENT ANY>

• Child Element Lists– Name of elements in parentheses<!ELEMENT

direction (left, right, top?)>

• #PCDATA (Parsed Character Data)– Nonmarkup text<!ELEMENT First_Name

(#PCDATA)>

Page 22: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Example 1:<!ELEMENT PRODUCT (#PCDATA | PRODUCT_ID)*>

<PRODUCT>Tomatoes</PRODUCT>

<PRODUCT> <PRODUCT_ID>124829548702121</PRODUCT_ID></PRODUCT>

Example 2:<!ELEMENT p (#PCDATA | b)*><!ELEMENT b (#PCDATA)>

<p>This is <b>bold</b> text</p>

Page 23: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Entities

• XML's way of referring to a data item.• Text or Binary data.• General Entity

– Use in the content of XML document– References start with '&' and end with ';’

• Parameter Entity– Use in a DTD– References start with '%' and end with ';’

• Internal Entity - Defined in XML Document• External Entity - Defined in a external source: file,

URI.

Page 24: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

<!ENTITY name definition>

Example 1: <!ELEMENT DATE (#PCDATA)> <!ENTITY TODAY "February 7, 2001">

<DATE>&TODAY;</DATE>Example 2: <!ENTITY NAME "John Punin"> <!ENTITY CNAME "&NAME; Palacios">

Page 25: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Namespaces

• XML namespaces provide a simple method for qualifying element and attribute names used in Extensible Markup Language documents by associating them with namespaces identified by URI references.

• Definition: A namespace (or more precisely, a namespace binding) is declared using a family of reserved attributes. Such an attribute's name must either be xmlns or begin xmlns:. These attributes, like any other XML attributes, may be provided directly or by default.

Page 26: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

XML Document with one namespace

• Namespace is defined by xmlns:prefix• prefix is used for the namespace• The xmlns:prefix attribute is assigned to a URI. A

Uniform Resource Identifier (URI) is a string of characters which identifies an Internet Resource.

• Every tag is prefaced with the prefix name <?xml version="1.0"?> <!-- both namespace prefixes are available throughout --> <bk:book xmlns:bk='http://www.books.org/books'> <bk:title>Programing XML in Java</bk:title> </bk:book>

Page 27: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

Using Namespaces

This XML document carries information in a table:

<h:table xmlns:h="http://www.w3.org/TR/html4/"> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr></h:table>This XML document carries information about a piece of furniture:

<f:table xmlns:f="http://www.w3schools.com/furniture"> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length></f:table>Instead of using only prefixes, we have added an xmlns attribute to the <table> tag to give the prefix a qualified name associated with a namespace.

Page 28: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

XML Schema

• To define a "class" of XML Documents

• "instance document" - XML document that conforms to a particular schema

• An XML alternative to DTDs

Page 29: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

A Simple XML Document

Look at this simple XML document called "note.xml":

<?xml version="1.0"?><note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>

Page 30: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

A DTD File

The following example is a DTD file called "note.dtd" that defines the elements of the XML document above ("note.xml"):

<!ELEMENT note (to, from, heading, body)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)>The first line defines the note element to have four child elements: "to, from, heading, body".

Line 2-5 defines the to, from, heading, body elements to be of type "#PCDAT

Page 31: Introduction to XML. What is XML? Extensible Markup Language XML 1.0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.

An XML Schema

The following example is an XML Schema file called "note.xsd" that defines the elements of the XML document above ("note.xml"):

<?xml version="1.0"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"targetNamespace="http://www.w3schools.com"xmlns="http://www.w3schools.com"elementFormDefault="qualified"><xs:element name="note"> <xs:complexType> <xs:sequence>

<xs:element name="to" type="xs:string"/><xs:element name="from" type="xs:string"/><xs:element name="heading" type="xs:string"/><xs:element name="body" type="xs:string"/>

</xs:sequence> </xs:complexType></xs:element></xs:schema>The note element is a complex type because it contains other elements. The other elements (to, from, heading, body) are simple types because they do not contain other elements.