1 XML (3) Extensible Markup Language Acknowledgements and copyrights : these slides are a result of combination of notes and slides with contributions from: Michael Kiffer, Arthur Bernstein, Philip Lewis, Hanspeter Mφssenbφck, Hanspeter Mφssenbφck, Wolfgang Beer, Dietrich Birngruber, Albrecht Wφss, Mark Sapossnek, Bill Andreopoulos, Divakaran Liginlal, Anestis Toptsis, Addison Wesley, Microsoft AA. They serve for teaching purposes only and only for the students that are registered in CSE4413 and should not be published as a book or in any form of commercial product, unless written permission is obtained from each of the above listed names and/or organizations.
26
Embed
Extensible Markup Language Lecture Notes/xml 03.pdf · 2 Document Type Definition (DTD) •A DTD is a grammar specification for an XML document • DTDs are optional – don’t need
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
XML (3)Extensible Markup Language
Acknowledgements and copyrights: these slides are a result of combination of notes and slides with contributions from: MichaelKiffer, Arthur Bernstein, Philip Lewis, Hanspeter Mφssenbφck, Hanspeter Mφssenbφck, Wolfgang Beer, Dietrich Birngruber,
Albrecht Wφss, Mark Sapossnek, Bill Andreopoulos, DivakaranLiginlal, Anestis Toptsis, Addison Wesley, Microsoft AA.
They serve for teaching purposes only and only for the students that are registered in CSE4413 and should not be published as a book or in any form of commercial product, unless written permission is
obtained from each of the above listed names and/or organizations.
2
Document Type Definition (DTD)
• A DTDDTD is a grammar specification for an XML document
• DTDs are optional – don’t need to be specified
• If specified, DTD can be part of the document (at the top); or it can be given as a URL
• A document that conforms (i.e., parses) w.r.t. its DTD is said to be validvalid
3
XMLData + DTD<!-- XML Data--><a>
<b> Some </b><c> 100 </c><c> 101 </c>
</a>
<!-- XML Data--><a>
<b> Some </b><b> Thing </b>
</a>
Not Valid!DTD
<!ELEMENT a (b+, c?) ><!ELEMENT b (#PCDATA) ><!ELEMENT c (#PCDATA) >
Valid
4
What is a DTD ?
• Document Type Definition (DTD)• Defines the syntax, grammar and semantics • Defines the document structure
– What Elements, Attributes, Entities, etc are permitted?– How are the document elements related and structured?
• Referenced by or defined in XML documents, but it’s not XML!
• Enables validation of XML documents using an XML Parser
• Can be referenced to by more than one XML document
• DTD’s may reference other DTD’s
5
Schemas: DTD Example
• XML document (that conforms to DTD below)
• DTD schema:
<!DOCTYPE BOOK [<!ELEMENT BOOK (TITLE+, AUTHOR) ><!ELEMENT TITLE (#PCDATA) ><!ELEMENT AUTHOR (#PCDATA) >]>
<BOOK><TITLE>All About XML</TITLE><AUTHOR>Joe Developer</AUTHOR>
</BOOK>
6
DTD By Diagram
Customer
FName LName
Address
Address
Address
CustomerOrder
Orders
OrderNo ProductNo
ProductNo
ProductNo
OrderNo ProductNo
ProductNo
Person
Orders
Orders
7
DTD By Example
http://www.myco.com/dtd/order.dtd<?xml version = “1.0” encoding = “UTF-8” ?><!DOCTYPE CustomerOrder [
Examples: DTD- Attribute Declaration<!ATTLIST product Name CDATA #REQUIRED>
• A product must have a name and it is a character data string
<!ATTLIST product Life #IMPLIED>• A product may have a specified Life
<!ATTLIST product Life #FIXED “Not Known”>• By default the product’s Life = ‘Not Known’ and is
constant. If attribute Life is absent, then still the attribute Life will be assumed to have this value.
<!ATTLIST product Life “Not Known”>• By default the product’s Life = ‘Not Known’. If specified
then the specified value is used. Divakaran Liginlal
Type of attr. is character string
14
Attribute Types – CDATA & ID<!ATTLIST elementname attrname Type DEFAULT>
Eg: <!ATTLIST Idea id ID #REQUIRED> <!ATTLIST school id ID #REQUIRED> Format of specifying an ID = (Letter | '_' | ':') (Char)*<Idea id=‘L1013’> <!ATTLIST Course CrsCode ID #REQUIRED><Course CrsCode=“4413”>
Eg: <!ATTLIST Idea name CDATA #REQUIRED><Idea name=‘Bug Killer’>
<!ATTLIST education school CDATA #REQUIRED><education school=“York University”>
Character String. CDATA indicates that an attribute contains a simple character string of text.
CDATA
Unique Name (identifier) inside the document. Only one attribute of type ID can be assigned to a given element type. The value of the attribute (i.e., the ID) must be unique throughout the same XML document, i.e., it uniquely identifies an element in the XML document,
ID
MeaningType
15
Attribute Types IDREF<!ATTLIST elementname attrname Type DEFAULT>
Indirect reference to an ID type<!ATTLIST CrsTaken CrsCode IDREF #REQUIRED>
<CrsTaken CrsCode=‘4413’>. CrsCode refers to an element (Course) that has an attribute CrsCode with value ‘4413’.
Reference to an element with an ID attribute having the same value (but not necessarily the same name!) as this IDREF attribute. (i.e., the value of the IDREF attribute must match the value of an ID attribute elsewhere in the same XML document.)
IDREFMeaningType
16
Attribute Types IDREFS
<!ATTLIST elementname attrname Type DEFAULT>
<!ATTLIST ClassRoster Members IDREFS #IMPLIED><ClassRoster Members=“cs123456 cs234567 cs345678”> –student ids who are registered in some class. Ideally, these students should be listed with elements of type<!ATTLIST Student StudId ID #REQUIRED>, e.g., <Student StudId=“cs123456”><Student StudId=“cs234567 ”><Student StudId=“cs345678”>
Series of IDREFs delimited by whitespace
IDREFSMeaningType
Divakaran Liginlal
17
Limitations of DTDs
• DTDs do not support namespaces. All element names are global: can’t have one Name type for people and another for companies:
<!ELEMENT Name (Last, First)><!ELEMENT Name (#PCDATA)>
both cannot be in the same DTD
• Very limited assortment of data types (just strings)• Cannot express unordered contents conveniently.
For example, <!ELEMENT Report (Students, Classes, Courses)>
determines that Students, Classes, Courses should appear in the order specified and not any other order.
18
DTD validation• Once you have a DTD, you can create a
XML document from that DTD. • Then you (may) want to validate the
document against the DTD.• To do so you can write a program that
parses the document and tries to match it against the DTD (Difficult!), or
• Can use a DTD validation tool.
19
DTD validation - tools• XSV validator (W3C) :
– Free– http://www.w3.org/2001/03/webdata/xsv
• Brown University’s STG (Scholarly Technology Group) validator.– Free– http://www.stg.brown.edu/service/xmlvalid/
etc. (defined in XMLSchema namespace, http://www.w3.org/TR/xmlschema-2/#built-in-datatypes) – string – string type– boolean – boolean type– integer, decimal, float, double – number types– time, date, month, year, century, etc– date and time
types.
All the above used as in xsd:type, e.g., xsd:integer.e.g.: <xsd:element name = “name” type = “xsd:string”/><xsd:attribute name=“retired” type=“xsd:boolean”/>
• Besides minInclusive and maxInclusive, as can also use minExclusive and maxExclusive, minLength and maxLength (for string). These are called usually “facets” in XSD. Also there are the facets precision and scale that allow you to control how many floating point digits will be allowed in floating point numbers.
• restriction is used to restrict the range of values.