2002 Prentice Hall, Inc. All rights reserved. Chapter 6 – Document Type Definition (DTD) tline Introduction Parsers, Well-formed and Valid XML Documents Document Type Declaration Element Type Declarations 6.4.1 Sequences, Pipe Characters and Occurrence Indicators 6.4.2 EMPTY, Mixed Content and ANY Attribute Declarations 6.5.1 Attribute Defaults (#REQUIRED, #IMPLIED, #FIXED) Attribute Types: strings (CDATA), tokenized, enumerated 6.6.1 Tokenized Attribute Type (ID, IDREF, ENTITY, NMTOKEN) 6.6.2 Enumerated Attribute Types Conditional Sections Whitespace Characters Case Study: Writing a DTD for the Day Planner Application
55
Embed
2002 Prentice Hall, Inc. All rights reserved. Chapter 6 – Document Type Definition (DTD) Outline 6.1Introduction 6.2Parsers, Well-formed and Valid XML.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
2002 Prentice Hall, Inc. All rights reserved.
Chapter 6 – Document Type Definition (DTD)
Outline6.1 Introduction6.2 Parsers, Well-formed and Valid XML Documents6.3 Document Type Declaration6.4 Element Type Declarations 6.4.1 Sequences, Pipe Characters and Occurrence Indicators 6.4.2 EMPTY, Mixed Content and ANY6.5 Attribute Declarations 6.5.1 Attribute Defaults (#REQUIRED, #IMPLIED, #FIXED)6.6 Attribute Types: strings (CDATA), tokenized, enumerated 6.6.1 Tokenized Attribute Type (ID, IDREF, ENTITY, NMTOKEN) 6.6.2 Enumerated Attribute Types6.7 Conditional Sections6.8 Whitespace Characters6.9 Case Study: Writing a DTD for the Day Planner Application
2002 Prentice Hall, Inc. All rights reserved.
6.1 Introduction
• Document Type Definitions (DTDs)– Define structure of XML document
• i.e., what elements, attributes, etc. are permitted in document
– XML document not required to have DTD• Usually recommended for document conformity
– Use Extended Backus-Naur Form (EBNF) grammar• Wikipedia Definition: The Backus–Naur form (also known as
BNF, the Backus–Naur formalism, Backus normal form, or Panini–Backus Form) is a metasyntax used to express context-free grammars: that is, a formal way to describe formal languages.
• Use (application’s) default value if attribute value not specified
[from w3schools: Use the #IMPLIED keyword if you don't want to force the author to include an attribute, and you don't have an option for a default value.]
– #REQUIRED• Attribute must appear in element
• Document is not valid if attribute is missing
– #FIXED• Attribute value is constant
• Attribute value cannot differ in XML document
2002 Prentice Hall, Inc. All rights reserved.
6.6 Attribute Types
• Attribute types– Strings (CDATA)
• No constraints on attribute values
– Except for disallowing <, >, &, ’and ” characters
– Tokenized attributes• Constraints on permissible characters for attribute values
– Enumerated attributes• Most restrictive
• Take only one value listed in attribute declaration
2002 Prentice Hall, Inc. All rights reserved.
6.6 Attribute Types
Type Value Explanation
String CDATA The value is character data
Enumerated (en1|en2|…) The value must be one from an enumerated list
Tokenized ID The value is a unique id
IDREF The value is the id of another element
IDREFS The value is a list of other ids
NMTOKEN The value is a valid XML name
NMTOKENS The value is a list of valid XML names
ENTITY The value is an entity
ENTITIES The value is a list of entities
NOTATION The value is a name of a notation
xml: The value is a predefined xml value
2002 Prentice Hall, Inc. All rights reserved.
6.6.1 Tokenized Attribute Type (ID, IDREF, ENTITY, NMTOKEN)
• Tokenized attribute types– Restrict attribute values– ID
• Uniquely identifies an element
– IDREF• Points to elements with ID attribute
2002 Prentice Hall, Inc. All rights reserved.
Outline
Fig. 6.8 XML document with ID and IDREF attribute types.
Each shipping element has a unique identifier (shipID)
Attribute shippedBy points to shipping element by matching shipID attribute
1 <?xml version = "1.0"?>
2
3 <!-- Fig. 6.8: IDExample.xml -->
4 <!-- Example for ID and IDREF values of attributes -->
5
6 <!DOCTYPE bookstore [
7 <!ELEMENT bookstore ( shipping+, book+ )>
8 <!ELEMENT shipping ( duration )>
9 <!ATTLIST shipping shipID ID #REQUIRED>
10 <!ELEMENT book ( #PCDATA )>
11 <!ATTLIST book shippedBy IDREF #IMPLIED>
12 <!ELEMENT duration ( #PCDATA )>
13 ]>
14
15 <bookstore>
16 <shipping shipID = "s1">
17 <duration>2 to 4 days</duration>
18 </shipping>
19
Each shipping element has a unique identifier (shipID)
Attribute shippedBy points to shipping element by
matching shipID attribute
2002 Prentice Hall, Inc. All rights reserved.
Outline
Fig. 6.8 XML document with ID and IDREF attribute types . (Part 2)
Declare book elements with attribute shippedBy
20 <shipping shipID = "s2">
21 <duration>1 day</duration>
22 </shipping>
23
24 <book shippedBy = "s2">
25 Java How to Program 3rd edition.
26 </book>
27
28 <book shippedBy = "s2">
29 C How to Program 3rd edition.
30 </book>
31
32 <book shippedBy = "s1">
33 C++ How to Program 3rd edition.
34 </book>
35 </bookstore>
Declare book elements with attribute shippedBy
2002 Prentice Hall, Inc. All rights reserved.
Fig. 6.7 XML document with ID and IDREF attribute types.
2002 Prentice Hall, Inc. All rights reserved.
Fig. 6.9 Error displayed by XML Validator when an invalid ID is
referenced.
Assign shippedBy (line 28) value “s3”
Outline
Assign shippedBy (line 28) value “s3”
2002 Prentice Hall, Inc. All rights reserved.
6.6.1 Tokenized Attribute Type (ID, IDREF, ENTITY, NMTOKEN) (cont.)
• ENTITY tokenized attribute type– Indicate that attribute has entity for its value
– Entity declaration
<!ENTITY digits “0123456789”>
– Entity may be used as follows:<useAnEntity>&digits;</useAnEntity>
– Entity reference &digits; replaced by its value<useAnEntity>0123456789</useAnEntity>
2002 Prentice Hall, Inc. All rights reserved.
Outline
Declare entity city that refers to external document elements tour.html
Fig. 6.10XML document that contains an ENTITY attribute type.
Declare entity city that refers to external document elements tour.html
NDATA indicates that external-entity content is not XML
Attribute tour for element company requires ENTITY attribute type
1 <?xml version = "1.0"?>
2
3 <!-- Fig. 6.10: entityExample.xml -->
4 <!-- ENTITY and ENTITY attribute types -->
5
6 <!DOCTYPE database [
7 <!NOTATION html SYSTEM "iexplorer">
8 <!ENTITY city SYSTEM "tour.html" NDATA html>
9 <!ELEMENT database ( company+ )>
10 <!ELEMENT company ( name )>
11 <!ATTLIST company tour ENTITY #REQUIRED>
12 <!ELEMENT name ( #PCDATA )>
13 ]>
14
15 <database>
16 <company tour = "city">
17 <name>Deitel & Associates, Inc.</name>
18 </company>
19 </database>
Declare entity city that refers to external document
elements tour.html
NDATA indicates that external-entity content is not XML
Attribute tour for element company requires
ENTITY attribute type
2002 Prentice Hall, Inc. All rights reserved.
Fig. 6.10 XML document that contains an ENTITY attribute type.
2002 Prentice Hall, Inc. All rights reserved.
Fig. 6.11 Error generated by XML Validator when a DTD contains a reference to an undefined entity.
Replace line 16<company tour = "city">
with<company tour = "country">
Outline
Replace line 16
2002 Prentice Hall, Inc. All rights reserved.
6.6.1 Tokenized Attribute Type (ID, IDREF, ENTITY, NMTOKEN) (cont.)
• NMTOKEN tokenized attribute type– “Name token”
– Value consists of letters, digits, periods, underscores, hyphens and colon characters (i.e. cannot contain spaces)
2002 Prentice Hall, Inc. All rights reserved.
6.6.2 Enumerated Attribute Types
• Enumerated attribute types– Declare list of possible values for attribute
<!ATTLIST person gender ( M | F ) “F”>
• Attribute gender can have either value M or F• F is default value
2002 Prentice Hall, Inc. All rights reserved.
6.7 Conditional Sections
• Conditional sections– Include declarations
• Keyword INCLUDE
– Exclude declarations• Keyword IGNORE
– Often used with entities• Parameter entities
– Preceded by percent character (%)
– Creates entities specific to DTD
– Can be used only inside DTD in which they are declared
2002 Prentice Hall, Inc. All rights reserved.
Outline
Entities accept and reject represent strings INCLUDE and IGNORE, respectively
Fig. 6.12Conditional sections in a DTD.
Entities accept and reject represent strings INCLUDE and IGNORE, respectively
Attribute cdata requires CDATA, which preserves whitespace
Other attributes normalize (do not preserve) whitespace
2002 Prentice Hall, Inc. All rights reserved.
Outline
Fig. 6.14Processing whitespace in an XML document. (Part 2)
Whitespace preserved
Whitespace normalized
26 <whitespace>
27
28 <hasCDATA cdata = " simple cdata "/>
29
30 <hasID id = " i20"/>
31
32 <hasNMTOKEN nmtoken = " hello"/>
33
34 <hasEnumeration enumeration = " true"/>
35
36 <hasMixed>
37 This is text.
38 <hasCDATA cdata = " simple cdata"/>
39 This is some additional text.
40 </hasMixed>
41
42 </whitespace>
Whitespace preserved
Whitespace normalized
2002 Prentice Hall, Inc. All rights reserved.
Outline
Output from Fig. 6.14
Whitespace preserved
Whitespace normalized
>java Tree yeswhitespace.xmlURL: file:C:/Examplesps/Files/deleted/ch09/Tree/whitespace.xml[ document root ]+-[ element : whitespace ] +-[ ignorable ] +-[ ignorable ] +-[ ignorable ] +-[ element : hasCDATA ] +-[ attribute : cdata ]" simple cdata “ +-[ ignorable ] +-[ ignorable ] +-[ ignorable ]+-[ element : hasID ] +-[ attribute : id ] "i20“ +-[ ignorable ] +-[ ignorable ] +-[ ignorable ] +-[ element : hasNMTOKEN ] +-[ attribute : nmtoken ] "hello“ +-[ ignorable ] +-[ ignorable ] +-[ ignorable ] +-[ element : hasEnumeration ] +-[ attribute : enumeration ] "true“ +-[ ignorable ] +-[ ignorable ] +-[ ignorable ] +-[ element : hasMixed ] +-[ text ] ““ +-[ text ] " This is text.“ +-[ text ] “
Whitespace normalized
Whitespace preserved
2002 Prentice Hall, Inc. All rights reserved.
Outline
Output from Fig. 6.14
“ +-[ text ] " “ +-[ element : hasCDATA ] +-[ attribute : cdata ] " simple cdata“ +-[ text ] ““ +-[ text ] " This is some additional text.“ +-[ text ] ““ +-[ text ] " “ +-[ ignorable ] +-[ ignorable ][ document end ]
2002 Prentice Hall, Inc. All rights reserved.
6.9 Case Study: Writing a DTD for the Day Planner Application
• Continue case study from Chapter 5– External subset of DTD for day planner
2002 Prentice Hall, Inc. All rights reserved.
Outline
Fig. 6.15DTD for planner.xml.
Root element planner
Element year contains one or more date elements
Element year contains attribute value that has character data
Element date contains one or more note elements
Element date contains attributes month and day
Element note contains parsed character data and optional attribute time
1 <!-- Fig. 6.15: planner.dtd -->
2 <!-- DTD for day planner -->
3
4 <!ELEMENT planner ( year* )>
5
6 <!ELEMENT year ( date+ )>
7 <!ATTLIST year value CDATA #REQUIRED>
8
9 <!ELEMENT date ( note+ )>
10 <!ATTLIST date month CDATA #REQUIRED>
11 <!ATTLIST date day CDATA #REQUIRED>
12
13 <!ELEMENT note ( #PCDATA )>
14 <!ATTLIST note time CDATA #IMPLIED>
Root element planner contains any number of (optional) year elements
Element year contains one or more date elements
Element year contains attribute value that has character data
Element date contains one or more note elements
Element date contains attributes month and day, which
contain has character data
Element note contains parsed character data and optional attribute time