Top Banner
QUALITY CONTROL WITH QUALITY CONTROL WITH SCHEMAS SCHEMAS CSC1310 Fall 2009 CSC1310 Fall 2009
34

QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

Jan 18, 2016

Download

Documents

Logan Sutton
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

QUALITY CONTROL WITH QUALITY CONTROL WITH SCHEMASSCHEMASCSC1310 Fall 2009CSC1310 Fall 2009

Page 2: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

BASIS CONCEPTSBASIS CONCEPTS

• SchemaSchema is a pass-or-fail test for document• Schema is a minimum set of requirements for

document to prevent anomalous processing or to formalize an application.

• ValidationValidation is a testing a document with a schema.– StructureStructure: use and placement of markup elements

and attributes.– Data typingData typing: patterns of character data– IntegrityIntegrity: the status of links between nodes and

resources.– Business rulesBusiness rules: spelling checks, checksum results

and so on.

Page 3: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

DOCUMENT TYPE DEFINITIONSDOCUMENT TYPE DEFINITIONS(DTDS)(DTDS)

• DTD is the oldest and widely supported schema language.

• DTD declares a set of allowed elements (vocabulary).

• DTD defines a content modelcontent model for each element (grammar)

• DTD declares a set of allowed attributes for each element: name, data type, default values, behavior (for example, required or optional).

Page 4: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

DOCUMENT PROLOG FOR DTDDOCUMENT PROLOG FOR DTD All external parsed entities (including DTD)

shouldshould begin with text declaration. Text declarationText declaration looks like XML declarationXML declaration

except explicitly excludingexcluding the standalonestandalone property.

<?xml version=“1.0” encoding=“character <?xml version=“1.0” encoding=“character set”>set”>

Encoding in DTD won’t automatically carry over the XML documents that use the DTD.

External parsed entities (including DTD) must must notnot contain a document type declaration.

Page 5: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

DECLARATIONSDECLARATIONS DTD is a set of rules (declarationsdeclarations). Each declaration adds a new element, set of

attributes, entity or notation. If there are redundant entity declarations, entity declarations,

the first one that appears takes precedence, others are ignored.others are ignored.

• EMPTYEMPTY: no information (special tags like <br>)

• ANYANY: any information.

• PCDATA or CDATAPCDATA or CDATA : character data.

• With ChildrenWith Children : a parent-child relationship (order of kids).

Page 6: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

USE OF CHILDRENUSE OF CHILDREN There are ways that children elements can be

defined in a DTD file : One Occurrence OnlyOne Occurrence Only

Minimum of One Occurence (+)Minimum of One Occurence (+)

Zero or More Occurences (*)Zero or More Occurences (*)

Zero or One Occurences (?)Zero or One Occurences (?) Either / Or Occurrences ( | )Either / Or Occurrences ( | )

Page 7: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

ATTRIBUTESATTRIBUTES There are four value options : ValueValue: The default value of the attribute

surrounded by quotes ( " ") #IMPLIED#IMPLIED: The attribute is optional #FIXED#FIXED: A fixed value.

#REQUIRED#REQUIRED: The attribute is required when the element is used.

Page 8: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

TYPES OF ATTRIBUTETYPES OF ATTRIBUTE CDATACDATA : The value is Character Data. (en1|en2|...)(en1|en2|...) : The value is an enumerated

list. IDID : The value is a unique id. IDREFIDREF : The value is the id of another element. IDREFSIDREFS : : The value is a list of other ids NMTOKENNMTOKEN : The value is a valid XML name. NMTOKENSNMTOKENS : The value is a list of valid XML

names. ENTITYENTITY : The value is an entity. ENTITIESENTITIES : The value is a list of entities. NOTATIONNOTATION : The value is a name of a notation. xmlxml : The value is a predefined XML value.

Page 9: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

EXAMPLEEXAMPLE

Page 10: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

EXAMPLEEXAMPLE

Page 11: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

EXAMPLEEXAMPLE

<!ELEMENT date (year, month, day)><!ELEMENT year #PCDATA><!ELEMENT month #PCDATA ><!ELEMENT day #PCDATA >

Page 12: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

EXAMPLEEXAMPLE

<!ELEMENT address (street, city, country, zip)><!ELEMENT street (#PCDATA | unit )*><!ELEMENT city #PCDATA ><!ELEMENT country #PCDATA ><!ELEMENT zip #PCDATA ><!ELEMENT unit #PCDATA >

Page 13: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

EXAMPLEEXAMPLE<!ELEMENT person (name,

age, gender)><!ELEMENT name (first, last,

(junior | senior)? )><!ELEMENT age #PCDATA ><!ELEMENT gender

#PCDATA ><!ELEMENT first #PCDATA ><!ELEMENT last #PCDATA ><!ELEMENT junior #EMPTY><!ELEMENT senior #EMPTY><!ATTLIST person pid ID #REQUIRED employed (fulltime|

partime)>

Page 14: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

TIPS FOR DESIGNING DTDTIPS FOR DESIGNING DTD Organize declarations into groupsgroups by their

purposeBlocks, hierarchical elements, part of tables, lists,

etc. Use whitespacewhitespace

More understandable and easier to navigate.

Use commentscommentsAt the top of each DTD file: purpose, version

number, contact informationCustomization: original, authors, your changes.Label each section and subsection of the DTD.

Track versionversion Use parameter entitiesparameter entities

Hold recurring parts of declarations and allow to edit them in one place.

Page 15: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

PARAMETER ENTITIESPARAMETER ENTITIES In the external external DTD, can be used in:

Element-type declarations to hold element groupsAttribute list declarations to hold attribute definition.

In the internalinternal DTD, can hold only complete declarations.

<!ENTITY %% common.atts “ id ID # IMPLIED class CDATA #IMPLIED”>

<!ATTLIST foo %%common.atts;><!ATTLIST bar %%common.atts; extra CDATA #FIXED “blah”>

Page 16: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

IMPORTING MODULESIMPORTING MODULES .mod.mod means file contains declarations but

should not be used as DTD on its own. External entity import all the text in a file.

<!ELEMENT catalog (title, metadata, front, entries+)>

<!ENTITY % basic.stuff SYSTEM “basics.mod”>%basic.stuff;<!ENTITY % front.matter SYSTEM “front.mod”>%front.matter;<!ENTITY % metadata SYSTEM “metadata.dtd”>%metadata;

Page 17: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

CONDITIONAL SECTIONSCONDITIONAL SECTIONS Conditional sectionConditional section is a special form of

markup in DTD to mark a region for inclusion or exclusion.

Conditional section can be used only in external subsets

<![INCLUDE [ DTD text ]]><![INCLUDE [ DTD text ]]>

<![IGNORE [ DTD text ]]><![IGNORE [ DTD text ]]>

<![INCLUDE [<!ELEMENT blah #PCDATA>]]>

Page 18: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

OVERRIDING ELEMENTOVERRIDING ELEMENT In DTD:<!ENTITY % default.polyhedron “INCLUDE”><![%default.polyhedron;[ <!ELEMENT polyhedron (side+,angle+)>]]> In XML:<!DOCTYPE picture SYSTEM “shapes.dtd”[ <!ENTITY %default.polyhedron “IGNORE”> <!ELEMENT polyhedron (side, side, side+,

angle, angle, angle+)>] >

Page 19: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

LIMITATION OF DTDLIMITATION OF DTD DTD describes how elements are arranged in

document, but say a little about the content in document.

DTD is not flexible in children order. Lockdown namespace: any element in a

document has to have a corresponding declaration in DTD.

Schema is a new validation system:contains rules that all must be satisfied for a

document to be considered valid is not built into the XML specification. W3C XML Schema, RELAX NG, Schematron.

Page 20: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

NAMESPACESNAMESPACES Namespaces are used to group elements and

attributes.xmlns: namespace_prefix = “namespace_identifier”xmlns: namespace_prefix = “namespace_identifier”<part catalog xlmns:nw=“http://www.nutware.com” xlmns=“http://www.bobco.com”> #implicit namespaceimplicit namespace<nw:entry nw:number=“1327”> < nw:decription > hexnut < /nw:description

></nw:entry><part id=“555”> <name> type 4 </name></part></part-catalog>

Page 21: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

W3C SCHEMA (2001)W3C SCHEMA (2001) XML document by themselves.

In DTD: <!ELEMENT country #PCDATA > In W3C Schema<xs:schema

xlmns:xs=“http://www.w3.org/2001/XMLSchema”>

<xs:element name=“country” type=“xs:string”/>

</xs:schema>

Page 22: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

WIDELY USED TYPES.WIDELY USED TYPES. xs:stringxs:string any text xs:tokenxs:token textual tokens separated by

whitespace xs:decimalxs:decimal any decimal number xs:integerxs:integer any integer number xs:floatxs:float floating-point number xs:ID, xs:IDREFxs:ID, xs:IDREF the same as ID, IDREF in DTD xs:booleanxs:boolean “true”/”false” (“1”/”0”) xs:timexs:time time as HH:MM:SS-Timezone xs:datexs:date date in format CCYY-MM-DD xs:dateTimexs:dateTime date/time combination in format

CCYY-MM-DDTHH:MM:SS-Timezone xs:Qnamexs:Qname namespace-qualified name

Page 23: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

COMPLEX ELEMENT IN SCHEMACOMPLEX ELEMENT IN SCHEMA

<xs:element name=“date”> <xs:complexType> <xs:all> <xs:element ref=“year”/>

<xs:element ref=“month”/> <xs:element ref=“day”/>

</xs:all> </xs:complexType></xs:element><xs:element name=“year” type=“xs:integer”/><xs:element name=“month” type=“xs:integer”/><xs:element name=“day” type=“xs:integer”/>

Page 24: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

FACETSFACETS

FacetFacet is a way to control the range of the data type.

<xs:simpleType name=“monthNum”> <xs:restriction base=“xs:integer”> <xs:minInclusiveminInclusive value=“1”/> <xs:maxInclusivemaxInclusive value=“12”/> </xs:restriction> </xs:simpleType><xs:element name=“month” type=“monthNum”/> Facets can create fixed values, constrain the

length of strings, match patterns, set allowed values.

Page 25: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

FACETS EXAMPLEFACETS EXAMPLE List of allowed values:<xs:simpleType name=“genderType”> <xs:restriction base=“xs:token”> <xs:enumerationenumeration value=“female”/>

<xs:enumeration enumeration value=“male”/> </xs:restriction></xs:simpleType> Pattern:<xs:simpleType name=“pcode”> <xs:restriction base=“xs:token”> <xs:patternpattern value=“[0-9]{3}[A-Z]{3}”/> </xs:restriction></xs:simpleType>

Page 26: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

SCHEMA EXAMPLESCHEMA EXAMPLE

<xs:schema xlmns:xs=

“http://www.w3.org/2001/

XMLSchema”>

<xs:element name=“census-record”>

<xs:complexType>

<xs:sequence>

<xs:element ref=“date”/>

<xs:element ref=“address”/>

<xs:element ref=“person”

maxOccurs=“unbounded”/>

</xs:sequence>

<xs:attribute ref=“taker”/>

</xs:complexType> </xs:element>

Page 27: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

SCHEMA EXAMPLESCHEMA EXAMPLE

<xs:attribute name=“taker”>

<xs:simpleType>

<xs:restriction base=“xs:integer”>

<xs:minInclusive value=“1”/>

<xs:maxInclusive value=“9999”/>

</xs:restriction>

</xs:simpleType>

</xs:attribute>

Page 28: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

SCHEMA EXAMPLESCHEMA EXAMPLE<xs:element name=“date” type=“xs:date”>

<xs:element name=“address”>

<xs:complexType>

<xs:all>

<xs:element ref=“street”/>

<xs:element ref=“city”/>

<xs:element ref=“country”/>

<xs:element ref=“zip”/>

</xs:all>

</xs:complexType> </xs:element>

<xs:element name=“street” type=“xs:string”/>

<xs:element name=“city” type=“xs:string”/>

<xs:element name=“country” type=“xs:string”/>

Page 29: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

SCHEMA EXAMPLESCHEMA EXAMPLE<xs:element name=“zip”>

<xs:simpleType>

<xs:restriction base=“xs:token”>

<xs:pattern value=“[0-9]{3}[A-Z]{3}”/>

</xs:restriction>

</xs:simpleType>

</xs:element>

Page 30: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

SCHEMA EXAMPLESCHEMA EXAMPLE<xs:element name=“person”>

<xs:complexType>

<xs:all>

<xs:element ref=“name”/>

<xs:element ref=“age”/>

<xs:element ref=“gender”/>

</xs:all>

<xs:attribute ref=“employed”/>

<xs:attribute ref=“pid”/>

</xs:complexType>

</xs:element>

Page 31: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

SCHEMA EXAMPLESCHEMA EXAMPLE<xs:attribute name=“employed”>

<xs:simpleType >

<xs:restriction base=“xs:token”>

<xs:enumeration value=“fulltime”/>

<xs:enumeration value=“parttime”/>

<xs:enumeration value=“none”/>

</xs:restriction>

</xs:simpleType>

</xs:attribute>

<xs:attribute name=“pid”>

<xs:simpleType>

<xs:restriction base=“xs:integer”>

<xs:minInclusive value=“1”/>

<xs:maxInclusive value=“999999”/>

</xs:restriction>

</xs:simpleType>

</xs:attribute>

Page 32: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

SCHEMA EXAMPLESCHEMA EXAMPLE<xs:element name=“age”>

<xs:simpleType>

<xs:restriction base=“xs:integer”>

<xs:minInclusive value=“0”/>

<xs:maxInclusive value=“150”/>

</xs:restriction>

</xs:simpleType>

</xs:element>

<xs:attribute name=“gender”>

<xs:simpleType >

<xs:restriction base=“xs:token”>

<xs:enumeration value=“female”/>

<xs:enumeration value=“male”/>

</xs:restriction>

</xs:simpleType> </xs:element>

Page 33: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

SCHEMA EXAMPLESCHEMA EXAMPLE<xs:element name=“name”>

<xs:complexType>

<xs:all>

<xs:element ref=“first”/>

<xs:element ref=“last”/>

</xs:all>

<xs:choice minOccurs=“0”>

<xs:element ref=“junior”/>

<xs:element ref=“senior”/>

</xs:choice>

</xs:complexType>

</xs:element>

Page 34: QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009. BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.

SCHEMA EXAMPLESCHEMA EXAMPLE

<xs:element name=“junior” type=“emptyElem”/>

<xs:element name=“senior” type=“emptyElem”/>

<xs:complexType name=“emptyElem”/>

</xs:schema>