Top Banner
XML Structures For Existing Databases Ref: http://www-106.ibm.com/developerwor ks/xml/library/x-struct/
38

XML Structures For Existing Databases Ref: 106.ibm.com/developerworks/xml/library/x-struct

Dec 29, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

XML Structures For Existing DatabasesRef:

http://www-106.ibm.com/developerworks/xml/library/x-struct/

Page 2: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

2

XML schema Validator

• The Sun Multi-Schema XML Validator (MSV) is a Java[tm] technology tool to validate XML documents against several kinds of XML schemata:

download: http://wwws.sun.com/software/xml/developers/multischema/

Page 3: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

3

Document Type Definitions

• DTDs are associated with the entire element tree via the document element.

Page 4: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

4

Writing DTDsExample1.dtd

<!ELEMENT skills (skill)*>

<!ELEMENT skill (name)>

<!ATTLIST skill level CDATA #REQUIRED>

<!ELEMENT name ( #PCDATA )>

Example1.xml

<?xml version = "1.0"?>

<!-- Example1.xml -->

<!DOCTYPE skills SYSTEM "example1.dtd">

<skills>

<skill level="1">

<name > XML How to write DTD </name>

</skill>

</skills>

Page 5: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

5

Validating Sun Validator

• Invoking the Validator:• java -jar C:\pathName\msv.jar Example1.dtd Example2.xml

Output:start parsing a grammar.

validating Example1.xml

the document is valid.

Page 6: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

6

ID Attribute Type

• Attributes using the ID type serve as unique identifiers for a given instance of an element.

• The value of an ID attribute must be a valid XML name, unique within a document and use the #IMPLIED or #REQUIRED default value.

• #IMPLIED: No default value. The attribute is optional.

• #REQUIRED: The attribute must appear in every element type.

• #FIXED: Attributes may be optional, but when present are constrained to the given value.

• There may only be one ID attribute for one element type.

Page 7: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

7

Example 2

<!ELEMENT somethings (anything)*>

<!ELEMENT anything ANY>

<!ELEMENT one (#PCDATA)>

<!ELEMENT two (#PCDATA)>

<?xml version = "1.0"?><!-- Example2.xml -->

<!DOCTYPE somethings SYSTEM "example2.dtd">

<somethings> <anything><two> Two </two><one> One </one>

</anything>

</somethings>

Page 8: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

8

Example 3<!ELEMENT PurchaseOrders

(Order)*><!ELEMENT Order (Item+)><!ATTLIST Order

OrderId ID #REQUIRED

><!ELEMENT Item

(ItemName,price)><!ATTLIST Item

ItemId ID #REQUIRED><!ELEMENT ItemName

(#PCDATA)><!ELEMENT price (#PCDATA)>

<?xml version = "1.0"?><!-- Example3.xml --><!DOCTYPE PurchaseOrder SYSTEM "example3.dtd"><PurchaseOrders> <Order OrderId="od123"> <Item ItemId="I1">

<ItemName> Item1 </ItemName><price> 20.00 </price>

</Item> <Item ItemId="I2">

<ItemName> Item2 </ItemName><price> 20.00 </price>

</Item>

</Order> <Order OrderId="od456"> <Item ItemId="I3">

<ItemName> Item3 </ItemName><price> 20.00 </price>

</Item> <Item ItemId="I4">

<ItemName> Item4 </ItemName><price> 20.00 </price>

</Item> </Order>

</PurchaseOrders>

Page 9: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

9

Example 4

<!ELEMENT PurchaseOrders (Order,Manufacturer)*>

<!ELEMENT Order (Item+)><!ATTLIST Order

OrderId ID #REQUIRED><!ELEMENT Item (ItemName)><!ATTLIST Item

ItemId ID #REQUIREDManf IDREF #REQUIRED

><!ELEMENT ItemName (#PCDATA)><!ELEMENT Manufacturer EMPTY><!ATTLIST Manufacturer ManfId

ID #REQUIRED >

<?xml version = "1.0"?><!– Example4.xml

--><!DOCTYPE PurchaseOrders SYSTEM

"example4.dtd"><PurchaseOrders> <Order OrderId="od123"> <Item ItemId="I1"

Manf = "m444"> <ItemName> Item1

</ItemName></Item>

</Order><Manufacturer ManfId = "m444"/>

</PurchaseOrders>

Page 10: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

10

Example 5

<!ENTITY % ID_Req "ID #REQUIRED">

<!ELEMENT PurchaseOrders (Order,Manufacturer)*>

<!ELEMENT Order (Item+)><!ATTLIST Order

OrderId ID #REQUIRED><!ELEMENT Item

(ItemName,Comments)><!ATTLIST Item

ItemId %ID_Req;Manf IDREF #REQUIRED

><!ELEMENT ItemName (#PCDATA)><!ELEMENT Manufacturer EMPTY><!ATTLIST Manufacturer ManfId

ID #REQUIRED ><!ELEMENT Comments (#PCDATA)><!ENTITY Greetings

"Hello,World">

<?xml version = "1.0"?><!– Example5.xml

--><!DOCTYPE PurchaseOrders

SYSTEM "example5.dtd"><PurchaseOrders> <Order OrderId="od123"> <Item ItemId="I1"

Manf = "m444"> <ItemName> Item1

</ItemName>

<Comments>&Greetings;</Comments></Item>

</Order><Manufacturer ManfId =

"m444"/>

</PurchaseOrders>

Page 11: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

11

Modeling Relationships

Invoice

invoiceId : IntegercustomerId : IntegerorderDate : Date

LineItem

lineItemId : IntegerinvoiceId : IntegeritemDesc : Stringprice : Realquantity : Integer

1..*11 1..*

Primary Keys:Invoice: invoiceIdLineItem: lineItemIdForeign Keys: invoiceId in LineItem

Page 12: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

12

Modeling Relationships• One-to-one and one to many relationships are best represented by containment.• In the following DTD, mostly attributes are used to represent the information

in LineItem and Invoice.

<!ELEMENT Invoice (LineItem+)><!ATTLIST Invoice

orderDate CDATA #REQUIREDcustomerId CDATA #REQUIRED>

<!ELEMENT LineItem EMPTY> <!ATTLIST LineItem

itemDesc CDATA #REQUIREDprice CDATA #REQUIREDquantity CDATA #REQUIRED

>

Page 13: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

13

Example

InvoiceinvoiceId : IntegercustomerId : IntegerorderDate : Date

LineItemlineItemId : IntegerinvoiceId : IntegeritemId : Integerprice : Realquantity : Integer

1..*1 1..*1

ItemitemId : IntegeritemName : StringitemDesc : String

Page 14: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

14

Modeling Relationships

• Many to many relationships can be represented using pointers, that is ID/IDREF pairs.• Using containment in this situation will result in redundancy of information• Example:<!ELEMENT Orders (Invoice+,Item+)><!ELEMENT Invoice (LineItem+)>

<!ATTLIST Invoice orderDate CDATA #REQUIREDcustomerId CDATA #REQUIRED>

<!ELEMENT LineItem EMPTY> <!ATTLIST LineItem

itemIDREF IDREF #REQUIREDprice CDATA #REQUIREDquantity CDATA #REQUIRED

><!ELEMENT Item EMPTY> <!ATTLIST Item

itemId ID #REQUIREDitemName CDATA #REQUIREDitemDesc CDATA #REQUIRED

>

Page 15: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

15

XML Design For DataSome Issues To Consider

• Establish the scope of the document• Identify the structures to model• Identify the relationships between entities• Identify data points that need to be associated with

each structure

Page 16: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

16

XML Design For DataSome Issues To Consider

• Example: Let us take two purchase orders and model them in XML.

Books, IncPurchase OrderOrder date: 6/25/2002Shipping Date:6/27/2002Customer:

Mary Jones500 AlamedaSanta Clara, Santa Clara, CA 95013

Shipping Co: UPSItem No (ISBN) Description quantity price totalQ1234 Cosmos 1 50.00 50.00Q555 XML 2 25.00 50.00Total $100.00

Page 17: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

17

XML Design For DataSome Issues To Consider

Books, Inc

Purchase Order

Order date: 7/28/2002

Shipping Date:7/30/2002

Customer:

John Smith

555 Spring Ct

Santa Clara, Santa Clara, CA 95013

Shipping Co: fedex

Item No (ISBN) Description quantity price total

Q333 Java 5 30.00 150.00

Q555 XML 1 25.00 25.00

Total $175.00

Page 18: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

18

• Establish the scope:– One XML document per PurchaseOrder – One XML document to represent a number of

PurchaseOrders.

Page 19: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

19

• Identify the Structures to model• Orders

– PurchaseOrders– Customer– Item– LineItem

Page 20: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

20

Customername : StringstNumber : StringstName : StringCity : StringState : Stringzip : String

OrdersstartDateendDate

PurchaseOrderorderDate : DateshippingDate : DateshippingCo : String

1..*

1 1..*

LineItemquantityprice

1

1..*Item

itemCodedesc : String 1..*

1..*1

1..*

1

1..*

1..*

Page 21: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

21

Creating The XML DTD

Start With the Structures and establish the Elements

<!ELEMENT Orders EMPTY><!ELEMENT PurchaseOrderEMPTY><!ELEMENT Customer EMPTY><!ELEMENT Item EMPTY><!ELEMENT LineItem EMPTY>

Page 22: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

22

Add the Data points to the ElementsWe will use attributes to represent the data points

<!ELEMENT Orders EMPTY><!ATTLIST Orders

StartDate CDATA #REQUIREDEndDate CDATA #REQUIRED

<!ELEMENT PurchaseOrder EMPTY>….<!ELEMENT Customer EMPTY>….<!ELEMENT Item EMPTY>….<!ELEMENTLineItem EMPTY>….

Page 23: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

23

Model the relationships

• Use containment whenever possible.• Relationships:

– Each Orders contains many PurchaseOrders.– Each PurchaseOrder has one Customer.– Each Customer may be associated with more than

one PurchaseOrder.– Each LineItem has one Item.– Each Item may be in more than one LineItem

Page 24: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

24

• Each Orders contains many PurchaseOrders.• Each PurchaseOrder contains many LineItems

<!ELEMENT Orders (PurchaseOrders+><!ATTLIST Orders

StartDate CDATA #REQUIREDEndDate CDATA #REQUIRED

<!ELEMENT PurchaseOrder (LineItem+)>….<!ELEMENT Customer EMPTY>….<!ELEMENT Item EMPTY>….<!ELEMENTLineItem EMPTY>….

Page 25: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

25

Modeling Relationships

Modeling Relationships:

Each PurchaseOrder has one Customer.

Each Customer may be associated with more than one PurchaseOrder.

Each LineItem has one Item.

Each Item may be in more than one LineItem

– To avoid the repetition of data for Customer and Item, we will use ID/IDREF to represent the one to many relationship.

– The elements Customer and Item are promoted to the document scope

Page 26: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

26

<!ELEMENT Orders (PurchaseOrders+, Customer+,Item+>…

<!ELEMENT PurchaseOrder (LineItem+)><!ATTLIST PurchaseOrder

orderDate CDATA #REQUIREDshippingDate CDATA #REQUIREDshippingCo (fedex | ups) #REQUIRED customerIDREF IDREF #REQUIRED

….<!ELEMENT Customer EMPTY><!ATTLIST Customer

customerId ID #REQUIREDname CDATA #REQUIRED

Page 27: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

27

XML Structures For Existing Databases

• Migrating a Database To XML• Scoping the XML Document• Creating the Root Element• Model the tables• Model the non-foreign key values• Adding ID attributes• Handling Foreign Keys• Adding the Relationships

Page 28: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

28

Scoping the XML Document

• Choose the data to include in the XML document – Based on the business requirements the XML document will be fulfilling, decide which tables and columns will need to be included in the xml documents.

Page 29: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

29

Creating The Root Element

• Create a Root Element• Create a root element for the document.• Declare any attributes of that element that are

required to hold additional semantic information.

Page 30: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

30

Model the tables

• Content tables: Contain a set of records.• Eg: Customer information• Lookup tables: Contain a list of ID-description pairs

that are used to further classify information.• Eg: Shipping Company• Relation tables: Express many to many relationships

as separate tables. These will be treated as content tables.

Page 31: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

31

Content Tables

• Create an element in the DTD for each content table.• Model the Non-foreign key values:• As Attributes: As attributes in the ATTLIST

associated with ach element; each attribute should have a type, CDATA and if it cannot take nulls in the database should include #REQUIRED; Otherwise, should be #IMPLIED.

• As Elements: If the attribute in database allows nulls, use ? as suffix; otherwise, use no suffix.

Page 32: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

32

Adding ID attributes

• Add an ID attribute to each of the Elements (with the exception of root element).

• Use the element name followed by ID for the name of the new attribute, watching for name collisions.

• Note that a unique ID (unique across all elements in the document) will need to be created for each of the instance of an element.

• If there are row-based primary keys, use them by prefixing them with the table name.

Page 33: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

33

Handling Foreign Keys

• Foreign keys serve as glue to connect the different tables in a database.

• In XML, relationships between elements can be represented – using containment (via nesting).– Using ID/IDREF pairs

Page 34: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

34

Modeling Lookup Tables• For each foreign key that we have chosen

to include in our XML structures that references a lookup table:– Create an attribute on the element

representing the table in which the foreign key is found.

– Give the attribute the same name as the table referenced by foreign key.

– Make the attribute of the enumerated list type.

– Example:<!ELEMENT PurchaseOrder

(LineItem+)><!ATTLIST PurchaseOrder

shippingCo (fedex | ups) #REQUIRED

ShippingCo

shippingType : Integerdesc : String

PurchaseOrder

shippingType : Integer

Page 35: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

35

• Add Element content to the Root Element• Add a child element or elements to the allowable

content of the root element for each table that models the content information in the document.

Page 36: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

36

Modeling the relationships

• Walk the relationships between tables to add ID/IDREF where applicable.

• We walk the relationships in the direction that makes the most business sense, for example, from PurchaseOrder to LineItem.

• If the relationship is 1 to 1 or 1 to n, in the direction that is being navigated, and no other relationship leads to the child within the selected subset, then add the child element as element content of the parent element with the appropriate cardinality.

Page 37: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

37

• Many-to-one or multiple-parent relationships: Identify each relationship that is many-to-one in the direction we have defined it, or whose child is a child in more than one relationship we have defined.

• For each of these relationships, add an IDREF or IDREFS attribute to the element on the parent side of the relationship, which points to the ID of the element on the child side of the relationship.

Page 38: XML Structures For Existing Databases Ref:  106.ibm.com/developerworks/xml/library/x-struct/

38

• Add missing elements to the root element:• For any element that is pointed to in the structure so

far, add that element as allowable element content of the root element.

• Discard unreferenced ID attributes: • Remove unwanted ID attributes. Remove ID

attributes that are not referenced by IDREF or IDREFS attributes elsewhere in the XML structures.