IS432: Semi- Structured Data Dr. Azeddine Chikh
Dec 20, 2015
IS432: Semi-Structured Data
Dr. Azeddine Chikh
8. XML Relational Mapping
Introduction
• A common question in the XML community is how to map XML to databases.
• two mappings: – a table-based mapping – an object-relational (object-based) mapping.
• Both mappings model the data in XML documents rather than the documents themselves.
Introduction• The table-based mapping can't handle mixed
content at all, and the object-relational mapping of mixed content is extremely inefficient.
• Both mappings are commonly used as the basis for software that transfers data between XML documents and databases, especially relational databases.
• Both mappings are bidirectional. That is, they can be used to transfer data both from XML documents to the DB and from the DB to XML documents.
Table-based Mapping
• There is an obvious mapping between the following XML document and table:
<A> <B> <C>ccc</C> Table A <D>ddd</D> ------- <E>eee</E> C D E </B> --- --- --- <B> <=> ... ... ... <C>fff</C> ccc ddd eee <D>ggg</D> fff ggg hhh <E>hhh</E> ... ... ... </B> </A>
Table-based Mapping
• It views the document as a single table or a set of tables. The structure of the document must be either
<Table> <Row> <Column_1>...</Column_1> ... <Column_n>...</Column_n> </Row> ... <Row> <Column_1>...</Column_1> ... <Column_n>...</Column_n> </Row> </Table>
• or
Table-based Mapping <Tables> <Table_1> <Row> <Column_1>...</Column_1> ... <Column_n>...</Column_n> </Row> ... </Table_1> ... <Table_n> <Row> <Column_1>...</Column_1> ... <Column_m>...</Column_m> </Row> ... </Table_n> </Tables>
Table-based Mapping
• The table-based mapping is commmonly used by middleware to transfer data between XML documents and relational databases. It is also used in some Web application servers to return result set data as XML.
Generating Schema1. Generating Relational Database Schema from
DTDs2. Generating DTDs from Database Schema
Generating Relational Database Schema from DTDs• Relational schemas are generated by reading
through the DTD and processing each element type:– Complex element types generate class tables with
primary key columns. – Simple element types are ignored except when
processing content models.
Generating Relational Database Schema from DTDs• The following example shows how this process works. Consider
the following DTD: <!ELEMENT Order (OrderNum, Date, CustNum, Item*)> <!ELEMENT OrderNum (#PCDATA)> <!ELEMENT Date (#PCDATA)> <!ELEMENT CustNum (#PCDATA)> <!ELEMENT Item (ItemNum, Quantity, Part)> <!ELEMENT ItemNum (#PCDATA)> <!ELEMENT Quantity (#PCDATA)> <!ELEMENT Part (PartNum, Price)> <!ELEMENT PartNum (#PCDATA)> <!ELEMENT Price (#PCDATA)>
Generating Relational Database Schema from DTDs• DTD In the first step, we generate tables for complex element types and primary keys for these tables:
<!ELEMENT Order (OrderNum, Date, CustNum, Item*)> ==> Table Order <!ELEMENT OrderNum (#PCDATA)> Column OrderPK <!ELEMENT Date (#PCDATA)> <!ELEMENT CustNum (#PCDATA)> <!ELEMENT Item (ItemNum, Quantity, Part)> ==> Table Item <!ELEMENT ItemNum (#PCDATA)> Column ItemPK <!ELEMENT Quantity (#PCDATA)> <!ELEMENT Part (PartNum, Price)> ==> Table Part <!ELEMENT PartNum (#PCDATA)> Column PartPK <!ELEMENT Price (#PCDATA)>
Generating Relational Database Schema from DTDs• DTD In the second step, we generate columns for references to simple element types:
<!ELEMENT Order (OrderNum, Date, CustNum, Item*)> ==> Table Order - Column OrderPK <!ELEMENT OrderNum (#PCDATA)> Column OrderNum <!ELEMENT Date (#PCDATA)> Column Date <!ELEMENT CustNum (#PCDATA)> Column CustNum <!ELEMENT Item (ItemNum, Quantity, Part)> ==> Table Item - Column ItemPK <!ELEMENT ItemNum (#PCDATA)> Column ItemNum <!ELEMENT Quantity (#PCDATA)> Column Quantity <!ELEMENT Part (PartNum, Price)> ==> Table Part - Column PartPK <!ELEMENT PartNum (#PCDATA)> Column PartNum <!ELEMENT Price (#PCDATA)> Column Price
Generating Relational Database Schema from DTDs• DTD In the final step, we generate foreign keys for references to complex element types:
<!ELEMENT Order (OrderNum, Date, CustNum, Item*)> ==> Table Order - Column OrderPK<!ELEMENT OrderNum (#PCDATA)> Column OrderNum<!ELEMENT Date (#PCDATA)> Column Date<!ELEMENT CustNum (#PCDATA)> Column CustNum <!ELEMENT Item (ItemNum, Quantity)> ==> Table Item - Column ItemPK<!ELEMENT ItemNum (#PCDATA)> Column ItemNum<!ELEMENT Quantity (#PCDATA)> Column Quantity Column OrderFK
Column PartFK<!ELEMENT Part (PartNum, Price,Item*)> ==> Table Part - Column PartPK <!ELEMENT PartNum (#PCDATA)> Column PartNum <!ELEMENT Price (#PCDATA)> Column Price
Generating Relational Database Schema from DTDs• A generated schema isn't going to be the same as a
human would have written.
Generating DTDs from Database Schema
• DTDs are generated by starting from a single "root" table or set of root tables and processing each:
• Each root table generates an element type with element content in the form of a single sequence.
• Each data (non-key) column in the table generates an element type with PCDATA-only content and a reference in the sequence; nullable columns generate optional references.
Generating DTDs from Database Schema
• The following example shows how this process works. Consider the following database schema:
Table Orders Column OrderNum Column Date Column CustNum Table Items Column OrderNum Column ItemNum Column Quantity Column PartNum Table Parts Column PartNum Column Price
Generating DTDs from Database Schema• In our first step, we generate an element type for the root table (Orders): Table Orders ==> <!ELEMENT Orders ()> Column OrderNum Column Date Column CustNum Table Items Column OrderNum Column ItemNum Column Quantity Column PartNum Table Parts Column PartNum Column Price
Generating DTDs from Database Schema• Next, we generate PCDATA-only elements for the data columns (Date and CustNum) and add
references to these elements to the content model of the Orders element: Table Orders ==> <!ELEMENT Orders (Date, CustNum)> Column OrderNum Column Date <!ELEMENT Date (#PCDATA)> Column CustNum <!ELEMENT CustNum (#PCDATA)> Table Items Column OrderNum Column ItemNum Column Quantity Column PartNum Table Parts Column PartNum Column Price
Generating DTDs from Database Schema• Now we generate a PCDATA-only element for the primary key (OrderNum) and add a reference to
it to the content model:
Table Orders ==> <!ELEMENT Orders (Date, CustNum, OrderNum)> Column OrderNum <!ELEMENT OrderNum (#PCDATA)> Column Date <!ELEMENT Date (#PCDATA)> Column CustNum <!ELEMENT CustNum (#PCDATA)> Table Items Column OrderNum Column ItemNum Column Quantity Column PartNum Table Parts Column PartNum Column Price
Generating DTDs from Database Schema• And then add an element for the table (Items) to which the primary key is exported, as well as a
reference to it in the content model:
Table Orders <!ELEMENT Orders (Date, CustNum, OrderNum, Items*)> Column OrderNum <!ELEMENT OrderNum (#PCDATA)> Column Date <!ELEMENT Date (#PCDATA)> Column CustNum <!ELEMENT CustNum (#PCDATA)> Table Items ==> <!ELEMENT Items()> Column OrderNum Column ItemNum Column Quantity Column PartNum Table Parts Column PartNum Column Price
Generating DTDs from Database Schema• We process the data and primary key columns in the remote (Items) table in the same way:
Table Orders <!ELEMENT Orders (Date, CustNum, OrderNum, Items*)> Column OrderNum <!ELEMENT OrderNum (#PCDATA)> Column Date <!ELEMENT Date (#PCDATA)> Column CustNum <!ELEMENT CustNum (#PCDATA)> Table Items ==> <!ELEMENT Items(ItemNum, Quantity)> Column OrderNum Column ItemNum <!ELEMENT ItemNum (#PCDATA)> Column Quantity <!ELEMENT Quantity (#PCDATA)> Column PartNum Table Parts Column PartNum Column Price
Generating DTDs from Database Schema• And then add an element for the table (Parts) to which the foreign key corresponds:
Table Orders <!ELEMENT Orders (Date, CustNum, OrderNum, Items*)> Column OrderNum <!ELEMENT OrderNum (#PCDATA)> Column Date <!ELEMENT Date (#PCDATA)> Column CustNum <!ELEMENT CustNum (#PCDATA)> Table Items <!ELEMENT Items(ItemNum, Quantity, Parts)> Column OrderNum Column ItemNum <!ELEMENT ItemNum (#PCDATA)> Column Quantity <!ELEMENT Quantity (#PCDATA)> Column PartNum Table Parts ==> <!ELEMENT Parts()> Column PartNum Column Price
Generating DTDs from Database Schema• Finally, we process the foreign key table (Parts): Table Orders <!ELEMENT Orders (Date, CustNum, OrderNum, Items*)> Column OrderNum <!ELEMENT OrderNum (#PCDATA)> Column Date <!ELEMENT Date (#PCDATA)> Column CustNum <!ELEMENT CustNum (#PCDATA)> Table Items <!ELEMENT Items (ItemNum, Quantity)> Column OrderNum Column ItemNum <!ELEMENT ItemNum (#PCDATA)> Column Quantity <!ELEMENT Quantity (#PCDATA)> Column PartNum Table Parts ==> <!ELEMENT Parts (PartNum, Price, Items*)> Column PartNum <!ELEMENT PartNum (#PCDATA)> Column Price <!ELEMENT Price (#PCDATA)>
Generating DTDs from Database Schema
• As was the case in the previous section, the generated DTD is not what a human would have created.