11.02.2009 1 XML Databases 13. Systems Silke Eckstein Andreas Kupfer Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 13.1 Introduction 13.2 Oracle 13.3 DB2 13.4 SQL Server 13.5 Tamino 13.6 Summary 13.X Overview and References 2 13. Systems XML Databases –Silke Eckstein – Institut für Informationssysteme –TU Braunschweig • After discussing various aspects of XML and XML databases ... • ... we are now going to have a closer look at some of the database systems. XML Databases –Silke Eckstein – Institut für Informationssysteme –TU Braunschweig 3 13.1 Introduction • RDBMS with XML support • Native XML-DBMS systems XML Databases –Silke Eckstein – Institut für Informationssysteme –TU Braunschweig 4 13.1 Introduction 13.1 Introduction 13.2 Oracle 13.3 DB2 13.4 SQL Server 13.5 Tamino 13.6 Summary 13.X Overview and References 5 13. Systems XML Databases –Silke Eckstein – Institut für Informationssysteme –TU Braunschweig XML Databases –Silke Eckstein – Institut für Informationssysteme –TU Braunschweig 6 [Tür08] 13.2 Oracle 11g Architecture Figure taken from Oracle® XML Developer's Kit Programmer's Guide 11g Release 1 (11.1), April 2008
11
Embed
13. Systems XML Databases · 13.2 Oracle 11 g XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
11.02.2009
1
XML Databases13. Systems
Silke EcksteinAndreas KupferInstitut für InformationssystemeTechnische Universität Braunschweighttp://www.ifis.cs.tu-bs.de
13.1 Introduction
13.2 Oracle
13.3 DB2
13.4 SQL Server
13.5 Tamino
13.6 Summary
13.X Overview and References
2
13. Systems
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
• After discussing various aspects of XML and XML databases ...
• ... we are now going to have a closer look at some of the database systems.
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 3
13.1 Introduction
• RDBMS with XML support
• Native XML-DBMS systems
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 4
13.1 Introduction
13.1 Introduction
13.2 Oracle
13.3 DB2
13.4 SQL Server
13.5 Tamino
13.6 Summary
13.X Overview and References
5
13. Systems
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6[Tür08]
13.2 Oracle 11gArchitecture
Figure taken from Oracle® XML Developer's KitProgrammer's Guide 11g Release 1 (11.1), April 2008
11.02.2009
2
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 7[Tür08]
13.2 Oracle 11gArchitecture (2)
Figure taken from Oracle® XML DB Developer’s Guide 11g Release 1 (11.1)October 2007
• Mapping variants from XML to databases– XML column approach: Column is based on XML type– XML table approach: Table is based on XML type
• Using objectrelational extensions of Oracle– XMLTYPE as predefined object type with SQL/XML
functions as methods– Intermedia-Text-Package with full text functions– DBMS_XMLDOM package with DOM methods– DBMS_XMLSCHEMA package with administration and
generation methods– DBMS_XMLGEN package with methods to generate XML
from SQL
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 8[Tür08]
13.2 Oracle 11g
• Storage options– text-based (unstructured as CLOB)
– binary (compact storage in XML binary format)
– schema-based (object-relational storage requires XML Schema)
– hybrid (semistructured)
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 9[Tür08]
13.2 Oracle 11g
Figure taken from Oracle® XML DB Developer’s Guide 11g Release 1 (11.1)October 2007
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 10[Tür08]
13.2 Oracle 11g
Figure taken from Oracle® XML DB Developer’s Guide 11g Release 1 (11.1)October 2007
• XML-column vs. XML-table approach
– Table with XML column
– XML table
– Inserting documents in both cases
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 11[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 21[Tür08]
13.2 Oracle 11g
CREATE VIEW DpunktBooks OF XMLTYPEWITH OBJECT ID DEFAULTAS SELECT VALUE (b) FROM Book b
WHERE EXISTSNODE (VALUE(b),'//Publisher[text()="dpunkt"]') ;
• Export of database contents with XML syntax– Standard mapping: SQL � XML with
• Top level elements result from columns• Simple types (with scalar values) as elements with PCDATA• Structured types and their attributes as elements with subelements for
attributes• Complex attributes as hierarchically nested elements• Collection types are mapped to lists of elements• Object references and referential integrity as ID/IDREF within the
document• Table content is mapped to ROWSET elements:
– User defined transformation from SQL to XML is possible with XSLT
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 22[Tür08]
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 23[Tür08]
13.2 Oracle 11g
XML storage modelccccc Extensible, object relational
Schema definition Validation possible
Storage type Text-based or schema-based
Mapping DB � XML By SQL/XML functions, schemagenerators, XML views
XML data type Available
Value/function index Available
Full text index Available
Path index Available
Queries SQL/XML with XQuery support
Full text search With the Intermedia-Text-Package
Manipulation SQL methods with XPath
13.1 Introduction
13.2 Oracle
13.3 DB2
13.4 SQL Server
13.5 Tamino
13.6 Summary
13.X Overview and References
24
13. Systems
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
11.02.2009
5
• IBM DB2
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 25[Tür08]
13.3 DB2 V9
Database
XML documentsApplication
filesys-tem
• Mapping XML data to relational databases– Variants:
• XML column approach: based on XML data type• XML collection approach: based on decomposition of XML documents into
database tables and attributes
– Table with XML column:• Diverse XML datatypes:
– XML: modelbased / hierarchical storage– XMLCLOB: XML documents stored as CLOBs– XMLVARCHAR: XML documents stored as VARCHAR – XMLFILE: XML documents stored in file system
• XML schema validation for datatype XML only• In addition: materialized views
– Extract selected XML content from documents– Materialise those content into so-called side tables– Side tables are defined in Document Access Definition (DAD)
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 26[Tür08]
13.3 DB2 V9
PureXML
XMLExtender
• "pureXML and relational hybrid database"
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 27[IBM06a]
13.3 DB2 V9
• Ways to put XML data into the database (PureXML)
13.3 DB2 V9
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 28[IBM06b]
• Ways to get XML data out of the database (PureXML)
13.3 DB2 V9
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 29[IBM06b]
• PureXML – Queries and Indexes– Application of SQL in XQuery:
– Delivers the value of column xml1 of table t1 as a node sequence (column must be of type XML)
– Delivers the XML value of the single-column table t1 as a node sequence (column must be of type XML)
– Definition of a path index:
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 30[Tür08]
13.3 DB2 V9
CREATE INDEX Idx_Author_Path ON Book (Content)GENERATE KEY USING XMLPATTERN '//Author' AS SQL VARCHAR(50)
XQUERY db2-fn:xmlcolumn (‘t1.xml1’)
XQUERY db2-fn:sqlquery (’SELECT xml1 FROM t1’)
11.02.2009
6
• XML Extender – Mapping between XML and SQL
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 31[Tür08]
13.3 DB2 V9
• XML Extender –Tables with XML Types– XML extension setup with XML Extender Admin Wizard
or Command Window:
– Definition of tables accepting XML documents:• Variant 1: Create with XML Extender Admin Wizard
• Variant 2: SQL
– Insertion of an XML document:
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 32[Tür08]
• Value index (B-Tree, Bitmap, etc.) on side tables (XML Extender)
• Full text index (with Text Extender) on XML types
– Extension of full text index for IR on XML• Path information included in index
• Support for path expressions
• Example:
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 36[Tür08]
13.3 DB2 V9
SELECT InhaltFROM BuchlobWHERE contains(dscrHandel, ‘MODEL order SECTION(//Buch/Beschreibung) "Datenbank"‘) = 1
Retrival model
11.02.2009
7
• Summary IBM DB2 XML Support
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 37[Tür08]
13.3 DB2 V9
XML storage model Extensible, object relational
Schema definition Validation possible
Storage type Model-based (PureXML), text-based oruserdefined schema-based (XML Extender)
Mapping DB � XML DAD (XML Extender)
XML data type Available (PureXML)
Value/function index Standard DBS indexes on side tables
Full text index With TextExtender
Path index Available
Queries SQL/XML with XQuery support
Full text search WithTextExtender
Manipulation SQL functions with XPath
13.1 Introduction
13.2 Oracle
13.3 DB2
13.4 SQL Server
13.5 Tamino
13.6 Summary
13.X Overview and References
38
13. Systems
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig
• Microsoft SQL Server Architecture
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 39[Tür08]
13.4 SQL Server
Database
XML documentsApplication
• Mapping XML data to relational databases– 4 storage variants:
• Native (binary) storage• Text-based storage as CLOB• Model-based storage according to EDGE approach• Schema-based storage via STORED-queries
– Datatype XML with methods based on XQuery• Query() – evaluates an XQuery and returns a value of type XML• Value() – evaluates an XQuery and returns a scalar SQL value• Exist() – returns true, if XQuery result is not empty• Modify() – updates a value of type XML• Nodes() – returns subtree of XML value
– Integrated Usage of SQL and XQuery• Access to SQL data in XQuery via sql:column() and sql:variable()
• Evaluation of XQuery expressions in SQL via XML methods from above
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 40[Tür08]
title NVARCHAR(3000) ‘./title',publisher NVARCHAR(200) ‘./publisher‘,isbn NVARCHAR(15) ‘./isbn‘
)EXEC sp_xml_removedocument @hdoc
• Mapping of databases to XML– Variant 1: Standard transformation with SQL SELECT and FOR
XML clause• FOR XML RAW: Transformation in ROW-XML elements and XML
attributes• FOR XML AUTO:
– Semantically rich XML element names– Foreign key relationships are transformed into hierarchies
• FOR XML EXPLICIT: User controls XML assembling through metadata (EDGE)
– Variant 2: User defined XML view• Use of a (available) XML schema• Annotation of the schema with information about tables and columns• Accesss from the application to the XML view via:
– IIS functionality– ADO (ActiveX Data Objects) – middleware for DB access
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 47[Tür08]
13.4 SQL Server
• Updates– SQL Server does not offer functions to update XML documents
stored as CLOBs• Results in heavy restrictions of text-based approach
– Updates for schema-based approach possible via so called updategrams
• Builds on annotated XML schemas• Updates are specified as an XML document• New namespace: xmlns:updg="urn:schemas-microsoft-com:xml-updategram"
– Element before: Definition of a previous state (to be modified)– Element after: Definition of the new state
• Different update operations through varying element contents– Insert: before element remains empty– Delete: after element remains empty– Update: both elements have non-empty contents
• Automatic execution of necessary database operations
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 48[Tür08]
13.4 SQL Server
11.02.2009
9
• Updates: updategram example– Update of publisher information
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 49[Tür08]
<Verlag> International Thomson Publishing </Verlag>
<Buch>
</updg:after>
</updg:sync>
</ROOT>
• Summary SQL Server XML support
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 50[Tür08]
13.4 SQL Server
XML storage model Relational
Schema definition inline DTD or XML schema
Storage type Native: XML columntext-based: CLOB columnmodelbased: with OPENXMLuser-defined schema-based: with OPENXML-STORED queries
Mapping DB � XML Automatically: FOR XML clauseuser-defined: XSD annotations
XML data type Available
Value index Available
Full text index No XML specific functions
Path index Available
Queries SQl extensions (query and value not compatible with
SQL/XML), XQuery
Manipulation XML method modify with updategrams
13.1 Introduction
13.2 Oracle
13.3 DB2
13.4 SQL Server
13.5 Tamino
13.6 Summary
13.X Overview and References
51
13. Systems
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 52[Tür08]
13.5 Tamino
• Architecture
• Architecture (2)
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 53[Tür08]
13.5 Tamino
XML Output Query (URL) XML Objects, DTDs
Data from external sources and/or internal data storage
Data to external sources and/or internal data storage
• Storage structures: Mapping of XML– Tamino uses "native" storage structures for XML data– Native storage is supplemented with diverse classical index types
• B-Tree index• Full text index• Path index
– Storage alternatives:• Storage of well-formed XML documents without schema• Storage of valid XML documents
– Annotation of schema definition with storage alternatives
– Storage hierarchy:• Tier 1: Tamino• Tier 2: Collection• Tier 3: Document type (defined by set of XML schema definitions)• Tier 4: document instance
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 54[Tür08]
13.5 Tamino
11.02.2009
10
XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 55[Tür08]