Top Banner
May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions, LLC Session: E05
44

May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

Jan 17, 2016

Download

Documents

Suzanna Chapman
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

May 8, 2007 9:20 a.m. – 10:20 a.m.

Platform: DB2 for Linux, UNIX and Windows

DB2 9: XML Evolution and Revolution

Philip K. GunningGunning Technology Solutions, LLC

Session: E05

Page 2: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

2

Outline

• XML in DB2 LUW till DB2 9 time• Shredding• CLOBs

• XML only databases• TIMBER, Niagara, Natix

• Followed by bliss for several years…• XML Databases Fundamental differences with

Relational Databases

Page 3: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

3

Outline

• Then IBM shook-up the database world WITH DB2 9 HYBRID DATA SERVER

• Extensible Optimizer and DB2 9• Why Native XML data type?• pureXML™

Page 4: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

4

Outline

• Pure XML Implementation• Pure XML -- Key Enablers• SQL/XML• XPath/XDM• XQuery• Developer Workbench• XQuery Builder

• Explain Facility and Visual Explain

Page 5: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

5

Disclaimer

• DB2 9 is a registered trademark of IBM Corp.• pureXML is a registered trademark of IBM Corp.• DB2 9 Sample queries and programs are copyrights of

IBM Corp.

• DB2 for z/OS is a registered trademark of IBM Corp.

• Developer Workbench and Visual Explain are copyrights of IBM Corp.

Page 6: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

6

Shredding

• Early implementations of XML support in databases used shredding to shred XML to columns in relation tables• Mapping + Parsing = Overhead• Retrieval of whole document or parts • Entire document replaced if update required• Lack of flexibility

Page 7: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

7

CLOBs

• Stored entire XML document as text

• High cost of retrieval• Not buffered• Poor search performance and parsing• Lack of flexibility

Page 8: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

8

Key Factors in IBM Approach

• “XML and Relational data coexist and complement each other in enterprise solutions”

• “A successful XML repository requires much of the same infrastructure that already exists in a RDBMS system”

• “XML query languages have considerable conceptual and functional overlap with SQL”

DB2 goes hybrid: Integrating native XML and XQuery with relational data and SQLIBM Systems Journal, Vol 45 NO 2, 2006, Beyer, et al

Page 9: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

9

Revolutionary ApproachDB2 9 pureXML Framework

• DB2 Optimizer was extensible

• XML Native data type

• Enables XML data to be treated natively

• Native XML data types enables better performance (less overhead versus legacy methods) via optimization and XML indexes

• Industry schemas supported

Page 10: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

10

Fundamental Differences

• DB2 9 native XML data type takes advantage of years of relational database research• 20+ years of optimization advancements

• Extensive query rewrite plus new rewrites

• Uses underlying optimization and storage components

• Same or enhanced APIs

Page 11: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

11

PureXML Framework Implementation

• Key Enablers• Extensible Optimizer• XML and SQL Integration• XQuery, XDM, XPath, SQL/XML, • Development Tooling

• Developer Workbench• XQuery Builder• Explain Support, including Visual Explain

Page 12: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

12

SQL/XML Parser XQuery Parser

Semantics Checking

Optimizer Phase

Rewrite Phase

Code Generation

QueryPlan

QGMX

Hybrid SQL/XQuery Compiler

Page 13: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

13

DB2 Client Application

SQL/XML XQuery

Relational

InterfaceXSR/Catalogs

XML

Interface

DB2 Engine

DB2 STORAGE

XMLRelational

DB2 9 Hybrid Data Server Architecture

Page 14: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

14

Tight Integration

Page 15: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

15

XQuery Defined

• SQL is the query language for relational databases

• XQuery is the query language for XML as defined by the W3C organization

• Built-in support provided in DB2 9 by query compiler and built-in XQuery functions

Page 16: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

16

INPUT FUNCTIONS

Page 17: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

17

DB2 9 XML Input

• SQL INSERT Statement

• Input to the XML column must be a well-formed XML document• Defined in XML specification

• Clients send XML documents in textual representation and DB2 uses a Simple API for XML (SAX) parser• “formness” • Validation

• If XML data type, serialization performed by DB2 implicitly

• XMLPARSE function for non-XML data type

Page 18: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

18

DB2 9 Annotated XML Schema Decomposition

• Data from XML documents decomposed into relational and XML columns using the annotated XML Schema decomposition• Stores data into columns according to

annotations contained in XML schema documents

• XML Schema Registry (XSR) Registration

• Schemas registered with DB2 supplied Stored Procedure or via Command Line Processor

Page 19: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

19

DB2 9 XML Input -- IMPORT

• Import utility enhanced to support import of XML documents

• Validation optional

• Schema must be registered in DB2 XML Schema Repository (XSR) if validation performed

Page 20: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

20

OUTPUT FUNCTIONS

Page 21: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

21

DB2 9 XML Output Functions

• db2-fn:xmlcolumn function• Takes a string literal as input that identifies an

XML column and returns an XML sequence that consists of all document nodes in specified columns

Page 22: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

22

DB2 9 XML Output Functions

• db2-fn:sqlquery function• Used to restrict input to an XQuery by

conditions placed on relational columns in the same or related tables

• Returns a single column• Based on SQL Fullselect

Page 23: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

23

DB2 9 XML Output -- EXPORT

• EXPORT utility supports XML data type

• XML data stored separately from exported relational data

• Details about exported XML represented in main exported file by an XML data specifier (XDS)

Page 24: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

24

XQuery Data Model (XDM)

• XQuery Data Model (XDM) is used to define an instance of an XDM sequence

• An instance of the XDM is a sequence• Sequence is an ordered collection of zero or

more items• An item is either an atomic value or a node

• Sequence – 48, <car/>, (6,7,8), (48,<car/>,(6,7,8))• () (an empty sequence), an XML document, 48

Page 25: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

25

DATABASE DESIGN

Page 26: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

26

Relational – XML

• Relational is highly structured

• Represented by well defined entities and relationships

• XML is hierarchical in form, unstructured and can be very complex• Represented in a tree format defined by XPath

W3C standard

Page 27: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

27

Relational vs. XML Database Design

• Relational• Frequency of updates• Design is fixed• Max performance req• Stays relational• Meaning outside hierarchy• Specific attributes• Large Fact and dimension

tables• RI Required

• XML• Design Changes• Flexibility desired• Not use relationally

downstream• Only hierarchical• Many attributes and

only subset applicable• Only subset applicable• Small dimensions in

STAR schema

Page 28: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

28

XML Indexes

• Value Indexes• Path-specific value indexes on XML columns• Elements and attributes used in predicates and cross-

document joins• Full-text indexes

• Indexes can be defined on any native XML column• Documents can be fully or partially indexed• Enables just certain parts of documents to be subject to full-

text search• Text index maintained asynchronously via “lazy” update

• Regions Indexes• Connects documents that span multiple pages • Created automatically by DB2

Page 29: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

29

XML Storage

• Relational data stored in tables and columns

• XML data stored in hierarchical type-annotated tree format

• XML document stored separately outside of table

• XML Data Specifier (XDS) stored in table describes XML document

Page 30: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

30

XML Storage

• Documents must be able to span disk pages• Single text node may be larger than a page

• Direct Node Access• Not feasible to traverse every node (could be

several gigabyte document)

• Must support existing isolation levels, logging and recovery mechanisms

Page 31: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

31

XML Storage

• DB2 uses a structured, type-annotated tree

• Stored in binary representation to avoid repeated parsing and validating of the document

• Digital signatures preserved

• Each node contains its type information

• Type information on the document level enables schema evolution• Each document in a column can conform to a different

schema or different versions of evolving schema

Page 32: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

32

XML Storage

• Each node contains pointers to parent and children• Supports efficient navigational queries

• Path expressions are evaluated directly for the native format on buffered pages without copying or transforming the data

• Extra information stored with each node• Type annotation if validated• Each element node has set of child slots for

associate attribute and ordered children

Page 33: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

33

XML Storage

• Child slots have hints within them • Give indication of what the child represents• Enables fast navigation across a context node’s set of

children without actually visiting each child node• Child page may be on a different page and require I/O

• A unique identifier gives each node a logical and physical addressability• Can be used in indexing and query evaluation

• Large document trees may not fit on one page• Can be split into regions via region index

Page 34: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

34

BUILDING APPLICATIONS

Page 35: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

35

Key DB2 9 XML Enablers

• Build with Developer Workbench

• Test with Developer Workbench

• Deploy and Maintain with Developer Workbench

• Replaces former Development Center• Migration support for existing documents

• Eclipse Framework based tool

Page 36: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

36

Key DB2 9 XML Enablers• Developer Workbench

• Separate download at http://www-306.ibm.com/software/data/db2/ad/

Page 37: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

37

XML Sample Schema Definition

Page 38: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

38

XML-XQuery SP

Page 39: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

39

Visual Explain Support

Page 40: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

40

Page 41: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

41

XML Schema Definition

Page 42: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

42

XPath Example

Page 43: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

43

Summary

• pureXML™ Framework

• SQL/XML

• XQuery/XPath

• XDM and XSR

• XML Storage and XML Indexes

• Developer Workbench• Build, Test, Deploy and Maintain!

• Additional Features coming in DB2 9 for z/OS

Page 44: May 8, 2007 9:20 a.m. – 10:20 a.m. Platform: DB2 for Linux, UNIX and Windows DB2 9: XML Evolution and Revolution Philip K. Gunning Gunning Technology Solutions,

44

Thanks!Philip K. Gunning

Gunning Technology Solutions, LLC

[email protected]

Session: E5DB2 9: XML Evolution and Revolution