YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: 10. XML Storage 1 XML Databases 10 . XML Storage 1 – Overvie · • Indexes atomar values of an XML document, like element content or attribute values • Index format for structured

Silke EcksteinAndreas KupferInstitut für InformationssystemeTechnische Universität Braunschweighttp://www.ifis.cs.tu-bs.de

XML Databases10. XML Storage 1 –Overview

10.1 Motivation

10.2 Text-based storage

10.2.1 Index structures

10.3 Model-based storage

10.4 Schema-based storage

10.5 Conclusion

10.6 Overview and References

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 2

10. XML Storage 1

• Applications require different types of XML documents– Structure vs. content– Regular vs. irregular

• Thus, XML documents are– Data-centric– Document-centric – or somewhere in-between

• Questions– Storage of XML documents– Efficient processing of queries on the stored documents or data

• There are several methods for storage– 1st goal: Learn and understand methods– 2nd goal: Classify methods

• Principles• Advantages and disadvantages• Usage

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 3

10.1 Motivation

• Characterisation of XML documents:

– Data-centric documents

• Structured, regular

• E.g. product catalog, order, invoice

– Document-centric documents

• Unstructured, irregular

• E.g. scientific article, book, email, web page

– Semi-structured documents

• Data-centric and document-centric parts

• E.g. publications, Amazon, MS Press (example chapters)

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 4

10.1 Motivation

• Requirements for the physical layer:

– Order preserving and lossless storage of XML documents

– Efficient access to XML documents or parts thereof

• Quick response time for

– Queries

– Update operations

• Indexing

• Transaction processing

• Support of XPath and XQuery

• Support of SAX and DOM for applications

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 5

10.1 Motivation

• Storage approaches for XML documents

– Text-based

• Storage as character data

– Model-based

• Generic storage of the graph structure

• Storage of the DOM

– Schema-based

• Mapping to (object-)relational databases

– Deriving the database schema from the XML structure

– Using user defined mapping procedures

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6

10.1 Motivation

Page 2: 10. XML Storage 1 XML Databases 10 . XML Storage 1 – Overvie · • Indexes atomar values of an XML document, like element content or attribute values • Index format for structured

10.1 Motivation

10.2 Text-based storage

10.2.1 Index structures

10.3 Model-based storage

10.4 Schema-based storage

10.5 Conclusion

10.6 Overview and References

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 7

10. XML Storage 1

• The whole XML document text is stored ascharacter data– File in the file system– CLOB (Character-Large-OBject) in the DBS

• Operations documents as a whole are very efficient– Reading and writing the whole document– But the content is monolithic and opaque with respect to

the relational query engine (query can't inspect a fragment)

• Getting granular access requires additional support– Full text index– Path index

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 8

10.2 Text-based storage

• Index structures for XML documentsallow efficient access for specific queries

– Different types of indexes are optimized for different types of queries

• Generate redundancy

– Index has to be up-to-date by propagating datachanges

• Index structures can be storage structures as well

– They define the storage method

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 9

10.2.1 Index structures

• Types of index structures– Value index

• Indexes atomar values of an XML document, like element content orattribute values

• Index format for structured parts of XML documents• Already known from databases (B-trees, hash index, …)

– Full text index• Indexes single words from the full text• Index format for unstructured parts of XML documents• Already known from Information Retrieval (inverted lists, tries, suffix

trees, …)

– Path index• Indexes subtrees/paths in an XML document• Index format for semistructured parts of XML documents• Already known from object-databases (access support relations, …)

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 10

10.2.1 Index structures

• B-tree as value index for an XML fragment document

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 11[Tür08]

10. 2.1 Index structures

• Full text index– Not limited to exact matches

• Keyword-based search and boolean retrieval• Pattern search (with regular expressions)

– Use of• Statistical, word-based methods

– Stop word removal– Elimination of uncommon items

• Linguistic methods– Normalization of words (e.g. capitalisation, hyphenation,) – Word decomposition by rules (engl.) or dictionaries (german)– Stemming

• Knowledge-based methods– Use of ontologies and thesauri to search for synonyms, hypernyms and

hyponyms

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 12

10. 2.1 Index structures

Page 3: 10. XML Storage 1 XML Databases 10 . XML Storage 1 – Overvie · • Indexes atomar values of an XML document, like element content or attribute values • Index format for structured

• Inverted list as full text index for XML

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 13[Tür08]

10. 2.1 Index structures

word occurrence word position in the text

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 14[Tür08]

10. 2.1 Index structures

word occurrenceword occurrence

• Path index

– Structure information must be identifiable andreconstructable

• Assigning the markup to the content as well as

• Representing the hierarchical nesting and order ofelements/attributes

– Especially suited for keyword search with regard tostructure or path expressions

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 15

10. 2.1 Index structures

FOR $b IN //book

WHERE CONTAINS($b/author,"Benjamin")

RETURN $b

• Types of path indexes– Nested path index

• Access to root node from everynode

– Multi-index• Accessing parent nodes

– Join-index• Access parent and child nodes

– Access Support Relations (ASR)

• Generalization of indexes above,by listing all paths in a table

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 16[Tür08]

10. 2.1 Index structures

• Conclusion– Efficient query processing on XML documents

requires different types of index structures

– Value index• For efficient access to structured parts

• Keyword search, value search

– Full text index• For efficient access to unstructured parts

– Path index• Using the document structure

• Navigating queries

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 17

10. 2.1 Index structures

• Summary text-based storage– Schema definition:

• not required

– Document reconstruction:• documents stay in their original format

– Queries:• Information retrieval queries• Processing the markup of the queries• XML queries possible

– Special features:• Full text functions

– Efficiency:• Character string must be parsed on every access with XML processorsà expensive

• No concurrency on read or write à no parallel processing

– Usage: • For document-centric XML applications• Suitable to only a limited extent also for semi-structured applications

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 18

10.2 Text-based storage

Page 4: 10. XML Storage 1 XML Databases 10 . XML Storage 1 – Overvie · • Indexes atomar values of an XML document, like element content or attribute values • Index format for structured

10.1 Motivation

10.2 Text-based storage

10.2.1 Index structures

10.3 Model-based storage

10.4 Schema-based storage

10.5 Conclusion

10.6 Overview and References

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 19

10. XML Storage 1

• Idea: generic storage of the graph structure– XML elements, XML attributes, … are nodes of a graph– Nesting of elements defines edges– Nodes get an (internal) ID based on graph traversal

• Using relations or object classes to store elements andattributes

• Document structure can be restored completely• Extension for data type adapted storage is possible

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 20

10.3 Model-based storage

ID Element name Value Reference to preceeding Rank

ID Attribute name Value Reference to element

Elements

Attributes

• The EDGE approach [FK99]

– Variant BINARY: horizontal partition of EDGE based on label

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 21[Tür08]

10.3 Model-based storage

XML documents

• XML queries

– XML queries (XPath, XQuery) are mapped to SQL queries (taking storage structures into account)

– Result of XML query is generated from result ofdatabase query

• "Labeling" of the result tuples

• Result is in XML format

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 22[Tür08]

10.3 Model-based storage

• Example: list bargain buy with prices

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 23

10.3 Model-based storage

SELECT a.content, b.content FROM Edge a, Edge b

WHERE (a.label = 'price') AND (a.content < 10.00)

AND (b.label = 'description')

AND (b.parent = a.parent) AND (a.key = b.key)

[Tür08] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 24[Tür08]

10.3 Model-based storage

• DOM-based storage

– Information from theDocument Object Modelare stored in the database

– Storage alternatives

• (Object-)relational databases

• Object-oriented databases

• Developing own datastructure

Page 5: 10. XML Storage 1 XML Databases 10 . XML Storage 1 – Overvie · • Indexes atomar values of an XML document, like element content or attribute values • Index format for structured

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 25[Tür08]

10.3 Model-based storage

Node type:

ELEMENTNode type:

ATTRIBUTE

Node type:

TEXT

DOM-based storage – example • XML Queries

– XML queries (DOM method invocations) are mappedto SQL queries (taking storage structures intoaccount)

– Result of method invocation is generated from resultof database query

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 26[Tür08]

10.3 Model-based storage

Summary model-based storage– Schema definition:

• not required for storage

– Document reconstruction:• Possible, but expensive

– Queries:• XML queries possible• Adapted database queries

– Special features:• Querying many elements/attributes is expensive

– Efficiency:• Navigation from the given context is efficient• Restoring the document and evaluating path expressions is inefficient

– Usage: • For data- and document-centric as well as for semi-structured

XML applications

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 27

10.3 Model-based storage

10.1 Motivation

10.2 Text-based storage

10.2.1 Index structures

10.3 Model-based storage

10.4 Schema-based storage

10.5 Conclusion

10.6 Overview and References

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 28

10. XML Storage 1

• Motivation– XML content shall be stored in a conventional database– Accepting the loss of native access– DB schema is derieved from a DTD or an XML schema

• Problem– Generate DB schema automatically– Thereby use as much structure information as possible

• General approach for mapping from a DTD– Transform DTD into a tree representation– Nodes: element types, attributes, etc. (type layer!!!)– Edges: nesting relationships of element types and their restrictions– Traverse tree in order to transform nodes and edges into database

tables (according to certain rules)

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 29

10.4 Schema-based storage

• Generating the DB schema for a DTD:

– Rules to map element types:

– Rules to map attributes:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 30

10.4 Schema-based storage

XML element type à column of a tableSequence of element types à columns of a tableAlternative of element types à column of a tableElement type with quantifier ? à column with null valuesElement type with quantifier +,* à set/list of columns (SET OF, LIST OF)Nested element types à TUPLE OF

XML attribute à column of a tableIMPLIED à null values allowedREQUIRED à null values not allowedDefault value à DEFAULT constraint

Page 6: 10. XML Storage 1 XML Databases 10 . XML Storage 1 – Overvie · • Indexes atomar values of an XML document, like element content or attribute values • Index format for structured

• Mapping to relational databases– DTD is usually required– Queries use SQL functionality– RDBMS data types are used (e.g. prices are NUMERIC)– Problem: Mapping of collection types

• Subdivide into additional relations

– Example:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 31

10.4 Schema-based storage

Comment_ID Customer_info Feedback

44901 C0001 F0001

ID Fname Lname Email

C0001 Charles Sanchez C.Sanchez@hotmail...

ID Type Content

F001 opinion Darjeeling Special…

Comment:

Customer_Info:

Feedback:

• Mapping with STORED (Semistructured TO RElational Data)– Basic idea: Use data mining techniques on the XML structure to find a good

mapping to tables [DFS99]

– Input• XML documents (or an average sample of the collection)

• Query workload

• Restrictions of storage space, number of tables, …

• No DTD or XML schema is required!

– Output• Relational schema

• STORED-queries: Mapping instructions for XML documents to DB tables

– Procedure• Determine the XML subtrees with the largest support in the collection and in the

queries

• These subtrees are materialised in tables

• Irregular data is stored in overflow tables according to the EDGE approach

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 32

10.4 Schema-based storage

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig

• Mapping with STORED – example

10.4 Structure-based storage

XML documents shown as tree structure

Subtrees with

high support

Subtrees with

high support

33[Tür08]

• Mapping to object relational databases– DTD is usually required

– Queries use SQL functionality

– "Natural" mapping to tupletypes, collection types

– In case of irregular document structure databases containmany null values.

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 34

10.4 Schema-based storage

Comment_ID <Customer_info> <Feedback>

44901

Fname Lname Email

Charles Sanchez C.Sanchez@hotmail...

Type Content

opinion Darjeeling Specia…

Comment:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 35[Tür08]

10.4 Schema-based storage

• Mapping of recursive data definitions– DTDs can be recursive

– Infinite recursion is impossible on instance layer of a database

– Procedure:• Marking the nodes

• Subdividing into separate tables

• Use primary and foreign keys in RDBMS

• Use reference types in ORDBMS

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 36

10.4 Schema-based storage

<!ELEMENT book (front, body, references)>

<!ELEMENT references (book+)>

Page 7: 10. XML Storage 1 XML Databases 10 . XML Storage 1 – Overvie · • Indexes atomar values of an XML document, like element content or attribute values • Index format for structured

• Mapping of element sequences

– Sequence can be important

• Use an additional attribute in these cases

– Example:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 37

10.4 Schema-based storage

Order Lesson

1 Introduction

2 XML basics

<lecture>

<lesson>Introduction</lesson>

<lesson>XML basics</lesson>

⇓⇓⇓⇓⇓⇓⇓⇓

• Mapping of alternatives

– XML allows to specify alternatives

– Example:

– Three possible storage variants

• Each alternative is stored as separate table column

• Subdivide alternatives in separate tables

• Use a table column of type XML type

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 38

10.4 Schema-based storage

<!ELEMENT car (compactCar | sedan | van)*>

• Variant 1 – all alternatives in one table

– Problem: many null values (wasting storage space)

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 39[Tür08]

10.4 Schema-based storage

• Variant 2 – subdivided into multiple tables

– For queries, combination of tables is needed

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 40[Tür08]

10.4 Schema-based storage

• Variant 3 – Using column type XML

– XML type allows XML queries or DOM methods

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 41[Tür08]

10.4 Schema-based storage

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 42[Tür08]

10.4 Schema-based storage

Mapping of mixed content – example

Page 8: 10. XML Storage 1 XML Databases 10 . XML Storage 1 – Overvie · • Indexes atomar values of an XML document, like element content or attribute values • Index format for structured

• Mapping of mixed content

– Mapping to plain tables is ill-suited

– Use variant 3 from above or

• Content model ANY is not representable at all

– Arbitrary content, arbitrary element types

– Often the fitting storage structure can only bedecided on instance layer

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 43

10.4 Schema-based storage

• Schema-based storage with automaticmapping

– Advantages

• Queries, data types, aggregation functions, views

• Integration in other databases when storing structured data

– Disadvantages

• Large schema, sparsely filled databases (many null values)

• No flexible data types, storage of alternatives has problems

• Less flexible queries

– No information retrieval queries possible without additional extensions

– No full text operations for semi- or unstructured data

– Usually native access is not possible any more

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 44

10.4 Schema-based storage

• Mapping solutions with different specializations

– Algorithms, middleware, commercial applications, …

– Varying amount of required input or user decisions

– Many algorithms create different database schemas

• Two phases

– Mapping

• Assign a place for each node type in the DB

– Shredding

• Import the XML data as DB tuples

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 45

10.4 Schema-based storage

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 46[Bus08]

10.4 Schema-based storageAlgorithm/product |based on: n/a DTD schema |restrictions: keys cardin. types | DTD optimisation

• The shredder can be part of the DB

– Usually requires an XML schema

– In the IBM Data Studio, the shredder is part of the"annotated XML schema decomposition"

– Direct approach in DB2:

• register the XML schema and call the stored procedure:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 47

10.4 Schema-based storage

register xmlschema http://our.org/custacc from

dec_files/custacc.xsd as cust_schema ;

complete xmlschema cust_schema enable decomposition ;

call SYSPROC.XDBDECOMPXML ('VRODRIG', 'CUST_SCHEMA', ? ,

?, 1, null, null, null)

• Shredding without XML schema in DB2

– XMLTABLE function in combination with an INSERT

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 48http://www.ibm.com/developerworks/db2/l

ibrary/techarticle/dm-0801ledezma/

10.4 Schema-based storage

INSERT INTO ENVELOPEXT (MAILFROM, MAILTO, MAILDATE, SUBJECT)

SELECT MAILFROM, MAILTO, MAILDATE, SUBJECT

FROM XMLTABLE(

XMLNAMESPACES('http://www.sal.com/mails' AS "email"),

'$doc/email:mails/mail' (: some xquery-expression :)

PASSING xml-source AS "doc"

COLUMNS

MAILFROM VARCHAR (100) PATH 'envelope/from',

MAILTO VARCHAR (100) PATH 'envelope/to',

MAILDATE VARCHAR (30) PATH 'envelope/email:Date',

SUBJECT VARCHAR (100) PATH 'envelope/Subject') AS T;

Page 9: 10. XML Storage 1 XML Databases 10 . XML Storage 1 – Overvie · • Indexes atomar values of an XML document, like element content or attribute values • Index format for structured

• Summary Schema-based storage with automatic mapping– Schema definition:

• Is usually required and analysed

• not required, e.g. for STORED

– Document reconstruction:• Limited (requires logging of the mapping process)

– Queries:• Database queries

• XML queries possible,but lack the XPath horizontal axes, e.g. following, preceding-sibling

– Special features:• Federation with existing databases is possible

– Efficiency:• High efficiency by using the DB-engine

– Usage: • For data-centric XML applications, but with limited nesting

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 49

10.4 Schema-based storage

• User defined mapping– Idea

• In all previously shown methods it is not possible to affect the storage in the DB

• With user defined mappings the user defines the storage structure

• The structure of XML documents and database schema can be designedindependently from each other

• Also possible: storing XML documents in existing databases

– Annotation of DTD and XML schema, respectively• In many cases the mapping definition is combined with existing schema

information

– Only limited XML queries possible• Logging of the mapping process from XML documents to databases

• For a given query all relevant data has to be stored (lossless mapping)

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 50

10.4 Schema-based storage

• Example:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 51[Tür08]

10.4 Schema-based storage

mapping instructionXML document

• Mapping instruction

– Example syntax for XML-DBMS (Roland Bourret)

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 52

10.4 Schema-based storage

<ClassMap>

<ElementType Name="sales:SalesOrder"/>

<ToClassTable>

<Table Name="Sales"/>

</ToClassTable>

<PropertyMap>

<Attribute Name="SONumber"/>

<ToColumn>

<Column Name="Number"/>

</ToColumn>

</PropertyMap>

</ClassMap>

Connection

between elements

and tables

Connection

between

elements/attributes

and table columns

• Remarks

– Many different mapping languages or schemaannotations

• Automatic mappings usually have an internal mappinglanguage

– Remember the mapping constructs from lecture 5 and6. The SQL/XML annotations are a mapping language, too.

– DB2 uses similar annotations as SQL/XML

• On the next slide, the example from lecture 6 is shown withDB2 syntax

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 53

10.4 Schema-based storage

Name Balance

Joe 2000

Jim 3500

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 54[Tür08]

<ACCOUNT>

<row>

<NAME>Joe</NAME>

<BALANCE>2000</BALANCE>

</row>

<row>

<NAME>Jim</NAME>

<BALANCE>3500</BALANCE>

</row>

</ACCOUNT>

Mapping SQL tables <xsd:complexType xmlns:db2-xdb=

"http://www.ibm.com/xmlns/prod/db2/xdb1"

name="ROW.ACCOUNT">

<xsd:sequence>

<xsd:element name="NAME"

type="CHAR_20"

db2-xdb:rowSet="Account"

db2-xdb:column="Name"/>

<xsd:element name="BALANCE"

type="NUMERIC_12_2"/>

db2-xdb:rowSet="Account"

db2-xdb:column="Balance"/>

</xsd:sequence>

</xsd:complexType>

<xsd:complexType name="TABLE.ACCOUNT">

<xsd:sequence>

<xsd:element name="row"

type="ROW.ACCOUNT"/>

</xsd:sequence>

</xsd:complexType>

<xsd:element name="ACCOUNT"

type="TABLE.ACCOUNT"/>

CREATE TABLE Account

(

Name CHAR(20),

Balance NUMERIC(12,2),

);

Mapping SQL

table columns to

XML elements

Mapping table

rows to XML

<row>

elements

SQL/XML

schema

annotations in

DB2

(table is called

rowSet)

Page 10: 10. XML Storage 1 XML Databases 10 . XML Storage 1 – Overvie · • Indexes atomar values of an XML document, like element content or attribute values • Index format for structured

• Summary schema-based storage with user definedmapping– Schema definition:

• Depends on mapping language

– Document reconstruction:• Not possible in most cases (requires logging of the mapping process)

– Queries:• Database queries• XML queries in rare cases only!

– Special features:• Integration with existing databases is possible

– Efficiency:• High efficiency by using the DB-engine

– Usage: • For data-centric XML applications

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 55

10.4 Schema-based storage

10.1 Motivation

10.2 Text-based storage

10.2.1 Index structures

10.3 Model-based storage

10.4 Schema-based storage

10.5 Conclusion

10.6 Overview and References

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 56

10. XML Storage 1

• Different methods for storage of XML documents– Text-based

• Storing whole XML documents as string• Can use full text index or path index

– Model-based• Generic mapping of the tree structure

– Schema-based• Detect and analyse the structure of the XML documents• Derive a DB schema from the structure

– Hybrid approaches• A combination of some of those methods

– No algorithm has the optimal solution for all kind of XML documents

– Reasonable solution is heavily dependent on the application

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 57

10.5 Conclusion

• "XML und Datenbanken" [Tür08]– Can Türker– Lecture, University of Zurich, 2008

• "XML und Datenbanken" [KM03]– M. Klettke, H. Meier– dpunkt.verlag, 2003

• "Generierung eines adaptiven Datenbankschemas für datenzentrierte XML-Dokumente" [Bus08]– Carsten Busche– Diplomarbeit, TU Braunschweig, 2008

• [FK99]– D. Florescu, D. Kossmann: Storing and Querying XML Data using an RDBMS. IEEE Data

engineering Bulletin (DEBU), Volume 22(3), Seiten 27-34, 1999.

• [DFS99]– A. Deutsch, M.F. Fernández, D. Suciu: Storing Semistructured Data with STORED.

Proceedings of the 1999 ACM SIGMOD international conference on Management of data, Seiten 431-442, ACM, 1999.

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 58

10.6 References

1. Introduction

2. XML Basics

3. Schema definition

4. XML query languages I

5. Mapping relational datato XML

6. SQL/XML

7. XML processing

8. XML query languages II –XQuery Data Model

9. XML query languages III – XQuery

10. XML storage I –Overview

11.XML storage II

12. Updates / Transactions

13. Systems

10.6 Overview

59XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig

üüüü

üüüü

üüüü

üü

üü

üü

üü

• Now, or ...

• Room: IZ 232

• Office our: Tuesday, 12:30 – 13:30 Uhr

or on appointment

• Email: [email protected]

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 60

Questions, Ideas, Comments


Related Documents