Top Banner
Realization of DBS 5. Storage Structures Theo Härder www.haerder.de © 2011 AG DBIS Realization of Database Systems – SS 2011 Main reference: Theo Härder, Erhard Rahm: Datenbanksysteme – Konzepte und Techniken der Implementierung, Springer, 2001, Chapter 6. Jim Gray, Andreas Reuter: Transaction Processing – Concepts and Techniques, 5th printing, Morgan Kaufmann Publ., 1993, Chapter 14. Realization of DBS Disk-based record addressing Free-placement administration Storage Structures Goal: design of Storage structures for records and complex objects Auxiliary structures such as free placement administration, addressing, etc. Free-placement administration Complex objects Memory-based record addressing Mapping of records Large objects Disk-based record addressing TID, allocation table, indexing of tables Memory-based record addressing Classification of the solution concepts, Pointer Swizzling methods Mapping of records Fixed/variable fields, partitioning Storage structures for complex objects List and set const cto s t ple const cto © 2011 AG DBIS DB connection for external data 5-2 List and set constructors, tuple constructor Problems of large objects Storage structures for LOBs Segments of fixed and variable size , access via B*-tree, pointer list, . . .
32

of DBS 5. Storage Structures -

Jun 05, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: of DBS 5. Storage Structures -

Realizationof DBS

5. Storage Structures

Theo Härderwww.haerder.de

© 2011 AG DBIS

Realization of Database Systems – SS 2011

Main reference:Theo Härder, Erhard Rahm: Datenbanksysteme – Konzepte und Techniken der Implementierung, Springer, 2001, Chapter 6.

Jim Gray, Andreas Reuter: Transaction Processing – Concepts and Techniques, 5th printing, Morgan Kaufmann Publ., 1993, Chapter 14.

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures

Goal: design of• Storage structures for records and complex objects • Auxiliary structures such as free placement administration, addressing, etc.

Free-placement administration

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Disk-based record addressing• TID, allocation table, indexing of tables

Memory-based record addressing• Classification of the solution concepts, Pointer Swizzling methods

Mapping of records• Fixed/variable fields, partitioning

Storage structures for complex objectsList and set const cto s t ple const cto

© 2011 AG DBIS

DB connectionfor external data

5-2

• List and set constructors, tuple constructor

Problems of large objects

Storage structures for LOBs• Segments of fixed and variable size , access via B*-tree, pointer list, . . .

Page 2: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures (2)

Operations

Insert <record> at <location> with <database-key>Retrieve <record> with <database-key>Add <entry> to <B*-tree>Ret ie e <add ess list> f om <B* t ee> fo < al e>

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Retrieve <address-list> from <B*-tree> for <value>

Mapping functions• record identifier address• attribute value record-id.list• record identifier record-id.list• address {occupied, free}

FIX Pi FIX Pj UNFIX Pj

© 2011 AG DBIS

DB connectionfor external data

5-3

FIX Pi, FIX Pj, UNFIX Pj,FIX Pk, UNFIX Pi, ...

Properties of the upper interface• Non-volatile storage with “addressing assistance”• Free-placement administration• Addressing methods of and between physical records• Access paths supporting content addressability

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Free-Placement Administration

Free-placement administration (FPA) for• External storage (allocation of files)• Segments (allocation of internal records)• Pages (administration of allocated/free entries)

For all pages of a segment:

Complex objects

Memory-based record addressing

Mapping of records

Large objects

For all pages of a segment:• Insert/update search for n free bytes• Delete/update release or allocation of storage space• In general: search, allocation, and release of storage space in Sj

PH

PirecordsPH

page Pi

LP

pages

segment Sj

© 2011 AG DBIS

DB connectionfor external data

5-4

LP

In PH (page header): ID of Pi,free placement info, type, org. data

free-placement table F in Sjfi

i

Lf

1

fi = # free bytes

Page 3: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Free-Placement Administration (2)

Size of F: entries per page of size LP with s = #pages in segment

pages for F

k

sn

f

PHP

L

LLk

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Location of F• Begin of segment• Equidistant distribution i k + 1 (i=0,1,2,...)• End of segment

Kind of FPA

© 2011 AG DBIS

DB connectionfor external data

5-5

Kind of FPA• Exact: Lf = 2 bytes • Fuzzy: Lf = 1 byte (or less)

units of fi LP/256 - multiplefor LP = 4KB 16 bytes

FPA within Pi• Exact fi in PH• Contiguous administration (displacements!)• Free-storage chain (best-fit / first-fit)

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Disk-based Record Addressing Problem statement

• Long-term storage of data records • Avoidance of “technology dependencies”• Support of migration, etc.

Complex objects

Memory-based record addressing

Mapping of records

Large objects

General form of a record address• DBID, SID, TID and, if necessary, table selector (RID), • if table completely stored in segment: in record: TID; in DB catalog: DBID,SID• if table in several segments: SID, TID

Goals of the addressing technique:• Fast possibly direct record access

© 2011 AG DBIS

DB connectionfor external data

5-6

• Fast, possibly direct record access• Stable against minor displacements (moves within a page without impact)• Infrequent or no reorganizations

Addressing in segments• Logically contiguous address space• Direct addressing (logical byte address, RBA) Instable under displacements

Indirect addressing is very important

Page 4: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Disk-based Record Addressing – Techniques

addressing methods

Complex objects

Memory-based record addressing

Mapping of records

Large objects

direct addressing indirect addressing

logical (relative)byte addressin segment

TID allocation table primary key

© 2011 AG DBIS

DB connectionfor external data

5-7

DBK DBK/PPP PK/PPP ARID/PPP

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Record Addressing: TID Concept

TID (tuple identifier) consists of two components:• Page number (3 bytes)• Relative index position within page (1 byte)• Serves for addressing within a segment (z. B. SID = A)

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Migration of a record into another page without TID change

Creation of a proxy TID in primary page

Overflow chain: length <= 1

page 123TIDs

segment A:

© 2011 AG DBIS

DB connectionfor external data

5-8

record

123 3

123 6

TID

overflow record

record

Page 5: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Record Addressing via Allocation Table

Each record owns a unique logical identifier• Database key (DBK)• Allocation of DBKs done by the DBMS, in general• System-internal references to records are exclusively made via DBK

Allocation table contains a PP for each DBKSID (1 b ) b (3 b )

Complex objects

Memory-based record addressing

Mapping of records

Large objects

• SID (1 byte), page number (3 bytes)

Hybrid method:use of “probable page pointers” (PPP) in access paths possibly saves accesses to allocation table

DBK

allocation tablefor record type Y

PP

segment A

Y003page123

index structure

searchcriterion DBK PPP

xxx Y003 A123

© 2011 AG DBIS

DB connectionfor external data

5-9

A 123

A 124Y006

Y003DBK xxx …

Y006

zzz …

page124

xxx Y003 A123

zzz Y006 A124

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Indexing of Tables Storage of tables

• Unordered table:records (rows) are scattered across the pages of the segment (heap)

• Ordered table:records are embedded in a B*-tree (key-sequenced table); thereby, clustering is achieved

Complex objects

Memory-based record addressing

Mapping of records

Large objects

W d h bl i d i d bl (IT)

root page

internal pages

leaf pages

25 61

8 13 33 45 77 85

ITTAB (PK)

© 2011 AG DBIS

DB connectionfor external data

5-10

We denote such a table as index-organized table (IT)

Indexing of tables• With secondary indexes for columns Ai : ITab(Ai)• Use of different addressing methods

- TID (physical)- DBK (indirect: logical/physical)- PK (primary key: logical)- Hybrid methods

Page 6: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Indexing of Tables (2)

How do addressing and table allocation play together?

Unordered table. . .

segment Tab recordsstored as a heap

Complex objects

Memory-based record addressing

Mapping of records

Large objects

a57

A5

. . .Ptra57 Ptr . . .

ITab(A5)

p

© 2011 AG DBIS

DB connectionfor external data

5-11

• Records are not (or hardly) displaced in case of modification• Addressing methods (Ptr):

TID, DBK, and DBK/PPP are conceivable• Index support for unordered tables

in DB2, Sybase, MS SQL-Server, Oracle, ...

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Indexing of Tables (3) Index-organized table

ITab(A5) ITTab(PK)

Complex objects

Memory-based record addressing

Mapping of records

Large objects

• Split in ITTab requires many address adjustments in ITab(Ai), when using- TID- DBK- DBK/PPP

• Improvement: logical addressing

Ptra57 Ptr . . . a11PK1 . . . a57 . . .

© 2011 AG DBIS

DB connectionfor external data

5-12

PK1a57 PKj . . .

ITab(A5) ITTab(PK)

a11PK1 . . . a57 . . .

• No maintenance of ITab(Ai) needed in case of splits/displacements in ITTab

• But: higher access costs for index scan, etc.

Page 7: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Indexing of Tables (4)

Use of a hybrid addressing method• Reference has two components

- Logical reference: PK- Physical reference: probable DB page (PPP, Guess DBA)

Complex objects

Memory-based record addressing

Mapping of records

Large objects

• Entry in index

attribute value PK PPPindex key HRID = (Hybrid Row Identifier)

ITTab(PK)ITab(A5)

© 2011 AG DBIS

DB connectionfor external data

5-13

• Combined advantages of both methods • What happens in case of long primary keys?

a11PK1 . . . a57 . . .. . .PK1 PPP1 PKj PPPja57

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Indexing of Tables (5)

Optimization for long primary keys• Example: table Order_Line of TPC-C benchmark:

OL (ol-o-id, ol-w-id, ol-d-id, ol-number, ol-i-id, ...) • Simplified notation: OL (A1, A2, A3, A4, A5, ...)

• Avoidance of PK storage in the index (solution of Oracle)

Complex objects

Memory-based record addressing

Mapping of records

Large objects

g ( )- Use of a mapping table ATab- Reference to ATab by ARID

ITOL(A1, A2, A3, A4)

ATab . . .

IOL(A5)

ARID1a57 PPP1 ARIDj PPPj

© 2011 AG DBIS

DB connectionfor external data

5-14

• If access fails via PPP , ATab is searched using ARID • All IOL(Ai) use ATab• Starting from ATab, access can be performed to ITOL via PPP or via PK

a11, a21, a31, a41ARID1 a57 . . .

PK1

PPPjPKj. . .

PPP1PK1

Page 8: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Memory-based Addressing

Task: Programs should be enabled to transparently process transient and persistent data objects in main memory• Exclusive usage of direct addresses in main memory (virtual addressing),

i i i t i t t bj t i ffi i t

Complex objects

Memory-based record addressing

Mapping of records

Large objects

i.e., in main memory, access to persistent objects is as efficient as access to transient objects

• No additional costs for programs only accessing transient objects

• Mapping costs for persistent objects should not be paid for each access

Mapping of persistent objects residing on external storage (ES) to such in virtual storage (VS)• Persistent addresses (e.g., SID, RID, TID) are long (e.g. 64 bit),

© 2011 AG DBIS

DB connectionfor external data

5-15

Persistent addresses (e.g., SID, RID, TID) are long (e.g. 64 bit), in contrast, virtual addresses are shorter (e.g. 32 bit)

• Translation of pointers (pointer swizzling) from the long format (using indirect addressing) to the shorter format using an addressing method ‘as direct as possible’

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Memory-based Addressing (2)

Goal: Fast processing of pointer sequences in VS — e.g. 105 refs/sec• Object processing: traversing sequences of references and

navigation in meshed object structures• Direct access in main memory is substantially cheaper than access via

persistent addresses (localization of a page in DB buffer and search of the

Complex objects

Memory-based record addressing

Mapping of records

Large objects

p ( p gobject in the page)

• Additional access paths to support search in main memory, if necessary:B*-tree access requires h+1 direct pointer references

Dimensions of Pointer Swizzling*

full

uncachingswizzing

software

copy

© 2011 AG DBIS

DB connectionfor external data

5-16

* White, S.J., DeWitt, D.J.: Quickstore: A High Performance Mapped Object Store, in: The VLDB Journal 4:4, Oct. 1995, pp. 629-674.

no-swizzinghardware

in-place

direct

eagerindirect

lazy

partial

no-uncaching

Page 9: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Pointer Swizzling

Classification of swizzling methods• Most important criteria:

location, point of time, and mode (orthogonal)

• Location:

Complex objects

Memory-based record addressing

Mapping of records

Large objects

- In-Place Swizzling: retention of object formats and page structures- Copy Swizzling: copy of objects in a buffer and swizzling of pointers

in the copies

• Point of time:- Eager Swizzling: swizzling of all pointers as soon as objects are placed

in main memory - Lazy Swizzling: swizzling of pointers at the first reference or later

(according to arbitrary criteria — magic number 3)

© 2011 AG DBIS

DB connectionfor external data

5-17

( g y g )

• Mode:- Direct Swizzling: use of virtual address of the object: using this method,

replacement of the object can become very difficult or even impossible during processing

- Indirect Swizzling: use of virtual address of the object descriptors

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Pointer Swizzling (2)

Classification criterion: locationin-placeDB buffer

O1

O2

copyobject buffer (heap) is typicallyallocated in the client computer

O1

Complex objects

Memory-based record addressing

Mapping of records

Large objects

O2

Classification criterion: point of timeeager

as soon as in main memory(avalanche effect)

lazymany possibilities

© 2011 AG DBIS

DB connectionfor external data

5-18

Classification criterion: mode

indirect

O1 O2

descriptor 1 descriptor 2

direct

O1 O2

Page 10: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Pointer Swizzling (3)

Direct and indirect swizzling – Principle

object 1 object 2 object 1 object 2

Complex objects

Memory-based record addressing

Mapping of records

Large objects

object 3

a) Symmetric references b) Referencing of descriptors

object 4descriptor 4descriptor 3

© 2011 AG DBIS

DB connectionfor external data

5-19

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Pointer Swizzling (4)

Direct and indirect variant using Copy Swizzling

object 4object 1

allocation tableOID/MM addr.

h(OID1)

Complex objects

Memory-based record addressing

Mapping of records

Large objects

object 2

object 3

object 5

a) Direct swizzling in an object buffer

allocation tableOID/Handle descriptor 1 descriptor 4

© 2011 AG DBIS

DB connectionfor external data

5-20b) Indirect swizzling in an object buffer

object 2

object 3

object 4object 1

h(OID1)

object 5

descriptor 2 descriptor 3 descriptor 5

Page 11: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Pointer Swizzling (5)

Checks

Lazy: check whether swizzling is already performedEager: no checkIndirect: check whether object is already / still there

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Indirect: check whether object is already / still there Direct: no check

(but no uncaching of the object after swizzling or unswizzling)

Costs

Eager/Direct: 0 checks ( no uncaching)Eager/Indirect: 1 check (in the descriptor)

© 2011 AG DBIS

DB connectionfor external data

5-21

g ( p )( replacement using reference counter)

Lazy/Direct: 1 check (+ cost of heuristics)( replacement using symmetric pointers)

Lazy/Indirect: 2 checks (+ cost of heuristics (> 3 refs))

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Pointer Swizzling (6)Location

Time Eager EagerLazy Lazy

In-Place Copy

Complex objects

Memory-based record addressing

Mapping of records

Large objectsRemarks:

l f ll / b h k l ( h )

Mode

1 2 3 4 5 6 7 8

ID D D DI I I

© 2011 AG DBIS

DB connectionfor external data

5-22

1 + 5 : Swizzling of all pages/objects at Checkout, no replacement (no uncaching)2 + 4 : cumbersome organization

Questions- Which methods allow fastest processing (no consideration of swizzling cost)?- Which methods enable object replacement (uncaching)?- How can Lazy/Direct (3 + 7) be implemented?

Page 12: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Mapping of Records

Record manager • Physical storage of records in pages• Operations: read, insert, modify, delete

Record description• Per attribute: Fi t N X5h

attribute name type . . . length attribute value

Complex objects

Memory-based record addressing

Mapping of records

Large objects

• Per attribute: First_Name Xaver5. . .varchar

metadata incatalog (DD)

instance inrecord

• Description of records and access paths in the catalog• Special methods for storing values

- Blank-/ null suppression- Character compression

© 2011 AG DBIS

DB connectionfor external data

5-23

Character compression- Cryptographic encoding- Symbol for undefined values

• Table substitution for values: KL = Kaiserslautern

Organization • n record types per segment• m records of different types per page• Record size < page size: RL LP - LPH

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures for Records

Design goals• Space economy• Fast location of the i-th field

(to a large extent, computation using catalog information)• Dynamic extension (ALTER TABLE ...)

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Concatenation of fixed-length fieldsCatalog: f5 | f8 | f80 | f6 | ... |

• space consuming• inflexible

Pointer in prefix

RID . . .

e.g. TID f5 f8 f6

. . .

f80

© 2011 AG DBIS

DB connectionfor external data

5-24

Pointer in prefixCatalog: f5 | v | v | f6 | ... |

RID

2 bytes

. . . • inflexible

Page 13: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures for Records (2)

Embedded length fieldCatalog: f5 | v | v | f6 | f2 | v

RID TL val L val L val val val L val

Complex objects

Memory-based record addressing

Mapping of records

Large objects

• Increased use of the catalog• Dynamic extension possible

Optimization: embedded length fields using pointersCatalog: f5 | v | v | f6 | f2 | v |

f5 f6 f2

© 2011 AG DBIS

DB connectionfor external data

5-25

• Address of the n-th attribute can be computed • Dynamic extensibility

RID

f5

TL val L valval val L val

f6 f2

FL L val

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures for Records (3)

Record mapping: evaluation of methods

concatenation of fixed-length

fieldspointer in

prefixembedded

length fieldsembedded

length fields using pointers

Complex objects

Memory-based record addressing

Mapping of records

Large objects

space economy

access speed within a record

extensibility

Special storage requirements

First Name Name Job

© 2011 AG DBIS

DB connectionfor external data

5-26

• 200 attribute/record ?• RL LP - LPH

• Must fit for n relational DBMSs• Indexing

Xaver

First_Name Name Job

OID

Page 14: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures for Records (4)

Extreme solution: AOV (built-in schema evolution)

First_Name

A OID V

XYZ 0815 Xaver

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Mapping to n-ary relation DR

• Search of the entire record having 200 attributes ? Via OID

A15 XaverXYZ 0815

OID V5

V4V3V2V1AID

INTFLOAT

MONEYVARCHAR

OWNtypes:

© 2011 AG DBIS

DB connectionfor external data

5-27

• Index on all attributesSearch: Select *

FROM Emp (DR)WHERE First_Name = ’Xaver’(OR) AND Job = ’Programmer’(OR) AND Age > 50

How can this query be mapped onto DR?

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures for Records (5)

Problem: dynamic growth / variable length• Growth and shrinking in a page• Overflow schemata, garbage collection Methods of record storage introduced so far are to be combined with

additional options

Complex objects

Memory-based record addressing

Mapping of records

Large objects

additional options

Strictly contiguous storage of records• Numerous migrations needed in case of high update frequency• Advantages for indirect addressing schemes

Splitting of the record

© 2011 AG DBIS

DB connectionfor external data

5-28

• Ordering according to reference frequencies• Improvement of clustering• Repeated overflow possible• Is inevitable in case of storing attributes of type TEXT or IMAGE

F1 F2 F3 F4 F5 F6 F7

Page 15: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures for Complex Objects*

Complex objects composed of• Atomic values and thereupon • recursively applied set-, list-, and tuple constructors

Model for complex objects (eNF2)set list

Complex objects

Memory-based record addressing

Mapping of records

Large objects

set list

values

tuple

Storage strategy• Orthogonality is important: no enumeration of all possibilities• Filing of frequently accessed substructures (possibly shared) in a single

or a few storage units

© 2011 AG DBIS

DB connectionfor external data

5-29* Keßler, U., Dadam, P.: User-guided, flexible storage structures for complex objects, Proc. BTW’93,Braunschweig, 1993, S. 206-225.

or a few storage units• Rarely accessed substructures should be separated Application knowledge!

Performance aspects• of complex objects/operations are essentially affected by the storage

structures used• minimization of I/O clustering, consideration of object growth

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures for Complex Objects (2)

Simple example• Complex_object Employee [. . .]

set [. . .] of tuple (Emp_No [. . .] : integer,Name [. . .] : string (30),Salary [. . .] : real,CV [. . .] : var_string)

[. . .] denotes location of storage structure description

Complex objects

Memory-based record addressing

Mapping of records

Large objects

[. . .] denotes location of storage structure description

Degrees of freedom for physical storage structures1. Choice of storage structures for the implementation

of sets, lists, and tuples (constructor data structure)2. In-line storage or referencing of the elements of a set or list resp. the

attribute values of a tuple in the constructor data structure

Each constructor has a constructor data structureExample: SimpleSet {Emp 1 Emp 2 Emp 3}

© 2011 AG DBIS

DB connectionfor external data

5-30

• Example: SimpleSet {Emp_1, Emp_2, Emp_3}• Variable-length array as constructor data structure

Emp_1 Emp_2 Emp_3

Emp_1 Emp_2 Emp_3materialized storage (in-line)

referenced storage

Page 16: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures for Complex Objects (3)

Twofold application of the set constructors• { {Emp_1 , Emp_2} , {Emp_3 , Emp_4} }• Pre-setting: variable-length arrays as constructor data structures

F i l i

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Four implementations

Anchor_Rec

Structure_Rec

Anchor_Rec

© 2011 AG DBIS

DB connectionfor external data

5-31

Emp_1 Emp_2 Emp_3 Emp_4Emp_Rec

1. Elements of outer set : referencedElements of inner set : referenced

Emp_1 Emp_2 Emp_3 Emp_4Emp_Rec

2. Elements of outer set : materializedElements of inner set : referenced

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures for Complex Objects (4)

Four implementations (cont.)

Anchor_Rec Anchor_Rec

Emp 2 Emp 3 Emp 4Emp1

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Emp_Rec

Emp_1 Emp_2 Emp_3 Emp_4

3. Elements of outer set : referencedElements of inner set : materialized

If in addition linked lists can be used as constructor data structures

Emp_2 Emp_3 Emp_4

4. Elements of outer set : materializedElements of inner set : materialized

Emp1

© 2011 AG DBIS

DB connectionfor external data

5-32

If, in addition, linked lists can be used as constructor data structures, 16 variants can be obtained in total

Page 17: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures for Set- and List Constructors

Independent degrees of freedom• Constructor data structure

- Variable-length array- Linked list- ...

• Mode of element storage- Directly in constructor data structure- Referencing of the elements via pointers

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Independent specification of these degrees of freedom requires two parameters (in a data definition language):

object_type = . . ./* definition of a set */set [implementation = implementation_type,

element_placement = placement_type] of object_type …/* definition of a list. */list [implementation = implementation_type,element_placement = placement_type] of object_type ...

P t l

© 2011 AG DBIS

DB connectionfor external data

5-33

Parameter valuesimplementation_type = array linked_listplacement_type = inplace referenced (record_type_name)

Complete definition of the storage structure (case 1)complex_object Set_of_Set_of_Emp [anchor_record_type=Anchor_Rec]

set[implementation=array, element_placement=referenced(Structure_Rec)] ofset [implementation=array, element_placement=referenced (Emp_Rec)] of Emp

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures for Tuple Constructors

Here, the same degrees of freedom exist, in principle• Choice of a constructor data structure: allocation of the tuple to a

record or to several records• Materialized or referenced storage of the attribute values

Complex objects

Memory-based record addressing

Mapping of records

Large objects

New parameter: “location”attribute_description = attribute_name [location = location_type,

element_placement = placement_type]

For each attribute, “location” and “element_placement” can be separately specified

“Location” allows for the optimization of record access

© 2011 AG DBIS

DB connectionfor external data

5-34

paccording to access frequencies of the individual attributeslocation_type = primary secondary (record_type_name)

The constructor data structure of a tuple can be divided into several records.

Using “primary”, the related attribute is allocated in the primary block.

Page 18: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures – Example

Instance of an Employee relation

Employee

Emp_No Name Salary CV

77234 Roberts 4000 Mrs. Julia Roberts is born …

Complex objects

Memory-based record addressing

Mapping of records

Large objects

77235 Bond 5000 Mr. James Bond is born …

Definition of a related storage structure1. Referenced Prim_Rec

complex_object Employee [anchor_record_type=Link_Rec]set [implementation=linked_list, element_placement=referenced (Prim_Rec)]

of tuple(Emp No [location=primary element placement=inplace] : integer

© 2011 AG DBIS

DB connectionfor external data

5-35

(Emp_No [location=primary, element_placement=inplace] : integer,Name [location=primary, element_placement=inplace] : string (30),Salary [location=secondary (Sec_Rec),

element_placement=inplace] : real,CV [location=secondary (Sec_Rec),

element_placement=referenced (CV_Rec)]] : var_string)

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Structures – Example (2)

Related storage structures for the Employee relation

77234 Roberts 4000 Mrs Julia Roberts is born

1. Link_Rec Prim_Rec Sec_Rec CV_Rec

1. Referenced Prim_Rec

Complex objects

Memory-based record addressing

Mapping of records

Large objects

77234 Roberts 4000 Mrs. Julia Roberts is born …

77235 Bond 5000 Mr. James Bond is born …nil

2. Materialized Prim_Reccomplex_object Employee [anchor_record_type=Link_Rec]

set [implementation=linked_list, element_placement=inplace] of . . .

© 2011 AG DBIS

DB connectionfor external data

5-36

nil 77235 Bond 5000 Mr. James Bond is born …

77234 Roberts 4000 Mrs. Julia Roberts is born …

2. Link_Rec Sec_Rec CV_Rec

Page 19: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Large Objects

Requirements• Ideally no size restriction• General administration functions• Tailor-made processing functions, . . .

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Examples for large objects (today up to n (=4) TByte)• Texts, CAD data• Image data, audio sequences • Videos, . . .

Principal possibilities of DB integration

Storage as LOB in the DB Storage using DataLinks concept in external file servers

© 2011 AG DBIS

DB connectionfor external data

5-37

Storage as LOB in the DB(mostly indirect storage)

Employees

PHOTODEPTNAME

ORDBS server

g g p

Employees

ORDBS server

PHOTOIDDEPTNAME

Image server

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage of Large Objects*

Representation of large storage objects• potentially consists of many pages or segments• is an uninterpreted byte sequence• Address (OID, object identifier) points to object header • OID is proxy in the record which the long field belongs to

R i d i fl ibili d i h d

Complex objects

Memory-based record addressing

Mapping of records

Large objects

• Required processing flexibility determines access paths and storage structure

Processing problems• Is object size known in advance?• Are many modifications anticipated during the life time of the object?• Is fast sequential access needed? . . .

Mapping onto external storage• Page based

© 2011 AG DBIS

DB connectionfor external data

5-38

g- Unit of storage allocation: page, “scattered” collection of pages

• Segment based (several pages)- Segments of fixed size (Exodus), segments of variable size (EOS)- Segments with a fixed growth pattern (Starburst)

• Access structure to the object- Chain of segments/pages- List of entries (descriptors), B*-tree

* Biliris, A.: The Performance of Three Database Storage Structures for Managing Large Objects,Proc. ACM SIGMOD’92 Conf., San Diego, Calif., 1992, pp. 276-285

Page 20: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Long Fields in Exodus

Storage of long fields• Data are kept in (small) segments of fixed size

• Choice of segments sizes adjusted to the processing characteristics

• Insertion of byte sequences is simple and possible anywhere

Complex objects

Memory-based record addressing

Mapping of records

Large objects

• Performance degradation in case of sequential access

B*-tree as access structure• Leaves are segments of fixed size (here 4 pages of 100 bytes)

• Internal nodes and root represent an index for byte positions

• For each child-node, internal nodes and root store entries of

the form (page-#, counter)

© 2011 AG DBIS

DB connectionfor external data

5-39

the form (page #, counter)

- Counter maintains the maximum byte number of the corresponding subtree (page entries on the left-hand side belong to the subtree)

- Object size: counter in the right-most entry of the root

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Long Fields in Exodus (2)

Representation of very long dynamic objects• Up to n GBytes using three tree levels (even for small segments)• Space occupancy typically ~ 80%

OID

Complex objects

Memory-based record addressing

Mapping of records

Large objects

root

internalnodes

(pages)

leafnodes

(segments) 350 250 300 400 280 230

© 2011 AG DBIS

DB connectionfor external data

5-40

• How to determine object position of byte 100 in the last page?

Special operations• Search for a byte interval• Insertion/deletion of a byte sequence on a given position• Attachment of a byte sequence at the end of the long field

Page 21: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Long Fields in Exodus* (3)

Support of versioned storage objects• Labeling of the object header with a version number• Copy and update only of those pages which differ in the new

version (in update operations for which versioning is turned on)

V2

Complex objects

Memory-based record addressing

Mapping of records

Large objects

900 1810

V1

V2

200

400 680 880

900 1780Version-determining operation:deletion of 30 bytesat the end of V1

© 2011 AG DBIS

DB connectionfor external data

5-41* M.J. Carey, D.J. DeWitt, J.E. Richardson, E.J. Shekita: Object and File Management in the EXODUS

Extensible Database System. Proc. 12th VLDB Conf., 1986, pp. 91-100

350 600 900

350 250 300 400 280 230

400 680 910

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Long Fields in Starburst

Enhanced requirements• Efficient allocation and release of storage for fields of 100 MB up to 2 GB• High I/O performance: write and read operations with raw-disk speed

Principal representation• Descriptor containing list of segment specifications

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Descriptor containing list of segment specifications• Long field consists of one or several segments• Segments, also denoted as Buddy segments, are allocated using the Buddy

method in large predefined extents of fixed size on external storage

5 100 310descriptor first last

#segments

© 2011 AG DBIS

DB connectionfor external data

5-42

Segment allocation when object size is known in advance• Object size G (in pages) • G MaxSeg: a single segment is allocated• G > MaxSeg: a sequence of maximum segments is allocated• Last segment is reduced to remaining object size

Page 22: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Long Fields in Starburst* (2) Segment allocation when object size is not known

• Growth pattern of segment sizes as shown: 1, 2, 4, ..., 2n pages are aggregated to a Buddy segment; MaxSeg = 2048 for n = 11

• If MaxSeg is reached, further segments are allocated with size MaxSeg • Last segment is reduced to remaining object size

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Allocation of Buddy segments using the binary Buddy method2n

22

20000 001 010 011 . . .

21

. . .0*

00* 01*

© 2011 AG DBIS

DB connectionfor external data

5-43* T. J. Lehman, B. G. Lindsay: The Starburst Long Field Manager. Proc. 15th VLDB Conf., 1989, pp. 375-383

• Aggregation of two buddies of size 2n 2n+1 (n > 0)

Processing properties• Efficient support of sequential and random reads• Simple attachment and removal of byte sequences at the end of object

• Difficult insertion and deletion of byte sequences within the object

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Starburst: Storage Organization for Long Fields

Relation

DBSpace#

Size(bytes)

Numberof BSEGS

Size ofFirst

Size ofLast

Offset#1

Offset#2 . . . Offset

#N

DB Space

Long Field Descriptor

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Implementation of a long field

DB Space

Counts Pointers

AllocationBit Array

Buddy Space

© 2011 AG DBIS

DB connectionfor external data

5-44

p g• Long field descriptor (< 316 bytes) is stored in relation

• Long field consists of one or several Buddy segments, which are allocated in large predefined Buddy Spaces of fixed size on disk

• Buddy segments contain only data and no control information and consist of 1, 2, 4, 8, ... or 2048 pages ( max. segment size 2 MB when using 1 KB pages)

• Buddy Spaces are allocated in (even larger) DB files (DB Spaces). They are composed of control page (allocation page) and data area

Page 23: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Storage Allocation Using Variable Segments Generalization of the approaches of Exodus and Starburst in Eos

• Object is stored in a sequence of segments of variable size• Segment consists of pages allocated in physical contiguity on external storage • Only the last page of a segment can contain free space

Principal representation

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Principal representation

950 1250 430 560

1250 1810

© 2011 AG DBIS

DB connectionfor external data

5-45

The sizes of the various segments can widely differ

Processing properties• The operational properties of both underlying approaches can be obtained• Reorganization is possible, if adjacent segments become very small (page only)

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Summary

Free-placement information at different levels required: device, segment (file), page

Goals of disk-based addressing• Combination of direct-access speed and flexibility of indirection

Complex objects

Memory-based record addressing

Mapping of records

Large objects

• Record displacements in a page without side effects

TID, DBK (allocation table) or primary key

Indexing of tables• Physical or hybrid methods in case of unordered tables• Hybrid methods combined with primary key in case of ordered tables

(Index-organized tables)

Memory-based addressing (Pointer Swizzling)

© 2011 AG DBIS

DB connectionfor external data

5-46

y g ( g)• Transparent program access to persistent and transient objects• Mapping of long disk addresses onto virtual addresses• Orthogonal classification criteria: location, point of time, mode

Mapping of records• Storing fields of variable length • Dynamic extension possible• Computation of field addresses

Page 24: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Summary (2)

Storage of complex objects• Constructors for lists, sets, and tuples • Application of constructors is orthogonal and recursive

Large objects need efficient DBMS support

Complex objects

Memory-based record addressing

Mapping of records

Large objects

• Tailor-made processing techniques & performance properties needed• Transport to the application (minimization of copies needed)

• Query optimization, evaluation of LOB functions, synchronization, logging and recovery

Storage of large objects gains increasing importance• B*-tree technique: flexible representation, moderate access speed• Large segments (lists) of variable length: high I/O performance

h f h l d h h

© 2011 AG DBIS

DB connectionfor external data

5-47

• Choice of various techniques tailored to the processing characteristics

DB linkage for external files• DB support desired for management, consistency control, and content-based

search • DataLinks concept provides referential integrity, access control,

coordinated backup and recovery as well as transaction consistency

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Complex objects

Memory-based record addressing

Mapping of records

Large objects

© 2011 AG DBIS

DB connectionfor external data

5-48

Page 25: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Summary (2)

Storage of complex objects• Constructors for lists, sets, and tuples• Application of constructors is orthogonal and recursive

Large objects need efficient DBMS support

Complex objects

Memory-based record addressing

Mapping of records

Large objects

• Tailor-made processing techniques & performance properties needed• Transport to the application (minimization of copies needed)

• Query optimization, evaluation of LOB functions, synchronization, logging and recovery

Storage of large objects gains increasing importance• B*-tree technique: flexible representation, moderate access speed• Large segments (lists) of variable length: high I/O performance

h f h l d h h

© 2011 AG DBIS

DB connectionfor external data

5-49

• Choice of various techniques tailored to the processing characteristics

Realizationof DBS

Disk-based record addressing

Free-placementadministration

DB Linkage of External Data Motivation

• Most data in an enterprise are stored in files• They will increase for a long time and even grow in volume• Because many applications are based on files, file access has to be supported, too

(uniform access to DBs and other data sources)

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Properties• File systems do not provide sufficient meta-data for search functions and integrity

preservation• DBMS support a wide spectrum of functions, but are currently not optimized for the

storage of a large number of BLOBs (multimedia types) • BLOBs need hierarchical storage management of powerful file systems (e.g., tertiary

storage) which guarantee cost-effective processing of data for varying access pattern (frequent or rare changes)

© 2011 AG DBIS

DB connectionfor external data

5-50

Linkage of file systems and DBMSs should combine pros of both approaches!

Application examples• CAD systems: synchronization of millions of components/files (proprietary format)

• Multimedia objects: management of libraries for images, documents, or videos

• HTML and XML files: DB support for the functionality of Web servers

Page 26: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

DB Linkage of External Data* (2)

Storage model for DB linkageDBMS

file system 1

fileURL1

Complex objects

Memory-based record addressing

Mapping of records

Large objects

file system n

fileSQL table

URL2

Which problems have to be solved?• Referential integrity• Access control

© 2011 AG DBIS

DB connectionfor external data

5-51* Information Technology – Database Language SQL - Part 9: Management of External Data, International Standard, May 2001 (www.jtc1sc32.org)

• Coordinated backup and recovery• Transaction consistency• Additionally: search via conventional data types, contents of external data• Performance aspects in DB and file applications

Participating file systems need additional control componentwhich cooperates with the DBMS via special protocols

Realizationof DBS

Disk-based record addressing

Free-placementadministration

DB Linkage of External Data (3)

DataLinks concept for the management of external data

Emp table

file APISQL APIApplications

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Emp tableName DNo Photo

=DataLink type (URL)

imagesin

externalfiles

DataLinks File System Filter (DLFF)• Enforces referential integrity when files are renamed or deleted• Enforces DB-centric access control when a file is opened • File API remains unchanged – no changes in the applications

© 2011 AG DBIS

DB connectionfor external data

5-52

File API remains unchanged no changes in the applications• DLFF does not reside in the read/write path for external files (performance!)

DataLinks File Manager (DLFM)• Executes Link/UnLink operations under transaction protection • Guarantees referential integrity• Supports coordinated backup/recovery

DBMS manages/coordinates operations on external files • Via referenced URLs• Via DLFM API

Page 27: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

DB Linkage of External Data (5)

DataLinks architecture

Data-LinksFile

standardfile system

AIX,HP-UX,

Applicationdirect takeover of data

standard data access

List ofSQL

Complex objects

Memory-based record addressing

Mapping of records

Large objects

FileMgr

,Solaris,

Windows

files

DBMSusing

DataLinksextension

DB

URLsSQL

hierarchicalstorage management

© 2011 AG DBIS

DB connectionfor external data

5-53

Typical application• Integration of unstructured and semi-structured data with

applications based on DBMS use• Reach: large number of files in computer networks• Using function value indexing: files referenced via URLs remain unchanged• User extracts features of images or videos and stores them in the DB

to perform evaluations together with predicates on other DB data • Query By Image Content (QBIC) supports extraction/search of such features.

Realizationof DBS

Disk-based record addressing

Free-placementadministration

DB Linkage of External Data (4) Processing model from the viewpoint of the application

• SQL access to meta-data repository for external data• Search is also possible via content of external data Function value indexing• List of references of searched objects • Application references external data directly via file API

DataLinks data type in SQL:99 – example

Complex objects

Memory-based record addressing

Mapping of records

Large objects

DataLinks data type in SQL:99 exampleCREATE TABLE Emp (

Name VARCHAR (30);DNo INTEGER,Photo DATALINK (200)

LINKTYPE URLFILE LINK CONTROL

INTEGRITY allREAD PERMISSION DBWRITE PERMISSION blockedRECOVERY yesON UNLINK restore);

• DBMS control can be activated in a leveled way

© 2011 AG DBIS

DB connectionfor external data

5-54

• URL: http://server name/pathname/filename/• Integrity: URLs are kept consistent as references • Read Permission: either at the file system or is delegated to the DBMS.

Authorization is embedded as a token in the URL• Write Permission: either at the file system or is blocked • Recovery: coordinated backup and recovery is only possible

for option WRITE PERMISSION blocked• On Unlink: file can be deleted or can be returned under file system control

Page 28: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Large Objects (2)

Principal possibilities of DB integration

Storage as LOB in the DB (mostly indirect storage)

BLOB - Binary Large ObjectORDBS

Complex objects

Memory-based record addressing

Mapping of records

Large objects

y g jfor audio, image data etc.

CLOB - Character Large Objectfor text data

DBCLOB - Double Byte CharacterLarge Object (DB2)for special graphic data etc.

Employees

PHOTOABTNAME

ORDBS server

Storage using DataLinks concept in external file servers

© 2011 AG DBIS

DB connectionfor external data

5-55

g g p

Employees

ORDBS server

PHOTOIDABTNAME

Bilddatei (Server)

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Large Objects (3)

Creation of LOB columns*

LOB column definition

column name BLOB ( n )

Complex objects

Memory-based record addressing

Mapping of records

Large objects

column name BLOB ( n )CLOB

DBCLOBKMG

LOGGED

NOT LOGGED

NOT COMPACT

COMPACT

© 2011 AG DBIS

DB connectionfor external data

5-56* The realization examples correspond to DB2 – Universal Database

Page 29: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Large Objects (4)

ExamplesCREATE TABLE Graduate

(RunNo Integer,Name Varchar (50),

Complex objects

Memory-based record addressing

Mapping of records

Large objects

. . .Photo BLOB (5 M) NOT LOGGED COMPACT, -- imageCV LOB (16 K) LOGGED NOT COMPACT); -- text

CREATE TABLE Design(Pno Char (18),Time_of_Update Timestamp,Updated_By Varchar (50)Drawing BLOB (2 M) LOGGED NOT COMPACT); -- graphic

© 2011 AG DBIS

DB connectionfor external data

5-57

ALTER TABLE GraduateADD COLUMN MasterThesis CLOB (500 K)

LOGGED NOT COMPACT;

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Large Objects (5)

Specification of LOBs requires care• Maximum lenght

- Reservation of an application buffer- Clustering and Optimzation using indirect storage allocation;

descriptor in the tuple is dependent on the LOB size(72 bytes when <1K up to 316 bytes for 2G)

Complex objects

Memory-based record addressing

Mapping of records

Large objects

- For smaller LOBs (< page size), direct storage allocation possible

• Compact storage- COMPACT reserves no space for later growth

What may happen in case of a LOB modification?- NOT COMPACT is default

• Logging- LOGGED: in case of updates, LOB column is treated like all other columns

(ACID!)

© 2011 AG DBIS

DB connectionfor external data

5-58

( ) What does this mean for the log file?

- NOT LOGGED: updates are not recorded in the log file. So-called shadow pages (shadowing) guarantee atomicity until Commit

1 2 3 41 2 3 4Lob1

Lob1‘

What happens in case of a device failure?

Page 30: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Large Objects (6)

How are large objects processed?• BLOB and CLOB are no types of the host language Special declaration of BLOB, CLOB, ... by SQL TYPE ist required, because

they use the same host language types. Furthermore, it can be guaranteed that the length to be expected by the DBMS can be exactly met

Complex objects

Memory-based record addressing

Mapping of records

Large objects

that the length to be expected by the DBMS can be exactly met.

Preparations required in the AP• SQL TYPE IS CLOB (2 K) c1 (or BLOB (2 K))

is translated by the C-precompiler into

static struct c1_t{unsigned long length;char data [2048];} c1;

© 2011 AG DBIS

DB connectionfor external data

5-59

} c1;

• Creation of a CLOB

c1.data = ‘Hello’;c1.length = sizeof (‘Hello’)-1;

can be hidden by the use of makros (e.g., c1 = SQL_CLOB_INIT(‘Hello’);)

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Large Objects (7)

Insert, delete, and update can be performed similar to other types, if sufficiently large AP buffers exist

Complex objects

Memory-based record addressing

Mapping of records

Large objects

Fetch the data for Graduate having RunNo 17 into AP

. . .SELECT Name, Photo, CVfINTO :x, :y :yindicator, :z :zindicatorFROM GraduateWHERE RunNo = 17;

© 2011 AG DBIS

DB connectionfor external data

5-60

Page 31: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Large Objects (8)

Which operations can be applied to LOBs?• Comparison predicates: =, <>, <, <=, >, >=, IN, BETWEEN

• LIKE predicate

Complex objects

Memory-based record addressing

Mapping of records

Large objects

• Uniqueness or sequence for LOB values- PRIMARY KEY, UNIQUE, FOREIGN KEY- SELECT DISTINCT, . . ., COUNT (DISTINCT)- GROUP BY, ORDER BY

• Use of of aggregate functions like MIN, MAX

© 2011 AG DBIS

DB connectionfor external data

5-61

• Operations- UNION, INTERSECT, EXCEPT- joins of LOB attributes

• Index structures across LOB columns

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Large Objects (9)

How can LOBs be indexed?• User-defined function assigns index values to LOBs • Function value indexing

Complex objects

Memory-based record addressing

Mapping of records

Large objects

f(blob1) = x BLOBindex

blob1

Is direct processing of LOBs in AP realistic?

Books EXEC SQL

© 2011 AG DBIS

DB connectionfor external data

5-62

(Title Varchar (200), SELECT Abstract, Booktext,VideoBNO ISBN, INTO :kilobuffer, :megabuffer, :gigabufferAbstract CLOB (32 K),Booktext CLOB (20 M), FROM BooksVideo BLOB (2 G)) WHERE Title = ‘American Beauty’

Page 32: of DBS 5. Storage Structures -

Realizationof DBS

Disk-based record addressing

Free-placementadministration

Large Objects (10) Client/Server architecture

AP

Client

DBMS

AP buffer

Complex objects

Memory-based record addressing

Mapping of records

Large objects • Allocation of buffers?

Server

DB

DB buffer

© 2011 AG DBIS

DB connectionfor external data

5-63

• Transfer of an entire LOB into the AP?

• Should the transfer be performed via the DB buffer?

• “Piece-wise” processing of LOBs required by AP!

Locator concept for the access to LOBs