This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Storage structures for LOBs• Segments of fixed and variable size , access via B*-tree, pointer list, . . .
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Storage Structures (2)
Operations
Insert <record> at <location> with <database-key>Retrieve <record> with <database-key>Add <entry> to <B*-tree>Ret ie e <add ess list> f om <B* t ee> fo < al e>
Complex objects
Memory-based record addressing
Mapping of records
Large objects
Retrieve <address-list> from <B*-tree> for <value>
Mapping functions• record identifier address• attribute value record-id.list• record identifier record-id.list• address {occupied, free}
Properties of the upper interface• Non-volatile storage with “addressing assistance”• Free-placement administration• Addressing methods of and between physical records• Access paths supporting content addressability
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Free-Placement Administration
Free-placement administration (FPA) for• External storage (allocation of files)• Segments (allocation of internal records)• Pages (administration of allocated/free entries)
For all pages of a segment:
Complex objects
Memory-based record addressing
Mapping of records
Large objects
For all pages of a segment:• Insert/update search for n free bytes• Delete/update release or allocation of storage space• In general: search, allocation, and release of storage space in Sj
units of fi LP/256 - multiplefor LP = 4KB 16 bytes
FPA within Pi• Exact fi in PH• Contiguous administration (displacements!)• Free-storage chain (best-fit / first-fit)
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Disk-based Record Addressing Problem statement
• Long-term storage of data records • Avoidance of “technology dependencies”• Support of migration, etc.
Complex objects
Memory-based record addressing
Mapping of records
Large objects
General form of a record address• DBID, SID, TID and, if necessary, table selector (RID), • if table completely stored in segment: in record: TID; in DB catalog: DBID,SID• if table in several segments: SID, TID
Goals of the addressing technique:• Fast possibly direct record access
TID (tuple identifier) consists of two components:• Page number (3 bytes)• Relative index position within page (1 byte)• Serves for addressing within a segment (z. B. SID = A)
Complex objects
Memory-based record addressing
Mapping of records
Large objects
Migration of a record into another page without TID change
Each record owns a unique logical identifier• Database key (DBK)• Allocation of DBKs done by the DBMS, in general• System-internal references to records are exclusively made via DBK
Allocation table contains a PP for each DBKSID (1 b ) b (3 b )
Complex objects
Memory-based record addressing
Mapping of records
Large objects
• SID (1 byte), page number (3 bytes)
Hybrid method:use of “probable page pointers” (PPP) in access paths possibly saves accesses to allocation table
• If access fails via PPP , ATab is searched using ARID • All IOL(Ai) use ATab• Starting from ATab, access can be performed to ITOL via PPP or via PK
a11, a21, a31, a41ARID1 a57 . . .
PK1
PPPjPKj. . .
PPP1PK1
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Memory-based Addressing
Task: Programs should be enabled to transparently process transient and persistent data objects in main memory• Exclusive usage of direct addresses in main memory (virtual addressing),
i i i t i t t bj t i ffi i t
Complex objects
Memory-based record addressing
Mapping of records
Large objects
i.e., in main memory, access to persistent objects is as efficient as access to transient objects
• No additional costs for programs only accessing transient objects
• Mapping costs for persistent objects should not be paid for each access
Mapping of persistent objects residing on external storage (ES) to such in virtual storage (VS)• Persistent addresses (e.g., SID, RID, TID) are long (e.g. 64 bit),
Persistent addresses (e.g., SID, RID, TID) are long (e.g. 64 bit), in contrast, virtual addresses are shorter (e.g. 32 bit)
• Translation of pointers (pointer swizzling) from the long format (using indirect addressing) to the shorter format using an addressing method ‘as direct as possible’
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Memory-based Addressing (2)
Goal: Fast processing of pointer sequences in VS — e.g. 105 refs/sec• Object processing: traversing sequences of references and
navigation in meshed object structures• Direct access in main memory is substantially cheaper than access via
persistent addresses (localization of a page in DB buffer and search of the
Complex objects
Memory-based record addressing
Mapping of records
Large objects
p ( p gobject in the page)
• Additional access paths to support search in main memory, if necessary:B*-tree access requires h+1 direct pointer references
1 + 5 : Swizzling of all pages/objects at Checkout, no replacement (no uncaching)2 + 4 : cumbersome organization
Questions- Which methods allow fastest processing (no consideration of swizzling cost)?- Which methods enable object replacement (uncaching)?- How can Lazy/Direct (3 + 7) be implemented?
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Mapping of Records
Record manager • Physical storage of records in pages• Operations: read, insert, modify, delete
Record description• Per attribute: Fi t N X5h
attribute name type . . . length attribute value
Complex objects
Memory-based record addressing
Mapping of records
Large objects
• Per attribute: First_Name Xaver5. . .varchar
metadata incatalog (DD)
instance inrecord
• Description of records and access paths in the catalog• Special methods for storing values
FROM Emp (DR)WHERE First_Name = ’Xaver’(OR) AND Job = ’Programmer’(OR) AND Age > 50
How can this query be mapped onto DR?
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Storage Structures for Records (5)
Problem: dynamic growth / variable length• Growth and shrinking in a page• Overflow schemata, garbage collection Methods of record storage introduced so far are to be combined with
additional options
Complex objects
Memory-based record addressing
Mapping of records
Large objects
additional options
Strictly contiguous storage of records• Numerous migrations needed in case of high update frequency• Advantages for indirect addressing schemes
• Ordering according to reference frequencies• Improvement of clustering• Repeated overflow possible• Is inevitable in case of storing attributes of type TEXT or IMAGE
F1 F2 F3 F4 F5 F6 F7
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Storage Structures for Complex Objects*
Complex objects composed of• Atomic values and thereupon • recursively applied set-, list-, and tuple constructors
Model for complex objects (eNF2)set list
Complex objects
Memory-based record addressing
Mapping of records
Large objects
set list
values
tuple
Storage strategy• Orthogonality is important: no enumeration of all possibilities• Filing of frequently accessed substructures (possibly shared) in a single
If, in addition, linked lists can be used as constructor data structures, 16 variants can be obtained in total
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Storage Structures for Set- and List Constructors
Independent degrees of freedom• Constructor data structure
- Variable-length array- Linked list- ...
• Mode of element storage- Directly in constructor data structure- Referencing of the elements via pointers
Complex objects
Memory-based record addressing
Mapping of records
Large objects
Independent specification of these degrees of freedom requires two parameters (in a data definition language):
object_type = . . ./* definition of a set */set [implementation = implementation_type,
element_placement = placement_type] of object_type …/* definition of a list. */list [implementation = implementation_type,element_placement = placement_type] of object_type ...
Representation of large storage objects• potentially consists of many pages or segments• is an uninterpreted byte sequence• Address (OID, object identifier) points to object header • OID is proxy in the record which the long field belongs to
R i d i fl ibili d i h d
Complex objects
Memory-based record addressing
Mapping of records
Large objects
• Required processing flexibility determines access paths and storage structure
Processing problems• Is object size known in advance?• Are many modifications anticipated during the life time of the object?• Is fast sequential access needed? . . .
g- Unit of storage allocation: page, “scattered” collection of pages
• Segment based (several pages)- Segments of fixed size (Exodus), segments of variable size (EOS)- Segments with a fixed growth pattern (Starburst)
• Access structure to the object- Chain of segments/pages- List of entries (descriptors), B*-tree
* Biliris, A.: The Performance of Three Database Storage Structures for Managing Large Objects,Proc. ACM SIGMOD’92 Conf., San Diego, Calif., 1992, pp. 276-285
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Long Fields in Exodus
Storage of long fields• Data are kept in (small) segments of fixed size
• Choice of segments sizes adjusted to the processing characteristics
• Insertion of byte sequences is simple and possible anywhere
Complex objects
Memory-based record addressing
Mapping of records
Large objects
• Performance degradation in case of sequential access
B*-tree as access structure• Leaves are segments of fixed size (here 4 pages of 100 bytes)
• Internal nodes and root represent an index for byte positions
• For each child-node, internal nodes and root store entries of
• How to determine object position of byte 100 in the last page?
Special operations• Search for a byte interval• Insertion/deletion of a byte sequence on a given position• Attachment of a byte sequence at the end of the long field
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Long Fields in Exodus* (3)
Support of versioned storage objects• Labeling of the object header with a version number• Copy and update only of those pages which differ in the new
version (in update operations for which versioning is turned on)
V2
Complex objects
Memory-based record addressing
Mapping of records
Large objects
900 1810
V1
V2
200
400 680 880
900 1780Version-determining operation:deletion of 30 bytesat the end of V1
5-41* M.J. Carey, D.J. DeWitt, J.E. Richardson, E.J. Shekita: Object and File Management in the EXODUS
Extensible Database System. Proc. 12th VLDB Conf., 1986, pp. 91-100
350 600 900
350 250 300 400 280 230
400 680 910
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Long Fields in Starburst
Enhanced requirements• Efficient allocation and release of storage for fields of 100 MB up to 2 GB• High I/O performance: write and read operations with raw-disk speed
Principal representation• Descriptor containing list of segment specifications
Complex objects
Memory-based record addressing
Mapping of records
Large objects
Descriptor containing list of segment specifications• Long field consists of one or several segments• Segments, also denoted as Buddy segments, are allocated using the Buddy
method in large predefined extents of fixed size on external storage
Segment allocation when object size is known in advance• Object size G (in pages) • G MaxSeg: a single segment is allocated• G > MaxSeg: a sequence of maximum segments is allocated• Last segment is reduced to remaining object size
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Long Fields in Starburst* (2) Segment allocation when object size is not known
• Growth pattern of segment sizes as shown: 1, 2, 4, ..., 2n pages are aggregated to a Buddy segment; MaxSeg = 2048 for n = 11
• If MaxSeg is reached, further segments are allocated with size MaxSeg • Last segment is reduced to remaining object size
Complex objects
Memory-based record addressing
Mapping of records
Large objects
Allocation of Buddy segments using the binary Buddy method2n
p g• Long field descriptor (< 316 bytes) is stored in relation
• Long field consists of one or several Buddy segments, which are allocated in large predefined Buddy Spaces of fixed size on disk
• Buddy segments contain only data and no control information and consist of 1, 2, 4, 8, ... or 2048 pages ( max. segment size 2 MB when using 1 KB pages)
• Buddy Spaces are allocated in (even larger) DB files (DB Spaces). They are composed of control page (allocation page) and data area
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Storage Allocation Using Variable Segments Generalization of the approaches of Exodus and Starburst in Eos
• Object is stored in a sequence of segments of variable size• Segment consists of pages allocated in physical contiguity on external storage • Only the last page of a segment can contain free space
The sizes of the various segments can widely differ
Processing properties• The operational properties of both underlying approaches can be obtained• Reorganization is possible, if adjacent segments become very small (page only)
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Summary
Free-placement information at different levels required: device, segment (file), page
Goals of disk-based addressing• Combination of direct-access speed and flexibility of indirection
Complex objects
Memory-based record addressing
Mapping of records
Large objects
• Record displacements in a page without side effects
TID, DBK (allocation table) or primary key
Indexing of tables• Physical or hybrid methods in case of unordered tables• Hybrid methods combined with primary key in case of ordered tables
y g ( g)• Transparent program access to persistent and transient objects• Mapping of long disk addresses onto virtual addresses• Orthogonal classification criteria: location, point of time, mode
Mapping of records• Storing fields of variable length • Dynamic extension possible• Computation of field addresses
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Summary (2)
Storage of complex objects• Constructors for lists, sets, and tuples • Application of constructors is orthogonal and recursive
Large objects need efficient DBMS support
Complex objects
Memory-based record addressing
Mapping of records
Large objects
• Tailor-made processing techniques & performance properties needed• Transport to the application (minimization of copies needed)
• Query optimization, evaluation of LOB functions, synchronization, logging and recovery
Storage of large objects gains increasing importance• B*-tree technique: flexible representation, moderate access speed• Large segments (lists) of variable length: high I/O performance
Storage of complex objects• Constructors for lists, sets, and tuples• Application of constructors is orthogonal and recursive
Large objects need efficient DBMS support
Complex objects
Memory-based record addressing
Mapping of records
Large objects
• Tailor-made processing techniques & performance properties needed• Transport to the application (minimization of copies needed)
• Query optimization, evaluation of LOB functions, synchronization, logging and recovery
Storage of large objects gains increasing importance• B*-tree technique: flexible representation, moderate access speed• Large segments (lists) of variable length: high I/O performance
• Choice of various techniques tailored to the processing characteristics
Realizationof DBS
Disk-based record addressing
Free-placementadministration
DB Linkage of External Data Motivation
• Most data in an enterprise are stored in files• They will increase for a long time and even grow in volume• Because many applications are based on files, file access has to be supported, too
(uniform access to DBs and other data sources)
Complex objects
Memory-based record addressing
Mapping of records
Large objects
Properties• File systems do not provide sufficient meta-data for search functions and integrity
preservation• DBMS support a wide spectrum of functions, but are currently not optimized for the
storage of a large number of BLOBs (multimedia types) • BLOBs need hierarchical storage management of powerful file systems (e.g., tertiary
storage) which guarantee cost-effective processing of data for varying access pattern (frequent or rare changes)
5-51* Information Technology – Database Language SQL - Part 9: Management of External Data, International Standard, May 2001 (www.jtc1sc32.org)
• Coordinated backup and recovery• Transaction consistency• Additionally: search via conventional data types, contents of external data• Performance aspects in DB and file applications
Participating file systems need additional control componentwhich cooperates with the DBMS via special protocols
Realizationof DBS
Disk-based record addressing
Free-placementadministration
DB Linkage of External Data (3)
DataLinks concept for the management of external data
Emp table
file APISQL APIApplications
Complex objects
Memory-based record addressing
Mapping of records
Large objects
Emp tableName DNo Photo
=DataLink type (URL)
imagesin
externalfiles
DataLinks File System Filter (DLFF)• Enforces referential integrity when files are renamed or deleted• Enforces DB-centric access control when a file is opened • File API remains unchanged – no changes in the applications
Typical application• Integration of unstructured and semi-structured data with
applications based on DBMS use• Reach: large number of files in computer networks• Using function value indexing: files referenced via URLs remain unchanged• User extracts features of images or videos and stores them in the DB
to perform evaluations together with predicates on other DB data • Query By Image Content (QBIC) supports extraction/search of such features.
Realizationof DBS
Disk-based record addressing
Free-placementadministration
DB Linkage of External Data (4) Processing model from the viewpoint of the application
• SQL access to meta-data repository for external data• Search is also possible via content of external data Function value indexing• List of references of searched objects • Application references external data directly via file API
DataLinks data type in SQL:99 – example
Complex objects
Memory-based record addressing
Mapping of records
Large objects
DataLinks data type in SQL:99 exampleCREATE TABLE Emp (
Name VARCHAR (30);DNo INTEGER,Photo DATALINK (200)
• URL: http://server name/pathname/filename/• Integrity: URLs are kept consistent as references • Read Permission: either at the file system or is delegated to the DBMS.
Authorization is embedded as a token in the URL• Write Permission: either at the file system or is blocked • Recovery: coordinated backup and recovery is only possible
for option WRITE PERMISSION blocked• On Unlink: file can be deleted or can be returned under file system control
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Large Objects (2)
Principal possibilities of DB integration
Storage as LOB in the DB (mostly indirect storage)
BLOB - Binary Large ObjectORDBS
Complex objects
Memory-based record addressing
Mapping of records
Large objects
y g jfor audio, image data etc.
CLOB - Character Large Objectfor text data
DBCLOB - Double Byte CharacterLarge Object (DB2)for special graphic data etc.
Employees
PHOTOABTNAME
ORDBS server
Storage using DataLinks concept in external file servers
- NOT LOGGED: updates are not recorded in the log file. So-called shadow pages (shadowing) guarantee atomicity until Commit
1 2 3 41 2 3 4Lob1
Lob1‘
What happens in case of a device failure?
Realizationof DBS
Disk-based record addressing
Free-placementadministration
Large Objects (6)
How are large objects processed?• BLOB and CLOB are no types of the host language Special declaration of BLOB, CLOB, ... by SQL TYPE ist required, because
they use the same host language types. Furthermore, it can be guaranteed that the length to be expected by the DBMS can be exactly met
Complex objects
Memory-based record addressing
Mapping of records
Large objects
that the length to be expected by the DBMS can be exactly met.
Preparations required in the AP• SQL TYPE IS CLOB (2 K) c1 (or BLOB (2 K))
is translated by the C-precompiler into
static struct c1_t{unsigned long length;char data [2048];} c1;