Top Banner
Multimedia Information Systems CS 4570
29

Multimedia Information Systems

Jan 19, 2016

Download

Documents

Selin Kesebir

Multimedia Information Systems. CS 4570. Outlines. Introduction to DMBS Relational database and SQL B + - tree index structure. Database Management System (DBMS). Collection of interrelated data Set of programs to access the data DBMS contains information about a particular enterprise - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Multimedia Information Systems

Multimedia Information Systems

CS 4570

Page 2: Multimedia Information Systems

Outlines

• Introduction to DMBS

• Relational database and SQL

• B+- tree index structure

Page 3: Multimedia Information Systems

Database Management System (DBMS)

• Collection of interrelated data

• Set of programs to access the data

• DBMS contains information about a particular enterprise

• DBMS provides an environment that it both convenient and efficient to use

Page 4: Multimedia Information Systems

Purpose of Database Systems

• To overcome the disadvantages of the typical file–processing systems:– Data redundancy and inconsistency– Difficulty in accessing data– Data isolation multiple files and formats– Integrity problems– Atomicity of updates– Concurrent access by multiple users– Security problems

Page 5: Multimedia Information Systems

Data and Data Models

• Data is really the “given facts”• A data model is a collection of tools for describing:

– Data– Data relationships– Data semantics– Data constraints

• Object-based logical models:– object-oriented model, semantic model…

• Record-based logical models:– relational model (e.g. SQL, DB2)…

Page 6: Multimedia Information Systems

Entity-Relationship (ER) Model

• A database can be modeled as:– a collection of entities, – relationships among entities

Page 7: Multimedia Information Systems

Relational Model

Page 8: Multimedia Information Systems

Relational Databases

• A relational database is a database that is perceived be its users as a collection of tables. ( and nothing but tables ).

Page 9: Multimedia Information Systems
Page 10: Multimedia Information Systems

Relation Instance• The current values (relation instance) of a relation

are specified by a table.• An element t of r is a tuple; represented by a row

in a table.

tuple

attribute

Page 11: Multimedia Information Systems

Structured Query Language (SQL)

• SQL is based on set and relational operations with certain modifications and enhancements

• A typical SQL query has the form:

select A 1 , A 2 , ..., A n

from r 1 , r 2 , ..., r m

where P

– Ais represent attributes

– ris represent relations

– P is a predicate.• The result of an SQL query is a relation.

Page 12: Multimedia Information Systems

Banking Example

branch(branch-name, branch-city, assets)

customer(customer-name, customer-street, customer-city)

account(branch-name, account-number, balance)

loan(branch-name, loan-number, amount)

depositor(customer-name, account-number)

borrower(customer-name, loan-number)

Page 13: Multimedia Information Systems

The Select Clause

• The select clause corresponds to the projection operation of the relational algebra. It is used to list the attributes desired in the result of a query.

• Find the names of all branches in the loan relation

select branch-name

from loan

• An asterisk in the select clause denotes”all attributes”

select *

from loan

branch-name loan-number amount

Page 14: Multimedia Information Systems

The Select Clause(Cont.)

• SQL allows duplicates in relations as well as in query results.• To force the elimination of duplicates, insert the keyword

distinct after select.

Find the names of all branches in the loan relation, and remove duplicates

select distinct branch-name

from loan• The keyword all specifies that duplicates not be removed.

Select all branch-name

from loan

Page 15: Multimedia Information Systems

The Select Clause(Cont.)

• The select clause can contain arithmetic expressions involving the operators,+,-,*,/,and operating on constants or attributes of tuples.

• The query:

select branch-name,loan-number,amount * 100

from loan

would return a relation which is the same as the loan relation, except that the attribute amount is multiplied by 100

Page 16: Multimedia Information Systems

The from Clause

• The from clause corresponds to the Cartesian product operation of the relational algebra. It lists the relations to be scanned in the evaluation of the expression.

• Find the Cartesian product borrower loan

select *

from borrower, loan

• Find the name and loan number of all customers having a loan at the Perryridge branch.

select distinct customer-name,borrower.loan-number

from borrower,loan

where borrower.loan-number=loan.loan-number and

branch-name=“Perryridge”

borrower (customer-name, loan-number)

loan (branch-name, loan-number, amount)

borrower × loan

( borrower.customer-name, borrower.loan-number, loan.branch-name, loan.loan-number, loan.amount)

Page 17: Multimedia Information Systems

The where Clause

• The where clause corresponds to the selection predicate of the relational algebra. It consists of a predicate involving attributes of the relations that appear in the from clause.

• Find all loan numbers for loans made at the Perryridge branch with loan amounts greater than $1200.

select loan-number

from loan

where branch-name=“Perryridge” and amount>1200• SQL uses the logical connectives and, or, and not. It allows the

use of arithmetic expressions as operands to the comparison operators.

Page 18: Multimedia Information Systems

The where Clause(Cont.)

• SQL includes a between comparison operator in order to simplify where clauses that specify that a value be less than or equal to some value and greater than or equal to some other value.

• Find the loan number of those loans with loan amounts between $90,000 and $100,000(that is, $90,000 and $100,000)

select loan-number

from loan

where amount between 90000 and 100000

Page 19: Multimedia Information Systems

Example Query

• Find all branches that have greater assets than some branch located in Brooklyn.

select distinct T.branch-name

from branch as T, branch as S

where T.assets > S.assets and

S.branch-city =“Brooklyn”

Page 20: Multimedia Information Systems

B+ -Tree Index Files B+ -Tree indices are an alternative to indexed-

sequential files. • Disadvantage of indexed-sequential files:

performance degrades as file grows, both for index lookups and for sequential scans through the data.

• Advantage of B+ -Tree index files: automatically reorganizes itself with small, local, changes, in the face of insertions and deletions.

• Disadvantage of B+ -Trees: extra insertion and deletion overhead, space overhead.

• Advantages of B+ -Trees outweigh

disadvantages, and they are used extensively.

Page 21: Multimedia Information Systems

B+ -Tree Index Files (Cont.) A B+ -Tree is a rooted tree satisfying the

following properties:• All paths from root to leaf are of the same

length • Each node that is not a root or a leaf has

between n/ 2 and n children• A leaf node has between (n-1)/2 and n-1

values• Special cases:if the root is not a leaf it has at

least 2 children; if the root is a leaf (that is there are no other nodes in the tree) it can have between 0 and (n-1) values.

Page 22: Multimedia Information Systems

B+ -Tree Node Structure

• Typical node

– Ki are the search-key values

– Pi are pointers to children (for non-leaf nodes) or pointers to records or buckets of records (for leaf nodes).

• The search-keys in a node are ordered K1 < K2 < K3 < ...

Page 23: Multimedia Information Systems

Leaf Nodes in B+ -Trees Properties of a leaf node:

• For i = 1, 2, . . . , n-1, i either points to a file record with search-key value Ki , or to a bucket of pointers to file records, each record having search-key value Ki . Only need bucket structure if search-key does not form a

primary key. • If Li , Lj are leaf nodes and i < j , Li ‘s search-

key values are less than Lj ’s search-key values • Pn points to next leaf node in search-key order

Page 24: Multimedia Information Systems

Non-Leaf Nodes in B+ -Trees

• The nonleaf nodes form a multi-level sparse index on the leaf nodes. For a non-leaf node with n

pointers: – All the search-keys in the subtree to which

P1 points are less than K1

– For 2 i n-1 all the search-key in the subtree to which Pi points have values greater than or equal to Ki-1 and less than Ki

– All the search-keys in the subtree to which Pn points are greater than or equal to Kn-1

Page 25: Multimedia Information Systems

Example of a B+-tree

Page 26: Multimedia Information Systems

Queries on B+-Trees (Cont.)

• In processing a query, a path is traversed in the tree from the root to some leaf node.

• If there are K search-key values in the file, the path is no longer than log n/ 2 (K ).

• A node is generally the same size as a disk block, typically 4 kilobytes, and n is typically around 100 (40 bytes per index entry).

• With 1 million search key values and n = 100, at most log50 (1, 000, 000) = 4 nodes are accessed in a lookup.

• Contrast this with a balanced binary tree with 1 million search key values --- around 20 nodes are accessed in a lookup – above difference is significant since every node

access may need a disk I/O, costing around 30 millisecond!

Page 27: Multimedia Information Systems

Updates on B+-Trees: Insertion (Cont.)

• Splitting a node: – take the n (search-key value, pointer) pairs

(including the one being inserted) in sorted order. Place the first n/ 2 in the original node, and the rest in a new node.

– let the new node be p, and let k be the least key value in p. Insert (k, p) in the parent of the node being split. If the parent is full, split it and propagate the split further up.

• The splitting of nodes proceeds upwards till a node that is not full is found. In the worst case the root node may be split increasing the height of the tree

by 1.

Page 28: Multimedia Information Systems

Updates on B+-Trees: Insertion (Cont.)

Page 29: Multimedia Information Systems

Self-Practice

• Construct a B+-tree for the following set of key values: (2, 3, 5, 7, 11, 17, 19, 23, 29, 31) with n = 6