Top Banner
CS542 Schema Refinement Chapter 19 (part 1) Functional Dependencies
20

Schema Refinement

Jan 31, 2016

Download

Documents

Niel

Schema Refinement. Chapter 19 (part 1) Functional Dependencies. The Evils of Redundancy. Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update anomalies Main refinement technique: decomposition - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Schema Refinement

CS542 1

Schema Refinement

Chapter 19 (part 1)

Functional Dependencies

Page 2: Schema Refinement

CS542 2

The Evils of Redundancy

Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update anomalies

Main refinement technique: decomposition Example: replacing ABCD with, say, AB and BCD,

Functional dependeny constraints utilized to identify schemas with such problems and to suggest refinements.

Page 3: Schema Refinement

CS542 4

Insert Anomaly

sNumber sName pNumber pName

s1 Dave p1 prof1

s2 Greg p2 prof2

Student

Question : How do we insert a professor who has no students?

Insert Anomaly: We are not able to insert “valid” value/(s)

Page 4: Schema Refinement

CS542 5

Delete Anomaly

sNumber sName pNumber pName

s1 Dave p1 MM

s2 Greg p2 ER

Student

Question : Can we delete a student that is the only student of a professor ?

Delete Anomaly: We are not able to perform a delete without losing some “valid” information.

Page 5: Schema Refinement

CS542 6

Update Anomaly

sNumber sName pNumber pName

s1 Dave p1 MM

s2 Greg p1 MM

Student

Question : How do we update the name of a professor?

Update Anomaly: To update a value, we have to update multiple rows. Update anomalies are due to redundancy.

Page 6: Schema Refinement

CS542 7

Functional Dependencies (FDs)

A functional dependency X Y holds over relation R if, for every allowable instance r of R:

t1 r, t2 r, (t1) = (t2)

implies

(t1) = (t2)

Given two tuples in r, if the X values agree, then the Y values must also agree.

X

Y Y

X

Page 7: Schema Refinement

CS542 8

FD Example

sNumber sName address

1 Dave 144FL

2 Greg 320FL

Student

Suppose we have FD sName address

• for any two rows in the Student relation with the same value for sName, the value for address must be the same

• i.e., there is a function from sName to address

Page 8: Schema Refinement

CS542 10

Keys + Functional Dependencies

Assume K is a candidate key for R

What does this imply about FD between K and R?

It means that K R !

Does K R require K to be minimal ?

No. Any superkey of R also functionally implies all attributes of R.

Page 9: Schema Refinement

CS542 11

Example: Constraints on Entity Set

Consider relation obtained from Hourly_Emps: Hourly_Emps (ssn, name, lot, rating, hrly_wages,

hrs_worked) Notation:

We denote relation schema by its attributes: SNLRWH This is really the set of attributes {S,N,L,R,W,H}.

Some FDs on Hourly_Emps: ssn is the key: S SNLRWH rating determines hrly_wages: R W

Page 10: Schema Refinement

CS542 12

Problems Caused by FD

Problems due to Example FD :

rating determines hrly_wages: R W

Page 11: Schema Refinement

CS542 13

Example Problems due to R W :

Update anomaly: Can we change W in just the 1st tuple of SNLRWH?

Insertion anomaly: What if we want to insert an employee and don’t know the hourly wage for his rating?

Deletion anomaly: If we delete all employees with rating 5, we lose the information about the wage for rating 5!

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

Hourly_Emps

rating (R) determines hrly_wages (W)

Page 12: Schema Refinement

CS542 14

Same Example Problems due to R W !

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

S N L R H

123-22-3666 Attishoo 48 8 40

231-31-5368 Smiley 22 8 30

131-24-3650 Smethurst 35 5 30

434-26-3751 Guldu 35 5 32

612-67-4134 Madayan 35 8 40

R W

8 10

5 7Hourly_Emps2

Wages

Solution : 2 smaller tables insteadof one big one !

Hourly_Emps

Page 13: Schema Refinement

CS542 15

Same Example Problems due to R W !

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

S N L R H

123-22-3666 Attishoo 48 8 40

231-31-5368 Smiley 22 8 30

131-24-3650 Smethurst 35 5 30

434-26-3751 Guldu 35 5 32

612-67-4134 Madayan 35 8 40

R W

8 10

5 7Hourly_Emps2

Wages

Will 2 smaller tables be better than one big one?

Update anomaly: Can we change W in just the 1st tuple of SNLRWH?Insertion anomaly: What if we want to insert an employee and don’t know the hourly wage for his rating?Deletion anomaly: If we delete all employees with rating 5, we lose information about wage for rating 5?

Page 14: Schema Refinement

CS542 17

Reasoning About FDs

Given some FDs, we can usually infer additional FDs:

ssn did, did lot implies ssn lot

Page 15: Schema Refinement

CS542 18

Properties of FDs

Consider A, B, C, Z are sets of attributes

Armstrong’s Axioms: Reflexive (also trivial FD): if A B, then A B Transitive: if A B, and B C, then A C Augmentation: if A B, then AZ BZThese are sound and complete inference rules for FDs!

Additional rules (that follow from AA): Union: if A B, A C, then A BC Decomposition: if A BC, then A B, A C

Page 16: Schema Refinement

CS542 21

Closure of FDs

An FD f is implied by a set of FDs F if f holds whenever all FDs in F hold. = closure of F is set of all FDs that are implied

by F.

Computing closure of a set of FDs can be expensive. Size of closure is exponential in # attrs!

F

Page 17: Schema Refinement

CS542 23

Reasoning About FDs (Contd.)

Instead of computing full closure F+ of a set of FDs Too expensive

Typically, we just need to know if a given FD X Y is in closure of a set of FDs F.

Algorithm for efficient check: Compute attribute closure of X (denoted ) wrt F:

• Set of all attributes A such that X A is in• There is a linear time algorithm to compute this.

Check if Y is in X+ . If yes, then X Y in F+.

X

F

Page 18: Schema Refinement

CS542 24

Algorithm for Attribute Closure

Computing the closure of set of attributes {A1, A2, …, An}, denoted {A1, A2, …, An}+

1. Let X = {A1, A2, …, An}2. If there exists a FD B1, B2, …, Bm C, such

that every Bi X, then X = X C3. Repeat step 2 until no more attributes can

be added.4. Output X+ = {A1, A2, …, An}+

Page 19: Schema Refinement

CS542 26

Another Example : Inferring FDs

Consider R (A, B, C, D, E) with FDs F = { A B, B C, CD E } does A E hold ? Rephrase as :

Is A E in the closure F+ ? Equivalently, is E in A+ ?

Let us compute {A}+

{A}+ = {A, B, C} Conclude : E is not in A+, therefore A E is

false

Page 20: Schema Refinement

CS542 27

Recap: So Far.

Functional Dependencies : Relationships across Attributes of Relations

Redundancy : Arises due to certain relationships (FDs) holding.

So far : Reasoning with FDs.

Approach : Establish certain “normal forms” with respect to dependencies