Top Banner
14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database
75

14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

Jan 18, 2016

Download

Documents

Allison Joseph
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

14-1

Chapter 14

Functional Dependencies and Normalization for Relational Database

Page 2: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-1 14-2

1 Informal Design Guidelines for Relational Databases1.1 Semantics of the Relation Attributes1.2 Redundant Information in Tuples and Update Anomalies1.3 Null Values in Tuplesss1.4 Generation of Spurious Tuples

2 Functional Dependencies (FDs)2.1 Definition of FD2.2 Inference Rules for FDs2.3 Equivalence of Sets of FDs2.4 Minimal Sets of FDs

3 Normal Forms Based on Primary Keys3.1 Introduction to Normalization3.2 First Normal Form3.3 Second Normal Form3.4 Third Normal Form

4 General Normal Form Definitions (For Multiple Keys)5 BCNF (Boyce-Codd Normal Form)

Page 3: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-2 14-3

1.Informal Design Guidelines for Relatio1.Informal Design Guidelines for Relational Databasesnal Databases

• What is relational database design?The grouping of attributes to form “good” relation schemas

• Two levels of relation schemas:

– The logical “user view” level

– The storage “base relation” level

• Design is concerned mainly with base relations

• What are the criteria for “good” base relations?

Page 4: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-2 14-4

1 Informal Design Guidelines for Relatio1 Informal Design Guidelines for Relational Databases (Cont.)nal Databases (Cont.)

• We first discuss informally guidelines for good relational design

• Then we discuss formal concepts of functional dependencies and normal forms– 1 NF (First Normal Form)

– 2 NF (Second Normal Form)

– 3 NF (Third Normal Form)

– BCNF (Boyce-Codd Normal Form)

• Additional types of dependencies, further normal forms, relational design algorithms are discussed in Chapter 15

Page 5: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-3 14-5

1.1 Semantics of the Relation Attributes

• Informally, each tuple should represent one entity or relationship instance

• Attributes of different entities (EMPLOYEEs, DEPARTMENTs, PROJECTs) should not be mixed in the same relation

• Only foreign keys should be used to refer to other entities (see 14-7) Figure 14.1

• semantics of attributes• reducing redundant values in tuples• reducing null values in tuples• disallowing spurious tuples

Informal measures

Page 6: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-3 14-6

1.2 Redundant Information in Tuples and Update Anomalies

• Mixing attributes of multiple entities may cause problems

• Information is stored redundantly (i.e., wasting storage (see 14-11))

• Problems with update anomalies:– Insertion anomalies– Deletion anomalies– Modification anomalies

Page 7: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-4 14-7

Figure 14.1 Simplified version of the COMPANY relational database schema

Page 8: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-4a 14-8

Figure 14.2 Example relations for the schema of Figure 14.1

Page 9: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-4a 14-9

Figure 14.2 Example relations for the schema of Figure 14.1 (Cont.)

Page 10: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-5 14-10

GUIDELINE 1.• Design a relation schema so that it is easy to

explain its meaning.

• Do not combine attributes from multiple entity types and relationship types into a single relation

EMPLOYEE * DEPARTMENT

attributes from department

attributes from project

Page 11: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-6 14-11

Page 12: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-6 14-12

Page 13: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-7 14-13

Insertion Anomalies

• To insert a new employee tuple into EMP_DEPT, we must include either the attribute values for the department that employee works for or nulls. (if the employee does not work for a department)

Page 14: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-7 14-14

Insertion Anomalies (Cont.)

• It is difficult to insert a new department that has no employees as yet in the EMP_DEPT relationPlace null values??SSN is a primary keythe first employee is assigned

Page 15: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-8 14-15

Deletion Anomalies

• If we delete from EMP_DEPT an employee tuple that happens to represent the last employee working for a particular department, the information concerning that department is lost.

Page 16: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-8 14-16

Modification Anomalies

• In EMP_DEPT, if we change the value of one of attributes of a particular department, we must update the tuples of all employees who work in that department.

Page 17: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-8 14-17

GUIDELINE 2.

Design the base relation schemes so that no

insertion deletion, or modification anomalies

are present.

Cost: join is needed

(view definition)

Page 18: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-9 14-18

1.3 Null Values in Tuples

• Relation should be designed such that their tuples will have few NULL values if possible.

• Attributes that are NULL frequentlycould be placed in separate relations (with the primary key)

not applicableunknownknown but absent

Waste spacejoinaggregate COUNT. SUM. AVG.

problems

Office numbers (~ 10%)EMP_OFFICES (ESSN, OFFICE_NUMBER)

Page 19: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-9 14-19

1.4 Spurious Tuples

• Bad designs for a relational database may result in erroneous results for certain JOIN operations

• The “lossless join” property is used to guarantee meaningful results for join operations

• The relations should be designed to satisfy the lossless join condition

• Discussed in Chapter 15

Page 20: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-10 14-20

Page 21: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-10 14-21

=ΠENAME, PLOCATION(EMP_PROJ) see 14-22

=ΠSSN,PNUMBER,HOURS,PNAME,PLOCATION(EMP_PROJ)

Page 22: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-11 14-22

Page 23: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-12 14-23

EMP_PROJ1 * EMP_LOC

Page 24: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-12 14-24

GUIDELINE 4.

• Design relation schemas so that they can be joined with equality conditions on attributes that either primary keys or foreign keys in a way that guarantees that no spurious tuples are generated.

Page 25: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-13 14-25

2 Functional Dependencies2 Functional Dependencies

• Functional dependencies (FDs) are used to specify formal measures of the ‘goodness’ of relational designs

• FDs and keys are used to define normal forms for relations

• FDs are constrains that are derived from the meaning and interrelationships of the data attributes

Page 26: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-13 14-26

2.1 Definition of FD

• A set of attributes X functionally determines a set of attributes if the value of X determines a unique value for Y

• Written as X→ Y; can be displayed graphically on a relation schema as in Figure 14.3 (see 14-11)

• Specifies a constraint on all relation instances r(R)

Page 27: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-13/14 14-27

2.1 Definition of FD (Cont.)

• For any two tuples t1 and t2 in any relation instance r(R): If tIf t11[X]= t[X]= t22[X], then t[X], then t11[Y]= t[Y]= t22[Y][Y] X is a candidate key of R ⇒ X→ Y for any subset Y of R

• X→ Y holds if whenever two tuples have the same value for X, they must have the same value for Y

• FDs are derived from the real-world constrains on the attributes

Page 28: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-14 14-28

Examples of FD constraints:

• Social security number determines employee name SSN → ENAME

• Project number determines project name and location PNUMBER →{PNAME, PLOCATION}

Page 29: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-14 14-29

Examples of FD constraints: (Cont.)

• Employee SSN and project number determines the hoursper week that the employee works onthe project

{SSN, PNUMBER} → HOURS• An FD is a property of the attributes in the schema R• The constraint must hold on every relation instance r(R)• If K is a key of R, then K functionally determines all

attributes in R (since we never have two distinct tuples with tt11[K]= t[K]= t22[K][K]))

TEACHTEACHER

SmithSmithHall

Brown

COURSED.S.D.M.

CompilersD.S.

TEXTBartramAl-NourHoffman

Augenthaler

TEACHER → COURSECOURSE → TEXTTEXT → COURSE(P)

Page 30: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-14a 14-30

Inference Rules for Functional Dependencies Designer specifies the functional dependencies that

are semantically obvious.

closure of closure of FF (( Closure of Closure of FF ) = { X ) = { X →→Y | Y | FF ㅑㅑ X X →→Y}Y}

FF ㅑㅑ X X →→Y :Y : XX →→Y isY is inferred from inferred from FF whenever r (whenever r ( an extension of R an extension of R ))

satisfies all the dependencies in satisfies all the dependencies in FF , , X X →→Y also holds in r.Y also holds in r. F F = { SSN →{ENAME, BDATE, ADDRESS, DNUMBER},= { SSN →{ENAME, BDATE, ADDRESS, DNUMBER},

DNUMBER →{DNAME, DMGRSSN}} DNUMBER →{DNAME, DMGRSSN}}

ㅑㅑ SSN → { DNAME, DMGRSSN}SSN → { DNAME, DMGRSSN}

SSN → SSNSSN → SSN DNUMBER → DNAMEDNUMBER → DNAME

Page 31: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-15 14-31

2.2 Inference Rules for FDs

• Given a set of FDs F, we can infer additional FDs that hold whenever the FDs in F hold

Page 32: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-15 14-32

Armstrong’s inference rules:

• notations{X,Y}→Z {X,Y}→Z ≡≡ XY→ Z,{X,Y,Z}→{U,V} XY→ Z,{X,Y,Z}→{U,V} ≡≡ XYZ → UV XYZ → UV

A1. (Reflexive) If YA1. (Reflexive) If Y X, then X ⊆X, then X ⊆ →Y →Y (trivial dependency)(trivial dependency)

A2. (Augmentation) If X A2. (Augmentation) If X →Y, then XZ →YZ→Y, then XZ →YZ (Notation: XZ stands for X Z)∪ (Notation: XZ stands for X Z)∪

A3. (Transitive) If A3. (Transitive) If X X →Y and Y →Z, then X →Z→Y and Y →Z, then X →Z

• A1,A2,A3 form a sound and complete set of inference A1,A2,A3 form a sound and complete set of inference rulesrules

Page 33: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-15 14-33

Some additional inference rules that are useful:

(Decomposition) If X →YZ, then X →Y and X→Z

(Union) If X →Y and X →Z, then X →YZ

(Psuedotransitivity) If X →Y and WY→Z, then WX →Z

• The last three inference rules, as well as any other inference rule can be deduced from A1, A2, and A3(completeness property)

Page 34: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-16 14-34

A1. (Reflexive) If YIf Y X, then X ⊆X, then X ⊆ →Y→Y

Proof.

Assume t1, t2 r of R∈ and t1[X] = t2[X]

∵ Y X t1[Y] = t2[Y]⊆ ∴ t1[X] = t2[X]

Page 35: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-16 14-35

A2. (Augmentation) If X →Y, then XZ →YZIf X →Y, then XZ →YZ

Proof.

Assume X→Y holds in a r of R. and XZ→YZ does not hold

t1, t2 r∈1) t1[X] =t2[X]

2) t1[Y] =t2[Y]

3) t1[XZ] = t2[XZ]

4) t1[YZ] ≠ t2[YZ]

5) t1[Z] = t2[Z]

6) t1[YZ] = t2[YZ]

7) XZ →YZ

(X→Y)

XZ→YZ ()

1) 3)

2) 5)

3) 6) contradiction

Page 36: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-17 14-36

A3. (Transitive) If If X X →Y and Y →Z, then X →Z→Y and Y →Z, then X →Z

Proof.

t1, t2 r of R and t1[X] = t2[X]∈1) X →Y (given)

2) Y →Z (given)

3) t1[Y] = t2[Y] t1[X] = t2[X] & (1)

4) t1[Z] = t2[Z] (3) & (2)

5) X →Z t1[X] = t2[X] & (4)

Page 37: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-17 14-37

Decomposition Rule {X →YZ}{X →YZ} ㅑㅑ X →Y X →Y

1) X →YZ (given)

2) YZ →Y (Reflexive rule)

3) X →Y (Transitive rule)

Page 38: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-17 14-38

Union Rule{X →Y, X →Z}{X →Y, X →Z} ㅑㅑ X →YZ X →YZ

1) X →Y (given)

2) X →Z (given)

3) X →XY augmenting on 1 with X

4) XY →YZ augmenting on 2 with Y

5) X →YZ transitive rule on (2) & (4)

Page 39: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-18 14-39

Pseudotransitive Rule{X →Y, WY →Z}{X →Y, WY →Z} ㅑㅑ WX →Z WX →Z

1) X →Y (given)

2) WY →Z (given)

3) WX →WY (augmenting on 1with W)

4) WX →Z (transitive rule on (3) & (2))

Page 40: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-19 14-40

2.2 Inference Rules for FDs (Cont.)

• Closure of a set F of FDs is the set F+ of all FDs that can be inferred from F FF + + = { X →Y | F= { X →Y | F ㅑ ㅑ X →Y }X →Y }

• Closure of a set of attributes X with respect to F is the set X+ of all attributes that are functionally determined by XXX + + = { Y | F= { Y | F ㅑ ㅑ X →Y }X →Y }

• X+ can be calculated by repeatedly applying A1, A2, A3 using the FDs in F

Page 41: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-20 14-41

Algorithm 12.1 Determining X+

X+ := X;repeat oldX+ := X+; for each functional dependency Y →Z in F do if Y X⊆ + then X+ Z∪until (oldX+ = X+ );

Page 42: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-20 14-42

Example

F = { SSN → ENAME, PNUMBER → {PNAME, PLOCATION},

{SSN, PNUMBER} → HOURS}

{SSN}+ = {SSN,ENAME}

{PNUMBER}+ = {PNUMBER, PNAME, PLOCATION}

{SSN, PNUMBER}+ = {SSN, PNUMBER, ENAME, PNAME, PLOCATION, HOURS}

{SSN, PNUMBER} is a key

Page 43: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-19 14-43

2.3 Equivalence of Sets of FDs

• Two sets of FDs F and G are equivalent if:

– Every FD in F can be inferred from G, and

– Every FD in G can be inferred from F

• Hence, F and G are equivalent if F+ = G+

• Definition: F covers G if every FD in G can be inferred from F (i.e., if G+ F⊆ +)

Page 44: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-19 14-44

2.3 Equivalence of Sets of FDs (Cont.)

• F and G are equivalent if F covers G and G covers F

• There is an algorithm for checking equivalence of sets of FDs

F covers E: X→Y E compute X∀ ∈ + w.r.t. FF check Y X∈ +

E covers F: X→Y F compute X∀ ∈ + w.r.t. EE check Y X∈ +

Page 45: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-21 14-45

2.4 Minimal Sets of FDs

• A set of FDs is minimal if it satisfies the following conditions:

1) Every dependency in F has a single attribute for its RHS.

2) We cannot remove any dependency from F and have a set of dependencies that is equivalent to F

3) We cannot replace any dependency X →A in F with a dependency Y → A, where Y X and still have a set of dependencies ⊂that is equivalent to F.

Page 46: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-21 14-46

2.4 Minimal Sets of FDs (Cont.)

• Every set of FDs has an equivalent minimal set

• There can be several equivalent minimal sets

• Having a minimal set is important for some relational design algorithms (see Chapter 15)

Page 47: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-21a 14-47

Algorithm 14.2 Finding a minimal cover G for F

1. Set G : F.﹦

2. Replace each functional dependency X→{A1,A2,…,An} in G by the

n functional dependencies X →A1, X →A2,…, X →An.

3. For each functional dependency X → A in G

for each attribute B that is an element of X

if (( G - {X → A}) {( X ∪ - {B}) →A} ) is equivalent to G,

then replace X → A with ( X - {B}) → A in G.

4. For each remaining functional dependency X → A in G

if (G - {X → A}) is equivalent to G,

then remove X → A from G.

Page 48: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-22 14-48

3 Normal Forms Based on Primary Keys3 Normal Forms Based on Primary Keys

3.1 Introduction to Normalization

• Normalization: Process of decomposing unsatisfactory “bad” relations by breaking up their attributes into smaller relations

• Normal form: Condition using keys and FDs of a relation to certify whether a relation schema is in a particular normal form

Page 49: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-22 14-49

3.1 Introduction to Normalization (Cont.)

• 2NF, 3NF, BCNF based on keys and FDs of a relation schemaprime attribute : member of any keynonprime attribute

• 4NF based on keys, MVDs; 5NF based on keys, JDs (Chapter 15)

• Additional properties may be needed to ensure a good relational design (lossless join, dependency preservation; Chapter 15)

Page 50: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-22 14-50

3.2 First Normal Form

• Disallows composite attributes, multivalued attributes, and nested relations: attributes whose values for an individual tuple are non-atomic

• Considered to be part of the definition of relation

Page 51: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-23 14-51

Figure 14.8

(a) A relation schema that is not in 1NF

(b) Example relation instance

Page 52: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-23 14-52

Figure 14.8 (Cont.)

(c) 1NF relation with redundancy

alternative 1

SSN → PLOCATIONKEY:{DNUMBER,DLOCATION}

alternative 2

(better)

SSN → DLOCATION

Page 53: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-24 14-53

Figure 14.9 (a)

A nested relation PROJS within EMP_PROJ

Primary key Partial key

EMP_PROJ (SSN, ENAME, {PROJS (PNUMBER, HOURS)})EMP_PROJ (SSN, ENAME, {PROJS (PNUMBER, HOURS)})

Page 54: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-24 14-54

Figure 14.9 (b) Example extension of the EMP_PROJ relation showing nested relations within each tuple.

Page 55: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-24 14-55

Figure 14.9 (c)

Decomposing EMP_PROJ into 1NF relations by migrating the primary key

Page 56: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-25 14-56

3.3 Second Normal Form

• Uses the concepts of FDs, primary key

Definitions:

• Prime attribute – attribute that is member of the primary key K (candidate key??)

• Full functional dependency –

a FD Y →Z where removal of any attribute

from Y means the FD does not hold any

more. ∀ A Y, ( Y∈ - {A}) →Z ×

Page 57: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-25 14-57

Example:{SSN, PNUMBER} →HOURS is a full FD since neither

SSN → HOURS nor PNUMBER → HOURS hold{SSN, PNUMBER} →ENAME is not a full FD (it is

called partial dependency) since SSN →ENAME also holds

3.3 Second Normal Form (Cont.)

∃ A Y, ( Y∈ - {A}) → Z (i.e., A=PNUMBER)

Page 58: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-25 14-58

• A relation schema R is in second normal form (2NF) if every non-prime attribute A in R is fully functionally dependent on the primary keyprime attribute K→A trivial dependency

• R can be decomposed into 2NF relations via the process of 2NF normalization

3.3 Second Normal Form (Cont.)

Page 59: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-26 14-59

Figure 12.10fd2 and fd3 violate 2NF,i.e., ENAME, PNAME, and PLOCATION partially dependent on {SSN, PNUMBER}

SSN→DNUMBERDNUMBER →DMGRSSNㅑㅑ SSN →DMGRSSN

2NF (O)3NF (X)

It is not a primary key

Page 60: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-26a 14-60

Y→X (non-trivial dependency) ≡ t1, t2 r, if t1[Y] = t2[Y] then t1[X] = t2[X]∀ ∈ 有可能 t1[Y] ≠ t2[Y], 但是 t1[X] = t2[X]X →Z (non-trivial dependency) 只要上述可能性發生 , 資料就重複

Page 61: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-26a 14-61

SSN ( 或 PNUMBER) 僅是 key 的一部份,而非 key ,表示可能有一個以上的 tuples 具有相同的值,再加上 SSN→ENAME PNUMBER→PNAME PLOCATION相依部分也會重複

Page 62: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-27 14-62

3.4 Third Normal Form

Definition:

• Transitive functional dependency-a FD Y→Z that can be derived from two FDs Y→X and X →Z

nontrivial dependencyX is not a subset of any key

Page 63: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-27 14-63

Examples:SSN→DMGRSSN is a transitive FD since

SSN→DNUMBER and DNUMBER→DMGRSSN holdSSN→ENAME is non-transitive since there is no set of

attributes X where SSN→X and X→ENAME

3.4 Third Normal Form (Cont.)

Page 64: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-27 14-64

3.4 Third Normal Form (Cont.)

• A relation schema R is in third normal form

(3NF) if it is in 2NF and no non-prime

attribute A in R is transitively dependent on

the primary key (see 14-59/60/61 ) Figure 12.10

• R can be decomposed into 3NF relations via

the process of 3NF normalization

Page 65: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-28 14-65

4. General Normal Form Definitions(For Multiple Keys)

• The above definitions consider the primary key only

• The following more general definitions take into account relations with multiple candidate keys

• A relation schema R is in second normal form (2NF) if every non-prime attribute A in R is fully functionally dependent on every key of R (see Figure 14.11)

Page 66: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-29 14-66

Figure 14.11(a)

Parcels of lands for sale in various counties of a state

Candidate keys:PROPERTY_ID#{COUNTY_NAME, LOT#}

Partial dependency

Page 67: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-29 14-67

Figure 14.11 (b)

transitive dependency

Page 68: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-28 14-68

Definition:• Superkey of relation schema R- a set of attr

ibutes S of R that contains a key of R• A relation schema R is in third normal form

(3NF) if whenever a FD X →A holds in R, then either: (a) X is a superkey of R, or(b) A is a prime attribute of R(see 14-67/68/69)Figure 14.11

• Boyce-Codd normal form disallows condition (b) above

•A: nonprime transitive dependency key Y Y →X Y →A X →A•X: proper subset of a key key Y Y →X Y →A X →A partial dependency

Page 69: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-29 14-69

Figure 14.11 (c) (d)

fd5

Marion 0.5County 0.6 0.7 0.8 0.9 1.0 Liberty 1.1County 1.2 : 1.9 2.0

Page 70: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-30 14-70

5 BCNF (Boyce-Codd Normal Form)5 BCNF (Boyce-Codd Normal Form)

• A relation schema R is in Boyce-Codd Normal Form (BCNF) if whenever a FD X →A holds in R, then X is a superkey of R (14-71a) Figure 14.12

• Each normal form is strictly stronger than the previous one: Every 2NF relation is in 1NFEvery 3NF relation is in 2NFEvery BCNF relation is in 3NF

• There exist relations that are in 3NF but not in BCNF (14-71b) Figure 14.12

Page 71: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-31 14-71

Figure 14.12 (a) BCNF normalization with the dependency of FD2 being ‘lost’ in the decomposition

(b) A relation R in 3NF but not in BCNF

Non-prime: Cprime: A. B

Page 72: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

14-32 14-72

Three possible decompositions:

1. {STUDENT, INSTRUCTOR} and { STUDENT, COURSE}

2. {COUSE, INSTRUCTOR} and { COURSE, STUDENT}

3. {INSTRUCTOR, COURSE} and { INSTRUCTOR, STUDENT}

generate spurious tuples

generate spurious tuples

lossless join

“lost” FD1

FD1FD2

3NF, but not BCNF

Page 73: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

14-33 14-73

STUDENT INSTRUCTOR COURSESTUDENT

Page 74: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

14-34 14-74

INSTRUCTORCOURSE STUDENTINSTRUCTOR

Page 75: 14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.

12-30 14-75

5 BCNF (Boyce-Codd Normal Form) Cont.5 BCNF (Boyce-Codd Normal Form) Cont.

• The goal is to have each relation in BCNF (or 3NF)

• Additional criteria may be needed to ensure the set of relations in a relational database are satisfactory (see Chapter 15)

– Lossless join property

– Dependency preservation property

• Additional normal forms are discussed in Ch. 15

– 4NF (based on multi-valued dependencies)

– 5NF (based on join dependencies)