Top Banner
Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A * C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th ed),
25

Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

RelationA B C D E F

1NF?

2NF?3NF?

Relation1A B

Relation2A* C D E*

Help me Codd!! Relation3E F

Normalisation

Reading: Connolly and Begg 13 & 14 (4th ed),

Page 2: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Normalisation

regn

o

stud

ent_

nam

e

sex

stud

ent_

addr

ess

code

resu

lt

title

requ

ires

id lect

urer

_na

me

lect

urer

_ad

dres

s

qual

posi

tion

43414 Jones FemaleEdinburgh55101 Smith Edinburgh BSc Lecturer55144 Brown London BSc Lecturer

40986 Jones MaleOxford 3011 65 Data Structures 3005 55633 Brown Abingdon PhD Reader42331 Smith FemaleLondon 3011 72 Data Structures 3005 55633 Brown Abingdon PhD Reader40986 Jones MaleOxford 3080 Spreadsheets 3011 55633 Brown Abingdon PhD Reader40986 Jones MaleOxford 3025 78 Databases 3011 55981 Adams London Meng Lecturer42331 Smith FemaleLondon 3025 81 Databases 3011 55981 Adams London Meng Lecturer40986 Jones MaleOxford 3081 76 Artificial Intelligence 2080 55981 Adams London Meng Lecturer42331 Smith FemaleLondon 3081 Artificial Intelligence 2080 55981 Adams London Meng Lecturer

3082 Software Engineering 55981 Adams London Meng Lecturer

From this…

…to this

In 3+ easy(?) steps

Page 3: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

What is normalisation?

A method for database design– Theory examines how “good” is a schema?– Transform non-normalised schemas– Minimise storage

Takes a set of attributes and derives the relational model– By separating out the required tables

Completely different approach to ERM– But should get the same result

A minimum of 3 steps are used: For each stage, the normal form gets stronger (i.e.

removes redundancy) so less open to update anomalies All based on functional dependencies

Page 4: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Functional Dependency

Underpins normalisation process If every value of column A uniquely determines the value

in column B, then– B is functionally dependent on A (B depends on A)– A determines B, or, formally, A B (A is called the determinant)

For example,– EmpID Age, Dept (AB,C) Employee ID, Project Role (X,

Y Z)– Note multiple attributes are often involved

EmpID Project Age Dept Dsize Budget Role

Page 5: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Rules for functional dependency

A B does NOT automatically mean B A– E.g. student ID name but not name ID

Transitive dependency:If AB and BC then AC

Many other rules– E.g. if X,YZ but XZ also– In this case Z is partially dependent on X,Y

“Transitive” and “partial” dependency are two key concepts of the normalisation process

Page 6: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

A Question for you!

EmpID Project Age Dept Dsize Budget Role

EmpID Project Age Dept Dsize Budget Role

E1 P2 33 D2 10 100 AnalystE1 P1 33 D2 10 200 Prog.E2 P1 34 D5 10 200 Prog.E2 P2 34 D5 20 100 Analyst

Which functional dependency is violated by the data?

ABCD

Page 7: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Unnormalised Form

Relation contains:– non-atomic attribute values

non-atomic values

ID Employee Salary Project1 Grey 31000 A2 Brown 35000 B,C3 White 55000 A,B,C4 Black 47000 A,C

Violation of 1NF

Page 8: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

First Normal Form

Permits only single (atomic) attribute values

ID Employee Salary1 Grey 310002 Brown 350003 White 550004 Black 47000

ID (fk) Project Budget1 A 102 B 52 C 53 A 53 B 53 C 54 A 104 C 5

Remove Repeating

Group along with primary

key from other Table

ID Employee Salary Project Budget

1 Grey 31000 A 102 Brown 35000 B 52 Brown 35000 C 53 White 55000 A 53 White 55000 B 53 White 55000 C 54 Black 47000 A 104 Black 47000 C 5

redundancy Repeating

Page 9: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Full Functional Dependency (FFD) X Y is FFD

– if removal of any attribute from X removes the dependency

X Y is partially dependent– if removal of attribute from X leaves the dependency

intact 2NF test

– involves testing for partial dependency on the PK (therefore PK MUST be composite to test for 2NF)

Relation R is in 2NF if:– every non-primary-key attribute in R is FFD on the

primary key of R

Second Normal Form

Page 10: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

So which FD’s are violating 2NF? “Second Normalised” by:

– removing non-primary-key attributes and forming a FFD on appropriate part of primary key

2NF

EmpID Project Age Dept Dsize Budget Role

{EmpID ,Age, Dept , Dsize} {EmpID*, Project*, Role}

{Project , Budget}

Page 11: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Third Normal Form

Remove Transitive Dependency Conditions

– A non-primary-key attribute Z is transitively dependent on primary key X if:

X Y; Y Z (Y attribute provides the transition to the PK)

[EmpID* Project* Role]

[Project Budget]

[EmpID Age Dept Dsize]

A

B

C

Which of the above could have transitive dependency?

D None of the above

Page 12: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Here is an un-normalised Table

Ord# Date Cust# Name Prod# Desc Qty Supplier Tel1 12/1/01 1 Jones 1 Disk 3 X 1011 12/1/01 1 Jones 2 CD 5 Y 2232 13/1/01 2 Black 1 Disk 1 X 1012 13/1/01 2 Black 2 CD 1 Y 223 2 13/1/01 2 Black 3 Mouse 1 X 1013 13/1/01 1 Jones 3 Mouse 1 X 101

Page 13: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Normalise it to 1NF

Ord# Date Cust# Name1 12/1/01 1 Jones2 13/1/01 2 Black3 13/1/01 1 Jones

Ord# Date Cust# Name Prod# Desc Qty Supplier Tel1 12/1/01 1 Jones 1 Disk 3 X 1011 12/1/01 1 Jones 2 CD 5 Y 2232 13/1/01 2 Black 1 Disk 1 X 1012 13/1/01 2 Black 2 CD 1 Y 223 2 13/1/01 2 Black 3 Mouse 1 X 1013 13/1/01 1 Jones 3 Mouse 1 X 101

Ord# Prod# Desc Qty Supplier Tel1 1 Disk 3 X 1011 2 CD 5 Y 2232 1 Disk 1 X 1012 2 CD 1 Y 2232 3 Mouse 1 X 1013 3 Mouse 1 X 101

fk

Page 14: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Ord# Date Cust# Name1 12/1/01 1 Jones2 13/1/01 2 Black3 13/1/01 1 Jones

Ord# Prod# Desc Qty Supplier Tel1 1 Disk 3 X 1011 2 CD 5 Y 2232 1 Disk 1 X 1012 2 CD 1 Y 2232 3 Mouse 1 X 1013 3 Mouse 1 X 101

Already in 2NF

Prod# Desc Supplier Tel1 Disk X 1012 CD Y 2233 Mouse X 101

Ord# Prod# Qty1 1 31 2 52 1 12 2 12 3 13 3 1

Now we normalise this to 2NFremembering to test on the PK for any partial dependency

fk fk

Page 15: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

So, any transitive dependency?

Ord# Date Cust# Name1 12/1/01 1 Jones2 13/1/01 2 Black3 13/1/01 1 Jones

Prod# Desc Supplier Tel1 Disk X 1012 CD Y 2233 Mouse X 101

Ord# Prod# Qty1 1 31 2 52 1 12 2 12 3 13 3 1

fk fk

Page 16: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Yes! But not in all …………….Ord# Date Cust# Name1 12/1/01 1 Jones2 13/1/01 2 Black3 13/1/01 1 Jones

Prod# Desc Supplier Tel1 Disk X 1012 CD Y 2233 Mouse X 101

Prod# Desc Supplier (fk)1 Disk X2 CD Y3 Mouse X

Ord# Date Cust# (fk)1 12/1/01 12 13/1/01 23 13/1/01 1

Supplier TelX 101Y 223

Cust# Name1 Jones2 Black

Ord# Prod# Qty1 1 31 2 52 1 12 2 12 3 13 3 1

OK!

Page 17: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Final Decomposition

Ord#{fk} Prod#{fk} Qty1 1 31 2 52 1 12 2 12 3 13 3 1

Ord# Date Cust# (fk)1 12/1/01 12 13/1/01 23 13/1/01 1

Cust# Name1 Jones2 Black

Prod# Desc Supplier (fk)1 Disk X2 CD Y3 Mouse X

Supplier TelX 101Y 223

Now in 3NF

Page 18: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

The underlying E-R Model …..

Ord# Date Cust# Name Prod# Desc Qty Supplier Tel1 12/1/01 1 Jones 1 Disk 3 X 1011 12/1/01 1 Jones 2 CD 5 Y 2232 13/1/01 2 Black 1 Disk 1 X 1012 13/1/01 2 Black 2 CD 1 Y 223 2 13/1/01 2 Black 3 Mouse 1 X 1013 13/1/01 1 Jones 3 Mouse 1 X 101

Customer Order

Product Supplier

0..*1..10..*

0..*

1..11..*

makes

has

despatches

How many tables would you get from mapping?

Page 19: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

So Normalisation to 3NF is Normal!!

Remember, 2NF and 3NF disallow partial and transitive dependencies respectively on the PK, otherwise they are open to update anomalies

But ….. even at 3NF, a relation may be open to update anomalies on rare occasions due to redundancy too

So we look briefly at these– Boyce-Codd– 4NF

Page 20: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Boyce-Codd NF

Is a stronger normalised form then 3NF Definition: A relation is in BCNF, if and only if,

every determinant is a candidate key And remember that a candidate key is any key

that could become the PK of the relation (i.e. there may be competition for it!)

Potential to violate BCNF comes from:– A relation containing at least 2 composite candidate

keys– Or candidate keys overlapping (i.e. they have at

least one attribute in common)

Page 21: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

BCNF Example

Consider the candidate keys for:

Adapted from Connolly and Begg, 2005, 4th ed. Page 420

clientNo interviewDate interviewTime staffNo roomNo

CR76 13/5/08 10.30 SG5 G101

CR56 13/5/08 12.00 SG5 G101

CR74 13/5/08 12.00 SG37 G102

CR56 1/7/08 10.30 SG5 G102

FD1 {PK}: clientNo, interviewDate interviewTime, staffNo, roomNo FD2 {CK}: staffNo, interviewDate, interviewTime clientNo FD3 {CK}: roomNo, interviewDate, interviewTime staffNo, clientNo FD4: staffNo, interviewDate roomNo

PK is primary key and CK is candidate key.But what about FD4? It is not a CK

Page 22: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

So new decomposition?

clientNo interviewDate* interviewTime staffNo*

CR76 13/5/08 10.30 SG5

CR56 13/5/08 12.00 SG5

CR74 13/5/08 12.00 SG37

CR56 1/7/08 10.30 SG5

interviewDate staffNo roomNo

13/5/08 SG5 G101

13/5/08 SG37 G102

1/7/08 SG5 G102

So duplication in the room number is now eradicated

Page 23: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

4NF

Comes from 2 multi-valued attributes in a relation

E.g. for each value of A there is a set of values for B and a set for C, while B and C remain independent of each other

Branch

BranchNo

staffName[1..*]

ownerName[1..*]

So if you model your databases from ERM’s this type of dependency should not arise.

Page 24: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Example of 4NF

branchNo staffName ownerName

C003 Anne Carol

C003 David Carol

C003 Anne Tina

C003 David Tina

branchNo* staffName

C003 Anne

C003 David

branchNo* ownerName

C003 Carol

C003 Tina

Note: if step 9 applied to multi-valued attributes then we should map this correctly and avoid such redundancy as the two tables on the right would be the result of the mapping! Adapted from Connolly and Begg, 2005, 4th ed. Page 428

Page 25: Relation A B C D E F 1NF? 2NF? 3NF? Relation1 A B Relation2 A* C D E* Help me Codd!! Relation3 E F Normalisation Reading: Connolly and Begg 13 & 14 (4th.

Normal Form Summary

A Relation’s degree of normalisation Stronger in format at each stage

– less vulnerable to update anomalies First Normal Form (1NF)

– The relation has no non-atomic values– Or the relation has “no repeating group”

2nd Normal Form (2NF)– The relation has no partial dependencies– All non-key attributes are fully functionally dependent on the PK

3rd Normal Form (3NF)– The relation has no transitive dependencies

Boyce-Codd– Every determinant is a candidate key

4NF – no multi-valued dependencies