Top Banner
Database Lecture Notes Normalization 2 – How to Normalize Dr. Meg Murray [email protected]
54

Lec 8 Normalization 2

Nov 30, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lec 8 Normalization 2

DatabaseLecture Notes

Normalization 2 – How to Normalize

Dr. Meg Murray

[email protected]

Page 2: Lec 8 Normalization 2

Normalization Why?• All relations are not equal

• Tables not normalized experience issues known as modification problems – Insertion problems

• Difficulties inserting data into a relation

– Modification problems• Difficulties modifying data into a relation

– Deletion problems• Difficulties deleting data from a relation

Page 3: Lec 8 Normalization 2

Deletion Anomaly

• If you delete any row, you delete information about both the machine and the repair

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 4: Lec 8 Normalization 2

Modification Anomalies

• The EQUIPMENT_REPAIR table before and after an incorrect update operation on AcquisitionCost for Type = Drill Press:

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 5: Lec 8 Normalization 2

Normalization

• Normalization is a process of analyzing a relation to ensure that it is well formed

• More specifically, if a relation is normalized (well formed), rows can be inserted, deleted, or modified without creating update anomalies

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 6: Lec 8 Normalization 2

Normalization Review:Solving Modification Problems

• Most modification problems are solved by breaking an existing table into two or more tables through a process known as normalization

• So the question….

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 7: Lec 8 Normalization 2

How Many Tables?

Should we store these two tables as they are, or should we combine them into one table in our new database?

Page 8: Lec 8 Normalization 2

Normal Forms

• Relations are categorized as a normal form based on which modification anomalies or other problems that they are subject to:

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 9: Lec 8 Normalization 2

Normal Forms

• 1NF – A table that qualifies as a relation is in 1NF

• 2NF – A relation is in 2NF if all of its nonkey attributes are dependent on all of the primary key [focus is on composite primary keys]

• 3NF – A relation is in 3NF if it is in 2NF and has no determinants except the primary key

• Boyce-Codd Normal Form (BCNF) – A relation is in BCNF if every determinant is a candidate key

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 10: Lec 8 Normalization 2

BCNF

• Boyce-Codd Normal Form (BCNF) – A relation is in BCNF if every determinant is a candidate key

“I swear to construct my tables so that all nonkey columns are dependent on the key, the whole key and nothing but the key, so help me Codd.”

Page 11: Lec 8 Normalization 2

Normalization Review:Definition Review

• Determinant– The attribute that can be used to find the

value of another attribute in the relation– The right-hand side of a functional

dependency

StudentID (StudentName, DormName, DormRoom)

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 12: Lec 8 Normalization 2

Normalization Review:Definition Review II

• Candidate key– The value of a candidate key can be used to

find the value of every other attribute in the table

– A simple candidate key consists of only one attribute

– A composite candidate key consists of more than one attribute

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 13: Lec 8 Normalization 2

The CUSTOMER Table

CUSTOMER (CustomerNumber, CustomerName, StreetAddress, City, State, ZIP, ContactName, Phone)

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

•What is the primary key?•What are the candidate keys?•What are the non-keyed attributes?

Page 14: Lec 8 Normalization 2

Simple Examples

• Remember the question:– Is every determinant a candidate key

or – are all nonkey columns dependent on the

key, the whole key and nothing but the key?

Page 15: Lec 8 Normalization 2

Normalization Example(StudentID) (StudentName, DormName,

DormCost)

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Is every determinant a candidate key Are all nonkey columns dependent on the key, the whole key and nothing but the key?

•What are the determinants?•Does StudentID determine Student Name?•Does Student ID determine Dorm Name?•Does Student ID determine Dorm cost?

•Probably not – more likely Dorm Name does•If so, Dorm Name is a determinate of Dorm cost

•Is StudentID a candidate key?•Is Dorm Name a candidate key?

Page 16: Lec 8 Normalization 2

Normalization Example

(StudentID)

(StudentName, DormName, DormCost)

(DormName) (DormCost)

However, if…

(StudentID) (StudentName, DormName)

(DormName) (DormCost)

Then DormCost should be placed into its own relation,resulting in the relations:

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 17: Lec 8 Normalization 2

Normalization Example

(AttorneyID,ClientID)

(ClientName, MeetingDate, Duration)

(ClientID) (ClientName)

However, if…

Then ….KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

AttorneyIDClientID

ClientNameMeetingDateDuration

ATTORNEY

Page 18: Lec 8 Normalization 2

(ClientID) (ClientName)

(AttorneyID,ClientID)

(MeetingDate, Duration)

Then ClientName should be placed into its own relation,resulting in the relations:

AttorneyIDClientID

MeetingDateDuration

SCHEDULE

ClientID

ClientName

CLIENT

Page 19: Lec 8 Normalization 2

Walking through the forms

Page 20: Lec 8 Normalization 2

1st Normal Form [1NF]

• Eliminate Repeating Groups – Eliminate duplicative columns from the same

table. • Create separate tables for each group of related

data

– Give each table a primary key (unique identifier)

• Putting a Table into 1NF makes it a Relation– Do you remember the rules of a relation?

Page 21: Lec 8 Normalization 2

Is this in 1NF?

Page 22: Lec 8 Normalization 2

Characteristics of 1NF

• Characteristics– Table Format – No repeating groups – Primary key (PK) identified

Page 23: Lec 8 Normalization 2

Steps to 1NF

1. Eliminate repeating groups.– Present data in a tabular format, where each

cell has a single value and there are no repeating groups.

Page 24: Lec 8 Normalization 2

Repeating Groups

• Do you see the repeating group in this table?

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 25: Lec 8 Normalization 2

Steps to 1NF

2. Identify the Primary Key (PK) – At a first glance the AIRCRAFT_NUMBER

seems a good candidate for a PK, but would not uniquely identify all of the remaining row attributes.

– The combination of AIRCRAFT_NUMBER and PILOT_NUMBER is a PK candidate that will uniquely identify all row attributes.

Page 26: Lec 8 Normalization 2

Table in 1NF

Table with primary key identified [attributes listed vertically]

Page 27: Lec 8 Normalization 2

Identify Dependencies• AIRCRAFT_NUMBER, PILOT_NUMBER --> AIRCRAFT_NAME,

PILOT_NAME, MISSION_CLASS, FLYING_HOUR, COST_HOUR

– Primary Key (PK) dependency. The PK is also a composite key.

• AIRCRAFT_NUMBER --> AIRCRAFT_NAME

– Partial dependency ... aircraft name is only dependent on a part of the composite AIRCRAFT_NUMBER, PILOT_NUMBER key.

Page 28: Lec 8 Normalization 2

Identify Dependencies

• PILOT_NUMBER --> PILOT_NAME– Partial dependency ... pilot name is only dependent on a part of the composite

AIRCRAFT_NUMBER, PILOT_NUMBER key.

• PILOT_NUMBER --> PILOT_NAME, FLYING_HOUR, COST_HOUR– Partial dependencies

• MISSION_CLASS --> COST_HOUR– Transitive dependency .... COST_HOUR non-prime/non-key attribute is

dependent on non-prime/non-key MISSION_CLASS attribute

Page 29: Lec 8 Normalization 2

Helpful to Create Dependency Diagram

Page 30: Lec 8 Normalization 2

2nd Normal Form [2NF]

• Characteristics– For Tables with composite keys– 1NF – No partial dependencies

• In other words, a non-key field must provide a fact about the whole key - not just one part of the key

Page 31: Lec 8 Normalization 2

2nd Normal Form [2NF]

• If an attribute depends on only part of a composite key, remove it to a separate table.– Often map to components [themes, entities…]

• Create relationships between these new tables and their predecessors through the use of foreign keys.

http://databases.about.com/od/specificproducts/a/2nf.htm

Page 32: Lec 8 Normalization 2

Look for Partial Dependencies

• Ask questions such as:• Are both Aircraft_Number and Pilot_Number

needed to determine Pilot_Name?• Are both Aircraft_Number and Pilot_Number

needed to determine Mission_Class?• …

Page 33: Lec 8 Normalization 2

Look for Partial Dependencies

• Move partial dependencies to their own tables

AIRCRAFT (Aircraft_Number)

PILOT (Pilot_Number)

FLYING HOURS (Aircraft_Number, Pilot_Number)

Page 34: Lec 8 Normalization 2

Steps to 2NF

Notice how moving partial dependencies separates into key components

AIRCRAFT

PILOT

FLYING HOURS

Page 35: Lec 8 Normalization 2

Steps to 2NF

• Assign dependent attributes to each key component – AIRCRAFT ( AIRCRAFT_NUMBER,

AIRCRAFT_NAME )

– PILOT ( PILOT_NUMBER, PILOT_NAME, MISSION_CLASS, COST_HOUR )

– FLIGHT ( AIRCRAFT_NUMBER, PILOT_NUMBER, FLYING_HOUR )

Page 36: Lec 8 Normalization 2

Draw New Dependency Diagram

Page 37: Lec 8 Normalization 2

3rd Normal Form [3NF]

• Eliminate Columns Not Dependent On Key - If attributes do not contribute to a description of the key, remove them to a separate table.– Third normal form is violated when a non-key

field is a fact about another non-key field [transitive dependency]

Page 38: Lec 8 Normalization 2

3rd Normal Form

• Characteristics– 2NF – No transitive dependencies

3-38

Page 39: Lec 8 Normalization 2

3rd Normal Form

3-39

Page 40: Lec 8 Normalization 2

Back to the ERD

3-40

Page 41: Lec 8 Normalization 2

Is the Air Pilot Example in BCNF?

• Are all determinants candidate keys?

• Often when in 3NF, also in BCNF

Page 42: Lec 8 Normalization 2

Steps to BCNF

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 43: Lec 8 Normalization 2

Another Example

Page 44: Lec 8 Normalization 2

Putting a Relation into BCNF:EQUIPMENT_REPAIR

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 45: Lec 8 Normalization 2

Identify Functional Dependencies

FD:

ItemNumber (Type, AcquisitionCost)

RepairNumber (ItemNumber, Type, AcquisitionCost, RepairDate, RepairAmount)

Is there a determinate key that is not a candidate key?

EQUIPMENT_REPAIR (ItemNumber, Type, AcquisitionCost,RepairNumber, RepairDate, RepairAmount)

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 46: Lec 8 Normalization 2

Put into Tables

• ItemNumber is not a candidate key so– Move it and its attributes to a new table

• ITEM(ItemNumber,Type, Acquisition)

– The determinate becomes the primary key• ITEM(ItemNumber,Type, Acquisition)

– Leave a foreign key in the original table• REPAIR (ItemNumber, RepairNumber,

RepairDate, RepairAmount)

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 47: Lec 8 Normalization 2

Putting a Relation into BCNF:New Relations

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 48: Lec 8 Normalization 2

Putting a Relation into BCNF:SKU_DATA

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 49: Lec 8 Normalization 2

Putting a Relation into BCNF:SKU_DATA

SKU_DATA (SKU, SKU_Description, Department, Buyer)

SKU (SKU_Description, Department, Buyer)

SKU_Description (SKU, Department, Buyer)

Buyer Department

SKU_DATA (SKU, SKU_Description, Buyer)

BUYER (Buyer, Department)

Where BUYER.Buyer must exist in SKU_DATA.Buyer

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 50: Lec 8 Normalization 2

Putting a Relation into BCNF:New Relations

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 51: Lec 8 Normalization 2

Multivalued Dependencies

• A multivalued dependency occurs when a determinant determines a particular set of values: Employee Degree

Employee Sibling

PartKit Part

• The determinant of a multivalued dependency can never be a primary key

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 52: Lec 8 Normalization 2

Multivalued Dependencies

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition)© 2008 Pearson Prentice Hall

Page 53: Lec 8 Normalization 2

Eliminating Anomolies from Multivalued Dependencies

• Multivalued dependencies are not a problem if they are in a separate relation, so:– Always put multivalued dependencies into

their own relation– This is known as Fourth Normal Form (4NF)

Page 54: Lec 8 Normalization 2

References

• Example for AirPilot:– http://dotnet.org.za/willy/archive/2008/04/10/

taking-a-step-back-database-normalisation-1nf-2nf-3nf-bcnf-and-4nf-part-1.aspx

• Good reference– http://www.bkent.net/Doc/simple5.htm