Top Banner
Lecture 10
33

Normalization

Jan 05, 2016

Download

Documents

Susan

Normalization. Lecture 10. Intoduction. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Normalization

Lecture 10

Page 2: Normalization

In the field of relational database design, normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics—insertion, update, and deletion anomalies—that could lead to a loss of data integrity.

Page 3: Normalization

“ Normalization is the process of successively reducing relations with anomalies to produce smaller, well structured relations”

Page 4: Normalization

Minimize Data RedundancySimplify the enforcement of referential

integrityMake it easier to maintain data(Insert, Update,

Delete)Provide a better design that is an improved

representation of the real world and a stronger basis for future growth

Page 5: Normalization

We first discuss informal guidelines for good relational design

Then we discuss formal concepts of functional dependencies and normal forms

1NF (First Normal Form)2NF (Second Normal Form)3NF (Third Normal Form)BCNF (Boyce-Codd Normal Form)

Page 6: Normalization

GUIDELINE 1: Informally, each tuple in a relation should represent one entity.Attributes of different entities (EMPLOYEEs,

DEPARTMENTs, PROJECTs) should not be mixed in the same relation

Only foreign keys should be used to refer to other entities

Entity and relationship attributes should be kept apart as much as possible.

Bottom Line: Design a schema that can be explained easily relation by relation. The attributes should be easy to interpret.

Page 7: Normalization
Page 8: Normalization
Page 9: Normalization

Insertion anomaliesDeletion anomaliesModification anomalies (Update Anomalies)

Page 10: Normalization

Consider the relation:EMP_PROJ(Emp#, Proj#, Ename, Pname,

No_hours)Insert Anomaly:

Cannot insert a project unless an employee is assigned to it.

ConverselyCannot insert an employee unless a he/she is

assigned to a project.

Page 11: Normalization

Consider the relation:EMP_PROJ(Emp#, Proj#, Ename, Pname,

No_hours)Delete Anomaly:

When a project is deleted, it will result in deleting all the employees who work on that project.

Alternately, if an employee is the sole employee on a project, deleting that employee would result in deleting the corresponding project.

Page 12: Normalization

Consider the relation:EMP_PROJ(Emp#, Proj#, Ename, Pname,

No_hours)Update Anomaly:

Changing the name of project number P1 from “Billing” to “Customer-Accounting” may cause this update to be made for all 100 employees working on project P1.

Page 13: Normalization
Page 14: Normalization
Page 15: Normalization

Design a schema that does not suffer from the insertion, deletion and update anomalies.

If there are any anomalies present, then note them so that applications can be made to take them into account.

Page 16: Normalization

Relations should be designed such that their tuples will have as few NULL values as possible

Attributes that are NULL frequently could be placed in separate relations (with the primary key)

 Reasons for nulls:Attribute not applicable or invalidAttribute value unknown (may exist)Value known to exist, but unavailable

Page 17: Normalization

A set of attributes X functionally determines a set of attributes Y if the value of X determines a unique value for Y

Written as X -> Y

Page 18: Normalization

Social security number determines employee nameSSN -> ENAME

Project number determines project name and locationPNUMBER -> {PNAME, PLOCATION}

Employee ssn and project number determines the hours per week that the employee works on the project{SSN, PNUMBER} -> HOURS

Page 19: Normalization
Page 20: Normalization

Partial Functional Dependency Indicates that if A and B are attributes of a table , B is partially dependent on A if there is some attribute that can be removed from A and yet the dependency still holds. Say for Ex, consider the following functional dependency that exists in the Tbl_Staff table: StaffID,Name -------> BranchID BranchID is functionally dependent on a subset of A (StaffID,Name), namely StaffID.

Page 21: Normalization

A transitive dependency is an indirect functional dependency, one in which X→Z only by virtue of X→Y and Y→Z.

Page 22: Normalization
Page 23: Normalization

Steps and Methods

Page 24: Normalization

Unnormalized – There are multivalued attributes or repeating groups

1 NF – No multivalued attributes or repeating groups.

2 NF – 1 NF plus no partial dependencies 3 NF – 2 NF plus no transitive

dependencies

Page 25: Normalization

Disallows

Multivalued attributes

Considered to be part of the definition of relation

Page 26: Normalization
Page 27: Normalization

DefinitionsPrime attribute: An attribute that is member of

the primary key KFull functional dependency: a FD Y -> Z where

removal of any attribute from Y means the FD does not hold any more

A relation schema R is in second normal form (2NF) if every non-prime attribute A in R is fully functionally dependent on the primary key

R can be decomposed into 2NF relations via the process of 2NF normalization

Page 28: Normalization

Examples:{SSN, PNUMBER} -> HOURS is a full FD since

neither SSN -> HOURS nor PNUMBER -> HOURS hold

{SSN, PNUMBER} -> ENAME is not a full FD (it is called a partial dependency ) since SSN -> ENAME also holds

Page 29: Normalization
Page 30: Normalization

A relation schema R is in third normal form (3NF) if it is in 2NF and no non-prime attribute A in R is transitively dependent on the primary key

DisallowsTransitive functional dependency: a FD X -

> Z that can be derived from two FDs X -> Y and Y -> Z

Page 31: Normalization

SSN -> DMGRSSN is a transitive FD Since SSN -> DNUMBER and DNUMBER ->

DMGRSSN hold SSN -> ENAME is non-transitive

Since there is no set of attributes X where SSN -> X and X -> ENAME

Page 32: Normalization
Page 33: Normalization

Accountant No.

Skill No.

Skill Category

Proficiency

Accountant Name

Accountant Age

Group No.

Group City

Group Supervisor

21 113 System

3 Ali 55 52 ISD Baber

35 113179204

SystemTaxAudit

516

Daud 32 44 LHR Ghafoor

50 179 Tax 2 Chohan

40 44 LHR Ghafoor

77 148 ConsultingTax

6

6

Zahid 52 52 ISD Baber