Top Banner
NORMALIZATION Prof. Sridhar Vaithianathan
29
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Ism normalization pine valley 2012

NORMALIZATION

Prof. Sridhar Vaithianathan

Page 2: Ism normalization pine valley 2012

Entities, Attributes and RelationshipStrong Entity Vs Weak entity ( EMPLOYEE &

DEPENDENT)Simple Vs Composite AttributesSingle Valued Vs Multi Valued AttributesStored Vs Derived AttributesIdentifier Attribute – Primary KeyComposite IdentifierForeign KeySub-Type Vs Super Type Relationship

Page 3: Ism normalization pine valley 2012

Properties of Relations

1. Each relation (or table) in a database has a unique name.

2. An entry at the intersection of each row and column is atomic (single valued).there can be no multivalued attributes in a relation.

3. Each row (record) is unique; no two rows in a relation are identical.

4. Each attribute(or column) within a table has a unique name.

5. The sequence of columns/rows (left to right/top to bottom) is insignificant.

Page 4: Ism normalization pine valley 2012
Page 5: Ism normalization pine valley 2012

Integrity Constraints Domain Constraints: All of the values that appear in a column of

a relation must be taken from the same domain. – A domain is the set of values that may be assigned to an attribute. [Domain

definition usually consists of: domain name, meaning, data type, size (length), and allowable values/range.]

Entity Integrity Constraint: No primary key attribute (or component of primary key attribute) may be null.– Null: A value that may be assigned to an attribute when no other value

applies or when the applicable value is unknown. – Null is neither numeric zero nor string of blanks. – In reality null is not a value but rather absence of a value

Referential Integrity Constraint: Either each foreign key value must match a primary key value in another relation or the foreign key value must be null. ( Eg : Student who has not been assigned any faculty as mentor)

Page 6: Ism normalization pine valley 2012

Logical Database Design1. Top-down approach > E-R modeling

2. Bottom-up approach > Normalization.

Databases : Relational Vs Non-Relational.

What is Normalization?

It is a formal process for deciding which attributes should be grouped together in relation

It is a step by step decomposition of complex records into simple records and thereby reducing redundancy

Why Normalize ?

Normalization reduces redundancy. Redundancy is the unnecessary repetition of data

Page 7: Ism normalization pine valley 2012

Redundancy can lead to:

1. Inconsistencies – Errors are more likely to occur when facts are repeated

2. Update Anomalies

- Inserting, modifying and deleting data may cause inconsistencies

- High likelihood of updating or deleting data in one table while omitting to make corresponding changes in other relations

A fully normalized record consists of:

3. A primary key that identifies an entity

4. A set of attributes that describe the entity

Normal forms (NF) are table structures with minimum redundancy

Page 8: Ism normalization pine valley 2012

Functional Dependency

Normalization theory is based on the fundamental notion of functional dependency.

Given a relation R, attribute B is functionally dependent on A if , for every valid instance of A, that value of A uniquely determines the value of B.

The functional dependency of B on A is represented as below

A B

Example: Suppose entity CUSTOMER has the following attributes

Cust_Code, Name, Address and Phone_Number.

Cust_Code Name, Address, Phone_Number

Cust_Code Name Address Phone_Number

Page 9: Ism normalization pine valley 2012

Boyce -Codd NF, 4 NF and 5NF

1 NF

2 NF

3 NF

Unnormalized Relation

Steps in Normalization

Page 10: Ism normalization pine valley 2012

Steps in Normalization1. 1NF: A relation is in 1NF if multi-valued attributes (also called

repeating groups) have been removed, so there is a single value (possibly null) at the intersection of each row and column of the table.

2. 2NF: A relation is in 2NF if it is in 1NF, and contains no partial dependencies.

A partial functional dependency in a relation is a functional dependency in which one or more nonkey attributes are functionally dependent on part (but not all) of the primary key.

3. 3NF:A relation is in 3NF if it is in 2NF and no transitive dependencies exist.

A transitive dependency in a relation is a functional dependency between two (or more) nonkey attributes.

Page 11: Ism normalization pine valley 2012

Pine Valley Furniture Company Database

Page 12: Ism normalization pine valley 2012

Invoice Data - Pine Valley Furniture Company

Page 13: Ism normalization pine valley 2012

1 NF: A relation is in 1NF if multi-valued attributes (also called repeating groups) have been removed, so there is a single value (possibly null) at the intersection of each row and column of the table.

Page 14: Ism normalization pine valley 2012

Functional Dependency Diagram for Invoice

A partial functional dependency in a relation is a functional dependency in which one or more nonkey attributes are functionally dependent on part (but not all) of the primary key.

Page 15: Ism normalization pine valley 2012

Removing Partial Dependencies

2NF: A relation is in 2NF if it is in 1NF, and contains no partial dependencies.

A transitive dependency in a relation is a functional dependency between two (or more) nonkey attributes.

Page 16: Ism normalization pine valley 2012

Removing Transitive Dependencies

3NF:A relation is in 3NF if it is in 2NF and no transitive dependencies exist.

Page 17: Ism normalization pine valley 2012

Note to Students: For drawing ER diagram of your project , Try MS Visio, an easy to use tool to draw the ER Diagram as one shown above

Relational Scheme for INVOICE data (MS Visio)

Page 18: Ism normalization pine valley 2012

SQL – Structured Query Language

SQL Statements

SELECT (select list)

FROM (table List)

WHERE (condition for

retrieval)

ORDER BY (sort criteria)

Example:

SELECT Empno, Ename,

Job, Sal

FROM EMP

WHERE Sal > 2500

ORDER BY Job, Ename

Table : EMP

Empno Ename Job Sal

8756 Dravid President 80005348 Raju Manager 5000

Page 19: Ism normalization pine valley 2012

SQL – Structured Query Language

SQL Statements

SELECT (select list)

FROM (table List)

WHERE (condition for

retrieval)

ORDER BY (sort criteria)

Example:

SELECT Order Number, Unit Price *Quantity AS Total

FROM Order

Page 20: Ism normalization pine valley 2012

Normalization - Recap

Page 21: Ism normalization pine valley 2012

1. 1NF: A relation is in 1NF if multi-valued attributes (also called repeating groups) have been removed, so there is a single value (possibly null) at the intersection of each row and column of the table.

2. 2NF: A relation is in 2NF if it is in 1NF, and contains no partial dependencies.

A partial functional dependency in a relation is a functional dependency in which one or more nonkey attributes are functionally dependent on part (but not all) of the primary key.

3. 3NF:A relation is in 3NF if it is in 2NF and no transitive dependencies exist.

A transitive dependency in a relation is a functional dependency between two (or more) nonkey attributes.

SUMMARY - Normalization – Rules – 1NF TO 3NF

Page 22: Ism normalization pine valley 2012

First Normal Form (1NF) First normal form (1NF) sets

the very basic rules for an organized database:

Eliminate duplicative columns from the same table.

Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).

Second Normal Form (2NF) Second normal form (2NF)

further addresses the concept of removing duplicative data:

Meet all the requirements of the first normal form.

Remove subsets of data that apply to multiple rows of a table and place them in separate tables.

Create relationships between these new tables and their predecessors through the use of foreign keys.

Third Normal Form (3NF)• Third normal form (3NF) goes one large step further: • Meet all the requirements of the second normal form. • Remove columns that are not dependent upon the primary key.

SUMMARY - Normalization – Rules – 1NF TO 3NF

Page 23: Ism normalization pine valley 2012

1NF Eliminate Repeating Groups - Make a separate table for

each set of related attributes, and give each table a primary key.

2NF Eliminate Redundant Data - If an attribute depends on only

part of a multi-valued key, remove it to a separate table.

3NF Eliminate Columns Not Dependent On Key - If attributes

do not contribute to a description of the key, remove them to a separate table.

SUMMARY - Normalization – Rules – 1NF TO 3NF

Page 24: Ism normalization pine valley 2012

Normalization - Exercises

Page 25: Ism normalization pine valley 2012

Normalize

Page 26: Ism normalization pine valley 2012

Normalize

Exercise 1 Emp _No Prof_Designation Emp_Name Dept_Code Dept_Name Prof_Office Student_Name Student_Id Student DOB Student Age

Exercise 2 Prod NoProd DescItem NoSalesperson NameCustomer NameQuantityPrice

Page 27: Ism normalization pine valley 2012

NormalizeEmp NoEmp NameDept NoDept NameMgr NoProj NoProj NameStart DateBilling Rate

Page 28: Ism normalization pine valley 2012

Normalize

Title Author1 Author2

ISBN Subject Pages Publisher

Database System Concepts

Abraham Silberschatz

Henry F. Korth

0072958863 MySQL, Computers

1168 McGraw-Hill

Operating System Concepts

Abraham Silberschatz

Henry F. Korth

0471694665 Computers 944 McGraw-Hill

Page 29: Ism normalization pine valley 2012