Top Banner
DATA NORMALISATION Pamela Quick
32

DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Dec 26, 2015

Download

Documents

Lionel Harris
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

DATA

NORMALISATION

DATA

NORMALISATION

Pamela Quick

Page 2: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 2

Objectives

Data normalisation aims to derive record structures which avoid anomalies in

Insertion

Deletion

Modification

Accessing

Data normalisation ensures single valuedness of facts

Facts are represented in fields in keyed records

Page 3: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 3

The Process of Normalisation

Usually three steps (in industry) giving rise to

First Normal Form (1NF)

Second Normal Form (2NF)

Third Normal Form (3NF)

In academia

Boyce -Codd Normal Form (BCNF)

Fourth Normal Form (4NF)

At each step we consider relationships between an entity's attributes

These relationships are known as functional dependencies

Page 4: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 4

Steps in Data Normalisation

UNORMALISED ENTITY

step1 ... remove repeating groups

1st NORMAL FORM

step2 ... remove partial dependencies

2nd NORMAL FORM

step3 ... remove indirect dependencies

3rd NORMAL FORM

step4 ... remove multi-dependencies

4th NORMAL FORM

step4 ..every determinate a key

BOYCE-CODD NORMAL FORM

Page 5: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 5

Attributes - Identifiers

An entity identifier uniquely determines an occurence on the entity

A Superkey - a combination of attributes that uniquely identify

When more than one identifier exists we have Candidate

dentifiers (Keys) - minimal superkey

Primary Key - designated

Supplier# Supplier-name Supp-add

SUPPLIER

Page 6: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 6

Attributes - Repeating Groups

When a group of attributes has multiple values then we say there is a

repeating group of attributes in the entity

COMPANY NAME ADDRESSBRANCH

NAMEBRANCH

ADDRESS

A123 ABC Ltd 100 High St ABC1 Manchester

ABC2 London

ABC3 Glasgow

(BRANCH_NAME, BRANCH_ADDRESS) is a repeating group

Page 7: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 7

Functional Dependency

A B

PART-DESCRIPTIONPART#

A

B

C

B is functionally dependent on A if a value of A uniquely determines

a value of B

Page 8: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 8

Functional Dependency

A -> B B is functionally dependent on A, A determines B

for all A that have the same value , have the same value of B

Functional Dependency is Trivial if satisfied by all tuples

ie A ->A

in general X -> Y is trivial if Y = X or is a subset

FDs are said to HOLD - when every possible attribute combination complies

FDs are said to be SATISFIED - when all stated attribute instances comply

Page 9: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 9

More Examples of Functional Dependency

X

YZ

Z

KY

X

Page 10: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 10

Example

ORDER NUMBER

SUPPLIER NUMBER

ORDER DATE

DELIVERY DATE

500028

09/05/88

25/07/88

PART NO. PART-DESC QTY-ORD PRICE

O463 Hook 150 15.00

1492 Bolt 1000 10.00

3164 Spanner 10 5.00

TOTAL 30.00

1023

PURCHASE-ORDER (ORDER#, SUPPLIER#, ORDER-DATEDELIVERY-DATE, (PART#, PART-

DESCRIPTION,QUANTITY-ORDERED, PRICE), TOTAL-PRICE)

Page 11: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 11

First Normal Form

An entity type is in 1NF if there are no repeating groups of attribute types

Any un-normalised entity type is transformed to 1NF

Remove all repeating attribute groups

Repeating attribute groups become new entity types in their own right

The identifier of the original entity type must be an attribute (but not necessarily an identifier) of the derived entity type.

Page 12: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 12

Example of First Normal Form

ORDER NUMBER

SUPPLIER NUMBER

ORDER DATE

DELIVERY DATE

500028

09/05/88

25/07/88

PART NO. PART-DESC QTY-ORD PRICE

O463 Hook 150 15.00

1492 Bolt 1000 10.00

3164 Spanner 10 5.00

TOTAL 30.00

1023

PURCHASE-ORDER (ORDER#, SUPPLIER#, ORDER-DATEDELIVERY-DATE, (PART#, PART-

DESCRIPTION,QUANTITY-ORDERED, PRICE), TOTAL-

PRICE)

UN-NORMALISED ENTITY TYPE

Page 13: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 13

Example in 1NF

PURCHASE-ORDER (ORDER#, SUPPLIER#, ORDER-DATEDELIVERY-DATE, TOTAL-PRICE)

PURCHASE-ITEM-1 ( ORDER#, PART#, PART-DESCRIPTION,

QUANTITY-ORDERED, PRICE)

[NOTE: PART# ALONE DOES NOTE IDENTIFY PURCHASE-ITEM]

ENTITY TYPES IN 1NF

ORDER NUMBER

SUPPLIER NUMBER

ORDER DATE

DELIVERY DATE

500028

09/05/88

25/07/88

PART NO. PART-DESC QTY-ORD PRICE

O463 Hook 150 15.00

1492 Bolt 1000 10.00

3164 Spanner 10 5.00

TOTAL 30.00

1023

Page 14: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 14

Example

STUDENT NUMBER

STUDENT NAME

STUDENT ADDRESS

COURSE NO COURSE TUTOR NAME TUTOR NO

S0843215

P. Smith

1, South Downs Hale

PM951 Computing T. Long 037428

S212 Biology S. Short 096524

REGISTRATION FORM

STUDENT (Student#, student-name, student-address)

ENROLMENT (Student#, Course#, course-title,tutor-name,tutor-staff#

Page 15: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 15

Benefits from 1ST Normal Form

Any 'hidden' entities are identified

Process results in separation of different objects

BUT anomalies may still exist

PURCHASE-ITEM-1( ORDER#, PART#, PART-DESCRIPTION,QUANTITY-ORDERED, PRICE)

PART-DESCRIPTION appears on every PURCHASE-ITEM occurence.

This may result in anomalies when updating or deleting records

The problem in the example is that PART-DESCRIPTION is functionally dependent only on PART# (part of the identifier)

Page 16: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 16

Second Normal Form

An enity type is in 2NF if it is in 1NF and each non identifying attribute depends upon the whole identifier

Any enity type in 1NF is transformed to 2NF

Identify functional dependencies

Re-write entity types so that each non-identifying attribute is functionally dependent on the whole of the identifier

Page 17: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 17

Example

PURCHASE-ORDER (ORDER#, SUPPLIER#, ORDER-DATEDELIVERY-DATE, TOTAL-PRICE)

PURCHASE-ITEM-1 ( ORDER#, PART#, PART-DESCRIPTION,

QUANTITY-ORDERED, PRICE)

ENTITY TYPES IN 1NF

ORDER NUMBER

SUPPLIER NUMBER

ORDER DATE

DELIVERY DATE

500028

09/05/88

25/07/88

PART NO. PART-DESC QTY-ORD PRICE

O463 Hook 150 15.00

1492 Bolt 1000 10.00

3164 Spanner 10 5.00

TOTAL 30.00

1023

Page 18: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 18

Functional Dependencies

PURCHASE-ORDER (ORDER#, SUPPLIER#, ORDER-DATEDELIVERY-DATE, TOTAL-PRICE)

PURCHASE-ITEM-1 ( ORDER#, PART#, PART-DESCRIPTION,QUANTITY-ORDERED, PRICE)

ORDER#

PART#

PART-

DESCRIPTION

QUANTITY-ORDERED

PRICE

Page 19: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 19

In 2nd Normal Form

Decompose PURCHASE-ITEM into two entity types

PURCHASE-ITEM (Order#, Part#, Quantity-Ordered, Price)

PART (Part#, Part-Description)

Original enity type decomposed into three entity types in 2nd normal form

PURCHASE-ORDER (Order#,Supplier#, Order-Date, Delivery-Date, Total-Price)

PURCHASE-ITEM (Order#, Part#,Quantity-Ordered, Price)

PART (Part#, Part-Description)

Page 20: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 20

Example in 2NF

STUDENT NUMBER

STUDENT NAME

STUDENT ADDRESS

COURSE NO COURSE TUTOR NAME TUTOR NO

S0843215

P. Smith

1, South Downs Hale

PM951 Computing T. Long 037428

S212 Biology S. Short 096524

REGISTRATION FORM

STUDENT (Student#,Student-Name, Student-Adderss)

ENROLMENT ( Student#, Course#, Tutor-Name, Tutor-Staff#)

COURSE (Course#, Course-Title)

ENTITY TYPES IN 2NF

Page 21: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 21

Third normal Form

An enity type is in 3NF if it is in 2NF and all non identifying attributes are independent

Any enity type in 2NF is transformed in 3NF

Determine functional dependencies between non identifying attributes

Decompose enity into new entities

Page 22: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 22

Example

STUDENT NUMBER

STUDENT NAME

STUDENT ADDRESS

COURSE NO COURSE TUTOR NAME TUTOR NO

S0843215

P. Smith

1, South Downs Hale

PM951 Computing T. Long 037428

S212 Biology S. Short 096524

REGISTRATION FORM

STUDENT (Student#,Student-Name, Student-Adderss)

ENROLMENT ( Student#, Course#, Tutor-Name, Tutor-Staff#)

COURSE (Course#,, Course-Title)

ENTITY TYPES IN 2NF

Page 23: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 23

Functional Dependencies

STUDENT (Student#,Student-Name, Student-Adderss)ENROLMENT ( Student#, Course#, Tutor-Name, Tutor-Staff#)COURSE (Course#,, Course-Title)

Student#

Course#

Tutor-staff#

Tutor-name

Page 24: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 24

Example in 3NF

STUDENT (Student#,Student-Name, Student-Adderss)

ENROLMENT ( Student#, Course#, Tutor-Staff#)

COURSE (Course#,, Course-Title)

TUTOR (Tutor-Staff#, Tutor-Name)

STUDENT NUMBER

STUDENT NAME

STUDENT ADDRESS

COURSE NO COURSE TUTOR NAME TUTOR NO

S0843215

P. Smith

1, South Downs Hale

PM951 Computing T. Long 037428

S212 Biology S. Short 096524

REGISTRATION FORM

ENTITY TYPES IN 3NF

Page 25: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 25

Boyce-Codd Normal Form (BCNF)

A relation is in BCNF if every determinate is a candidate key

For a relation with only one candidate key , 3NF and BCNF are equivalent

Violation of BCNF is rare, may occur in a relation that :

contains two (or more) composite candidate keys and

which overlap, that is share at least one attribute in common

Page 26: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 26

BCNF

Client_no InterviewDate

InterviewTime

Staff_no Room_no

CR76 13-May-95

13-May-95

13-May-95

10.30 SG5 G101

CR56 12.00 SG5 G101

CR74 12.00 SG37 G102

CR56 10-Jun-95 10.00 SG5 G102

The following FDs hold :Client_No,Interview_Date ->Interview_time,Staff_no,Room_noStaff_no,Interview_Date,Interview_time -> Client_noStaff_no,Interview_date -> Room_no

Client_no,Interview_date and Staff_no,Interview_date are composite candidate keys that share the common attribute Interview_date

CLIENT_INTERVIEW

Page 27: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 27

BCNF

The relation CLIENT_INTERVIEW is in 3NF but not BCNF

To transform to BCNF:Remove the violating FD and create two relations:

INTERVIEW (Client_no, Interview_date, Interview_time, Staff_noSTAFF_ROOM (Staff_no,Interview_date,Room_no)

Page 28: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 28

Fourth Normal Form

An entity type is in 4NF if it is in 3NF and there are no multivalued dependencies between its attribute types

Any entity type in 3NF is transformed to 4NF

Detect any multivalued dependencies

Decompose entity type

Page 29: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 29

AUTHOR_NO BOOK_NO SUBJECT BOOK_TITLE AUTHOR_NAME

A1

A1

A2

A2

A3

B1

B1

B1

B1

B2

Comp. Sc.

Maths

Comp. Sc.

Maths

Maths

Methods

Method

Methods

Methods

Calculus

Jones

Jones

Smith

Smith

Brown

Multivalued Dependencies - 1

AUTHOR (Author_no, Author-name)

BOOK (Book_no, Book-_title)

AUTHOR-BOOK-SUBJECT (Author_no,

Book_no, Subject)

IN 3rd NORMAL FORM author_no

book_no

subject

author_name

book_title

Page 30: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 30

Multivalued Dependencies - 2

Example models that "each AUTHOR is associated with all the SUBJECTS under which the BOOK is classified"

The attribute SUBJECT contains redundant values. If SUBJECT were deleted from rows 1 & 2 the values could be deduced from rows 3 & 4

Anomaly because the same set of SUBJECT is associated with each AUTHOR of the same BOOK

BOOK_NO AUTHOR_NO

multidetermines

BOOK_NO SUBJECT

AUTHOR_NO BOOK_NO SUBJECT

B1 B1 B1 B1 B2

Comp. Sc. Maths Comp. Sc. Maths Maths

A1 A1 A2 A2 A3

Page 31: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 31

Fourth Normal Form

AUTHOR (Author_no, Author_name)

BOOK (Book_no, Book_Title)

AUTHOR-BOOK (Author_no, Book_no)

BOOK-SUBJECT (Book_no, Subject)

IN 4th NORMAL FORM

AUTHOR_NO BOOK_NO SUBJECT

A1 A1 A2 A2 A3

B1 B1 B1 B1 B2

Comp. Sc. Maths Comp. Sc. Maths Maths

AUTHOR_NO BOOK_NO

A1 A2 A3

B1 B1 B2

BOOK_NO SUBJECT

B1 B1 B2

Comp. Sc. Maths Maths

Page 32: DATA NORMALISATION Pamela Quick. Data Normalisation 2 Objectives  Data normalisation aims to derive record structures which avoid anomalies in u Insertion.

Data Normalisation 32

Conclusions

Data Normalisation is a bottom-up technique that ensures the basic properties of the relational model

no duplicate tuples

no nested relations

Data normalisation is often used as the only technique for database design - implementation view

A more appropriate approach is to complement conceptual modelling with data normalisation