Top Banner
Monash University 2004 1 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems
44

Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 1

Summer Semester 2004/2005

Lecture 7

Data Modelling and Normalisation

IMS1907 Database Systems

Page 2: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 2

Transforming an ER diagram into normalised relations, and then merging all the relations into one final, consolidated set of relations can be accomplished in four steps

1. Represent entities as relations

2. Represent relationships as relations

3. Normalise each relation

4. Merge the relations with a common PK

Transforming ER Models to Relations

Page 3: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 3

Customer (Customer-no, Name, Address, City, State,

Postcode, Discount)

Representing Entities as Relations

Each entity in the ER diagram is transformed into a relation

CUSTOMER

Page 4: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 4

The multi-valued attribute becomes a relation with a composite key consisting of the primary key of the entity and the multi-valued attribute’s partial identifier

EMPLOYEE

Customer (Emp_ID, Name, Address)

Employee Skill (Emp_ID, Skill)

SkillEmp_ID

Multi-valued Attributes as Relations

Page 5: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 5

The weak entity becomes a relation with a composite key consisting of the primary key of the base entity and the weak entity’s partial identifier

Movie (Movie-ID, Title, Release-date)

Videotape Copy (Movie-ID, Copy-No, Condition)

Representing Weak Entities as Relations

MOVIEVIDEOTAPE

COPY

Page 6: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 6

Binary Relationships (1:N, 1:1)

For 1:N, add the primary key of the entity on the ‘one’ side of the relationship as a foreign key in the relation that is on the ‘many’ side

For 1:1 relationship involving entities A and B, choose from- add the primary key of A as a foreign key of B- add the primary key of B as a foreign key of A- both of the above – at a cost!- use business rules and good sense to guide choice

Representing Relationships as Relations

Page 7: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 7

Binary Relationships (1:N)

CUSTOMER ORDER

Customer (Customer-no, Name, Address, City, State, Postcode, Discount)

Order (Order-no, Order-date, Promised-date, Customer-no)

Representing Relationships as Relations

places

Page 8: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 8

CARECENTRE

Binary Relationships (1:1)

NURSE

Nurse (Nurse-no, Name, Address, Qualification)

Care Centre (Care-Centre-No, Location, Type, Nurse-no)

Representing Relationships as Relations

manages

Page 9: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 9

Binary and Higher Degree Relationships (M:N)

If we wish to know the quantity of a product on an order, this attribute is an attribute of the relationship ‘ordered on’

Representing Relationships as Relations

PRODUCT ORDEROrdered on

Qty_ord

Page 10: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 10

Order (Order-no, Order-date, Promised-date)

Order Line (Order-no, Product-no, Quantity-ordered)

Product (Product-no, description, (other attributes))

For M:N, first create a relation for each for each of the entity types, then create a relation for the relationship, with a composite primary key formed from the primary keys of the participating entity types

PRODUCT ORDERORDER-LINE

Representing Relationships as Relations

Page 11: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 11

Unary or Recursive Relationships (M:N)

Item (Item-no, Name, Cost)

Item-Bill (Item-no, Component-no, Quantity)

Representing Relationships as Relations

ITEM is part of

Page 12: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 12

Unary or Recursive Relationships (1:N)

Employee (Emp-id, Name, Birthdate, Manager-id)

Representing Relationships as Relations

EMPLOYEE supervises

Page 13: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 13

Patient-Treatment (Patient-id, Doctor-id, Treatment-code, Date, Time, Result)

Ternary or Higher Relationships

Patient (Patient-id, Patient-name)

Doctor (Doctor-id, Doctor-name)

Treatment (Treatment-code, Description)

Representing Relationships as Relations

DOCTOR

treats

TREATMENT

PATIENT

Page 14: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 14

Supertype-subtype relationships

PROPERTY

BEACH PROPERTY

MOUNTAINPROPERTY

Property (Property-ID, Address, City, Postcode, No-rooms, Rent)

Beach (Property-ID, Nearest-beach, Distance-to-beach)

Mountain (Property-ID, Skiing)

Representing Relationships as Relations

Page 15: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 15

Normalisation is a process for converting complex data structures into simple, stable data structures in the form of relations

Data models consisting of normalised relations- are robust, stable flexible and have minimum

redundancy- simplify enforcement of referential integrity constraints- make data maintenance easier- are an improved representation of real world objects

Normalisation

Page 16: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 16

Normalisation ensures that each attribute is attached to the appropriate relation

Each attribute is contained in the relation which represents the real world system object or concept that the attribute describes or is a property of- the attribute Student-name should be in the relation

STUDENT which represents the real world object “student” of interest to a student records system

Normalisation

Page 17: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 17

Originally developed as part of relational database theory

Accomplished in stages, each of which corresponds to a “normal form”

Codd defined first, second and third normal forms– third normal form is adequate for most business

applications

Later extensions include Boyce-Codd, 4th, 5th and domain-key normal forms

Normalisation

Page 18: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 18

Central to normalisation and well-structured relations

A functional dependency is a particular relationship between attributes in a relation

For any relation R, with attributes ‘a’ and ‘b’, if for each value of ‘a’ there is only ever one value of attribute ‘b’ associated with it, attribute ‘b’ is said to be functionally dependent on attribute ‘a’

Functional Dependency

Page 19: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 19

If for every valid instance of A, that value of A uniquely determines the value of B, B is functionally dependent on A

A identifies B

A B

Emp# Emp-name

Emp# Salary

Functional Dependency

Page 20: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 20

Normalisation to third normal form is accomplished in three steps each corresponding to a basic normal form

A normal form is a state of a relation that can be determined by applying simple rules concerning dependencies within that relation

Each step of the normalisation process is applied to a single relation in sequence so that the relation is converted to third normal form

All identified relations are normalised

Normalisation

Page 21: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 21

First Normal Form 1NF

Second Normal Form

2NF

Third Normal Form 3NF

Unnormalised table

Remove repeating groups

Remove partial dependencies

Remove transitive dependencies

Steps in Normalisation

Page 22: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 22

A relation is in first normal form if it contains no repeating data – the value of the data at the intersection of each row and

column must be single-valued

Remove any repeating groups of attributes to convert a relation to 1NF – key of the removed group will be a composite key

First Normal Form 1NF

Page 23: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 23

Identify primary key

Identify repeating groups

Remove repeating groups

Order (Order#, Date, Customer#, Name, (Item#, Desc, Qty))

Order (Order#, Date, Customer#, Name)

Order-Item (Order#, Item#, Desc)

First Normal Form 1NF

Page 24: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 24

A relation is in 2NF - if it is in 1NF- no part of the primary key is dependent on any other

part of the primary key- no non-key attribute is dependent on only part of the

primary key

Second Normal Form 2NF

Page 25: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 25

Converting a relation to 2NF involves removing partial dependencies

A partial dependency exists - if one part of a composite primary key is dependent on

another part of the key- if one or more non-key attributes are dependent on only

part of a composite primary key

Second Normal Form 2NF

Page 26: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 26

Remove any partial dependencies to convert a 1NF relation to 2NF

If the primary key consists of only one attribute then a 1NF relation is automatically in 2NF

If a 1NF relation has no non-key attributes, as long as there are no dependencies between the parts of the primary key, it is automatically in 2NF

Second Normal Form 2NF

Page 27: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 27

Remove partial dependencies

A non-key attribute cannot be identified by part of a composite key

Order (Order#, Item#, Desc, Qty-ordered)

Order-Item (Order#, Item#, Qty-ordered)

Item (Item#, Desc)

Second Normal Form 2NF

Page 28: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 28

Part of a composite primary key cannot be identified by another part of the primary key

DEPT (Dept#, Dept-name, (Emp#, Emp-name))

DEPT (Dept#, Dept-name)

DEPT-EMP (Dept#, Emp#, Emp-name)

But

DEPT-EMP (Dept#, Emp#, Emp-name)

Remove Dept# from the key:

EMP (Emp# ,Emp-name, Dept#)

Dependencies Within the Primary Key

Page 29: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 29

Order# Item# Item-desc Qty

27

28

28

873

402

873

nut

bolt

nut

2

1

10

Order-Item

30 495 washer 50

- UPDATE - change item-desc in many places- DELETE - data for last item lost when last order for that

item is deleted- CREATE - cannot add new item until it is ordered

Partial Dependency Anomalies

Page 30: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 30

Order# Item# Qty

27

28

28

873

402

873

2

1

10

Order

30 495 50

Item# Desc

873

402

nut

bolt

495 washer

delete last order for item, but item remains

add new item at any time

change item description in one place only

Item

Solution to Anomalies – 2NF

Page 31: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 31

A relation is in 3NF if it is in 2NF and no transitive dependencies exist

A transitive dependency is a functional dependency between two or more non-key attributes– if a relation has no or only one non-key attribute then by

definition, a transitive dependency cannot exist

Remove any transitive dependencies to convert a 2NF relation to 3NF

Third Normal Form 3NF

Page 32: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 32

Employee (Emp#, Ename, Dept#, Dname)

Employee (Emp#, Ename, Dept#)

Department(Dept#, Dname)

(look for foreign keys and their attributes)

Remove Transitive Dependencies

A non-key attribute cannot be identified by another non-key attribute

Third Normal Form 3NF

Page 33: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 33

Emp# Emp-name Dept# Dname

10

20

25

Smith

Jones

Smith

D5

D7

D7

MIS

Finance

Finance

Employee

30 Black D8 Sales

UPDATE - change dept name in many places

DELETE - data for dept lost when last employee for that dept is deleted

CREATE - cannot add new dept until an employee is allocated to it

Transitive Dependency Anomalies

Page 34: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 34

Emp# Ename Dept#

10

20

25

Smith

Jones

Smith

D5

D7

D7

Employee

30 Black D8

Dept# Dname

D5

D7

MIS

Finance

D8 Sales

delete last emp in dept, but dept remains

add new dept at any time

change dept name in one place

Item

Solution to Anomalies – 3NF

Page 35: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 35

A relation is normalised to 3NF if all attributes are fully functionally dependent on the primary key- remove repeating groups- remove partial dependencies- remove transitive dependencies

Normalisation to 3NF

Page 36: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 36

During the normalisation process two or more relations with the same primary key may appear

The set of 3NF relations must not contain any duplicate data

Relations with the same primary key should be merged

Normalisation - Merging Relations

Page 37: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 37

Synonyms

Two or more attributes may have different names but the same meaning

Either adopt one of the names as a standard or choose a third name

STUDENT1 (Student-id, Name, Phone-no)

STUDENT2 (VCE-no, Name, Address)

STUDENT (Student-id, Name, Address, Phone-no)

Normalisation of Relations

Page 38: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 38

Homonyms

Two or more attributes may have the same name but different meanings

To resolve the conflict, new attribute names need to be created

STUDENT1 (Student-id, Name, Address)

STUDENT2 (Student-id , Name, Phone, Address)

STUDENT (Student-id, Name, Phone, Campus-address, Permanent-

address )

Normalisation of Relations

Page 39: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 39

When two 3NF relations are merged, transitive dependencies may result

STUDENT1 (Student-id, Major)STUDENT2 (Student-id , Advisor)

STUDENT (Student-id, Major, Advisor)

But MAJOR ADVISOR (dependency!)

STUDENT (Student-id, Major)MAJOR (Major , Advisor)

Normalisation of Relations

Page 40: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 40

A set of 3NF relations may be converted to a simple diagrammatic form to begin physical database design

The conversion is simple

1. Draw a named rectangle for each relation

2. Draw a relationship line between rectangles linked by foreign keys with a “many” cardinality at the foreign key end of the relationship

Data Structure Diagrams (DSD)

Page 41: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 41

CUSTOMER

SALESORDER

ITEM

SALES ORDERLINE

CUSTOMER (Cust#, Cname,Phone number)

SALES ORDER (Sord#, Sord-date, Cust#)

SALES ORDER-ITEM (Sord#, Item#, Qty)

ITEM (Item#, Item-desc)

Data Structure Diagrams (DSD)

Page 42: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 42

(Tourcode, ...)

(Tourcode, depdate, ...)

(Booking#, ... , Tourcode, depdate)

redundant

TOUR

DEPARTURE

BOOKING

Eliminate redundant relationships

Data Structure Diagrams (DSD)

Page 43: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 43

Detailed data modelling involves– collecting detailed attributes for each entity and

relationship identified– converting ER models to relations– normalising the relations– merging relations from each user viewpoint– converting the normalised and merged relations to create

a data structure diagram

Detailed Data Modelling - Summary

Page 44: Monash University 20041 Summer Semester 2004/2005 Lecture 7 Data Modelling and Normalisation IMS1907 Database Systems.

Monash University 2004 44

Hoffer, J.A., George, J.F. & Valacich, J.S., (2002), 3rd ed., Modern Systems Analysis and Design, Prentice-Hall, New Jersey, Ch 10,12.

Whitten, J.L., Bentley, L.D. & Dittman, K.C., (2001), 5th ed., Systems Analysis and Design Methods, Irwin McGraw-Hill, New York, NY, Ch 7

References