Top Banner
Project and Data Management Softwar e 1 Project and Data Management Software Data Analysis and Data Modelling Normalisation
22
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 1

Project and Data Management Software

Data Analysis and Data Modelling

Normalisation

Page 2: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 2

Normalisation

Normalisation provides an algorithm for reducing complex data structures into simple structures

Formalised by set of rules known as Codd’s laws

Tidying up the data so there is no data redundancy

Ensuring data is grouped logically

Page 3: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 3

Why Use Normalization?

Relations formed by the process makes the data easier to understand and manipulate.

Provides a stable base for future database growth.

Simplifies relations and reduces anomalies.

Page 4: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 4

Stages of Normalization

There are 3 stages: 1st Normal Form – 1NF 2nd Normal Form – 2NF 3rd Normal Form – 3NF

BCNF Boyce Codd Normal Form 4NF also exists

Page 5: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 5

First Normal Form – 1NF

For a relation to be in 1NF all its attributes must be atomic Each attribute must contain a single value

not a repeating group of values. Every non-primary key attribute must be

functionally dependent on the Primary Key.

Page 6: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 6

Un-normalised data

Course Code

Course Desc

Employee Number

Name

Block

Room No

Date Joined Course

Allocated Hours

Page 7: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 7

Un-normalised data

A list of fields needed for the system E.g. Staff Development Course All staff are released for two hours a week for staff

dev. Employees work at their own pace in a lab. A total of six attributes are recorded about each

employee including their normal office location (block and room), the date they joined the course and how many hours it is planned for them to work on it.

Page 8: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 8

First Normal Form (1NF)

An entity is in 1NF if, and only if, it has an identifying key and there are no repeating attributes or groups of attributes

To get to 1NF we must remove all repeating groups (data elements)

Page 9: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 9

Our Example

COURSE EMP_ON_COURSE

Course Code

Course Desc.

Course Code

Employee Number

Name

Block

Room No

Date Joined Course

Allocated Hours

Page 10: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 10

Second Normal Form (2NF)

An entity is in 2NF if, and only if, it is in 1NF and has no attributes which require only part of the key to identify them uniquely

To get to 2NF we remove part key dependencies

All data items must be dependant on the primary key

Page 11: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 11

Our Example

Course is already in 2NF Emp_On_Course is not because

Attribute Depends On

Name

Block

RoomNo

Employee No

Employee No

Employee No

Attribute Depends On

Date Joined

Hours

Employee No + Course Code

Employee No + Course Code

Page 12: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 12

So we..

Take out details that are linked only to employee into a separate table

If in any doubt, ask a question such as ‘Are these fields affected when they join a course’

Attribute Depends On

Name

Block

RoomNo

Employee No

Employee No

Employee No

Page 13: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 13

Cont.

COURSE EMP_ON_

COURSE

EMPLOYEE

Course Code

Course Desc

Course Code

Emp No

Date Joined Course

Allocated Hours

Emp No

Name

Block

Room No

Page 14: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 14

Problems

Block and Room Number are related, so if one is updated the other will be affected.

If the block names change, then the whole of the employee records will have to be altered

Page 15: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 15

Third Normal Form (3NF)

An entity is in 3NF if, and only if, it is in 2NF and no non-key attribute depends on another non-key attribute.

To get to 3NF we must remove attributes that depend on other non-key attributes

It removes any mutual dependence between non-key attributes

Page 16: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 16

Third Normal Form 3NF

In other words:

“The attributes is a relation in 3NF must depend on the key, the whole key and nothing but the key” !

Page 17: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 17

How to do that: Dependency

Decide on the direction of the dependency between the attributes

If B determines A, then A is dependant on B If A depends on B, create a new entity, keyed

by B, with A as an attribute Leave B in the original entity and mark it as a

foreign key, but remove A from the original entity

Page 18: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 18

Our Example: Dependency

If, given a value for A, there is only one possible value for B, then B is dependant on A

Therefore, given a value for room no., there is only one value for block. The same is not true vice-versa.

Hence Block is dependent on Room No. Leave Room No in the original entity and mark it as a

foreign key, but remove Block from the original entity

Page 19: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 19

Our Example

Hence the EMPLOYEE (2NF) entity becomes

EMPLOYEE LOCATION

Employee No

Name

Room No *

Room No

Block

* Room No is a foreign key in the Employee entity

Page 20: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 20

Entity Relationship Modelling

Course

Emp_On_Course Employee

Location

Page 21: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 21

Background - Keys

Primary key Unique Identifier Can be made up of more than one attribute

and then is called a composite key If there is no obvious choice, use a number

Foreign Key Does not belong to the entity Used to relate entity to entity A primary key in another table

Page 22: Project and Data Management Software1 Data Analysis and Data Modelling Normalisation.

Project and Data Management Software 22

To Normalise

Follow 3 simple steps

1. Remove all repeating data elements

2. Ensures data items are dependant on the primary key

3. Remove all fields dependant on non-key fields