Top Banner
ISOM MIS710 Module 1b Relational Model and Normalization Arijit Sengupta
42

[PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

May 23, 2018

Download

Documents

buiquynh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

MIS710 Module 1bRelational Model and Normalization

Arijit Sengupta

Page 2: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Structure of this semester

Database Fundamentals

Relational Model

Normalization

ConceptualModeling Query

Languages

AdvancedSQL

Transaction Management

Java DB Applications –JDBC

DataMining

0. Intro 1. Design 3. Applications 4. AdvancedTopics

Newbie Users ProfessionalsDesigners

MIS710

2. Querying

Developers

Page 3: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Today’s Buzzwords

• Relational Model• Superkey, Candidate Key, Primary Key

and Foreign Key• Entity Integrity Rule• Referential Integrity Rule• Normalization• First, Second, Third, and Boyce-Codd

Normal Forms• Unnormalization

Page 4: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Objectives of this lecture

• Understand the Relational Model and its properties• Understand the notion of keys• Understand the use and importance of referential

integrity• Provide an alternative way to design relations using

semantics rather than concepts• Take an existing “flat file” design and creating a

relational design from it through the process of Normalization

• Identify sources of problems (or anomalies) within a given relational design

• Argue about improvements to designs created by others

Page 5: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Relational Data Model

• Originally proposed by Codd in 1970• Based on mathematical set theory

ID Name Age Address GPAS1 Jose 21 Stoned Hill 3.1S2 Alice 18 BigHead 3.2S3 Lin 32 Done-Audy 2.9S4 Joyce 20 Atlanta 3.7S5 Sunil 27 Mare-iota 3.2Tuples

AttributesAttributeValues

Attribute NamesRelation

Page 6: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Relation: Properties

• A relation is a set of tuples• A tuple is a set of attribute-value properties

(relations) Ordering of attributes is immaterial Ordering of Tuples is immaterial

• Tuples are distinct from one another• Attributes contain atomic values only

Emp# Name AddressE1 Jose' 'M.' 'Smith' 3413 Main Street', 'Atlanta', GA

Page 7: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Attributes

• Attribute nameAttribute names are unique within a relation

• Attribute domainSet of all possible values an attribute may

takeDomain (GPA) = Domain (name) =Domain (DateOfBirth) = Domain (year)

• Number of attributes: degree of the relation

Page 8: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Tuples

• Aggregation of attribute valuesS1 = (s1, ‘Jose’, 21, ‘StonedHill’, 3.1)S2 = (s2, ‘Alice’, 18, ‘BigHead’, 3.2)

• Cardinality: Number of tuples in a relation

• What is the difference between the cardinality and the degree?

ID Name Age Address GPAS1 Jose 21 Stoned Hill 3.1S2 Alice 18 BigHead 3.2S3 Lin 32 Done-Audy 2.9S4 Joyce 20 Atlanta 3.7S5 Sunil 27 Mare-iota 3.2

Page 9: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Primary Keys

• Superkey: SK, a subset of attributes of R, satisfying Uniqueness, that is, no two tuples have the same combination of values for these attributes

• Candidate Key: K, a superkey SK, satisfying minimality, that is, no component of K can be eliminated without destroying the uniqueness property.

• Primary Key: PK, the selected Candidate key, K.

• Can a primary key be composed of multiple attributes?• Can a relation have multiple primary keys?

Page 10: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Keys - example

• Superkeys?

• Candidate keys?

• Primary key?

Disk: (ISBN#, Artist_name, Album_name, Year, Producer, Genre, time, price)

Page 11: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Entity Integrity Rule

• The primary key of a base relation cannot contain a NULL value.

• Enforcement of the rule:An update which results in a NULL value

in the primary key must be rejected.

• Are the following ok?Course Section Meets Enrolled

201 1 MW 20201 NULL TTh 25

NULL NULL MWF 18

Primary Key

Page 12: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Foreign Key

Physician (ID, Name, …) Patient (ID, Name, PhysID*, …)

Club (ID, Name, …) Player (ID, Name, ?*, …)

Order (OrdID, Date, …, ?*) Customer (ID, Name, …, ?*)

Dept (DeptID, Name, …, ?*) Employee (EID, Name, …, ?*)

• Attribute(s) of one relation that reference(s) the PK of another relation

• FK may or may not be (a part of) the PK of this relation

Course (CourseID, Name, …, ?*) Class (ClassID, Meets, …, ?*)Student (SID, Name, …, ?*) Registration (?)

• Can an FK refer to a part of the PK of another relation?• Can an FK refer to a PK of the same relation?

Page 13: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Foreign Key ..

• FK and referenced PK may have different names• The values of FK must draw from the value set of PK

• How do we define the Domain of an FK?• Can an FK have a NULL value?• What can we enforce with PKs and FKs?

Domain

Value Set Domain

Primary Key Foreign Key

Page 14: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Referential Integrity Rule

• If FK is the foreign key of a relation R2, which matches the primary key PK of the relation R1, then: the FK value must match the PK value in some tuple of R1, or the FK value may be NULL, but only if the FK is not (a part of)

the PK of R2.• Enforcement of the Rule

An update on either a referenced PK or an FK must satisfy the rule. Otherwise, the operation is rejected.

• Which operation on the primary key may violate this rule?• Which operation on the foreign key may violate this rule?

Page 15: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Referential Integrity Enforcement

• If an operation violates referential integrity:Restrict

• reject the operationCascade

• try to propagate the operation to all dependent FK values, if it is not possible, reject the operation

Nullify (or Default)• set all dependent FK values to NULL (or a default

value), if that is not possible, reject the operation

• Cases for each of the above situations?

Page 16: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Creating Relations

create table STUDENT (ID char (11) not null primary key,Name char(30) not null,age int,GPA number (2,1));

create table COURSE (courseno char (6) not null primary key,coursename char(30) not null,credithours number (2,1));

create table REGISTRATION (ID references STUDENT (ID)

on delete cascade,CourseNum references COURSE (courseno),primary key (ID, CourseNum) );

Page 17: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Normalization - Motivating Example

• Is there any redundant data?• Can we insert a new course# with a new

textbook?• What should be done if ‘CIS’ is changed to ‘MIS’?• What would happen if we remove all CIS 800

students?

SID Name Grade Course# Text Major Depts1 Joseph A CIS800 b1 CIS CISs1 Joseph B CIS820 b2 CIS CISs1 Joseph A CIS872 b5 CIS CISs2 Alice A CIS800 b1 CS MCSs2 Alice A CIS872 b5 CS MCSs3 Tom B CIS800 b1 Acct Accts3 Tom B CIS872 b5 Acct Accts3 Tom A CIS860 b1 Acct Acct

Page 18: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Why Normalization?

• Poor Relation Design causes Anomalies Insertion anomalies - Insertion of some piece of

information cannot be performed unless other irrelevant information is added to it.

Update anomalies - Update of a single piece of information requires updates to multiple tuples.

Deletion anomalies - Deletion of a piece of information removes other unrelated but necessary information.

• Normalization improves the design to remove these anomalies

Page 19: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Why Normalization?

• Benefitscontain minimum amount of redundancyallow users to insert, delete and modify tuples

in the relation without errors or inconsistencies. improve quality of information in the databasedecrease storage space for the database

• Costsmay contribute to performance problemsmay require more storage in some cases

Page 20: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Unnormalized Relation

• Create a ‘Definition’ for this relation.• Do you see any problems in the definition?• Do you see any anomalies in the data?

STUDENT STUDENT COURSE COURSE INSTR ROOM CREDITS GRADEID NAME ID NAME NAME

224 Waters CIS20 Intro CBIS Greene 205G 5 ACIS40 Database Mgt Hong 311S 5 BCIS50 Sys.Analysis Purao 139S 5 B

351 Byron CIS30 COBOL Brown 629G 3 BCIS50 Sys.Analysis Purao 139S 5 C

421 Smith CIS20 Intro CBIS Greene 205G 5 BCIS30 COBOL Brown 629G 3 BCIS50 Sys.Analysis Purao 139S 5 B

Page 21: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Normal Forms

Unnormalized Relation

First Normal Form

Second Normal Form

Third Normal Form

Higher Order Forms

Only atomic attributes

Remove nonkey dependency

Remove transitive dependency

Dependency preservation: BCNFRemove Multi-valued Dependencies: 4NFRemove Join Dependencies: 5NF

NF2

1NF

2NF

3NF

BCNF

Page 22: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

The Basis of Normalization

• Functional Dependency (FD)Consider two attributes, X and Y, and two

arbitrary tuples r1 and r2 of a relation R.• Y is functionally dependent on X iff:

value of x in r1 = value of x in r2impliesvalue of Y in r1 = value of Y in r2

• Also stated as: R.X R.Y or X Y

Page 23: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Properties of FDs

• If R.X R.Y or X Y X is called the determinant of Y. X may or may not be the key attribute of R. A FD changes with its semantic meaning

• Name Address? X and Y may be composite X and Y may be mutually dependent on each other

• Husband Wife, Wife Husband The same Y value may occur in multiple tuples

• Course# Text

Page 24: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Fully Functional Dependencies

• When is X Y a FFD?When Y is not functionally dependent on any proper subset

of X

• X Y is a fully functional dependency ( FFD )( SID, Course# ) Name? ( SID, Course# )

Grade?( SID, Name ) Major? ( SID, Name ) SID?

• By default, the term FD refers to FFD

Page 25: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Transitive Dependencies

• Given attributes X, Y, and Z of a relation R,• Z is transitively dependent on X (X Z)

iff X Y and Y Z• For example:

SID Dept, SID Major,Dept School, Major Dept

• Do you see any Transitive Functional Dependencies?

Page 26: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Some Inference Rules for FDs

• An FD is redundant if it can be derived from other FDs based on a set of inference rules. Some of these rules are:

• Reflexive rule: If X Y, then X Y X always determines a subset of itself.

• Augmentation rule: If X Y, then XZ YZ Adding an attribute(s) on both side does not change the FD.

• Transitive rule: If X Y & Y Z, then X Z Functional dependencies can be ‘chained’.

• Decomposition rule: If X YZ, then X Y and X Z• Given: { SID Name, SID Major, Major Dept }, which ones

is/are redundant?SID School, SID Dept, Dept SchoolSID ( Name, Major ), (SID, Name) (Major, Name)SID SID, SID (Name, SID)

Page 27: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

First Normal Form

• DEFINITIONA relation R is in first normal form (1NF) if and

only if all underlying domains contain atomic values only.

• TranslationTo be in first normal form the table must not

contain any repeating attributes.• Implication

Are all ‘relations’ in First Normal Form (1NF) ?

Page 28: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Example - 1NF

The ‘unnormalized’ relation has been decomposed in two.

• What are the PKs?

StudentID Course# Course Title Instrname ROOM CREDITS GRADE224 CIS20 Intro CBIS Greene 205G 5 A224 CIS40 Database Mgt Hong 311S 5 B224 CIS50 Sys.Analysis Purao 139S 5 B351 CIS30 COBOL Brown 629G 3 B351 CIS50 Sys.Analysis Purao 139S 5 C421 CIS20 Intro CBIS Greene 205G 5 B421 CIS30 COBOL Brown 629G 3 B421 CIS50 Sys.Analysis Purao 139S 5 B

StudentID StudentName224 Waters251 Byron421 Smith

Relation: Student-CourseRelation: Student

Page 29: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Anomalies (with only 1NF)

• Insertion Anomaly A new course cannot be inserted in the database (relation

Student-Course) until a student registers for that course.

• Update Anomaly If the instructor of a course is changed, this fact would have

to be noted at many places in the database (many tuples of the relation Student-Course).

• Deletion Anomaly Withdrawal of all students from an existing course (that is,

deletion of related tuples from the relation Student-Course) will result in unwarranted removal of that course from the database.

Page 30: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Anomalies in 1NF

Course (SID, Name, Grade, Course#, Text, Major, Dept)

• 1NF Relations have anomaliesRedundant Information ?Update Anomalies ? Insertion Anomalies ?Deletion Anomalies ?

Major

Dept

SID

Course#

Name

Grade

Text

Page 31: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Second Normal Form

• DEFINITIONA relation R is in second normal form (2NF) if and

only if it is in 1NF and every nonkey attribute is dependent on the full primary key.

• TranslationA table is in second normal form if there are no

partial dependencies.• Implication

What kinds of primary keys may lead to a violation of the Second Normal Form (2NF) ?

Page 32: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Bubble Chart

• Reconsider the example ..

StudentId+CourseId

StudentName

CourseTitle

Credits

Instructor

Classroom

Grade

Page 33: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Dealing with Compound Keys

• Revised Bubble Chart

StudentId

StudentName

CourseTitle

Credits

Instructor

Classroom

Grade

CourseId

Page 34: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Example - 2NF

STUDENT STUDENTID NAME

224 Waters251 Byron421 Smith

STUDENT COURSE GRADEID ID

224 CIS20 A224 CIS40 B224 CIS50 B351 CIS30 B351 CIS50 C421 CIS20 B421 CIS30 B421 CIS50 BCOURSE COURSE CREDITS

ID TITLECIS20 Intro to CIS 5CIS30 Java 3CIS40 DBMS 5CIS50 Systems Analysis 5

Page 35: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Anomalies with (only) 2NF

• Insertion anomaly Information about a faculty (potential advisor) cannot be

added to the database unless a student is assigned to him/her.

• Update anomaly If the advisor’s office location or phone were changed, many

tuples would need to be changed.• Deletion anomaly

If all students assigned to an advisor graduate, information about the advisor will disappear from the database.

STUDENT STUDENT STATUS ADVISOR ADVISOR ADVISOR TOTALID NAME OFFICE PHONE CREDITS

224 Waters Junior Young CBA221 726104 105351 Byron Soph Greene CBA215 718434 77421 Smith Junior Young CBA221 726104 97

Page 36: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Third Normal Form

• DEFINITION A relation R is in third normal form (3NF) if and only if

it is in 2NF and every nonkey attribute is non-transitively dependent on the primary key.

• Translation A table is in Third Normal Form if every non-key

attribute is determined by the key, and nothing else.

• Implication How many total attributes must the relation have for a

possible violation of the Third Normal Form (3NF) ?

Page 37: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

3NF Example

• Chalk out the relations.

How do you maintain student-advisor relation?

StudentName

Status

TotalCredits

AdvisorOffice

AdvisorPhone

StudentId

Advisor

Advisor

Page 38: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Boyce-Codd Normal Form (BCNF)

• Update anomalies occur in an 3NF relation R ifR has multiple candidate keys,Those candidate keys are composite, andThe candidate keys are overlapped.

Computer-Lab (SID, Account, Class, Hours)

• A relation R is in BCNF iff every determinant is a candidate key.

Page 39: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

The Normalization Process

1. Flatten the Table Completely (no composite columns)

2. Find the Key and “all” FDs (well as many as you can possibly detect)

3. Find Partial Dependencies and decompose relation using them (2NF)

4. Find Transitive dependencies and decompose using them (3NF)

5. Remember – this is not a deterministic method – depends on the order in which FDs are chosen, so same Relation, same set of FDs can lead to different decompositions!

Page 40: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Lossless Decomposition

• A bad decomposition loses information• In a good decomposition

The join of decomposed relations restores the original relation

Decomposed relations can be maintained independently

• Rissanen’s rule for non-loss decomposition: Two projections R1 and R2 of a relation R are independent iff: Every FD in R can be logically deduced from those in R 1

and R 2 , and The common attributes of R 1 and R 2 form a candidate key

for at least one of the pair.

Page 41: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

Higher Normal Forms

• Fourth Normal FormMultivalued Dependencies (Fagin 1977)

• Fifth Normal FormJoin Dependencies (Fagin 1979)

• Other Dependencies Inclusion Dependencies (Casanova 1981)Template Dependencies (Sadri 1982)Domain-Key Normal Form (Fagin 1981)

Page 42: [PPT]The Entity-Relationship Model - Wright State Universityarijit.sengupta/mis710/notes/lect3-rel.ppt · Web viewTitle The Entity-Relationship Model Subject Database Management Systems

ISOM

In-class Exercise – Normalize this: