Top Banner
Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)
45

Improving the Quality of Database Designs

Jan 22, 2016

Download

Documents

Jalen

Improving the Quality of Database Designs. (Adapted from David Kroenke, Dabase Processing ). Improving the Quality of Database Designs. Minimizing Redundancy in Database Avoiding Anomalies Function Dependency Normal Forms First Normal Form Second Normal Form Third Normal Form - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Improving the Quality  of Database Designs

Improving the Quality of Database Designs

(Adapted from David Kroenke, Dabase Processing)

Page 2: Improving the Quality  of Database Designs

Improving the Quality of Database Designs

• Minimizing Redundancy in Database• Avoiding Anomalies • Function Dependency• Normal Forms

o First Normal Formo Second Normal Formo Third Normal Form

• Exercise Problems

Page 3: Improving the Quality  of Database Designs

Minimizing Redundancy in DB

• Redundancyo Wastes spaceo Wastes timeo Causes Anomalies (incorrect data)

Page 4: Improving the Quality  of Database Designs

Avoiding Anomalies

• Causeso Update Anomalyo Insertion Anomalyo Deletion Anomaly

Page 5: Improving the Quality  of Database Designs

DVD Table

dvdID acquired title genre length studio country

120 1/25/03 The 39 Steps Mystery 120 ABC USA

150 2/5/03 Elizabeth Drama 105 XYZ England

172 12/31/03 Lady & Tramp Animation 93 DEF Poland

157 3/25/03 Elizabeth Drama 105 XYZ England

110 5/12/02 Annie Hall Comedy 120 ABC USA

125 3/8/03 Elizabeth Drama 105 XYZ England

Back to UA Back to IA Back to DA

Page 6: Improving the Quality  of Database Designs

Update Anomaly

• Situation in whichUpdate in one record requires update in another record.

• E.g.Suppose for dvdID #150 (Elizabeth),length is changed to 100.If length values in devID #157 and #125 are not changed also, we have anomalies.

To DVD

Page 7: Improving the Quality  of Database Designs

Insertion Anomaly

• Situation in whichAdding a record results in an inconsistency

• Suppose another copy of The 39 Steps is added to the table. If its values of genre, length, and rating are not the same as those dvdID #120, we have an anomaly.

To DVD

Page 8: Improving the Quality  of Database Designs

Deletion Anomaly

• Situation in whichDeleting one record results in unintended loss of data

• Suppose dvdID #172 is removed. Then all data items regarding studio DEF and its country (Poland) —will be lost.

To DVD

Page 9: Improving the Quality  of Database Designs

Functional Dependence

• Definition• Given: A and B are attributes of relation (table) R

Then B is functionally dependent on A if and only if each value in A has associated with it exactly one value of B in R.

• A B ( A determines B)• I.e., any 2 rows with same value for A will have

the same value for B

Page 10: Improving the Quality  of Database Designs

Functional Dependence (1)

• DVD (title, publisher, length, director, pubAddress)

o publisher pubAddress (yes)o title length (no)o title, publisher length (yes)

Back to 2NF

Page 11: Improving the Quality  of Database Designs

Functional Dependence (2)

• Books (bkID, ISBN, title, author, pubAddress)o ISBN title (yes)o ISBN author (yes)o bkID title (yes)o bkID author (yes)o bkID pubAddress (yes)o title, publisher length (yes)

• A primary key determine each nonkey attribute

Page 12: Improving the Quality  of Database Designs

First Normal Form (1NF)

• A relation (table) is in 1NF ifo Each row is unique (with primary key)o All attributes are atomic

Page 13: Improving the Quality  of Database Designs

Second Normal Form (2NF)

• A relation (table) is in second normal form if

o All nonkey attributes are dependent on all of the key. (This means that the relation is not in 2NF if any nonkey attribute is dependent on only part of the key.)

• E.g., in DVD, length is dependent only on title, but not on publisher. To FD1

Page 14: Improving the Quality  of Database Designs

2NF? (No)

stdID activities fee

100 Skiing 200

100 Golf 65

150 Swimming 50

175 Squash 50

175 Swimming 50

200 Swimming 50

200 Golf 65

StudentdActivities

Back to Problems

Page 15: Improving the Quality  of Database Designs

Problems

• Noteo Key: stdID + activitieso Attribute fee is dependent only on activities (partial key).

• Problemso There are obvious redundancies.o If student 175 is removed, fee($50) for Squash is

deleted.o A new activity—say Surfing—cannot be entered until a

student is entered

To 2NF

Page 16: Improving the Quality  of Database Designs

Solution

• Remove the attribute that is dependent only on part of the key and form a new table

• Create a link between the new and the original tables using a foreign key

• Note: if a relation (table) is 1NF and the primary key consists of a single attribute, the relation is automatically 2NF.

Page 17: Improving the Quality  of Database Designs

Solution

stdID activities

100 Skiing

100 Golf

150 Swimming

175 Squash

175 Swimming

200 Swimming

200 Golf

Activities fee

Skiing 200

Golf 65

Swimming 50

Squash 50

Activities Fees

Page 18: Improving the Quality  of Database Designs

Third Normal Form (3NF)

• A relation is in 3NF ifo It is in 2NF ando There are no transitive dependencies. (I.e.,

every nonkey attribute is dependent only on the primary key.)

• Table satisfying 3NF (in common terms)o Should have a field that uniquely identifies

each recordo Each field in the table should describe the

subject that the table represents

Page 19: Improving the Quality  of Database Designs

3NF? (No)

stdID building fee

100 Randolf 1200

150 Ingersoll 1100

200 Randolf 1200

250 Pitkin 1100

300 Randolf 1200

StudentHousing

Back to Problems

Page 20: Improving the Quality  of Database Designs

Transitive Dependence

• stdID building (I.e., building is dependent on stdID)

• building fee (I.e., fee is dependent on building)

• Thus,stdID building fee

Page 21: Improving the Quality  of Database Designs

Problems

• StdHousing is in 2NF, buto Redundant data will introduce modification

anomalyo Removing stdID 150 deletes fee value for

Ingersollo Fee for a new building—say Barrett—cannot

be recorded until a new stdID is entered

To 3NF

Page 22: Improving the Quality  of Database Designs

Solution

• Remove data that is not dependent on primary key and form new relation

• Create a relationship between the new and the original tables using foreign key

Page 23: Improving the Quality  of Database Designs

Solution

stdID Building

100 Randolf

150 Ingersol

200 Randolf

250 Pitkins

300 Randolf

Building Fee

Randolf 1200

Ingersoll 1100

Pitkins 1100

ResidenceFeeStudentResidence

Page 24: Improving the Quality  of Database Designs

Try This (Customers Table)

Back to Problem

Page 25: Improving the Quality  of Database Designs

Problem

• Note thato custNum ZIP ZIP city, state

I.e., custNum ZIP city, stateo Transitive dependence results in redundancy

and modification, insertion, & deletion anomalies.

To Customers

Page 26: Improving the Quality  of Database Designs

Solution

Page 27: Improving the Quality  of Database Designs

Summary

• Examine the attributes of an entity and ask the following questions. If the answer is any “Yes,” an attribute probably belong to another entity.

o Does an attribute or attributes describe an entity other than the current one?

o Does an attribute of the entity depend (functionally dependent) on only part of the primary key?

o Does an attribute depend on something other than the primary key?

Page 28: Improving the Quality  of Database Designs

empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPagerempPositionempPositionDescripempDateHireempPayRateempDateLastRaise

custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFaxorderNumorderQuantityorderDate

prodIdprodDescripprodCost

Employees

empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPagerempPositionempPositionDescripempDateHireempPayRateempDateLastRaise

Customers

custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFaxorderNumorderQuantityorderDate

Products

prodIdprodDescripprodCost

Company Database

Page 29: Improving the Quality  of Database Designs

Company Database (2)Employees

empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPagerempPositionempDateHireempPayRateempDateLastRaise

Employees

empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPager

EmployeePaysempIdempPositionempPositionDescripempDateHireempPayRateempDateLastRaise

Page 30: Improving the Quality  of Database Designs

Company Database

Customers

custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFaxorderNumorderQuantityorderDate

Customers

custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFax

Orders

custIdorderNumorderQuantityorderDate

Page 31: Improving the Quality  of Database Designs

QuizNormalization is the process of grouping data into logically related data into tables to reduce redundancy. (T/F)

Having no duplicate or redundant data in a database, and having everything in the database normalized, is always the best way to go. (T/F)

If data is in the third normal form, it is automatically in the first and second normal forms. (T/F)

What is the major advantage of denormalized database versus a normalized database?

What are some major disadvantages of unnormalized database?

Page 32: Improving the Quality  of Database Designs

Exercise: What Type of Relationships Do the Tables Have?

Positions

os_idpositionposition_descrip

EmployeePays

empPayIdempDateHireempPayRateempDateLastRaise

Orders

orderNumorderQuantityorderDate

Customers

custIdcustNamecustAddresscustcitycustStatecustZipcustPhonecustFax

Employees

empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPager

Page 33: Improving the Quality  of Database Designs

Exercise: Normalize the following data.

Take the following data and normalize it. Keep in mind that, in a real DB, there would be many more items than what is given here.

Employees:

Angela Smith, secretary, RR 1 Box 73, Greensburg, IN, 47890, $9.50/hour, started Jan. 22, 1996, SSN is 323149669

Jack Lee Nelson, salesman, 3334 N. Main St., Brownsburg, IN, 45687, 317-852-9901, $35,000.00/year, data started 10/28/95, SSN is 312567342

Customers:

Robert’s Games & Things, 5612 Lafayette Rd., Indianapolis, IN, 46224, 317-291-7888, customer ID is 432A

Reed’s Dairy Bar, 4556 W 10th St., Indianapolis, IN, 46245, 317-271-9823, customer ID is 117A

CustomerOrders:Customer ID is 117A, date of last order is 2/20/1997, product ordered was napkins, and product ID is 661

Page 34: Improving the Quality  of Database Designs

Tables

Employees Customers OrdersSsnlastNamefirstNamestreetcitystatezipphoneNumsalaryhourlyRatestartDateposition

customerIDnamestreetcitystatezipphoneNum

orderIDcustomerIDproductIDproductDescripdateOrdered

Page 35: Improving the Quality  of Database Designs

Normalization Case Study

• A database named “Movie Rentals” keeps track of which customer checked which movies.

Page 36: Improving the Quality  of Database Designs

Rentals Table

1NF•Each cell value is atomic•No repeating data fields

Is the table in 1NF?

Page 37: Improving the Quality  of Database Designs

To Satisfy 1NF Requirements (1)

Lets simplify the primary key

Page 38: Improving the Quality  of Database Designs

To Satisfy 1NF Requirements (2)

Make “movieTitle” and “category” atomic.

(Note that fullName has also been split.)

Page 39: Improving the Quality  of Database Designs

To Satisfy 1NF Requirements (3)

Eliminate “repeating” phone data.

PhonesCustomers

Rentals

Page 40: Improving the Quality  of Database Designs

To Satisfy 2NF Requirements (1)

• 2NFo Is in 1NFo Each nonkey field value depends on the entire

primary key

Page 41: Improving the Quality  of Database Designs

To Satisfy 2NF Requirements (2)

• Note that 2NF is a concern only with composite primary key.

Page 42: Improving the Quality  of Database Designs

To Satisfy 3NF Requirements (1)

• For 3NFo In 2NFo All nonkey field values depend only on primary key

(i.e., no transitive dependency)

Page 43: Improving the Quality  of Database Designs

To Satisfy 3NF Requirements (2)

• In Customers tableo Cust → nameFirst → title. I.e., tile depends on nameFirst

Page 44: Improving the Quality  of Database Designs

To Satisfy 3NF Requirements (3)

Titles Customers

Page 45: Improving the Quality  of Database Designs

Titles Customers

PhonesRentals

1 N

1 1

N N