Top Banner
Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)
35

Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Dec 30, 2015

Download

Documents

Felicity Walton
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Improving the Quality of Database Designs

(Adapted from David Kroenke, Dabase Processing)

Page 2: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Improving the Quality of Database Designs

• Minimizing Redundancy in Database• Avoiding Anomalies • Function Dependency• Normal Forms

o First Normal Formo Second Normal Formo Third Normal Form

• Exercise Problems

Page 3: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Minimizing Redundancy in DB

• Redundancyo Wastes spaceo Wastes timeo Causes Anomalies (incorrect data)

Page 4: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Avoiding Anomalies

• Causeso Update Anomalyo Insertion Anomalyo Deletion Anomaly

Page 5: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

DVD Table

dvdID acquired title genre length studio country

120 1/25/03 The 39 Steps Mystery 120 ABC USA

150 2/5/03 Elizabeth Drama 105 XYZ England

172 12/31/03 Lady & Tramp Animation 93 DEF Poland

157 3/25/03 Elizabeth Drama 105 XYZ England

110 5/12/02 Annie Hall Comedy 120 ABC USA

125 3/8/03 Elizabeth Drama 105 XYZ England

Back to UA Back to IA Back to DA

Page 6: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Update Anomaly

• Situation in whichUpdate in one record requires update in another record.

• E.g.Suppose for dvdID #150 (Elizabeth),length is changed to 100.If length values in devID #157 and #125 are not changed also, we have anomalies.

To DVD

Page 7: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Insertion Anomaly

• Situation in whichAdding a record results in an inconsistency

• Suppose another copy of The 39 Steps is added to the table. If its values of genre, length, and rating are not the same as those dvdID #120, we have an anomaly.

To DVD

Page 8: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Deletion Anomaly

• Situation in whichDeleting one record results in unintended loss of data

• Suppose dvdID #172 is removed. Then all data items regarding studio DEF and its country (Poland) —will be lost.

To DVD

Page 9: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Functional Dependency

• Definition• Given: A and B are attributes of relation (table) R

Then B is functionally dependent on A if and only if each value in A has associated with it exactly one value of B in R.

• A B ( A determines B)• I.e., any 2 rows with same value for A will have

the same value for B

Page 10: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Functional Dependence (1)

• DVD (title, publisher, length, director, pubAddress)

o publisher pubAddress (yes)o title length (no)o title, publisher length (yes)

Back to 2NF

Page 11: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Functional Dependence (2)

• Books (bkID, ISBN, title, author, pubAddress)o ISBN title (yes)o ISBN author (yes)o bkID title (yes)o bkID author (yes)o bkID pubAddress (yes)o title, publisher length (yes)

• A primary key determine each nonkey attribute

Page 12: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

First Normal Form (1NF)

• A relation (table) is in 1NF ifo Each row is unique (with primary key)o All attributes are atomic

Page 13: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Second Normal Form (2NF)

• A relation (table) is in second normal form if

o All nonkey attributes are dependent on all of the key. (This means that the relation is not in 2NF if any nonkey attribute is dependent on only part of the key.)

• E.g., in DVD, length is dependent only on title, but not on publisher. To FD1

Page 14: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

2NF? (No)

stdID activities fee

100 Skiing 200

100 Golf 65

150 Swimming 50

175 Squash 50

175 Swimming 50

200 Swimming 50

200 Golf 65

StudentdActivities

Back to Problems

Page 15: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Problems

• Noteo Key: stdID + activitieso Attribute fee is dependent only on activities (partial key).

• Problemso There are obvious redundancies.o If student 175 is removed, fee($50) for Squash is

deleted.o A new activity—say Surfing—cannot be entered until a

student is entered

To 2NF

Page 16: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Solution

• Remove the attribute that is dependent only on part of the key and form a new table

• Create a link between the new and the original tables using a foreign key

• Note: if a relation (table) is 1NF and the primary key consists of a single attribute, the relation is automatically 2NF.

Page 17: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Solution

stdID activities

100 Skiing

100 Golf

150 Swimming

175 Squash

175 Swimming

200 Swimming

200 Golf

Activities fee

Skiing 200

Golf 65

Swimming 50

Squash 50

Activities Fees

Page 18: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Third Normal Form (3NF)

• A relation is in 3NF ifo It is in 2NF ando There are no transitive dependencies. (I.e.,

every nonkey attribute is dependent only on the primary key.)

• Table satisfying 3NF (in common terms)o Should have a field that uniquely identifies

each recordo Each field in the table should describe the

subject that the table represents

Page 19: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

3NF? (No)

stdID building fee

100 Randolf 1200

150 Ingersoll 1100

200 Randolf 1200

250 Pitkin 1100

300 Randolf 1200

StudentHousing

Back to Problems

Page 20: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Transitive Dependence

• stdID building (I.e., building is dependent on stdID)

• building fee (I.e., fee is dependent on building)

• Thus,stdID building fee

Page 21: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Problems

• StdHousing is in 2NF, buto Redundant data will introduce modification

anomalyo Removing stdID 150 deletes fee value for

Ingersollo Fee for a new building—say Barrett—cannot

be recorded until a new stdID is entered

To 3NF

Page 22: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Solution

• Remove data that is not dependent on primary key and form new relation

• Create a relationship between the new and the original tables using foreign key

Page 23: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Solution

stdID Building

100 Randolf

150 Ingersol

200 Randolf

250 Pitkins

300 Randolf

Building Fee

Randolf 1200

Ingersoll 1100

Pitkins 1100

ResidenceFeeStudentResidence

Page 24: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Try This (Customers Table)

Back to Problem

Page 25: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Problem

• Note thato custNum ZIP ZIP city, state

I.e., custNum ZIP city, stateo Transitive dependence results in redundancy

and modification, insertion, & deletion anomalies.

To Customers

Page 26: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Solution

Page 27: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Summary

• Examine the attributes of an entity and ask the following questions. If the answer is any “Yes,” an attribute probably belong to another entity.

o Does an attribute or attributes describe an entity other than the current one?

o Does an attribute of the entity depend (functionally dependent) on only part of the primary key?

o Does an attribute depend on something other than the primary key?

Page 28: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPagerempPositionempPositionDescripempDateHireempPayRateempDateLastRaise

custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFaxorderNumorderQuantityorderDate

prodIdprodDescripprodCost

Employees

empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPagerempPositionempPositionDescripempDateHireempPayRateempDateLastRaise

Customers

custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFaxorderNumorderQuantityorderDate

Products

prodIdprodDescripprodCost

Company Database

Page 29: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Company Database (2)Employees

empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPagerempPositionempDateHireempPayRateempDateLastRaise

Employees

empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPager

EmployeePaysempIdempPositionempPositionDescripempDateHireempPayRateempDateLastRaise

Page 30: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Company Database

Customers

custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFaxorderNumorderQuantityorderDate

Customers

custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFax

Orders

custIdorderNumorderQuantityorderDate

Page 31: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

QuizNormalization is the process of grouping data into logically related data into tables to reduce redundancy. (T/F)

Having no duplicate or redundant data in a database, and having everything in the database normalized, is always the best way to go. (T/F)

If data is in the third normal form, it is automatically in the first and second normal forms. (T/F)

What is the major advantage of denormalized database versus a normalized database?

What are some major disadvantages of unnormalized database?

Page 32: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Exercise: What Type of Relationships Do the Tables Have?

Positions

os_idpositionposition_descrip

EmployeePays

empPayIdempDateHireempPayRateempDateLastRaise

Orders

orderNumorderQuantityorderDate

Customers

custIdcustNamecustAddresscustcitycustStatecustZipcustPhonecustFax

Employees

empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPager

Page 33: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Exercise: Normalize the following data.

Take the following data and normalize it. Keep in mind that, in a real DB, there would be many more items than what is given here.

Employees:

Angela Smith, secretary, RR 1 Box 73, Greensburg, IN, 47890, $9.50/hour, started Jan. 22, 1996, SSN is 323149669

Jack Lee Nelson, salesman, 3334 N. Main St., Brownsburg, IN, 45687, 317-852-9901, $35,000.00/year, data started 10/28/95, SSN is 312567342

Customers:

Robert’s Games & Things, 5612 Lafayette Rd., Indianapolis, IN, 46224, 317-291-7888, customer ID is 432A

Reed’s Dairy Bar, 4556 W 10th St., Indianapolis, IN, 46245, 317-271-9823, customer ID is 117A

CustomerOrders:Customer ID is 117A, date of last order is 2/20/1997, product ordered was napkins, and product ID is 661

Page 34: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Tables

Employees Customers OrdersSsnlastNamefirstNamestreetcitystatezipphoneNumsalaryhourlyRatestartDateposition

customerIDnamestreetcitystatezipphoneNum

orderIDcustomerIDproductIDproductDescripdateOrdered

Page 35: Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

Solutions