Top Banner
Design Of Databases •What is Good Design •Normalization
24
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Design of databases

Design Of Databases

•What is Good Design•Normalization

Page 2: Design of databases

Pitfalls in Relational-Database Design

• Repetition of information

• Inability to represent certain information

Page 3: Design of databases

Repetition of Information

• Lending-schema = (branch-name, branch-city, assets, customer-name, loan-number, amount)

• t[assets] is the asset figure for the branch named t[branch-name].

• t[branch-city] is the city in which the branch named t[branch-name] is located.

Page 4: Design of databases

Cont…

• t[loan-number] is the number assigned to a loan given by the branch named t[branch-name] to the customer named t[customer-name].

• t[amount] is the amount of the loan whose number is t[loan-number].

Page 5: Design of databases

Example of Repetition

• (branch-name, branch-city, assets, customer-name, customer-city, loan-number, amount)

• (Perryridge, Horseneck, 1700000, Adams, Brooklyn, L-31, 1500)

Page 6: Design of databases

• Suppose that we wish to add a new loan to our database. Say that the loan is made by the Perryridge branch to Adams in the amount of $1500. Let the loan-number be L-31. In our design, we need a tuple with values on all the attributes of Lendingschema.

• Thus, we must repeat the asset and city data for the Perryridge branch and the customer-city.

Page 7: Design of databases

Branch-Name

Branch-City

Assets Customer-name

Customer-City

Loan-number

Amount

Perryridge Horseneck 1700000 Adams Brooklyn L-31 1500

Perryridge Horseneck 1700000 Adams Brooklyn L-32 30000

Perryridge Horseneck 1700000 Adams Brooklyn L-33 2500

Perryridge Horseneck 1700000 Bob Horseneck L-39 4500

Redwood Palo Alto 2100000 Smith Rye L-23 2000

Redwood Palo Alto 2100000 Smith Rye L-52 3000

Page 8: Design of databases

• Repeating information wastes space.

• Furthermore, it complicates updating the database.

• for example, that the assets of the Perryridge branch change from 1700000 to 1900000.

• Each tuple with Branch-Name Perryridge must be updated.

Page 9: Design of databases

Inability to Represent Information

• Another problem with the Lending-schema design is that we cannot represent directly the information concerning a branch (branch-name, branch-city, assets) unless there exists at least one loan at the branch.

• One solution to this problem is to introduce null values.

Page 10: Design of databases

Functional Dependency

• We know that a bank branch has a unique value of assets, so given a branch name we can uniquely identify the assets value.

• In other words, we say that the functional dependency

branch-name → assetsholds good.

Page 11: Design of databases

• The fact that a branch has a particular value of assets, and the fact that a branch makes a loan are independent; these facts are best represented in separate relations (Tables).

Page 12: Design of databases

Super Key

• Let R be a relation schema. A subset K of R is a superkey of R if, in any legal relation r(R), for all pairs

• t1 and t2 of tuples in r such that if t1[K] = t2[K], then t1 = t2.

• That is, no two tuples in any legal relation r(R) may have the same value on attribute set K.

Page 13: Design of databases

Back to Functional Dependencies

• The notion of functional dependency generalizes the notion of superkey.

• Consider a relation schema R, and let α R ⊆and β R. The ⊆ functional dependency

α →β holds on schema R if, in any legal relation r(R),

for all pairs of tuples t1 and t2 in r such that if t1[α] = t2[α], it is also the case that

t1[β] = t2[β].

Page 14: Design of databases

• Consider our original Lending-Schema:

– Functional dependencies on it are:

– Branch Name -> Branch City Branch– Branch Name -> Assets Schema

– Loan Number -> Amount Loan– Loan Number -> Branch Name Schema– Loan Number -> Customer Name

– Customer Name -> Customer City - Customer Schema

Page 15: Design of databases

Branch Schema

Branch-Name Branch-City Assets

Perryridge Horseneck 1700000

Redwood Palo Alto 2100000

Page 16: Design of databases

Loan Schema

Loan-number Customer-name Branch-Name Amount

L-31 Adams Perryridge 1500

L-32 Adams Perryridge 30000

L-33 Adams Perryridge 2500

L-39 Bob Perryridge 4500

L-23 Smith Redwood 2000

L-52 Smith Redwood 3000

Page 17: Design of databases

Customer Schema

Customer – Name Customer – City

Adam Brooklyn

Bob Horseneck

Smith Rye

Page 18: Design of databases

Closure on Set of Functional Dependencies

• Armstrong Rules:• Reflexivity - If α is a set of attributes and β ⊆

α, then α →β holds.

• Augmentation rule - If α → β holds and γ is a set of attributes, then γα → γβ holds.

• Transitivity rule - If α →β holds and β → γ holds, then α → γ holds.

Page 19: Design of databases

Rules derived from Armstrong Rules

• Union rule. If α → β holds and α → γ holds, then α →βγ holds.

• Decomposition rule. If α →βγ holds, then α → β holds and α →γ holds.

• Pseudotransitivity rule. If α→β holds and γβ →δ holds, then αγ →δ holds.

Page 20: Design of databases

Algorithm to compute F+ (F closure)

F+ = Frepeat

for each functional dependency f in F+apply reflexivity and augmentation rules on fadd the resulting functional dependencies to F+

for each pair of functional dependencies f1 and f2 in F+

if f1 and f2 can be combined using transitivityAdd the resulting functional dependency to

F+

until F+ does not change any further

Page 21: Design of databases

Properties of Decomposition

• Lossless join decomposition

• Dependency Preservation

• Decrease in Repetition of Information

Page 22: Design of databases

Boyce–Codd Normal Form

A relation schema R is in BCNF with respect to a set F of functional dependencies if, for all functional dependencies in F+ of the form α → β, where α R and β R, at least one of the ⊆ ⊆following holds:

• α → β is a trivial functional dependency (that is, β α).⊆• α is a superkey for schema R.

Page 23: Design of databases

• A database design is in BCNF if each member of the set of relation schemas that constitutes the design is in BCNF.

• Branch Schema, Loan Schema and Customer Schema make up the BCNF of the Lending-Schema

Page 24: Design of databases

BCNF Decomposition Algorithmresult := {R};done := false;compute F+;while (not done) do

if (there is a schema Ri in result that is not in BCNF)then begin

let α → β be a nontrivial functional dependency that holds on Ri such that α → Ri is not in F+, and α ∩ β =

∅result := (result − Ri) (Ri − ∪ β) ( α, β)∪

endelse done := true