Database Design and the Entity/Relationship Model€¦ · Database Design and the Entity/Relationship Model ... Entity 1 Relationship Entity 2 Attribute 1a Attribute 1b Attribute

Post on 06-Apr-2020

12 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Database Design and the

Entity/Relationship Model

You’ve just been hired by Bank of America as their DBA for their online banking web site.

You are asked to create a database that

monitors:

◦ customers

◦ accounts

◦ loans

◦ branches

◦ transactions, …

Now what??!!!

1. Requirements Specification

◦ Determine the requirements of clients

2. Conceptual Design◦ Express client requirements in terms of some high-level

model (E/R model).◦ Confirm with clients that requirements are correct.

3. Functional Requirements◦ Specify required data operations ◦ priorities, response times

4. Logical Design◦ Convert E/R model to relational, object-based, XML-based,…

5. Physical Design◦ Specify file organizations, build indexes

Conceptual DesignThe E/R Data Model

What is a Data Model?

Framework for organizing and interpreting data

Example: E/R Data Model

Entity1 Entity2Relationship

Attribute1a Attribute1bAttribute2a Attribute2b

Attribute2c

E/R Data ModelBasics

Entities noun phrases (e.g., Bob Smith, Thayer St. Branch)

contained in entity sets (e.g. Employee, Branch)

have attributes (e.g., Employee = (essn, ename, …))

Relationships verb phrases (e.g., works_at, works_for)

relate 2 (binary) or more (n-ary) entities

relationship sets characterize relationships amongst entity sets

e.g., (Bob Smith, Thayer St Branch) Works_At

E/R Data ModelAn Example

Employee Works_At

essn

Branch

ename

phone

children

since seniority bname bcity

Works_Formanager

workerEntity Set

Relationship Set

Attribute

Employee

Works_At

phone

Lots of notation to come.

Color is irrelevant

E/R Data ModelTypes of Attributes

Employee Works_At

essn

Branch

ename

phone

children

since seniority bname bcity

Works_Formanager

worker Defaultename

children

seniority

Multivalued

Derived

E/R Data ModelTypes of relationships

Employee Works_At

essn

Branch

ename

phone

children

since seniority bname bcity

Works_Formanager

workerMany-to-One (n:1)

Many-to-Many (n:m)Works_At

Works_For

E/R Data ModelRecursive relationships

Employee Works_At

essn

Branch

ename

phone

children

since seniority bname bcity

Works_Formanager

worker

Employeemanager

worker Works_For

Recursive relationships: Must be declared with roles

E/R Data ModelDesign Issue #1: Entity Sets vs. Attributes

Employee

phone_no phone_loc

Employee

no loc

PhoneUsesvs

(a) (b)

To resolve, determine how phones are used 1. Can many employees share a phone?

(If yes, then (b))

2. Can employees have multiple phones?

(if yes, then (b), or (a) with multivalued attributes)

3. Else (a)

An Example: How to model bank loans

E/R Data ModelDesign Issue #2: Entity Sets vs. Relationship Sets

Customer

cssn cname

vs

(a)

To resolve, determine how loans are issued 1. Can there be more than one customer per loan?

If yes, then (a). Otherwise, loan info must be replicated for each customer (wasteful, potential update anomalies)

2. Is loan a noun or a verb?

Both, but more of a noun to a bank. (hence (a) probably more

appropriate)

Borrows Loan

lno amt

Customer

cssn cname

(b)

Loans Branch

bname bcity

lno amt

An Example:

E/R Data ModelDesign Issue #3: Relationship Cardinalities

Customer Borrows Loan? ?

Variations on Borrows:

1. Can a customer hold multiple loans?

2. Can a loan be jointly held by more than 1 customer?

E/R Data ModelDesign Issue #3: Relationship Cardinalities

Customer Borrows Loan? ?

Type Illustrated Multiple Loans? Joint Loans?

One-to-One (1:1) No No

Many-to-one (n:1) No Yes

One-to-many (1:n) Yes No

Many-to-many (n:m) Yes Yes

Cardinalities of Borrows:

Borrows

Borrows

Borrows

Borrows

E/R Data ModelDesign Issue #3: Relationship Cardinalities (cont)

In general...

1:1

n:1

1:n

n:m

An Example: Works_At

E/R Data ModelDesign Issue #4: N-ary vs Binary Relationship Sets

Employee Works_at Branch

Dept

Employee WAE Branch

Dept

WABinary:

Ternary:

WAB

WAD

vs

(Joe, Thayer, Acct) Works_At

(Joe, w3) WAE

(Thayer, w3) WAB

(Acct, w3) WAD

Choose n-ary

when possible!

Key = set of attributes identifying individual entities or relationships

E/R Data ModelKeys

Employee

essn ename eaddress ephone

A. Superkey: any attribute set that distinguishes identities e.g., {essn}, {essn, ename, eaddress}

B. Candidate Key:

“minimal superkey” (can’t remove attributes and preserve “keyness”) e.g., {essn}, {ename, eaddress}

C. Primary Key:

candidate key chosen as the key by a DBA e.g., {essn} (denoted by underline)

E/R Data ModelRelationship Set Keys

Employee

essn ename ...

Works_At Branch

bname bcity ...

Q: What attributes are needed to represent

relationships in Works_At?

since

e1

e2

e3

b1

b2

A: {essn, bname, since}

E/R Data ModelRelationship Set Keys (cont.)

Q: What are the candidate keys of Works_At?

e1

e2

e3

b1

b2

A: {essn}

Employee

essn ename ...

Works_At Branch

bname bcity ...since

E/R Data ModelRelationship Set Keys (cont.)

Q: What are the candidate keys if Works_At is...?

A: {essn, bname} b. n:m

a. 1:n

c. 1:1

A: {bname}

A: {essn} or {bname}

Employee

essn ename ...

Works_At Branch

bname bcity ...since

? ?

General Rules for Relationship Set Keys

E/R Data ModelRelationship Set Keys (cont.)

E1

P (E1)...

R E2

P (E2)...

If R is:

R

1:1

1:n

n:1

n:m

Candidate Keys

P (E1) or P (E2)

P (E2)

P (E1)

P (E1) P (E2)

Idea:◦ Existence of one entity depends on another

Example: Loans and Loan Payments

E/R Data ModelExistence Dependencies and Weak Entity Sets

Loan

lno lamt

Loan_Pmt Payment

pno pdate pamt

Weak Entity Set

Identifying Relationship

Total Participation

E/R Data ModelExistence Dependencies and Weak Entity Sets

Weak Entity Sets

existence of payments depends upon loans

have no superkeys: different payment records (for different loans) can be identical

instead of keys, discriminators: discriminate

between payments for given loan (e.g., pno)

Loan

lno lamt

Loan_Pmt Payment

pno pdate pamt

E/R Data ModelExistence Dependencies and Weak Entity Sets

Identifying Relationships

We say:

Loan is dominant in Loan_Pmt

Payment is subordinate in Loan_Pmt

Payment is existence dependent on Loan

Loan

lno lamt

Loan_Pmt Payment

pno pdate pamt

E/R Data ModelExistence Dependencies and Weak Entity Sets

All elements of Payment appear in Loan_Pmt

Loan

lno lamt

Payment

pno pdate pamt

Loan_Pmt

E/R Data ModelExistence Dependencies and Weak Entity Sets

E1

attam

E2

attb1 attbn

Q. Is {attb1, …, attbn} a superkey of E2?

......atta1

A: No

Q. Name a candidate key of E2

A: {atta1, attb1}

R

Q. Does total participation of E2 in R E2 is existence-dep?

A: No

E/R Data ModelExtensions to the Model: Specialization and Generalization

An Example: Customers can have checking and savings accts

Checking ~ Savings (many of the same attributes)

Old Way:

Customer Has1 Savings Acct

acct_no balance interest

Has2 Checking Acct

acct_no balance overdraft

E/R Data ModelExtensions to the Model: Specialization and Generalization

Customer Has Account

acct_no balance

Checking Acct

overdraftinterest

Isa

Savings Acct

An Example: Customers can have checking and savings accts

Checking ~ Savings (many of the same attributes)

New Way:

superclass

subclasses

E/R Data ModelExtensions to the Model: Specialization and Generalization

Subclass Distinctions:

1. User-Defined vs. Condition-Defined

User: Membership in subclasses explicitly determined

(e.g., Employee, Manager < Person)

Condition: Membership predicate associated with

subclasses - e.g:

Person

Isa

Child Adult Senior

age < 18 18 age age > 65

E/R Data ModelExtensions to the Model: Specialization and Generalization

Subclass Distinctions:

2. Overlapping vs. Disjoint

Overlapping: Entities can belong to >1 entity set

(e.g., Adult, Senior)

Disjoint: Entities belong to exactly 1 entity set

(e.g., Child)

Person

Isa

Child Adult Senior

age < 18 18 age age > 65

E/R Data ModelExtensions to the Model: Specialization and Generalization

Subclass Distinctions:

3. Total vs. Partial Membership

Total: Every entity of superclass belongs to a subclass

e.g.,

Partial: Some entities of superclass do not belong to any

subclass (e.g., Suppose Adult condition is age 21 )

Person

Isa

Child Adult Senior

age < 18 age 18 age 65

E/R Data ModelExtensions to the Model: Aggregation

E/R: No relationships between relationships E.g.: Associate loan officers with Borrows relationship set

Customer LoanBorrows

Employee

Loan_Officer

?

Associate Loan Officer with Loan?

What if we want a loan officer for every (customer, loan) pair?

E/R Data ModelSummary

Entities, Relationships (sets)

Both can have attributes (simple, multivalued, derived,

composite)

Cardinality or relationship sets (1:1, n:1, n:m)

Keys: superkeys, candidate keys, primary key

DBA chooses primary key for entity sets

Automatically determined for relationship sets

Weak Entity Sets, Existence Dependence, Total/Partial

Participation

Specialization and Generalization (E/R + inheritance)

E/R Relational Schema

Entity Sets

E = (a1, …, an)E

a1an…

E/R Relational Schema

Entity Sets

E = (a1, …, an)

Relationship Sets

R = (a1, b1, c1, …, ck)

a1: E1’s key

b1: E2’s key

c1, …, ck: attributes of R

E

a1an…

E2

bm

E1

a1an

ck

R

…a1 b1

c1

Not the whole story for Relationship Sets …

What about…E2

bm

E1

a1an

ck

R

…a1

b1c1

Rule of Thumb

Fewer tables

good, as long as

no redundancy

• a1 is a key for R

• a1 also a key for E1 = (a1, …, an)

Could have: R = (a1, b1, c1, …, ck) but…

• Ignore R

• Add b1, c1, …, ck to E1 instead (i.e: E1 = (a1, …, an,, b1, c1, …, ck))

Instead:

Relationship Cardinality Relational Schema

n:m E1 = (a1, …, an)E2 = (b1, …, bm)R = (a1, b1, c1, …, cn)

E2

bm

E1

a1an

ck

R

……

…a1 b1

c1

? ?

R

Relationship Cardinality Relational Schema

n:m E1 = (a1, …, an)E2 = (b1, …, bm)R = (a1, b1, c1, …, cn)

n:1 E1 = (a1, …, an, b1, c1, …, cn)E2 = (b1, …, bm)

E2

bm

E1

a1an

ck

R

……

…a1 b1

c1

? ?

R

R

Relationship Cardinality Relational Schema

n:m E1 = (a1, …, an)E2 = (b1, …, bm)R = (a1, b1, c1, …, cn)

n:1 E1 = (a1, …, an, b1, c1, …, cn)E2 = (b1, …, bm)

1:n E1 = (a1, …, an)E2 = (b1, …, bm,, a1, c1, …, cn)

R

R

R

E2

bm

E1

a1an

ck

R

……

…a1 b1

c1

? ?

Relationship Cardinality Relational Schema

n:m E1 = (a1, …, an)E2 = (b1, …, bm)R = (a1, b1, c1, …, cn)

n:1 E1 = (a1, …, an, b1, c1, …, cn)E2 = (b1, …, bm)

1:n E1 = (a1, …, an)E2 = (b1, …, bm,, a1, c1, …, cn)

1:1Treat as n:1 or 1:n

R

R

R

R

E2

bm

E1

a1an

ck

R

……

…a1 b1

c1

? ?

Acct-BranchAccount

acct_no balance

Branch

bname bcity assets

BorrowerCustomer

cname ccity

Loan

lno amtcstreet

Depositor Loan-Branch

Q. How many tables does this get translated into?

A. 6 (account, branch, customer, loan, depositor, borrower)

Q. What are the schemas?

Acct-BranchAccount

acct_no balance

Branch

bname bcity assets

BorrowerCustomer

cname ccity

Loan

lno amtcstreet

Depositor Loan-Branch

Account

bname acct_no balance

Depositor

cname acct_no

Customer

cname cstreet ccity

Branch

bname bcity assets

Borrower

cname lno

Loan

bname lno amt

Account

bname acct_no balance

DowntownMianusPerryR.H.

BrightonRedwoodBrighton

A-101A-215A-102A-305A-201A-222A-217

500700400350900700750

Depositor

cname acct_no

JohnsonSmithHayesTurner

JohnsonJones

Lindsay

A-101A-215A-102A-305A-201A-217A-222

Customer

cname cstreet ccity

JonesSmithHayesCurry

LindsayTurner

WilliamsAdams

JohnsonGlennBrooksGreen

MainNorthMainNorthPark

PutnamNassauSpringAlma

Sand HillSenatorWalnut

HarrisonRye

HarrisonRye

PittsfieldStanfordPrincetonPittsfieldPalo AltoWoodsideBrooklynStanford

Branch

bname bcity assets

DowntownRedwood

PerryMianusR.H.

PownelN. TownBrighton

BrooklynPalo Alto

HorseneckHorseneckHorseneckBennington

RyeBrooklyn

9M2.1M1.7M0.4M8M

0.3M3.7M7.1M

Borrower

cname lno

JonesSmithHayes

JacksonCurrySmith

WilliamsAdams

L-17L-23L-15L-14L-93L-11L-17L-16

Loan

bname lno amt

DowntownRedwood

PerryDowntown

MianusR.H.Perry

L-17L-23L-15L-14L-93L-11L-16

10002000150015005009001300

E/R Relational Schema

Weak Entity Sets

E1 = (a1, …, an)

E2 = (a1, b1, …, bm)E1

a1an

E2

b1 … bm

IR

E/R Relational Schema

Multivalued Attributes

Emp = (ssn, name)

Emp-Depts = (ssn, dept)

Emp

ssn namedept

Emp

ssn name

001

Smith

Emp-Depts

ssn dept

001

001

Acct

Sales

E/R Relational Schema

Subclasses

Method 1:

E = (a1, …, an)

E1 = (a1, b1, …, bm)

E2 = (a1, c1, …, ck)

E

a1an

E2

c1b1

Isa

E1

… bm … ck

E/Rb Relational Schema

Subclasses

Method 1:

E = (a1, …, an)

E1 = (a1, b1, …, bm)

E2 = (a1, c1, …, ck)

Method 2:

E1 = (a1, …, an, b1, …, bm)

E2 = (a1, …, an, c1, …, ck)

E

a1an

E2

c1b1

Isa

E1

… bm … ck

Subclasses example:

Method 1:Account = (acct_no, balance)SAccount = (acct_no, interest)CAccount = (acct_no, overdraft)

Method 2:SAccount = (acct_no, balance, interest)CAccount = (acct_no, balance, overdraft)

Q: When is method 2 not possible?

A: When subclassing is partial

top related