Top Banner
M.P. Johnson, DBMS, Stern/NYU, Sprin g 2005 1 C20.0046: Database Management Systems Lecture #3 Matthew P. Johnson Stern School of Business, NYU Spring, 2005
33

C20.0046: Database Management Systems Lecture #3

Feb 02, 2016

Download

Documents

shen

C20.0046: Database Management Systems Lecture #3. Matthew P. Johnson Stern School of Business, NYU Spring, 2005. Admin. Textbooks? This afternoon. Agenda. Last time: E/R models, some design issues This time: More design “carving at the joints” Redundancy - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

1

C20.0046: Database Management SystemsLecture #3

Matthew P. Johnson

Stern School of Business, NYU

Spring, 2005

Page 2: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

2

Admin Textbooks?

This afternoon

Page 3: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

3

Agenda Last time: E/R models, some design issues This time: More design “carving at the joints”

Redundancy Whether an element should be an attribute or entity

set Replacing a relationships with entity sets

Constraints Identifying & specifying key attributes to an entity set Recognizing other types of single-valued constraints Representing referential integrity constraints Identifying & representing general constraints

Weak entity sets

Page 4: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

4

Review Multiplicity review:

Square-of? (e.g., (3,9)) Cube-of? (e.g., (-3,-27)) Wife-of? Wife-of-in-certain-other-cultures?

Page 5: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

5

Design Principles Faithfulness Simplicity Avoiding redundancy Choice of relationships Picking elements

Page 6: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

6

Simplicity Einstein: Theories should be as simple as possible,

but not simpler. Use as few elements as possible

Minimum required relations No unnecessary attributes (will you be using this

attribute?) Eliminate “spinning wheels”

Example: how can we simplify this?

Movies Ownings StudiosOwned-by Owns

Page 7: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

7

Avoiding redundancy Say everything exactly once

Minimize database storage requirements More important: prevent possible update errors

simplest but not only e.g.: modify data one place but not the other – more later

Example: Spot the redundancy

Studios MoviesOwn

StudioName

Name

Length

Name

Address

Page 8: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

8

Avoiding redundancy Say everything exactly once

Minimize database storage requirements More important: prevent possible update errors

simplest but not only e.g.: modify data one place but not the other – more later

Example: Spot the redundancy

Studios MoviesOwn

StudioName

Name

Length

Name

Address

Redundancy: Movies “knows” the studio two ways

Phone

Page 9: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

9

Spot more redundancy

Different redundancy: studio info listed for every movie!

Movies

StudioName

Name

Length

SAddress

SPhone

Name Length Studio SAddress SPhonePulp Fiction … Miramax NYC 212-…Sylvia … Miramax NYC 212-…Jay & Sil. Bob … Miramax NYC 212-…

Page 10: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

10

Don’t add relships that are implied

Students Courses

TAs

Enrolls

TA-of

Assist

Suppose each course again has <=1 TA

Q: Is the following good design?

A: If TAs other than the course’s TA can help students, then yes;

if not, then no: we can connect Students and TAs by going through Courses; redundant!

Page 11: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

11

Correct E/R models may contain loops

Person plays multiple roles: employee of company buyer of product

price

address name ssn

Person

buys

makes

employs

CompanyProduct

name category

stockprice

name

Page 12: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

12

More design

Repeating TA names & IDs – redundant TA is not TAing any course now lose TA’s data! TA should get its own ES

Students CoursesEnrolls

Q: What’s wrong with this design?

A:

TA-Name TA-ID

TA-Email

Course-ID CName

Page 13: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

13

Opposite problem: Entity or attribute? Some E/Rs improved by removing entities

Can convert Entity E into attributes of F if1. R:FE is many-one

one-one counts because special case2. Attributes for E are independent of each other

knowing one att val doesn’t tell us another att val

Then remove E add all attributes of E to F

Page 14: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

14

Students CoursesEnrolls

TA-Name AssistsTA

Entity attributeCName

Room

Students CoursesEnrolls

CName

Room

TA-Name

Course-ID

Course-ID

Page 15: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

15

Convert TA entity again?

No! Multiple TAs allowed Violates condition (1) Redundant course data

Students CoursesEnrolls

AssistsTA

CName CID Room TA-NameDBMS 46 123 HowardDBMS 46 123 Wesley

CName

Room

Course-ID

TA-Name

Page 16: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

16

Convert TA entity again?

Students CoursesEnrolls

AssistsTA

CName

Room

Course-ID

TA-ID TA-Favorite-Color

No! TA has dependent fields Violates condition (2)

How can it tell? Redundant TA data

CName TA-Name TA-ID TA-ColorDBMS Ralph 678 GreenA.Soft. Ralph 678 Green

TA-Name

Page 17: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

17

Entity or attributes? Should student address be an entity or an attribute? If student may have multiple addresses, must be entity

campus address, permanent address attributes cannot be set-valued

If we need to examine structure of address, must be entity find all students from NYS but not NYC

If attribute, then it’s probably a simple string no structure! NB: this choice is a microcosm of entire miniworld (much) power of a DB comes from the structure imposed on the

data

Page 18: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

18

Larger example DB design Application: library database. Authors have written books about

various subjects; different libraries in the system may carry these books.

Entities (with attributes in parentheses): Authors (ssn, name, phone, birthdate) Books (ISDN, title) Subjects (sname, sid) Libraries (lname)

Relationships [associating entities in square brackets]: Wrote-on [Authors, Subjects] Cover [Libraries, Subjects] On [Books, Subjects]

Page 19: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

19

E/R of DB designName

Author

ssn phone birthdate

wrote-on

SubjectSNameTitle

Carries

LibraryLName

On Book

ISBN

Page 20: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

20

Poor initial design First design is a poor model of this system Some info not captured:

How many copies does a lib. have of a given book? What edition of a book does the library have?

Design problems: no direct relship associating authors and books no direct relship associating libraries and books

Common queries complex and difficult/expensive What libraries carry books by a given author? What books has a given author written? Who is the author of a given book?

Page 21: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

21

Larger example DB design 2 Application: library database as before

Entities (with attributes in parentheses): Authors (ssn, name, phone, birthdate) Books (ISDN, title) Subjects (sname, sid) Libraries (lname)

Relations [associating entities in square brackets] (attributes in parentheses): Wrote [Authors, Books] Carries [Libraries, Books] (quantity, edition) On [Books , Subjects]

Page 22: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

22

E/R of improved DB design

Rule of thumb: often queried together make closely connected

Name

Author

ssn phone birthdate

wrote

BookISBN

TitleCarries

LibraryLName

Edition

Quantity

On Subject

SName

Page 23: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

23

Next topic: Constraints Review: programmer-defined rules stating what

should always be true about consistent databases Restrictions on data:

Keys (e.g. SSNs uniquely identify people) Single value constraints (e.g. everyone has 1 father) Referential Integrity (e.g. person’s record refers to father

father must exist) Domain constraints (e.g. gender in M/F, age in 0..150) General constraints (e.g. no more than 10 customers per

sales rep) Can’t infer constraints from data

may hold “accidentally” they are a part of the schema

Page 24: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

24

E/R keys Uniquely identifies entity in ES Attribute or set of attributes

Two entities cannot agree on all key attributes These attributes determine all others

Every ES should have a key possibly including all attributes

Primary key attributes underlined More than one possible key:

Candidate keys, primary key

Practical tip: create intentional key attribute E.g. SSN, course-id, employee-id, etc. SSN likely shorter than (name,address) Prevents quasi-redundancy

address

name ssn

Person

Page 25: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

25

Single-valued constraints “at most one” value

sharp arrows E.g. attributes: could be null or one Many-one relationships: the “one” part is

single-valued. Can think of key atts as (non-null) single-

valued

TACourse Assists

Page 26: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

26

Referential integrity “Exactly one value” NOT NULL attributes Relationships

Non-null value refers to entity that exists Refer to entity with foreign key HTML analogy: no broken links Programming analogy: no dangling pointers Ways of handling deletion:

Prevent deletion as long as referrer exist Enforce deletion of all referrers

InstructorCourse Taught

Page 27: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

27

Referential integrity – E/R e.g.

Insertion – must refer to existing entity Suppose need to add

course: “Oracle” instructor: MPJ

Q: Which order? Q: What if relship were exactly-exactly, say, M(Hs,Ws)?

i.e., referential integrity in both directions? A: Put both inserts in one xact – later

Students CoursesEnrolls

Instructor

Taught

Page 28: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

28

Other kinds of constraints Domain constraints

E.g. date: must be after 1980 Enumerated type: grades A through F, no E No specific E/R notation: mention with attribute or

relationship General constraints:

A class may have no more than 100 students; a student may not have more than 6 courses:

Students CoursesEnroll <=6<=100

Page 29: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

29

Next topic: Weak entity sets Definition:

Some or all key attributes belong to another ES Why:

An entity set is part of a hierarchy (not ISA) Connecting entity sets

The key consists of 0, 1 or more of its own attributes Key attributes of entity sets from supporting

relationships

Page 30: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

30

Conditions of Supporting relationships

Supporting relationship R:EF R is many-one (E-F) (or one-one) R is binary Referential integrity from E to F

a rounded arrow Those atts supplied to E are the key attributes of F F itself may be weak

Another entity set G, and so on recursively

A1

A2

RE F

Page 31: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

31

For several supporting relships from E to F Keys of each F role appear as foreign key of E

Other many-one relationships Not necessarily supporting

Requirements for weak entity sets

From

By

Purchases A1

A2

A3

People

StoresAt-store

Page 32: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

32

Weak entity sets Example: Hierarchy – species & genus Idea: species name unique per genus only

Species

name

Belongs-to Genus

name

Page 33: C20.0046: Database Management Systems Lecture #3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

33

Next time We’ll finish E/R models and begin the

relational model Read chapter 3 through section 3.4 Info on project, hw likely posted soon