M.P. Johnson, DBMS, Stern/NYU, Sprin g 2005 1 C20.0046: Database Management Systems Lecture #3 Matthew P. Johnson Stern School of Business, NYU Spring, 2005
Feb 02, 2016
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
1
C20.0046: Database Management SystemsLecture #3
Matthew P. Johnson
Stern School of Business, NYU
Spring, 2005
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
2
Admin Textbooks?
This afternoon
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
3
Agenda Last time: E/R models, some design issues This time: More design “carving at the joints”
Redundancy Whether an element should be an attribute or entity
set Replacing a relationships with entity sets
Constraints Identifying & specifying key attributes to an entity set Recognizing other types of single-valued constraints Representing referential integrity constraints Identifying & representing general constraints
Weak entity sets
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
4
Review Multiplicity review:
Square-of? (e.g., (3,9)) Cube-of? (e.g., (-3,-27)) Wife-of? Wife-of-in-certain-other-cultures?
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
5
Design Principles Faithfulness Simplicity Avoiding redundancy Choice of relationships Picking elements
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
6
Simplicity Einstein: Theories should be as simple as possible,
but not simpler. Use as few elements as possible
Minimum required relations No unnecessary attributes (will you be using this
attribute?) Eliminate “spinning wheels”
Example: how can we simplify this?
Movies Ownings StudiosOwned-by Owns
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
7
Avoiding redundancy Say everything exactly once
Minimize database storage requirements More important: prevent possible update errors
simplest but not only e.g.: modify data one place but not the other – more later
Example: Spot the redundancy
Studios MoviesOwn
StudioName
Name
Length
Name
Address
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
8
Avoiding redundancy Say everything exactly once
Minimize database storage requirements More important: prevent possible update errors
simplest but not only e.g.: modify data one place but not the other – more later
Example: Spot the redundancy
Studios MoviesOwn
StudioName
Name
Length
Name
Address
Redundancy: Movies “knows” the studio two ways
Phone
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
9
Spot more redundancy
Different redundancy: studio info listed for every movie!
Movies
StudioName
Name
Length
SAddress
SPhone
Name Length Studio SAddress SPhonePulp Fiction … Miramax NYC 212-…Sylvia … Miramax NYC 212-…Jay & Sil. Bob … Miramax NYC 212-…
…
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
10
Don’t add relships that are implied
Students Courses
TAs
Enrolls
TA-of
Assist
Suppose each course again has <=1 TA
Q: Is the following good design?
A: If TAs other than the course’s TA can help students, then yes;
if not, then no: we can connect Students and TAs by going through Courses; redundant!
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
11
Correct E/R models may contain loops
Person plays multiple roles: employee of company buyer of product
price
address name ssn
Person
buys
makes
employs
CompanyProduct
name category
stockprice
name
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
12
More design
Repeating TA names & IDs – redundant TA is not TAing any course now lose TA’s data! TA should get its own ES
Students CoursesEnrolls
Q: What’s wrong with this design?
A:
TA-Name TA-ID
TA-Email
Course-ID CName
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
13
Opposite problem: Entity or attribute? Some E/Rs improved by removing entities
Can convert Entity E into attributes of F if1. R:FE is many-one
one-one counts because special case2. Attributes for E are independent of each other
knowing one att val doesn’t tell us another att val
Then remove E add all attributes of E to F
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
14
Students CoursesEnrolls
TA-Name AssistsTA
Entity attributeCName
Room
Students CoursesEnrolls
CName
Room
TA-Name
Course-ID
Course-ID
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
15
Convert TA entity again?
No! Multiple TAs allowed Violates condition (1) Redundant course data
Students CoursesEnrolls
AssistsTA
CName CID Room TA-NameDBMS 46 123 HowardDBMS 46 123 Wesley
…
CName
Room
Course-ID
TA-Name
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
16
Convert TA entity again?
Students CoursesEnrolls
AssistsTA
CName
Room
Course-ID
TA-ID TA-Favorite-Color
No! TA has dependent fields Violates condition (2)
How can it tell? Redundant TA data
CName TA-Name TA-ID TA-ColorDBMS Ralph 678 GreenA.Soft. Ralph 678 Green
…
TA-Name
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
17
Entity or attributes? Should student address be an entity or an attribute? If student may have multiple addresses, must be entity
campus address, permanent address attributes cannot be set-valued
If we need to examine structure of address, must be entity find all students from NYS but not NYC
If attribute, then it’s probably a simple string no structure! NB: this choice is a microcosm of entire miniworld (much) power of a DB comes from the structure imposed on the
data
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
18
Larger example DB design Application: library database. Authors have written books about
various subjects; different libraries in the system may carry these books.
Entities (with attributes in parentheses): Authors (ssn, name, phone, birthdate) Books (ISDN, title) Subjects (sname, sid) Libraries (lname)
Relationships [associating entities in square brackets]: Wrote-on [Authors, Subjects] Cover [Libraries, Subjects] On [Books, Subjects]
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
19
E/R of DB designName
Author
ssn phone birthdate
wrote-on
SubjectSNameTitle
Carries
LibraryLName
On Book
ISBN
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
20
Poor initial design First design is a poor model of this system Some info not captured:
How many copies does a lib. have of a given book? What edition of a book does the library have?
Design problems: no direct relship associating authors and books no direct relship associating libraries and books
Common queries complex and difficult/expensive What libraries carry books by a given author? What books has a given author written? Who is the author of a given book?
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
21
Larger example DB design 2 Application: library database as before
Entities (with attributes in parentheses): Authors (ssn, name, phone, birthdate) Books (ISDN, title) Subjects (sname, sid) Libraries (lname)
Relations [associating entities in square brackets] (attributes in parentheses): Wrote [Authors, Books] Carries [Libraries, Books] (quantity, edition) On [Books , Subjects]
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
22
E/R of improved DB design
Rule of thumb: often queried together make closely connected
Name
Author
ssn phone birthdate
wrote
BookISBN
TitleCarries
LibraryLName
Edition
Quantity
On Subject
SName
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
23
Next topic: Constraints Review: programmer-defined rules stating what
should always be true about consistent databases Restrictions on data:
Keys (e.g. SSNs uniquely identify people) Single value constraints (e.g. everyone has 1 father) Referential Integrity (e.g. person’s record refers to father
father must exist) Domain constraints (e.g. gender in M/F, age in 0..150) General constraints (e.g. no more than 10 customers per
sales rep) Can’t infer constraints from data
may hold “accidentally” they are a part of the schema
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
24
E/R keys Uniquely identifies entity in ES Attribute or set of attributes
Two entities cannot agree on all key attributes These attributes determine all others
Every ES should have a key possibly including all attributes
Primary key attributes underlined More than one possible key:
Candidate keys, primary key
Practical tip: create intentional key attribute E.g. SSN, course-id, employee-id, etc. SSN likely shorter than (name,address) Prevents quasi-redundancy
address
name ssn
Person
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
25
Single-valued constraints “at most one” value
sharp arrows E.g. attributes: could be null or one Many-one relationships: the “one” part is
single-valued. Can think of key atts as (non-null) single-
valued
TACourse Assists
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
26
Referential integrity “Exactly one value” NOT NULL attributes Relationships
Non-null value refers to entity that exists Refer to entity with foreign key HTML analogy: no broken links Programming analogy: no dangling pointers Ways of handling deletion:
Prevent deletion as long as referrer exist Enforce deletion of all referrers
InstructorCourse Taught
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
27
Referential integrity – E/R e.g.
Insertion – must refer to existing entity Suppose need to add
course: “Oracle” instructor: MPJ
Q: Which order? Q: What if relship were exactly-exactly, say, M(Hs,Ws)?
i.e., referential integrity in both directions? A: Put both inserts in one xact – later
Students CoursesEnrolls
Instructor
Taught
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
28
Other kinds of constraints Domain constraints
E.g. date: must be after 1980 Enumerated type: grades A through F, no E No specific E/R notation: mention with attribute or
relationship General constraints:
A class may have no more than 100 students; a student may not have more than 6 courses:
Students CoursesEnroll <=6<=100
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
29
Next topic: Weak entity sets Definition:
Some or all key attributes belong to another ES Why:
An entity set is part of a hierarchy (not ISA) Connecting entity sets
The key consists of 0, 1 or more of its own attributes Key attributes of entity sets from supporting
relationships
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
30
Conditions of Supporting relationships
Supporting relationship R:EF R is many-one (E-F) (or one-one) R is binary Referential integrity from E to F
a rounded arrow Those atts supplied to E are the key attributes of F F itself may be weak
Another entity set G, and so on recursively
A1
A2
RE F
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
31
For several supporting relships from E to F Keys of each F role appear as foreign key of E
Other many-one relationships Not necessarily supporting
Requirements for weak entity sets
From
By
Purchases A1
A2
A3
People
StoresAt-store
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
32
Weak entity sets Example: Hierarchy – species & genus Idea: species name unique per genus only
Species
name
Belongs-to Genus
name
M.P. Johnson, DBMS, Stern/NYU, Spring 2005
33
Next time We’ll finish E/R models and begin the
relational model Read chapter 3 through section 3.4 Info on project, hw likely posted soon