Top Banner
2003.10.02 - SLIDE 1 IS 202 – FALL 2003 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003 http://www.sims.berkeley.edu/academics/courses/ is202/f03/ SIMS 202: Information Organization and Retrieval Lecture 12: Database Design
68

Lecture 12: Database Design

Feb 10, 2016

Download

Documents

skule

Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003 http://www.sims.berkeley.edu/academics/courses/is202/f03/. Lecture 12: Database Design. SIMS 202: Information Organization and Retrieval. Lecture Overview. Review - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 12: Database Design

2003.10.02 - SLIDE 1IS 202 – FALL 2003

Prof. Ray Larson & Prof. Marc DavisUC Berkeley SIMS

Tuesday and Thursday 10:30 am - 12:00 pmFall 2003

http://www.sims.berkeley.edu/academics/courses/is202/f03/

SIMS 202: Information Organization

and Retrieval

Lecture 12: Database Design

Page 2: Lecture 12: Database Design

2003.10.02 - SLIDE 2IS 202 – FALL 2003

Lecture Overview• Review

– Databases and Database Design– Database Life Cycle– ER Diagrams

• Database Design• Normalization• Discussion Questions

Page 3: Lecture 12: Database Design

2003.10.02 - SLIDE 3IS 202 – FALL 2003

Lecture Overview• Review

– Databases and Database Design– Database Life Cycle– ER Diagrams

• Database Design• Normalization• Discussion Questions

Page 4: Lecture 12: Database Design

2003.10.02 - SLIDE 4IS 202 – FALL 2003

Models (1)

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

Page 5: Lecture 12: Database Design

2003.10.02 - SLIDE 5IS 202 – FALL 2003

Database System Life Cycle

Growth,Change, &

Maintenance6

Operations5

Integration4

Design1

Conversion3

PhysicalCreation

2

Page 6: Lecture 12: Database Design

2003.10.02 - SLIDE 6IS 202 – FALL 2003

Another View of the Life Cycle

Operations5

Conversion3

PhysicalCreation

2Growth, Change

6

Integration4

Design1

Page 7: Lecture 12: Database Design

2003.10.02 - SLIDE 7IS 202 – FALL 2003

Database Design Process

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

Page 8: Lecture 12: Database Design

2003.10.02 - SLIDE 8IS 202 – FALL 2003

Entity• An Entity is an object in the real world (or

even imaginary worlds) about which we want or need to maintain information– Persons (e.g.: customers in a business,

employees, authors)– Things (e.g.: purchase orders, meetings,

parts, companies)

Employee

Page 9: Lecture 12: Database Design

2003.10.02 - SLIDE 9IS 202 – FALL 2003

Attributes• Attributes are the significant properties or

characteristics of an entity that help identify it and provide the information needed to interact with it or use it (This is the Metadata for the entities)

Employee

Last

Middle

First

Name SSN

Age

Birthdate

Projects

Page 10: Lecture 12: Database Design

2003.10.02 - SLIDE 10IS 202 – FALL 2003

Relationships• Relationships are the associations

between entities• They can involve one or more entities and

belong to particular relationship types– One to One– One to Many– Many to Many

Page 11: Lecture 12: Database Design

2003.10.02 - SLIDE 11IS 202 – FALL 2003

Relationships

ClassAttendsStudent

PartSuppliesproject partsSupplier

Project

Page 12: Lecture 12: Database Design

2003.10.02 - SLIDE 12IS 202 – FALL 2003

Types of Relationships

• Concerned only with cardinality of relationship

TruckAssignedEmployee

ProjectAssignedEmployee

ProjectAssignedEmployee

1 1

n

n

1

m

Chen ER notation

Page 13: Lecture 12: Database Design

2003.10.02 - SLIDE 13IS 202 – FALL 2003

More Complex Relationships

ProjectEvaluationEmployee

Manager

1/n/n

1/1/1

n/n/1

ProjectAssignedEmployee 4(2-10) 1

SSN ProjectDate

ManagesEmployeeManages

Is Managed By

1

n

Page 14: Lecture 12: Database Design

2003.10.02 - SLIDE 14IS 202 – FALL 2003

Weak Entities• Owe existence entirely to another entity

Order-lineContainsOrder

Invoice #

Part#

Rep#

QuantityInvoice#

Page 15: Lecture 12: Database Design

2003.10.02 - SLIDE 15IS 202 – FALL 2003

Supertype and Subtype Entities

ClerkIs one ofSales-rep

Invoice

Other

Employee

Sold

Manages

Page 16: Lecture 12: Database Design

2003.10.02 - SLIDE 16IS 202 – FALL 2003

Many to Many Relationships

Employee

ProjectIsAssigned

ProjectAssignment

Assigned

SSN

Proj#

SSN

Proj#Hours

Page 17: Lecture 12: Database Design

2003.10.02 - SLIDE 17IS 202 – FALL 2003

Lecture Overview• Review

– Databases and Database Design– Database Life Cycle– ER Diagrams

• Database Design• Normalization• Discussion Questions

Page 18: Lecture 12: Database Design

2003.10.02 - SLIDE 18IS 202 – FALL 2003

Database Design Process

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

Page 19: Lecture 12: Database Design

2003.10.02 - SLIDE 19IS 202 – FALL 2003

Database Design Process

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

Page 20: Lecture 12: Database Design

2003.10.02 - SLIDE 20IS 202 – FALL 2003

Requirements Analysis• Conceptual Requirements

– Systems Analysis Process• Examine all of the information sources used in

existing applications• Identify the characteristics of each data element

– Numeric– Text– Date/time– Etc.

• Examine the tasks carried out using the information

• Examine results or reports created using the information

Page 21: Lecture 12: Database Design

2003.10.02 - SLIDE 21IS 202 – FALL 2003

Database Design Process

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

Page 22: Lecture 12: Database Design

2003.10.02 - SLIDE 22IS 202 – FALL 2003

Conceptual Design• Conceptual Model

– Merge the collective needs of all applications– Determine what Entities are being used

• Some object about which information is to maintained

– What are the Attributes of those entities?• Properties or characteristics of the entity• What attributes uniquely identify the entity

– What are the Relationships between entities• How the entities interact with each other?

Page 23: Lecture 12: Database Design

2003.10.02 - SLIDE 23IS 202 – FALL 2003

Developing a Conceptual Model• Overall view of the database that integrates all

the needed information discovered during the requirements analysis

• Elements of the Conceptual Model are represented by diagrams, Entity-Relationship or ER Diagrams, that show the meanings and relationships of those elements independent of any particular database systems or implementation details

• Can also be represented using other modeling tools (such as UML)

Page 24: Lecture 12: Database Design

2003.10.02 - SLIDE 24IS 202 – FALL 2003

Database Design Process

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

Page 25: Lecture 12: Database Design

2003.10.02 - SLIDE 25IS 202 – FALL 2003

Logical Design• Logical Model

– How is each entity and relationship represented in the Data Model of the DBMS

• Hierarchic?• Network?• Relational?• Object-Oriented?

Page 26: Lecture 12: Database Design

2003.10.02 - SLIDE 26IS 202 – FALL 2003

Database Design Process

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

Page 27: Lecture 12: Database Design

2003.10.02 - SLIDE 27IS 202 – FALL 2003

Physical Design• Internal Model

– Choices of index file structure– Choices of data storage formats– Choices of disk layout

Page 28: Lecture 12: Database Design

2003.10.02 - SLIDE 28IS 202 – FALL 2003

Database Design Process

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

Page 29: Lecture 12: Database Design

2003.10.02 - SLIDE 29IS 202 – FALL 2003

Database Application Design• External Model

– User views of the integrated database – Making the old (or updated) applications work

with the new database design

Page 30: Lecture 12: Database Design

2003.10.02 - SLIDE 30IS 202 – FALL 2003

Terms and Concepts• Key

– An attribute or set of attributes used to identify or locate records in a file

• Primary Key– An attribute or set of attributes that uniquely

identifies each record in a file• Candidate Key

– An attribute or set of attributes that might be used as a primary key

Page 31: Lecture 12: Database Design

2003.10.02 - SLIDE 31IS 202 – FALL 2003

Lecture Overview• Review

– Databases and Database Design– Database Life Cycle– ER Diagrams

• Database Design• Normalization• Discussion Questions

Page 32: Lecture 12: Database Design

2003.10.02 - SLIDE 32IS 202 – FALL 2003

Normalization• Normalization theory is based on the

observation that relations with certain properties are more effective in inserting, updating and deleting data than other sets of relations containing the same data

• Normalization is a multi-step process beginning with an “unnormalized” relation

Page 33: Lecture 12: Database Design

2003.10.02 - SLIDE 33IS 202 – FALL 2003

Normal Forms• First Normal Form (1NF)• Second Normal Form (2NF)• Third Normal Form (3NF)• Boyce-Codd Normal Form (BCNF)• Fourth Normal Form (4NF)• Fifth Normal Form (5NF)

Page 34: Lecture 12: Database Design

2003.10.02 - SLIDE 34IS 202 – FALL 2003

Normalization

Boyce-Codd and

Higher

Functional dependencyof nonkey attributes on the primary key - Atomic values only

Full Functional dependencyof nonkey attributes on the primary key

No transitive dependency between nonkey attributes

All determinants are candidate keys - Single multivalued dependency

Page 35: Lecture 12: Database Design

2003.10.02 - SLIDE 35IS 202 – FALL 2003

Unnormalized Relations• First step in normalization is to convert the

data into a two-dimensional table• In unnormalized relations data can repeat

within a column

• (The following is a highly contrived example that actually bears only a slight resemblance to the current implementation of the Phone/Photo project database)

Page 36: Lecture 12: Database Design

2003.10.02 - SLIDE 36IS 202 – FALL 2003

Unnormalized RelationsPerson # People # Picture date Person Name Person Type Location People Activity Objects Object_Feat

1111145 311

Oct 1, 2003; Nov 12, 2003 John White Student

San Francisco, Berkeley

Beth Little Michael Diamond

Shopping; Eating

Book bag; Pasta

Blue none

1234243 467

Sep 25, 2003; Oct 10, 2003 Mary Jones Auditor

202 South Hall; Oakland

Charles Field Patricia Gold

Reading; Drinking

Textbook; Teacup

None; Chinese

2345 189Sep 27, 2003 Charles Brown Student

Sather Gate

David Rosen Singing none none

4876 145Nov 5, 2003 Hal Kane Student Northside Beth Little Shopping Book bag Blue

5123 145Oct 10, 2003 Paul Kosher Student South Hall Beth Little Reading none none

6845 243

Oct 5, 2003 Dec 15, 2003 Ann Hood Student

Oakland; Oakland

Charles Field; Charles Field

Eating; Shopping

Burrito; none

vegetarian; none

Page 37: Lecture 12: Database Design

2003.10.02 - SLIDE 37IS 202 – FALL 2003

First Normal Form• To move to First Normal Form a relation

must contain only atomic values at each row and column– No repeating groups– A column or set of columns is called a

Candidate Key when its values can uniquely identify the row in the relation

Page 38: Lecture 12: Database Design

2003.10.02 - SLIDE 38IS 202 – FALL 2003

First Normal Form

Person # People # Picture DatePerson Name Person Type Location People Activity Objects Object_feat

1111 145 Oct 1, 2003 John White StudentSan Francisco Beth Little Shopping Book bag Blue

1111 311Nov 12,

2003 John White Student BerkeleyMichael Diamond Eating Pasta none

1234 243Sep 25,

2003 Mary Jones Auditor202 South Hall Charles Field Reading Textbook none

1234 467Oct 10,

2003 Mary Jones Auditor Oakland Patricia Gold Drinking Teacup Chinese

2345 189Sep 27,

2003Charles Brown Student Sather Gate David Rosen Singing none none

4876 145 Nov 5, 2003 Hal Kane Student Northside Beth Little Shopping Book bag Blue

5123 145Oct 10,

2003 Paul Kosher Student South Hall Beth Little Reading none none

6845 243 Oct 5, 2003 Ann Hood Student Oakland Charles Field Eating BurritoVegetarian

6845 243Dec 15,

2003 Ann Hood Student Oakland Charles Field Shopping none none

Page 39: Lecture 12: Database Design

2003.10.02 - SLIDE 39IS 202 – FALL 2003

1NF Storage Anomalies• Insertion: A new person has not yet taken a picture

-- hence no Picture # -- Since Picture # is part of the key we can’t insert

• Insertion: If People is are known and likely to be photographed, but haven’t been yet -- there is be no way to include that person in the database

• Update: If a Person changes status (e.g. Mary Jones becomes a Student) we have to change multiple rows in the database

• Deletion (type 1): Deleting a Person record may also delete all info about People in the pictures

• Deletion (type 2): When there are functional dependencies (like Object and Object_features) changing one item eliminates other information

Page 40: Lecture 12: Database Design

2003.10.02 - SLIDE 40IS 202 – FALL 2003

Second Normal Form• A relation is said to be in Second Normal

Form when every nonkey attribute is fully functionally dependent on the primary key– That is, every nonkey attribute needs the full

primary key for unique identification

Page 41: Lecture 12: Database Design

2003.10.02 - SLIDE 41IS 202 – FALL 2003

Second Normal FormPerson # Person Name Person Type

1111 John White Student

1234 Mary Jones Auditor

2345Charles Brown Student

4876 Hal Kane Student

5123 Paul Kosher Student

6845 Ann Hood Student

Person Table

Page 42: Lecture 12: Database Design

2003.10.02 - SLIDE 42IS 202 – FALL 2003

Second Normal Form

People # People145 Beth Little189 David Rosen243 Charles Field311 Michael Diamond467 Patricia Gold

People Table

Page 43: Lecture 12: Database Design

2003.10.02 - SLIDE 43IS 202 – FALL 2003

Second Normal FormPerson # People # Picture Date Location Activity Objects Object_Feat

1111 145 01-Oct-03San

Francisco Shopping Book bag Blue

1111 311 12-Nov-03 Berkeley Eating Pasta none

1234 243 25-Sep-03202 South

Hall Reading Textbook none

1234 467 10-Oct-03 Oakland Drinking Teacup Chinese

2345 189 27-Sep-03 Sather Gate Singing none none

4876 145 05-Nov-03 Northside Shopping Book bag Blue

5123 145 10-Oct-03 South Hall Reading none none

6845 243 05-Oct-03 Oakland Eating Burrito vegetarian

6845 243 15-Dec-03 Oakland Shopping none none

Picture Table

Page 44: Lecture 12: Database Design

2003.10.02 - SLIDE 44IS 202 – FALL 2003

1NF Storage Anomalies Removed

• Insertion: Can now enter new Persons who haven’t yet taken pictures

• Insertion: Can now enter People who haven’t been photographed

• Deletion (type 1): If Charles Brown withdraws his photos the corresponding tuples from Person and Picture tables can be deleted without losing information on David Rosen

• Update: If John White takes a third picture, and has changed status (e.g., graduate), we only need to change the Person table in one place

Page 45: Lecture 12: Database Design

2003.10.02 - SLIDE 45IS 202 – FALL 2003

2NF Storage Anomalies• Insertion: Cannot enter the fact that a particular

object has a particular feature unless it is associated with a particular picture

• Deletion: If John White describes some other object that Beth Little has while shopping, we lose the fact that the bookbag is blue

• Update: If the features of an object change change we have to update multiple occurrences of object features

Page 46: Lecture 12: Database Design

2003.10.02 - SLIDE 46IS 202 – FALL 2003

Third Normal Form• A relation is said to be in Third Normal

Form if there are no transitive functional dependencies between nonkey attributes– When one nonkey attribute can be

determined with one or more nonkey attributes there is said to be a transitive functional dependency

• The Obect_Feature column in the Picture table is determined by the Object– Object_Feature is transitively functionally

dependent on Object so Picture is not 3NF

Page 47: Lecture 12: Database Design

2003.10.02 - SLIDE 47IS 202 – FALL 2003

Third Normal FormPerson # People # Picture Date Location Activity Objects

1111 145 01-Oct-03 San Francisco Shopping Book bag

1111 311 12-Nov-03 Berkeley Eating Pasta

1234 243 25-Sep-03 202 South Hall Reading Textbook

1234 467 10-Oct-03 Oakland Drinking Teacup

2345 189 27-Sep-03 Sather Gate Singing none

4876 145 05-Nov-03 Northside Shopping Book bag

5123 145 10-Oct-03 South Hall Reading none

6845 243 05-Oct-03 Oakland Eating Burrito

6845 243 15-Dec-03 Oakland Shopping none

Picture Table

Page 48: Lecture 12: Database Design

2003.10.02 - SLIDE 48IS 202 – FALL 2003

Third Normal Form

Objects Object_Feat

Book bag Blue

Pasta none

Textbook none

Teacup Chinese

Burrito Vegetarian

Object Table

Page 49: Lecture 12: Database Design

2003.10.02 - SLIDE 49IS 202 – FALL 2003

2NF Storage Anomalies Removed

• Insertion: We can now enter the fact that an object has a particular feature

• Deletion: If John White describes some other object that Beth Little has while shopping, we don’t lose the fact that the bookbag is blue

• Update: The features for each object appear only once

Page 50: Lecture 12: Database Design

2003.10.02 - SLIDE 50IS 202 – FALL 2003

Boyce-Codd Normal Form• Most 3NF relations are also BCNF

relations• A 3NF relation is NOT in BCNF if:

– Candidate keys in the relation are composite keys (they are not single attributes)

– There is more than one candidate key in the relation, and

– The keys are not disjoint, that is, some attributes in the keys are common

Page 51: Lecture 12: Database Design

2003.10.02 - SLIDE 51IS 202 – FALL 2003

Most 3NF Relations Are Also BCNF – Is This One?

Person # Person Name Person Type

1111 John White Student

1234 Mary Jones Auditor

2345Charles Brown Student

4876 Hal Kane Student

5123 Paul Kosher Student

6845 Ann Hood Student

Page 52: Lecture 12: Database Design

2003.10.02 - SLIDE 52IS 202 – FALL 2003

BCNF Relations

Person # Person Name

1111 John White

1234 Mary Jones

2345Charles Brown

4876 Hal Kane

5123 Paul Kosher

6845 Ann Hood

Person # Person Type

1111 Student

1234 Auditor

2345 Student

4876 Student

5123 Student

6845 Student

Page 53: Lecture 12: Database Design

2003.10.02 - SLIDE 53IS 202 – FALL 2003

Additional Issues• Why separate Person and People?

– They are really all People/Persons in different roles

• Shouldn’t a picture have a unique ID regardless of Who is in it?

• Can’t we have multiple people in the same picture, multiple objects, etc.?

• Can’t objects have multiple characteristics?

Page 54: Lecture 12: Database Design

2003.10.02 - SLIDE 54IS 202 – FALL 2003

BCNF RelationsPicture # Person # Picture Date

1 1111 01-Oct-03

2 1111 12-Nov-03

3 1234 25-Sep-03

4 1234 10-Oct-03

5 2345 27-Sep-03

6 4876 05-Nov-03

7 5123 10-Oct-03

8 6845 05-Oct-03

9 6845 15-Dec-03

loc # Location

1 San Francisco

2 Berkeley

3 202 South Hall

4 Oakland

5 Sather Gate

6 Northside

7 South Hall

Picture # loc #

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 4

9 4Act # Activity

1 Shopping

2 Eating

3 Reading

4 Drinking

5 Singing

Picture # Act #

1 1

2 2

3 3

4 4

5 5

6 1

7 3

8 2

9 1

Picture # Obj #

1 1

2 2

3 3

4 4

6 1

8 5

Obj # Objects

1 Book bag

2 Pasta

3 Textbook

4 Teacup

5 BurritoPicture # People #

1 145

2 311

3 243

4 467

5 189

6 145

7 145

8 243

9 243

Page 55: Lecture 12: Database Design

2003.10.02 - SLIDE 55IS 202 – FALL 2003

BCNF Added Capabilities• Can now have a picture with no (identified)

people in it• Can have multiple objects, activities, and

people associated with each picture

Page 56: Lecture 12: Database Design

2003.10.02 - SLIDE 56IS 202 – FALL 2003

Fourth Normal Form• Any relation is in Fourth Normal Form if it

is BCNF and any multivalued dependencies are trivial

• Eliminate non-trivial multivalued dependencies by projecting into simpler tables

Page 57: Lecture 12: Database Design

2003.10.02 - SLIDE 57IS 202 – FALL 2003

Fifth Normal Form• A relation is in 5NF if every join

dependency in the relation is implied by the keys of the relation

• Implies that relations that have been decomposed in previous NF can be recombined via natural joins to recreate the original relation

Page 58: Lecture 12: Database Design

2003.10.02 - SLIDE 58IS 202 – FALL 2003

Fifth Normal Form RelationsPicture # Person # Picture Date

1 1111 01-Oct-03

2 1111 12-Nov-03

3 1234 25-Sep-03

4 1234 10-Oct-03

5 2345 27-Sep-03

6 4876 05-Nov-03

7 5123 10-Oct-03

8 6845 05-Oct-03

9 6845 15-Dec-03

loc # Location

1 San Francisco

2 Berkeley

3 202 South Hall

4 Oakland

5 Sather Gate

6 Northside

7 South Hall

Picture # loc #

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 4

9 4

Act # Activity

1 Shopping

2 Eating

3 Reading

4 Drinking

5 Singing

Picture # Act #

1 1

2 2

3 3

4 4

5 5

6 1

7 3

8 2

9 1

Picture # Obj #

1 1

2 2

3 3

4 4

6 1

8 5

Obj # Objects

1 Book bag

2 Pasta

3 Textbook

4 Teacup

5 Burrito

Picture # People #

1 145

2 311

3 243

4 467

5 189

6 145

7 145

8 243

9 243

People Table

Page 59: Lecture 12: Database Design

2003.10.02 - SLIDE 59IS 202 – FALL 2003

Normalizing to Death• Normalization splits database information

across multiple tables• To retrieve complete information from a

normalized database, the JOIN operation must be used

• JOIN tends to be expensive in terms of processing time, and very large joins are very expensive

Page 60: Lecture 12: Database Design

2003.10.02 - SLIDE 60IS 202 – FALL 2003

Lecture Overview• Review

– Databases and Database Design– Database Life Cycle– ER Diagrams

• Database Design• Normalization• Discussion Questions

Page 61: Lecture 12: Database Design

2003.10.02 - SLIDE 61IS 202 – FALL 2003

Questions: Brooke Maury• Discussion Questions on Hoffer & McFadden:• If the goal of the relational database model is to

encode a ‘conceptual’ design into a logical design, is it possible that improved technology and the development of new modeling techniques will supplant the RDBMS? Specifically, what impact will XML and the development of document engineering have on organizing information in multiple normalized tables?

• Conversely, what does the relational model have that would be lost if a conceptual design was encoded in another model?

Page 62: Lecture 12: Database Design

2003.10.02 - SLIDE 62IS 202 – FALL 2003

Questions: Brooke Maury• The drive to develop the RDBM was in

part motivated by a need to minimize the space required and improve the performance of database systems by removing redundancies. What impact will very inexpensive data storage and computing power have on the relational database model and the third normal form especially?

Page 63: Lecture 12: Database Design

2003.10.02 - SLIDE 63IS 202 – FALL 2003

Questions: Shane Ahern• Discussion Questions for "Logical Database Design and

the Relational Model"• Is the normalization process described really necessary?

When I design a database schema, I find that by thinking of tables in terms of they entities they represent (employees, sales, events), I avoid most of the problems of normalization that the process seeks to address (i.e. salesperson and region in Sales table, salesperson is clearly a distinct entity from sales). If the formal process described in the article is not followed, are there potential pitfalls that might lead to problems with your database schema?

Page 64: Lecture 12: Database Design

2003.10.02 - SLIDE 64IS 202 – FALL 2003

Questions: Shane Ahern• The article points out that "the relational model

does not yet directly support supertype/subtype relationships." Once the tables in a relational database have been decomposed to third normal form, the database is efficient from systems point-of-view, but the tables no longer represent a representation of the data that is intuitive to humans. The object-oriented model more accurately mirrors the way we think about the concepts that we wish to store in databases. So perhaps object-oriented database systems are worth considering. What about XML databases?

Page 65: Lecture 12: Database Design

2003.10.02 - SLIDE 65IS 202 – FALL 2003

Questions: Arthur Law• The three models that we have been presented

with, Entity Relationship Model, NIAM Model, and Object Oriented Model all enforce a specific thought process in the organization and relationship between items in a database. With all of our recent discussion of computers understanding natural language are these methods now out of date with how we should be organizing information? Should we use artificial intelligence or learning algorithms to statistically determine the relationship between entities or is there still value in using these models?

Page 66: Lecture 12: Database Design

2003.10.02 - SLIDE 66IS 202 – FALL 2003

Questions: Arthur Law• Each model is approximately one decade apart in

development and a quick Google search shows that companies are using databases with one of the three models. However, as new models arise there doesn't seem too much interest in migrating from one data model to another. Which makes sense given that an organization using a given model probably finds that it works. Now with the proliferation of XML, we see more information being shared between organizations, so are we fated for an expensive and lengthy translation process between databases? Or should all DB administrators be responsible for upgrading to the latest model?

Page 67: Lecture 12: Database Design

2003.10.02 - SLIDE 67IS 202 – FALL 2003

Lecture Overview• Review

– Databases and Database Design– Database Life Cycle– ER Diagrams

• Database Design• Normalization• Discussion Questions• Next Time/Readings

Page 68: Lecture 12: Database Design

2003.10.02 - SLIDE 68IS 202 – FALL 2003

Next Time• Guest Lecture – Bob Glushko on XML and

“Document Engineering” • Readings on Class website• No assigned discussion questions (but

bring your questions on the readings)