Chapter 4 Logical Database Design and the Relational Model. Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration Gonzaga University Spokane, WA 99258 [email protected]. Objectives. Define terms List five properties of relations - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Objectives• Define terms• List five properties of relations• State two properties of candidate keys• Define first, second, and third normal form• Describe problems from merging relations• Transform E-R and EER diagrams to relations• Create tables with entity and relational integrity
constraints• Use normalization to convert anomalous tables to
well-structured relations
Conceptual Level
Internal Level
External Level
Conceptual Schema
Internal Schema
Levels of Database SchemasDifferent schemas are presented to different users
dimensional table of data– Table is made up of rows (records), and columns
(attribute or field)• Not all tables qualify as relations• Requirements:
– Every relation has a unique name.– Every attribute value is atomic (not multivalued, not
composite)– Every row is unique (can’t have two rows with exactly
the same values for all their fields)– Attributes (columns) in tables have unique names– The order of the columns is irrelevant– The order of the rows is irrelevant
100 Margaret Simpson Marketing 48,000140 Allen Beeton Accounting 52,000110 Chris Lucero Info. System 43,000190 Lorenzo Davis Finance 55,000150 Susan Martin Marketing 42,000
140 Allen Beeton Accounting 52,000 Tax Acc 12/8/201x110 Chris Lucero Info. System 53,000 SPSS, C++ 1/12/201X,4/22/201X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS, Java 6/16/201X, 8/12/201X
Multi-valued
Can we implement a database as stated above? Why? (Y/N)How can we: 1) count # of courses an employee completed?How can we: 2) add a new course just completed today?
We will learn how to handle the issue of “Multi-Valued” attributes
(a) Table with repeating groups or multi-valued attributes (Un-Normalized)
EmpID Name DeptName Salary Course Date Title Completed
100 Margaret Simpson Marketing 48,000 SPSS 6/19/201X Surveys 10/7/201X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/201X110 Chris Lucero Info. System 43,000 SPSS 1/12/201X C++ 4/22/201X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/201X Java 8/12/201X
EmpID Name DeptName Salary Course Date Title Completed100 Margaret Simpson Marketing 48,000 SPSS 6/19/201X Surveys 10/7/201X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/201X110 Chris Lucero Info. System 43,000 SPSS 1/12/201X C++ 4/22/201X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/201X Java 8/12/201X
EmpID Name DeptName Salary Course Date Title Completed100 Margaret Simpson Marketing 48,000 SPSS 6/19/201X100 Margaret Simpson Marketing 48,000 Surveys 10/7/201X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/201X110 Chris Lucero Info. System 43,000 SPSS 1/12/201X110 Chris Lucero Info. System 43,000 C++ 4/22/201X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/201X150 Susan Martin Marketing 42,000 Java 8/12/201X
Figure 4-2 (a) Table with repeating groups – how to “remove” them (and solve the problem)
EmpID Name DeptName Salary Course Date Title Completed
100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X110 Chris Lucero Info. System 43,000 SPSS 1/12/200X110 Chris Lucero Info. System 43,000 C++ 4/22/200X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/200X150 Susan Martin Marketing 42,000 Java 8/12/200X
EmpID Name DeptName Salary Course Date Title Completed
100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X110 Chris Lucero Info. System 43,000 SPSS 1/12/200X110 Chris Lucero Info. System 43,000 C++ 4/22/200X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/200X150 Susan Martin Marketing 42,000 Java 8/12/200X
Well-Structured Relations• A well-structured relation contains minimal
redundancy and allows users to insert, modify, and delete the rows in a table without errors or inconsistencies.
• The following anomalies should be removed (as possible as we could) for a well-structured relation:– Insertion Anomaly– Deletion Anomaly– Modification Anomaly
EmpID Name DeptName Salary Course Date Title Completed
100 Margaret Simpson Marketing 48,000 SPSS 6/19/201X100 Margaret Simpson Marketing 48,000 Surveys 10/7/201X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/201X110 Chris Lucero Info. System 43,000 SPSS 1/12/201X110 Chris Lucero Info. System 43,000 C++ 4/22/201X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/201X150 Susan Martin Marketing 42,000 Java 8/12/201X
Well-Structured Relations• A well-structured relation contains minimal
redundancy and allows users to insert, modify, and delete the rows in a table without errors or inconsistencies.
• The following anomalies should be removed (as possible as we could) for a well-structured relation:– Insertion Anomaly– Deletion Anomaly– Modification Anomaly
Insertion Anomaly: Inserting a new row, the user must supply values for both EmpID (PK) and CourseTitle (CPK and FK). This is an (insertion) anomaly, since the user should be able to enter employee data without knowing (supplying) course (title) data.
EmpID Name DeptName Salary Course Date Title Completed100 Margaret Simpson Marketing 48,000 SPSS 6/19/201X100 Margaret Simpson Marketing 48,000 Surveys 10/7/201X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/201X110 Chris Lucero Info. System 43,000 SPSS 1/12/201X110 Chris Lucero Info. System 43,000 C++ 4/22/201X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/201X150 Susan Martin Marketing 42,000 Java 8/12/201X
Deletion Anomaly: Deleting the employee number 140, it results in losing not only the employee’s information but also the course had an offering that completed on that date.
EmpID Name DeptName Salary Course Date Title Completed100 Margaret Simpson Marketing 48,000 SPSS 6/19/201X100 Margaret Simpson Marketing 48,000 Surveys 10/7/201X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/201X110 Chris Lucero Info. System 43,000 SPSS 1/12/201X110 Chris Lucero Info. System 43,000 C++ 4/22/201X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/201X150 Susan Martin Marketing 42,000 Java 8/12/201X
Modification Anomaly: If the employee number 100 gets a salary increase, we must record the increase in each of the rows for that employee (two occurences for this example); otherwise the data will be inconsistent.
EmpID Name DeptName Salary Course Date Title Completed100 Margaret Simpson Marketing 48,000 SPSS 6/19/201X100 Margaret Simpson Marketing 48,000 Surveys 10/7/201X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/201X110 Chris Lucero Info. System 43,000 SPSS 1/12/201X110 Chris Lucero Info. System 43,000 C++ 4/22/201X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/201X150 Susan Martin Marketing 42,000 Java 8/12/201X
Fig. 4-7: EMP_COURSE: Normalized Relations from EMPLOYEE2EMPLOYEE2EmpID Name DeptName Salary Course Date Title Completed
100 Margaret Simpson Marketing 48,000 SPSS 6/19/201X100 Margaret Simpson Marketing 48,000 Surveys 10/7/201X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/201X110 Chris Lucero Info. System 43,000 SPSS 1/12/201X110 Chris Lucero Info. System 43,000 C++ 4/22/201X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/201X150 Susan Martin Marketing 42,000 Java 8/12/201X
100 Margaret Simpson Marketing 48,000140 Allen Beet Accounting 52,000110 Chris Lucero Info. System 43,000190 Lorenzo Davis Finance 55,000150 Sususan Martin Marketing 42,000
Fig. 4-10: Mapping an entity with a multivalued attribute(a) Employee entity type with multivalued attribute
EmployeeID EmployeeName EmployeeAddressEMPLOYEE
EmployeeID Skill
(b) Mapping a multivalued attribute
Multivalued attribute becomes a separate relation with foreign key
[Two relations created with one containing all of the attributes except the multi-valued attribute, and the second one contains the pk (on the first one) and the multi-valued attribute]
One–to–many relationship between original entity and new relation
Draw a entity-relationship diagram (enterprise model) for Mountain View community Hospital, based on the narrative description of the case and this handout (but the entities are from the five (5) figures shown above). You should create a file and turn in with a hardcopy (called MVC_Hospital_DD.docx) contains the following materials:
1. Read and employ materials from chapters 2,3 and 4.2. Include entities, associations (with detail cardinality), and attributes.3. Determine and draw the order of entering data
Next phase -- implementation, create SQL script file for table structure and data base (values).
(a) Create a graphical relation based on your final ERD (yours or my version) with pk, cpk, fk and field type, field size etc. such as JustLee_DDL. Name this file as MVC_PhaseI_ERD-2_Lastname_Firstname.docx
(b) a script file (MVC_PhaseII_Lastname_Firstname.SQL) that contains a set of commands of DROP, CREATE, and INSERT that performs the same functions as in the script file of Northwoods.sql
(c) Save your spooled file (both script file and result from SQL) as MVC_PhaseII_Spool_Lastname_Firstname.txt Upload three files: MVC_ERD2, *.sql and *.txt files to the Bb. Turn in ONLY spooled file. (*.txt)
Phase III. (a) Second script file (MVC_PhaseIII_QUERIES_Lastname_Firstname.SQL) containing a
set of SQL commands that answer the questions. Test the query one/time successfully.Note that you may need other SQL commands and create database views for the purpose of
answering questions easily. You may need to read other references related the SQL from the text book (e.g., Chapters 6 & 7 of the main text).
(b) Save the spooled file as MVC_PhaseIII_Spool_Lastname_Firstname.txt. Finally, you create a new file (*.docx) containing all work done from Parts I, II and save them in the file MVC_Hospital_Complete_Lastname_Firstname.docx.
(c) The file should contain your class and personal information, information for each question (re-type with question number) as well as each individual query and result.
Upload both *.sql and *.txt files to the Bb. Turn in only spooled file (*.txt)
-- version A for service_charge_view that includes patient_no and Patient Name [for Query1(A)CREATE OR REPLACE VIEW service_charge_view (Patient_No,Patient_Name, Item_Code, Service_Charge) ASSELECT patient.patient_no, patient.p_first || ' ' ||patient.p_last, item.item_code, (item_charge*ITS.num_times_serviced)FROM patient, item, ITEM_SERVICE ITSWHERE item.item_code = ITS.item_codeAND ITS.patient_no = patient.patient_no ORDER BY patient.p_last;
Hint: You need to create VIEW (one or more) to help you create SQL efficiently and effectively
See sample on the Bb
CREATE OR REPLACE VIEW total_service_charge_view ASSELECT patient_name, sum(service_charge_view.service_Charge) Total_Service_ChargeFROM service_charge_viewGROUP BY patient_name;
In class exerciseTransform it to relations (NOT 3NF)#2-III-a , (p.158, apply Figure 3-6.b (p.102)(read step 7, on p.141)HW#2-III-d, (p.158), apply Figure 3-10 (p.105)
EmpID Name DeptName Salary Course Date Title Completed100 Margaret Simpson Marketing 48,000 SPSS 6/19/201X Surveys 10/7/201X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/201X110 Chris Lucero Info. System 43,000 SPSS 1/12/201X C++ 4/22/201X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/201X Java 8/12/201X
Figure 4-2 (a) Table with repeating groups – how to “remove” them (and solve the problem)
Figure 4-2 (b) EMPLOYEE2 relation
EMPLOYEE Relation (Table)
EmpID Name DeptName Salary Course Date Title Completed100 Margaret Simpson Marketing 48,000 SPSS 6/19/201X100 Margaret Simpson Marketing 48,000 Surveys 10/7/201X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/201X110 Chris Lucero Info. System 43,000 SPSS 1/12/201X110 Chris Lucero Info. System 43,000 C++ 4/22/201X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/201X150 Susan Martin Marketing 42,000 Java 8/12/201X
EmpID Name DeptName Salary Course Date Title Completed100 Margaret Simpson Marketing 48,000 SPSS 6/19/201X Surveys 10/7/201X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/201X110 Chris Lucero Info. System 43,000 SPSS 1/12/201X C++ 4/22/201X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/201X Java 8/12/201X
Figure 4-2 (a) Table with repeating groups – how to “remove” them (and solve the problem)
Figure 4-2 (b) EMPLOYEE2 relation
EMPLOYEE Relation (Table)
EmpID Name DeptName Salary Course Date Title Completed100 Margaret Simpson Marketing 48,000 SPSS 6/19/201X100 Margaret Simpson Marketing 48,000 Surveys 10/7/201X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/201X110 Chris Lucero Info. System 43,000 SPSS 1/12/201X110 Chris Lucero Info. System 43,000 C++ 4/22/201X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/201X150 Susan Martin Marketing 42,000 Java 8/12/201X
A Process of 1NF to 2NF (EMPLOYEE2 - - 1NF) (b) Functional Dependencies in EMPLOYEE2
EmpID CourseTitle Name DeptName DateCompletedSalary
EmpID Name DeptName Salary Course Date Title Completed
100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X110 Chris Lucero Info. System 43,000 SPSS 1/12/200X110 Chris Lucero Info. System 43,000 C++ 4/22/200X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/200X150 Susan Martin Marketing 42,000 Java 8/12/200X
EMPLOYEE2 (1NF)EmpID Name DeptName Salary Course Date Title Completed
100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X110 Chris Lucero Info. System 43,000 SPSS 1/12/200X110 Chris Lucero Info. System 43,000 C++ 4/22/200X190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/200X150 Susan Martin Marketing 42,000 Java 8/12/200X
EmpID Name DeptName Salary
100 Margaret Simpson Marketing 48,000140 Allen Beet Accounting 52,000110 Chris Lucero Info. System 43,000190 Lorenzo Davis Finance 55,000150 Sususan Martin Marketing 42,000
MVC_HospitalPhase III: Create a script file:1. A script file (MVC_PhaseIII_QUERIES_Lastname_Firstname.SQL) containing a set of SQL commands that answer the questions. Test the query one/time successfully.Note that you may need other SQL commands and create database views (see pptx file for introducing VIEWS) for the purpose of answering questions easily. You may need to read other references related the SQL from the text book (e.g., Chapter 7 of McFadden).
2. Spool the script file and save it in the file MVC_Hospital_Spool_Lastname_Firstname.txt Finally, you create a new file (*.docx) containing all work done from Part I (including MVC_ERD2), II, III and save them in the file MVC_Hospital_Complete_Lastname_Firstname.docx. The file should contain your class information and personal information.
3. UPLOAD ALL three files (*.sql, *.txt and*.docx ) to the Bb by the deadline. Turn in ONLY the *.docx file.
-- version A for service_charge_view that includes patient_no and Patient Name [for Query1(A)CREATE OR REPLACE VIEW service_charge_view (Patient_No,Patient_Name, Item_Code, Service_Charge) ASSELECT patient.patient_no, patient.p_first || ' ' ||patient.p_last, item.item_code, (item_charge*ITS.num_times_serviced)FROM patient, item, ITEM_SERVICE ITSWHERE item.item_code = ITS.item_codeAND ITS.patient_no = patient.patient_no ORDER BY patient.p_last;
Hint: You need to create VIEW (one or more) to help you create SQL efficiently and effectively
See sample on the Bb
CREATE OR REPLACE VIEW total_service_charge_view ASSELECT patient_name, sum(service_charge_view.service_Charge) Total_Service_ChargeFROM service_charge_viewGROUP BY patient_name;
• We will study the concept and technique of “normalization and de-normalization” as well as OLTP and OLAP.
More on OLTP vs. OLAP
Fig. Extra-a: A simple database with a relation
between two tables.
• The figure depicts a relational database environment with two tables.
• The first table contains information about pet owners; the second, information about pets. The tables are related by the single column they have in common: Owner_ID.
• By relating tables to one another, we can reduce ____________ of data and improve database performance.
• The process of breaking tables apart and thereby reducing data redundancy is called _______________.
redundancy
normalization
For those have database background.
pk
pk: primary keyfk: foreign key
pk fk
• Most relational databases which are designed to handle a high number of reads and writes (updates and retrievals of information) are referred to as ________ (OnLine Transaction Processing) systems.
• OLTP systems are very efficient for high volume activities such as cashiering, where many items are being recorded via bar code scanners in a very short period of time.
• However, using OLTP databases for analysis is generally not very efficient, because in order to retrieve data from multiple tables at the same time, a query containing ________ must be used.
OLTP vs. OLAP (cont.)
joins
OLTP
pk
pk fk
• In order to keep our transactional databases running quickly and smoothly, we may wish to create a data warehouse. A data warehouse is a type of large database (including both current and historical data) that has been _____________ and archived.
• Denormalization is the process of intentionally combining some tables into a single table in spite of the fact that this may introduce duplicate data in some columns.
• The figure depicts what our simple example data might look like if it were in a data warehouse. When we design databases in this way, we reduce the number of joins necessary to query related data, thereby speeding up the process of analyzing our data.
• Databases designed in this manner are called __________ (OnLine Analytical Processing) systems.
OLTP vs. OLAP (cont.)
Fig. Extra-b: A combination of the tables into a single dataset.
You have just learned and completed one of the most important concepts and theories, integrity constraints and normalization, for developing a quality of database.
Draw a entity-relationship diagram (enterprise model) for Mountain View community Hospital, based on the narrative description of the case and this handout (but the entities are from the five (5) figures shown above). You should create a file and turn in with a hardcopy (called MVC_Hospital_DD.docx) contains the following materials:
1. Read and employ materials from chapters 2,3 and 4.2. Include entities, associations (with detail cardinality), and attributes.3. Determine and draw the order of entering data
Next phase -- implementation, create SQL script file for table structure and data base (values).
MVC_HospitalCreate two script files:1. a script file (MVC_Hospital_Lastname_Firstname.SQL) that contains a set of commands of DROP, CREATE, and INSERT that performs the same functions as in the script file of Northwoods.sql2. Second script file (MVC_Hospital_QUERIES_Lastname_Firstname.SQL) containing a set of SQL commands that answer the questions. Test the query one/time successfully.Note that you may need other SQL commands and create database views (see pptx file for introducing VIEWS) for the purpose of answering questions easily. You may need to read other references related the SQL from the text book (e.g., Chapter 7 of McFadden). 3. Spool (2) and save it in the file MVC_Hospital_Spool_Lastname_Firstname.LST. Finally, you create a new file (*.docx) containing all work done from Part I and save them in the file MVC_Hospital_Complete_Lastname_Firstname.docx. The file should contain your class information and personal information. 4. UPLOAD the .docx file to the Bb by the deadline.