Copyright © Genetic Computer School 2009 DBSQL 4-1 Chapter 4 Database Design
Dec 26, 2015
Copyright © Genetic Computer School 2009DBSQL 4-1
Chapter 4
Database Design
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-2
Chapter 4 Overview
Database DesignRedundancy IssuesData AnomalyFunctional Dependency TheoryNormalizationDrawbacks of NormalizationDenormalization
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-3
Database DesignDBMS
is a program or a collection of programs, through which users interact with a database
is a software system that enables users to define, create and maintain the database and which provides controlled access to this database.
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-4
Cont’
Database Processing
Data entryand reports
Data entryand reports
DBMS
Property, Owner, Renter
and Lease details
+ File definitions
Database
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-5
Database Development Process
PLANNING
Enterprise Data Model
Conceptual Data Model
Logical Data Model
Physical Data Model
Database and Repositories
IMPLEMENTATION
PHYSICAL DATABASEDESIGN
LOGICAL DATABASEDESIGN
ANALYSIS
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-6
Redundancy Issues
Redundancy is the common problem when implementing a database. It means that duplication of data exists and the same data are stored in more than one place. If those redundant data are to be removed then it is possible that data might be lost, further data complications and problems may lead to anomalies. There are three pre-defined anomalies existing: insertion, deletion and updating of data.
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-7
Data anomalies
Insert Delete Update
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-8
Functional Dependencies
The functional dependency occurs when one attribute in a relation uniquely identifies another attribute. It can be denoted as ‘AB’ which means that ‘B is functionally dependent upon A’.
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-9
Cont’
A B It means:
B is functionally dependent on AA determines BA is called determinant
B is called object of the determinant
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-10
FD Inference Rules
Subset Property Augmentation Transitivity
Secondary rules Union Decomposition Pseudo-transitivity
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-11
Types of Dependency
Functional
Partial
Transitive
It is a candidate key that uniquely identifies a relation
This is the situation that exists if it is necessary to only use a subset of the attributes of the composite determinant to identify its object.
A transitive dependency exists when there is an intermediate functional dependency. If AB, BC and if AC then it can be stated that the transitive dependency exists.
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-12
Normalization
Is a formal technique for analyzing relations based on their primary key (or candidate keys in the case of BCNF) and functional dependencies.
Normalization is often executed as a series of steps. Each step corresponds to a specific normal form, which has known properties.
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-13
Cont’
The technique involves a series of rules that can be tested against individual relations so that a database can be normalized to any degree.
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-14
Stages of Normalization Process
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-15
First Normal Form (1NF)
A table is said to be in First Normal Formal if it only contains one and only one atomic value. In this stage repeating groups will also be eliminated. Removing repeating groups is the starting point in the quest to create tables that are as free of problems as possible. Tables without repeating groups are said to be in its First Normal Form (1NF).
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-16
Example
Orders (OrderNum, OrderDate, (PartNum, NumOrdered))
Orders
OrderNum OrderDate PartNum NumOrdered
21608 10/20/2008 AT94 11
DR93 1
DW11 1
21613 10/21/2008 KL62 4
21614 10/21/2008 KT03 2
BV06 2
CD52 4
21619 10/24/2008 DR93 1
21623 10/25/2008 KV29 2
10/20/200821610
10/23/200821617
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-17
Second Normal Form (2NF)
To convert a table into its Second Normal Form, it should first be in its First Normal Form (1NF). A table is said to be in its Second Normal Form if all of its non-key attributes only depends on one key attribute meaning no partial dependencies
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-18
ExampleOrders
OrderNum OrderDate PartNum Description NumOrdered Quoted Price
21608 10/20/2008 AT94 Table 11 100.00$
21610 10/20/2008 DR93 Computer 1 925.00$
21610 10/20/2008 DW11 Chair 1 21.00$
21613 10/21/2008 KL62 Notebook 4 5.00$
21614 10/21/2008 KT03 Aircon 2 1,500.00$
21617 10/23/2008 BV06 TV 2 2,101.00$
21617 10/23/2008 CD52 Microwave Oven 4 794.00$
21619 10/24/2008 DR93 Computer 1 200.00$
21623 10/25/2008 KV29 Fax Machine 2 350.00$
Orders (OrderNum, OrderDate, PartNum, Description, NumOrdered, QuotedPrice)
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-19
Cont’
Orders (OrderNum, OrderDate)
Part (PartNum, Description)
OrderLine (OrderNum, PartNum, NumOrdered, QuotedPrice)
Part (PartNum, Description)
OrderLine (OrderNum, PartNum, NumOrdered, QuotedPrice)
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-20
Third Normal Form (3NF)
The general concept of Third Normal Form (3NF) is to remove transitive dependencies.
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-21
Example
Customer (CustID, CustName, CustAddress, Balance,CustCreditLimit, CustStatus, RepNo, FirstName, LastName)
Representative (RepNo, FirstName, LastName)
Customer (CustID, CustName, CustAddress, Balance,CustCreditLimit, CustStatus, RepNo)
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-22
Boyce Codd Normal Form (BCNF)
A relation is in BCNF, if for every non-trivial functional dependency XA, X is a super key. In other words a relation is in BCNF, if and only if the determinant is a candidate key.
BCNF is a stronger form of normalization than 3NF, which eliminates the first rule of 3NF, which allows the right side of the functional dependency to be a prime attribute.
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-23
Fourth Normal Form (4NF)
A table is in fourth normal form (4NF) when it is third normal form (3NF) and there are no multi-valued dependencies.
Multi-valued dependencies
A B
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-24
Fifth Normal Form
The goal of the Fifth Normal Form is to remove join dependency. In normalizing a relation seldom it would reach to this stage.
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-25
Drawbacks of Normalization
It produces a lot of tables Might slow the performance when querying data at multiple tables Query becomes complex
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-26
Denormalization
Denormalization is used primarily to improve performance in cases where a database and the relation itself are over normalized causing an overhead to the query processor.
Copyright © Genetic Computer School, Singapore 2009
DBSQL 4-27
End