8/6/2019 db mod 4
1/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Integrity constraints ( ref: dbms by Silbershatz and galvin)
We know that unauthorized users can access the database. They can damage data in the database.
Also they can make the database inconsistent. Also a normal DBMS user can make the database in an
inconsistent state because of accident. So some restrictions should be made in the database so that the
users do not make changes to data accidentally. These restrictions are also called constraints.
Integrity constraints are intended for the normal user. These integrityt constraints ensure that
changes made to the database by authorized users do not result in a loss of data consistency. So the
integrity constraints guard against accidental damage to the database. They are a number of weays to
specify integrity constraints.They are
Key constraints ( primary keys, foreign keys and candidate key specification)
Using not null
Using check clause
Using assertions
Using triggers
Using functional dependencies
Domain constraints
We know that an attribute has a set of possible values associated with it.
For example in the student table
Student ( stdid, name, marks)
We know that the set of possible values for the attribute stdid is in the range of integers.
For attribute name the set of possible values are a group of characters.
For attribute marks the set of possible values are integers.
So these integer, character, date etc.. are called standard domain types.
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
Module 4
Database Design Design guidelines Relational database design Integrity Constraints Domain
Constraints- Referential integrity Functional Dependency- Normalization using Functional
Dependencies, Normal forms based on primary keys- general definitions of Second and Third Normal
Forms. Boyce Codd Normal Form Multivalued Dependencies and Forth Normal Form Join
Dependencies and Fifth Normal Form Pitfalls in Relational Database Design.
1
8/6/2019 db mod 4
2/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Declaring an attribute to be of a particular domain acts as a constraint on the values that it can take. It is
possible for several attributes to have the same domain. For example in our student table, the domain of
stdid is same as domain of marks. That is integer. But we never say that find the name of students whohave the same stdid as a mark. It is not meaningful.
We can define new domains by using the create domain clause.
That iscreate domain Dollars int ;
create domain pounds int;
Define the domains Dollars and pounds to be of integers. An attempt to assign a value of type dollars to a
variable of type Pounds would result in a syntax error although both are of the same type. But they are of
different domains.
The check clause in SQL permits domains to be restricted in powerful ways. For example if we are
creating a domain Studmarks and the condition is that the Studmarks value should not be more than 100.
we can specify thgis by
Create domain Studmarks intConstraint marktest check(value
8/6/2019 db mod 4
3/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
create table books (
bid int,bname char(10),
author char(10),
primary key (bid)
);
suppose the students are allowed to access and reserve books. We are given that the details of all students
are in the students table and details of all books are in the books table.
Suppose the condition in the college is that only students of the college are allowed to access and reserve
books. In other words we can specify this condition as only students who are having entry in the student
table are allowed to access the books. In other words the stdid values in reserve and accessed table must
also be present in the student table.
Suppose another condition is that the students are allowed to access and reserve only those books that are
present in the college library. Or in other words we can say that the students are allowed to access and
reserve books that are present in the books table. Or in other words the bid values in the books reserve andaccessed tables must also be present in the books table .
The above conditions or restrictions we can specify by using a foreign key clause.
That is
The tables accessed and reserved are created by
Create table reserved (
Stdid int,
Bid int,
Foreign key( stdid) references student( stdid),
Foreign key( bid) references books( bid)
);
create table accessed (
stdid int,
bid int,
Foreign key( stdid) references student( stdid),
Foreign key( bid) references books( bid)
);
this means that for any tuples inserted in to the reserved table the value of stdid and bid must be presentin the student and books tables respectively.
Also for any tuples inserted in to the accessed table the value of stdid and bid must be present in the
student and books tables respectively.
We can also create the tables reserved and accessed by specifying a coantraint name for these foreign
keys. That is another way of creating the tables is
Create table reserved (
Stdid int,
Bid int,Constraint st Foreign key( stdid) references student( stdid),
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
3
8/6/2019 db mod 4
4/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Constraint bks Foreign key( bid) references books( bid)
);
create table accessed (
stdid int,
bid int,
constraint stud Foreign key( stdid) references student( stdid),constraint bk1 Foreign key( bid) references books( bid)
);
here we have given names to these constraints.
So there are 2 foreign key constraints for reserved table. They are st and bks.
There are 2 foreign key constraints for accessed table. They are stud and bk1.
These facts can be represented by
StudentStdid Name marks
Books
Bid Bname Author
Reserved
Stdid Bid Rdate
Accessed
Stdid Bid Adate
Then other types of constraints are primary key constraints , unique, not null, check constraints.
For example suppose consider the table student.
Student ( stdid, branch, sem, relation, name, marks)
In this we can see that there are 2 candidate keys. They are stdid and (branch, sem, relation). One we
assign as the primary key , one we assign as unique.
Suppose we have the constraint that the name and marks of a student should not be nil or thwere should
be a value in the marks field and also suppose that we want to ensure that the value of marks should not
be more than 100. we can ensure this by using check clause.We can create the table by
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
4
8/6/2019 db mod 4
5/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Create table student (
Stdid int,
Branch char(2),
Sem int,
Rn int,
Name char(10) not null,Marks int not null,
Primary key (stdid),
Unique( branch, sem, Rn),
Check (marks
8/6/2019 db mod 4
6/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Problems in updating values
Lossy join decomposition
We can see an example. Suppose the information related with a college is stored as
College (dname, dhod, dphone, stdid, stdname, stdmarks)
College
Dname Dhod Dphone stdid stdname smarks
CS
CS
CS
EC
EC
AE
CS
AE
Abc
Abc
Abc
Bgh
Bgh
Mkl
Abc
Mkl
23456
23456
23456
78905
78905
34443
23456
34443
100
101
102
100
101
100
103
101
Ss1
Ss2
Ss3
Ss7
Ss8
Ss2
Ss4
Ss3
70
20
45
67
55
68
34
70
Suppose we want to add the details of a new student in to the college table.
That is student- 800, hjk, 50 to AE department.
In our design we need a tuple with values on all attributes of college schema. Thus we must repeat
the dhod and dphone and we must add the tuple
AE, bcd, 34443, 800, hjk, 50
In general, the Dhod and Dphone for a department must appear once for each student admitted to
that department.
The repetition of information is very much undesirable. Repeating information wastes space. Also
it complicates the database. Suppose the phone number of department CS changes from 23456 to 56789.
Under this design many tuples of college relation needs to be changed. So updates are very costly in this
design. When we perform update on this table, we must ensure that every tuple corresponding to CSdepartnment is updated. Otherwise our table will show 2 different phone number values.
By observing this, we can say that this design of our table or database is bad.
We know that a department has a unique value of phone number, so given a department name we can
uniquely identify the phone number value.
We know that a department has many students, so given a department name we cannot uniquely
determine the stdid. In other words we can say that the functional dependency dname dphone holds on
college schema. But we cannot say that there is a functional dependency dname stdid exists.
The fact that the department has a particular value for phone no., and the fact that dept has a
student are independent, these facts can be best represented in separate tables. We will see that we can usefunctional dependencies to specify formally when a database design is good.
Another problem with the college relation is that we cannot represent directly the information
related with a department ( dname, dhod, dphone) if there are no students in that department. This is
because tuples in college relation requires values for stdid, stdname, stdmarks.
One solution for this is to use null values. But these null values are difficult to handle. If we do not
want to deal with null values, we can create department information only when the first student is
admitted to that department. And if all students from that department go out, then we have to delete allinformation on that department. But this situation is undesirable.
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
6
8/6/2019 db mod 4
7/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Then some other problems that can occur isupdate anomalies or problems in updates and lossy join
decompositions.For example if we consider the student table
Student ( stdid, branch, name, marks, hod, deptphoneno)
Student
Stdid Branch Name Marks Hod Deptphoneno
100
101
102
105
Cs
Cs
Ec
Ec
Abc
Bcd
Sad
Abc
60
70
80
10
Def
Def
Ghj
Ghj
567890
567890
123456
123456
In this table we can see that there is repetition of information. Also we can see that there is a particularperson as hod for each branch. If all the students details are stored in this table we can see that if there are
100 students in each branch the hod s name will be repeated 100 times. Also the department phone no
will also be repeated 100 times. Suppose the hod of a particular branch changes. Then we have to update
the hod field of each branch. If there are 100 tuples corresponding to each branch then all those tuples
have to be updated corresponding to the hod field. This is the case with deptphoneno also. If we want to
change the phone no of a particular department, it also has to be changed for all these tuples. This is called
update anomalies.
Lossy join decomposition is another pitfall in the relational database design. This has been explained withfourth normal form.
Functional dependency ( Ref: navathe)
This is a very important concept in the relational database design. A functional dependency is a
constraint between 2 sets of attributes from the database. First we can see an example.
Consider the student table.
Student
Stdid Sname Marks Rn Branch Sem Hod Grade
100 Anil 50 1 Cs 3 Abc D
101 Binil 80 2 Cs 3 Abc A
102 Cinil 70 3 Cs 3 Abc B
103 Dinil 80 4 Cs 3 Abc A
We are considering the student table and our assumptions are on a real world view of the student.
We can see that the keys or candidate keys of the table are stdid and (branch, sem, rn). We knowthat a key means for each tuple the value of the key attribute or column should be distinct. For example
stdid, for each row or tuple in the student table, stdid value should be different. Then the key (branch,
sem, rn). In this case also the 3 values for these three attributes taken together are distinct for each tuple or
row. That is these groups of 3 values are distinct for each tuple or row.
Stdid
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
7
8/6/2019 db mod 4
8/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
100
101102
103
108
branch sem rn
cs 3 1
cs 3 2
cs 3 3
cs 5 1
cs 5 2
ec 3 1
ec 3 2
ec 3 3
ec 5 1
ec 5 2
we can see that the key values are distinct for each row.
If we say
Stdid marks
This is called a functional dependency. That is stdid functionally determines marks.
Suppose in the above table the values for the attributes are
Stdid marks
100 80
101 85
102 70
103 70104 85
108 70
109 80
Any way stdid values are different for each row since it is a candidate key. In this we can see that
for each stdid value, there is a unique marks value. It means if the stdid is 102, its correspondingmarks value is always 70 in this student table. This means that the value of the marks attribute of a
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
8
8/6/2019 db mod 4
9/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
tuple in student depend on or are determined by the values of the stdid component or we can say that the
values of the stdid component of a tuple uniquely (functionally) determines the values of the marks
attribute. We can say that there is a functional dependency from stdid to marks or that marks isfunctionally dependent on stdid. The attribute stdid is called the left hand side of the FD and marks is
called the right hand side.
We can write other functional dependencies as
Stdid sname
Stdid rn
Stdid sem
Stdid branch
Stdid hod
Stdid grade
Also we can write as
Stdid sname, marks, rn, branch, sem, hod, grade
We can see that this is correct. We have written the above sets because stdid is a key attribute.
We can also write
Branch, sem, rnmarks
We can write it because the left hand side is a key attribute.
Branch sem rn marks
Cs 3 1 50
Cs 3 2 60
Cs 3 3 70
Cs 3 4 50Ec 3 1 50
Ec 3 2 20
Ec 3 3 30
On looking on to this we can say that
(branch, sem, rn) functionally determines marks.
Also we can write
Branch, sem, rn stdid
Branch, sem, rn sname
Branch, sem, rn marks
Branch, sem, rn hod
Branch, sem, rn grade
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
9
8/6/2019 db mod 4
10/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Or together
Branch, sem, rn stdid, sname, marks, hod, grade
Since these 2 attributes are keys for student, we have written these 2 functional dependencies.
Stdid branch, sem, rn, sname, marks, hod, grade
Branch, sem, rn stdid, sname, marks, hod, grade
Of we look on to that table again, we can find other functional dependencies.
For example
Stdid branch hod
100 cs abc
101 cs abc
103 cs abc
104 cs abc
101 ec bcd103 ec bcd105 cs abc
104 ec bcd
if we think, we can find that for each branch there is only one hod or for each value of
branch there is a unique hod.
We can write as
Branch hod
Then if we take marks and grade, suppose the mark is 80. suppose the grade is A for mark
80 and above. We can see that whenever mark 80 comes grade will be A.
So for each value of mark there is a unique grade.
Stdid marks grade
100 50 D
101 80 A
102 85 A
103 50 D
104 60 C
105 75 B
106 60 C
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
10
8/6/2019 db mod 4
11/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
so we can write
marks grade
so we can say that the following functional dependencies hold in the student relation.
Stdid branch, sem, rn, sname, marks, hod, grade
Branch, sem, rn stdid, sname, marks, hod, grade
Branch hod
Marks grade
So in the student schema we are representing these functional dependencies as
Student
Stdid Sname Marks Rn Branch Sem Hod Grade
A functional dependency (FD) denoted by X Y between 2 sets of attributes X and Y thatare subsets of R specifies a constraint on the possible tuples that can form a relation state r of R. the
constraint is that for any two tuples t1 and t2 in r that have t1[X] = t2[X], we must also have t1[Y]=
t2[Y].
This means that the values of Y component of a tuple in r depend on or are determined by the
values of the X component. . Or in other words, the values of X component of a tuple uniquely or
functionally determine the values of the Y component.
We are saying that there is a functional dependency from X to Y or that Y is functionallydependent on X. the abbreviation for functional dependency is FD. The set of attributes X is called left
hand side of FD, and Y is called right hand side of FD.
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
11
8/6/2019 db mod 4
12/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
A functional dependency is a property of the relation schema R, not of a particular relation state r
of R. So an FD cannot be automatically determined from a given relation but it must be explicitly defined
by someone who knows the meaning or semantics of the columns of relation R.
Inference rules for functional dependencies
Normally when designing a table, the database designer specifies a certain set of functional
dependencies that are applicable to the table. From these set of functional dependencies we can deduce orinfer additional functional dependencies. There are certain rules for inferring additional FDs. The set of all
such dependencies is called closure of F and is denoted by F+.
For example suppose that we specify the following set F of obvious FDs on a relation schema
R ( empid, empname, dob, address, deptnumber, deptname, deptmngrid)
The set of FDs areEmpid empname, dob, address, deptnumber
Deptnumber deptname, deptmngrid
We can deduce additional FDs as
Empid deptname, deptmngrid
Empid empid
Deptnumberdeptname
To determine a systematic way to infer dependencies from the given set of FDs we can use a set of
inference rules.
Armstrong s inference rules
The following set of rules is well known inference rules for FDs
1. Reflexive rule
If Y X, then X Y
2. Augmentation rule
X Y we can infer XZ YZ
3. Transitive rule
X Y, Y Z , we can infer X Z
4. Decomposition rule
X YZ , we can infer X Y, X Z
5. union rule
X Y, X Z we can infer X YZ
6. pseudotransitive ruleX Y, WY Z we can infer WX Z
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
12
8/6/2019 db mod 4
13/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Trivial and non trivial functional dependencies
In a functional dependency X Y , if Y X then it is a trivial FD.
Otherwise it is non trivial.
For example
A B C
Q
E
R
Q
T
UR
L
J
B
L
B
GB
M
N
Y
J
D
PY
The FD s are A B
C A
These two are non trivial functional dependencies.We can also write
A, B B
A, B, C A, C
These are trivial functional dependencies because RHS is a subset of LHS.
Normally database designers first specify a set of functional dependencies for a table. Then Armstrongs
inference rules can be used to deduce additional FDs. For this purpose we can use the following
algorithm.
We are given a set of Fds for a relation R. we are going to find the additional Fds by finding the closure of
X, that is right hand side of each FD.
Determining X+, the closure of X under F
X+ = X;Do
OldX+ = X+;
For each functional dependency Y Z in F do
If Y X+ then X+ = X+ U Z
For example
For relation R ( eid, ename, projno, projlocation, hours)
We are given F
Eid ename
Projno projname, projlocationEid, projno hours
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
13
8/6/2019 db mod 4
14/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Using above algorithm we can find the closure sets foreach LHS of the FDs.
eid+ = { eid, ename}
projno+ = { projno, projname, projlocation }eid, projno + = { eid, projno, ename, projname, projlocation, hours }
Normal forms ( ref: Dbms by Navathe)
The normal forms or normalization process was first proposed by Codd. It takes a relation schema
or a set of tables through a series of tests and it checks whether the database satisfies a certain normal
form. Codd proposed 3 normal forms.
First normal form
Second normal form and
Third normal form
Then a modification to the third normal form was proposed. That is called
Boyce Codd normal form
All these normal forms are based on functional dependencies.
Laterfourth normal form and fifth normal forms were proposed. They are based on multivalued
and join dependencies.We have already studied some drawbacks or pitfalls in relational database design. The main
drawbacks are repetition of information and inability to represent certain information. The purpose of
normalization is to analyze the given relation schemas or tables and based on functional dependencies and
candidate keys and remove the above said drawbacks from the database. If a relation schema or tables are
not satisfying the normal form tests, they are decomposed and new relations are made which satisfies thenormal form tests.
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
14
8/6/2019 db mod 4
15/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
We know the concept of candidate keys and primary key of a table.
Prime attributeAn attribute of relation schema R is called a prime attribute if it is a member of some candidate key
of R. an attribute is called non-prime if it is not a prime attribute- that is it is not a member of some any
candidate key.
For example
Student ( stdid, branch, sem, rn, sname, marks)
Branch is a prime attribute because it is a member of the candidate key ( branch, sem, rn).
Like wise sem is a prime attribute.
Stdid is a prime attribute because it is itself a candidate key.
Marks is not a prime attribute.
Also sname is not a prime attribute.
First normal form (1NF)
It is defined to disallow multivalued attributes, composite attributes and their combinations.
It states that domain of an attribute must include only atomic (indivisible) values and that value of
any attribute in a tuple must be a single value from the domain of the attribute.
So first normal form disallows having as set of values, tuple of values or combination of both as an
attribute value for a single tuple.
We can explain this using an example.
Consider the student relation.
Student ( stdid, sname, saddress, phoneno)
Student
Stdid Sname Saddress phoneno
100
102
Abc
Bcd
No. 20, KTM, Kerala
No. 35, EKM, Kerala
567890
564476
234789
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
15
8/6/2019 db mod 4
16/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
105 Def No. 41, KTM, Kerala 123245
367840
300898
In this relation we can see there are 3 tuples. But there is a composite attribute saddress having
three fields, house no, city and state .
Then we can see a multivalued attribute phoneno. We can see that student 102 has 2 phones. 103has 3 phones.
According to 1NF, all these multivalued and composite attributes are not allowed.
We have to find a way to to normalize this schema to first normal form.
First we are solving the problem caused by multi valued attributes, here phoneno.
We are removing the attribute phoneno and place it in a separate table or relation along with the primary
key of student that is stdid.
Then we get
Student1 ( stdid, sname, saddress)
Std_phone ( stdid, phoneno)
Here the
primary key of student1 is stdid and
Primary key of std_phone is (stdid, phoneno )
Student1
Stdid Sname Saddress
100
102
105
Abc
Bcd
Def
No. 20, KTM, Kerala
No. 35, EKM, Kerala
No. 41, KTM, Kerala
Std_phone
Stdid Phoneno
100
102
102105
105
105
567890
564476
234789123245
367840
300898
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
16
8/6/2019 db mod 4
17/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Then next we have to deal with composite attributes . we can expand the saddress to 3 attributes
as add_house, add_city, add_state. The nthe relations will be
Student1A
Stdid Sname Add_house Add_city Add_state
100
102
105
Abc
Bcd
Def
No. 20
No. 35
No. 41
KTM
EKM
KTM
Kerala
Kerala
Kerala
Std_phone
Stdid Phoneno
100
102
102
105
105
105
567890
564476
234789
123245
367840
300898
We can see that student1A and std_phone are in first normal form (1NF).
Second normal form (2NF)
Before seeing second normal form, we have to learn some definitions
Partial and full functional dependencies
A functional dependency X Y is a full functional dependency if removal of an attribute A from X (that
is A subset of X) means that the dependency does not hold any more.
A functional dependency X Y is a partial functional dependency, if some attribute A from X is
removed, the dependency still holds.
For example
Student (stdid, branch, sem, rn, name, marks, hod)
We know that the following FDs are correct for this table.
FD1 -- stdid branch, sem, rn, name, marks, hod
FD2 -- branch, sem, rn stdid, name, marks
Also
FD3 -- branch, sem, rn hod
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
17
8/6/2019 db mod 4
18/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
In FD2, if we remove the attribute sem from the LHS or X part, we can see the
Branch, rn does not functionally determine stdid, name, marks, hod. This is the case if we remove branch
and rn. So this FD2 is called a full functional dependency.In FD3, if we remove the attribute sem and rn we cn see that the FD still holds.
That is branch hod is also a functuional dependency. So this FD3 is a partial functional dependency.
A relation schema or a table, R is in second normal form, if every non prime attribute A in R is fully
functionally dependent on the primary key of R.
For example
Student1 ( stdid, branch, sem, rn, name, hod, marks, grade )
FD1
FD2
FD3
FD4
We can see that the student1 relation is not in second normal form, because of FD3. that is
Branch hod
It violates 2NF because the non prime attribute hod is partially dependent on the candidate key (branch,
sem, rn ).This is a partial functional dependency because
Branch, sem, rn hod. (if we remove the attribute sem, rn then also the FD holds).
Other non prime attributes are name, marks,grade. They are fully functionally dependent on the keys.
Stdid name
Branch, sem, rn name
Stdid marks
Branch, sem, rn
marksStdid grade
Branch, sem, rn grade
Grade marks does not violate 2NF, because grade is not a prime attribute.
As a next step we have to normalize student1 to 2NF.
We are decomposing it by
Removing attribute hod which forms a partial dependency from student1 and put it in another relation.
That is we are decomposing student1 to student1A and student1B
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
18
8/6/2019 db mod 4
19/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Student1A
Stdid Branch Sem Rn Name Marks Grade
FD1
FD2
FD3
Student1B
Branch Hod
So we have decomposed student1 into
student1A (stdid, branch, sem, rn, name, marks, grade) andstuident1B ( branch, hod)
This is in 2NF.
Third normal form (3NF)
3NF is based on the concept of transitive dependency. Transitive dependencies are not allowed in3NF.
Transitive dependency means, if in a relation or a table if XY and YZ hold, then X Z is also
a functional dependency that holds on R. Here X, Y, Z are attributes of the table and also Y should not be
a candidate key or a subset of any key (prime attribute) of the table R.
we can see this by an example.
Student3
Stdid Branch Sem Rn Name Marks Grade
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
19
8/6/2019 db mod 4
20/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
We have shown 3 FDs here. That is
Fd1 Stdidgrade
Fd2 Stdidmarks
Fd3 Marks grade
We can see that marks is not a prime attribute of student3.
Stdid grade is a transitive dependency because of Fd2 and Fd3.
This is not allowed in 3NF.
A relation R is said to be in 3NF, if R is in 2NF and also no non prime attribute of R is transitively
dependent on the key of R.
The above relation schema student3 is in 2NF, since there are no partial dependencies on a key exists. But
it is not in 3NF because of the transitive dependency stdid grade via e marks.
We can normalize student3 by decomposing it in to two 3NF relation schemas,
Student3A and student3B as follows.
Student3A (stdid, branch, sem, rn, name, marks)
Student3B (marks, grade)
Student3
Stdid Branch Sem Rn Name Marks Grade
Student3A______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
20
8/6/2019 db mod 4
21/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Stdid Branch Sem Rn Name Marks
Student3B
Marks Grade
We can see that this is in 3NF.
Example 2:
Emp_dept
Ename Ssn Bdate Address Dnumber Dname Dmgrssn
We can see that the above schema is not in 3NF because the transitive dependency, but it is in 2NF.
Ename dmgrssn is there. Also
Ename dname is there. (through dnumber)
We can decompose this in to
ED1
Ename Ssn Bdate Address Dnumber
ED2
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
21
8/6/2019 db mod 4
22/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Dnumber Dname Dmgrssn
See that this table is in 3NF.
General definitions of second and third normal forms
General definition of second normal form
A relation schema R is in 2NF, if every non prime attribute A in R is not partially dependent on
any key of R. we can see an example.
LOTSPropertyid Countyname Lot Area Price Taxrate
Fd1
Fd2
Fd3
Fd4
We can see that the LOTS schema violates the general definition of 3NF because tax rate is
partially dependent on the candidate key (county name, lot) due to FD3.
To normalize LOTS in to 2NF, we decompose it in to 2 relations, Lots1 and Lots2. we construct
Lots1 by removing the attribute tax rate that violates 2NF and placing it with county name (the LHS of
FD3 that causes partial dependency) in to another relation Lots2. both Lots1 and Lots2 are in 2NF. Wecan see that FD4 does not violate 2NF.
LOTS1
Propertyid Countyname Lot Area Price
Fd1
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
22
8/6/2019 db mod 4
23/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Fd2
Fd4
LOTS2
County
name
Tax rate
fd3
The relations LOTS1 and LOTS2 are in second normal form.
General definition of third normal form (3NF)
A relation schema R is in 3NF if whenever a non-trivial functional dependency
X A holds in R, eithera) X is a super key of R
OR
b) A is a prime attribute of R.
If any of these conditions hold we can say that the relation scema is in 3NF.
Using this we can directly analyse a relation scheam whether it is in 3NF.
Consider the LOTS relation.
LOTS
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
23
8/6/2019 db mod 4
24/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Propertyid Countyname Lot Area Price Taxrate
Fd1
Fd2
Fd3
Fd4
According to this LOTS is not in 3NF, because FD3 and FD4 violates the conditions.
We can see that FD1 and FD2 are in 3NF.
But in FD3
County name taxrate
County name itself is not a super key and also tax rate is not a prime attribute.
Also in FD4
Area price
Area is not a super key and also price is not a prime attribute.
So LOTS is not in 3NF.
To normalize LOTS we decompose it into LOTS2 and LOTS1A and LOTS1B.
We construct LOTS1A by removing the attribute price that violates 3NF and LOTS2 by removing the
attribute taxrate that also violates 3NF.
LOTS2
County
name
Tax rate
Fd3
LOTS1A
Propert id Countyname Lot Area
Fd1
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
24
8/6/2019 db mod 4
25/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Fd2
LOTS1B
Area Price
Fd4
We can see that all the above relations LOTS2, LOTS1A, LOTS1B are in 3NF.
A relation schema R is in 3N if every non prime attribute of R meets the following conditions .
It is fully functionally dependent on every key of R.
It is non transitively dependent on every key of R.
Boyce Codd Normal form (BCNF)
It was first proposed as a simpler form of 3NF, but it was founf to be stricter than 3NF. This isbecause every relation in BCNF is also in 3NF. However a relation in 3NF may not be in BCNF.
A relation schema R is in BCNF if whenever a non trivial functional dependency X A holds in
R, then X is a superkey of R. the only difference between BCNF and 3NF is that the condition (b) of 3NF
(which allows A to be prime) is absent from BCNF.
Suppose we have a table Lots1A
Lots1A
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
25
8/6/2019 db mod 4
26/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Propertyid Countyname Lot Area
Fd1
Fd2
Fd5
Here we can see that the relation Lots1A is not in BCNF, but it is in 3NF.
FD5 violates BCNF because area is not a superkey.Fd1 and Fd2 satisfies BCNF because the LHS are
super keys.So we remove the attribute (county name) and place it in another relation.
Lots1AX
Propertyid Area Lot
Lots1AY
Area Countyname
These relations are in BCNF.
Every relation in BCNF is also in 3NF. Every relation in 3NF may not necessarily be in BCNF.
For example
R
A B C
Fd1
Fd2
Here the relation R is in 3NF. But we can see that it is not in BCNF because C is not a super key of R.
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
26
8/6/2019 db mod 4
27/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Exercise:
1. consider the relation R = { A, B, C, D, E, F, G, H, I, J } and the set of functional dependencies
A, B CA D, E
B F
F G, H
D I, J
What is the key of R?
Decompose R in to 2NF, then 3NF relations.
Answer
A B C D E F G H I J
Fd1
Fd2
Fd3
Fd4
Fd5
From the figure, the key of R is (A, B).
This is not in 2NF because in fd2, fd3, there is partial functional dependency. So we remove attributes D,
E, F. but we can see
A D
D I
D JSo we have to remove I, J
B F
F G
F HSo we have to remove G, H.
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
27
8/6/2019 db mod 4
28/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
So we get relations 2NF
R1
A B C
Fd1
R2
A D E I J
Fd2
Fd5
R3
B F G H
Fd3
Fd4
The above relations R1, R2, R3 are in 2NF because there are no partial functional dependencies
and also it is in 1NF.
DECOMPOSITION TO 3NF
We can take each of R1, R2 and R3 and analyse them
R1A B C
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
28
8/6/2019 db mod 4
29/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Fd1
R1 is in 3NF because in Fd1 (A,B C), A,B is a super key.
R2
A D E I J
Fd2
Fd5
R2 is not in 3NF because
Fd2 ( A D,E) is in 3NF because A is a super key.
Fd5 ( D I, J) is not in 3NF because D is not a super key and also D is not a prime attribute.
So we remove I and J from R2.
We decompose R2 as
R2A
A D E
Fd2
R2B
D I J
fd5
R2A and R2B are in 3NF.
Consider R3
R3
B F G H
Fd3
Fd4______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
29
8/6/2019 db mod 4
30/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
We can see fd3 satisfies 3NF because B is a super key.
Fd4 is not in 3NF beause F is not a super key and also F is not a prime attribute.
We decompose it into 2 relations. R3A, and R3B.
R3A
B F
Fd3
R3B
F G H
Fd4
R3A and R3B are in 3NF.
So we get the final set of relations as
R1
A B C
Fd1
R2A
A D E
Fd2
R2B
D I J
fd5
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
30
8/6/2019 db mod 4
31/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
R3A
B F
Fd3
R3B
F G H
Fd4
Exercise 2:
Given R= { A, B ,C, D, E, F, G, H, I, J}
Functional dependencies are
AB C
B,D E, F
A, D G, H
A I
H J
Find the key and normalise to 2nf and then to 3nf .
RA B C D E F G H I J
Fd1
Fd2
Fd3
Fd4
Fd5
The key of R is (A, B, D)
Normalizing to 2nf
Fd2, fd3, fd4 are not in 2NF and then fd5.We decompose it into______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
31
8/6/2019 db mod 4
32/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
R1
A B C
fd1
R2
B D E F
Fd2
R3 R3 R3
Fd3
Fd5
R4
A I
Fd4
R1, R2, R3, R4 are in 2 nf..
Normalization to 3nf.
Note that R3 is not in 3 nf because of fd5 ( H J)So we decompose it into
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
A D G H J
32
8/6/2019 db mod 4
33/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
R3A and R3B.
R3A
Fd3
R3B
So we get the relations in 3nf as
R1
A B C
fd1
R2
B D E F
Fd2
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
A D G H
H J
33
8/6/2019 db mod 4
34/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
R3A
R3A
Fd3
R3B
R4
A I
Fd4
Higher level normal forms
We have already studied 4 different normal forms. That is 1NF, 2NF, 3NF and BCNF. They arebased on the concept of functional dependencies.
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
A D G H
H J
34
8/6/2019 db mod 4
35/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
We are going to see other kinds of dependencies. They are multivalued dependencies and join
dependencies.
Fourth normal form is based on the concept of multivalued dependencies.Fifth normal form is based on join dependencies.
Multivalued dependencies and 4NF
Suppose we are given the table named UCFX
UCFX
Course Faculty Textbook
Maths
DBMS
JYT
RVT
SZ
Grewal
Kreyzig
Navathe
Silbz
Desai
We can see that this table is not normalized. (also not in 1NF)
The meaning of the above table is that the specified course is taught by any of the specified teachers and
for learning this course any of the specified text books can be used.
So for a given course, there can exist any number of corresponding teachers and any number of
corresponding textbooks. Also we can see teachers andd textbooks are independent of each other. It is nota matter who actually teaches any course, the same texts can be used. Let us convert this to 1NF.
CFX
Course Faculty Textbook
Maths
Maths
Maths
MAths
DBMSDBMS
DBMS
JYT
JYT
RVT
RVT
SZSZ
SZ
Grewal
Kreyzig
Grewal
Kreyzig
NavatheSilbz
Desai
The key of the table is (course, faculty, text book)
There is so much repetition or redundancy in this table. Also there is so much update anomalies.
Suppose for teaching maths a new faculty comes, it is necessary to create 2 new tuples , one for each of
the 2 text books.
See that it is not necessary to include all faculty textbook combinations for a given course. That
is 2 tuples are sufficient to show that Maths course has 2 faculties and 2 text books. Here the problem isthat which 2 tuples are to be taken among the 4 tuples. We cannot take a decision.
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
35
8/6/2019 db mod 4
36/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
The difficulty here is caused by the fact that faculties and textbooks are independent of one
another. We can see that this will be improved if we decompose CFX into 2 tables.
CF
Course Faculty
Maths
Maths
DBMS
RVT
JYT
SZ
CX
The above relations are correct. But we can see that the
decomposition cannot be made on the basis of functionaldependencies, because there are no functional dependencies in the
relation.So we introduce multi valued dependencies (MVDs) in the
relation. MVDs are a generalization of FDs, in the meaning that every FD is an MVD. But every MVD is
not an FD.
There are 2 MVD s in the relation CFX.
CFX
Course Faculty Textbook
Maths
Maths
MathsMAths
DBMS
DBMS
DBMS
JYT
JYT
RVTRVT
SZ
SZ
SZ
Grewal
Kreyzig
GrewalKreyzig
Navathe
Silbz
Desai
Course Faculty
Course Textbook
(Double arrows are used here. Read as
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
Course Textbook
Maths
Maths
DBMS
DBMS
DBMS
Grewal
Kreyzig
Navathe
Silbz
Desai
36
8/6/2019 db mod 4
37/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
course multidetermines faculty
or faculty is multidependent on course)
we know that a course does not have a single corresponding faculty, ie..
functional dependency course faculty does not hold. But each course has a well defined set of
corresponding faculties. By well defined here means that for a given course (maths) and a given text book
(grewal) the set of faculties t (RVT, JYT) matching the pair (maths, Grewal) in CFX depends on the valueof maths alone. It makes no difference what particular value of text book we choose.
The second MVD can also be interpreted like this.
Definition of multi valued dependency
Let R be a table, and let A, B, C be arbitrary subsets of the set of attributes of R.
Then we say that B is multidependent on A , A B.
If and only if the set of B values matching a given ( A value, C value pair) in R depends only on the A
value and is independent of the C value.
MVDs always go together in pairs. That is given the table R (A, B, C), the MVD
A B holds if and only if A C also holds.So we can write as
A B | C
That is
Course faculty | text book
We said that every FD is an MVD. But every MVD is not an FD. In our CFX table, the
problem is that if we want to insert one more faculty for maths, we have to insert 2 tuples. (because of 2
text books). These 2 tuples are necessary to maintain our MVDs.
Trivial multivalued dependency
An MVD, X Y in R is said to be trivial multi valued functional dependency if a) Y
is a subset of X or
b) X U Y = RIf either of these conditions holds then it is a trivial MVD.
Otherwise it is a non trivial multi valued dependency.
For example
CF
Course Faculty
Maths
Maths
DBMS
RVT
JYT
SZ
Here course faculty is a multi valued dependency. It is a trivial MVD because if we union both
the attributes of this MVD we get the relation CF. (course U Faculty = CF ).
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
37
8/6/2019 db mod 4
38/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Example 2:
CFX
Course Faculty Textbook
MathsMaths
Maths
MAths
DBMS
DBMS
DBMS
JYTJYT
RVT
RVT
SZ
SZ
SZ
GrewalKreyzig
Grewal
Kreyzig
Navathe
Silbz
Desai
Here course faculty and course textbook are non-trivial MVDs because both the
conditions does not hold for these MVDs in the CFX table.
Inference rules for multivalued dependencies
We have seen the inference rules for functional dependencies. Like that we have some rules for
multivalued dependencies.
Suppose R is a table. Suppose W, X, Y, Z are the columns in that table.
Complementation rule for MVDs
If XY, then X( R- (X U Y ) )
Augmentation rule for MVDs
If XY and Z W then WXYZ
Transitive rule for MVDs
XY, YZ then X( Z Y )
Replication rule FD to MVD
XY then XY
Coalescence rule for FDs and MVDs
If XY and there exists W with the properties that a) W Y is empty b) WZ and c) Z Y, then XZ.
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
38
8/6/2019 db mod 4
39/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Fourth normal form
This is based on multivalued functional dependencies.
A relation schema R is in 4NF with respect to a set of dependencies F (that includes FDs and
MVDs) if, for every non trivial multivalued dependency
X Y, X is a super key of R.
Consider the table CFX.
CFX
Course Faculty Textbook
MathsMathsMaths
MAths
DBMS
DBMS
DBMS
JYTJYTRVT
RVT
SZ
SZ
SZ
GrewalKreyzigGrewal
Kreyzig
Navathe
Silbz
Desai
The table or relation CFX is not in fourth normal form because
The MVDs course textbook and
Course
facultyare not satisfying any of the 2 conditions of fourth normal form. Because the MVDs are non trivial MVDs
and course is not a superkey of the table. Also in the first mvd
course U textbook== CFX.
In the second mvd
Course U faculty == CFX
We are decomposing it into CF and CX.
CF
Course Faculty
Maths
Maths
DBMS
RVT
JYT
SZ
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
39
8/6/2019 db mod 4
40/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
CX
We can see that CF and CX are in 4NF because
Course
faculty is trivial in CF. (because course U faculty = CF)Course textbook is trivial in CX. (because course Utextbook = CX)
Example 2:
Consider the table emp
EMP
Ename Pname Dname
Smith
Smith
Smith
SmithBrown
Brown
Brown
Brown
Brown
BrownBrown
Brown
Brown
Brown
Brown
Brown
X
Y
X
YW
X
Y
Z
W
XY
Z
W
X
Y
Z
John
Anna
Anna
JohnJim
Jim
Jim
Jim
Joan
JoanJoan
Joan
Bob
Bob
Bob
Bob
In this table brown has 4 dependents and he works on 4 deifferent projects. Smith works
on 2 projects and has 2 dependents. We can see there are 16 tuples in this table. If we decompose emp
table in to two tables
Emp_projects and emp_dependents, we need to store only 11 tuples in both the tables.The emp relation is not in 4NF because the MVDs
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
Course Textbook
Maths
MathsDBMS
DBMS
DBMS
Grewal
KreyzigNavathe
Silbz
Desai
40
8/6/2019 db mod 4
41/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Ename pname and ename dname are not in 4NF.
We decompose it into 2 tables.
Emp_projects
Ename Pname
Smith
Smith
Brown
Brown
Brown
Brown
X
Y
W
X
Y
Z
Emp_dependents
Ename Dname
Smith
Smith
Brown
Brown
Brown
Anna
John
Jim
Joan
Bob
These 2 tables are in 4NF. This is because
in the first table emp_projects
enamepname is a trivial MVD. ( ename U pname = emp_projects)
in the second table emp_dependents
ename dname is a trivial MVD. ( ename U dname = emp_dependents )
Lossless join decomposition
Consider the example database
EMP
Ename Pname Dname
Smith
SmithSmith
X
YX
John
AnnaAnna
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
41
8/6/2019 db mod 4
42/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
smith Y John
Suppose we decompose the EMP table into Emp_projects and Emp_dependents.
Emp_projects
Ename Pname
Smith
Smith
X
Y
Emp_dependents
Ename Dname
Smith
Smith
John
Anna
Suppose we again join these tables we can see that we get the original EMP table. So this decomposition
of EMP table in to Emp_projects and Emp_dependents is a lossless join decomposition because nothing is
lost after a decomposition.
Consider another table SUPPLY.
SUPPLY
Sname Partname Projname
SmithSmith
Adamsky
Walton
Adamsky
Adamsky
Smith
BoltNut
Bolt
Nut
Nail
Bolt
Bolt
ProjxProjy
Projy
Projz
Projx
Projx
Projy
Suppose we decompose the supply table in to two that is R1 and R2.
We get
R1
Sname Partname
Smith
Smith
Adamsky
Bolt
Nut
Bolt
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
42
8/6/2019 db mod 4
43/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Walton
Adamsky
Nut
Nail
R2
Sname Projname
Smith
Smith
Adamsky
Walton
Adamsky
Projx
Projy
Projy
Projz
projx
If we again join these two tables R1 and R2 we will get
Sname Partname Projname
Smith
SmithSmith
Smith
Adamsky
Adamsky
Adamsky
AdamskyWalton
Bolt
BoltNut
Nut
Bolt
Bolt
Nail
NailNut
Projx
ProjyProjx
Projy
Projy
Projx
Projy
ProjxProjz
We can see that the join of these tables will not give our original table supply. So this is a lossy join
decomposition because after decomposing the Supply table we have lost some values. This we can see
from joining the decomposed tables.
Join dependencies and fifth normal form
In some cases there may be no lossless join decomposition of a table R into 2 tables but there may be a
lossless join decomposition into more than 2 tables.
For example in the supply table
SUPPLY
Sname Partname Projname
Smith
Smith
Adamsky
Walton
Adamsky
AdamskySmith
Bolt
Nut
Bolt
Nut
Nail
BoltBolt
Projx
Projy
Projy
Projz
Projx
ProjxProjy
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
43
8/6/2019 db mod 4
44/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
If we decompose the supply table in to 3 as
R1
Sname Partname
Smith
Smith
Adamsky
Walton
Adamsky
Bolt
Nut
Bolt
Nut
Nail
R2
Sname Projname
Smith
Smith
AdamskyWalton
Adamsky
Projx
Projy
ProjyProjz
projx
R3Partname Projname
Bolt
Nut
Bolt
Nut
Nail
Projx
Projy
Projy
Projz
projx
Here we can see that if we again join these tables R1, R2, R3 we will get the original table. We can see
that by joining just R1 and R2 will not get the supply table. But by joining all these 3 tables we will getthe supply table.
SUPPLY
Sname Partname Projname
Smith
Smith
Adamsky
WaltonAdamsky
Bolt
Nut
Bolt
NutNail
Projx
Projy
Projy
ProjzProjx
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
44
8/6/2019 db mod 4
45/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Adamsky
Smith
Bolt
Bolt
Projx
Projy
See that there are no functional dependencies in this supply table. Also we can see that there are no non-
trivial MVDs in this table that violates 4NF.
So we are moving to another type of dependency called Join dependency.
If a join dependency is present in a table we perform decomposition to fifth normal form. (5NF)
Here for the supply table the join dependency is specified by
JD (R1, R2, R3)
This is because by joining R1 and R2 and R3 tables we will get the original table Supply.
JD (R1, R2, R3) can also be written as
JD( (sname, partname), (sname, projname), (partname,projname) )
We can see that JD( R1, R2) is not valid for the supply table because on joining R1 and R2 we will not getthe Supply table.
Example 2.
Consider the table
EMP
Ename Pname Dname
SmithSmith
Smith
smith
XY
X
Y
JohnAnna
Anna
John
As we studied earlier had decomposed Emp table into
Emp_proj(ename, projname) and Emp_dept(ename, dname).
We have seen that on joining Emp_proj and Emp_dept we will get the original emp table. So we can
specify a join dependencyJD (Emp_proj, Emp_dept)
Trivial join dependency
For a table R, a join dependency specified as JD(R1, R2, R3) is trivial, if any of these Ri s is
the table R.
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
45
8/6/2019 db mod 4
46/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Fifth normal form
It is also called project join normal form.
A relation schema is in fifth normal form (5NF) , if for every nontrivial join dependency
JD( R1, R2, R3), every Ri is a superkey of R.
For example consider the table Supply
SUPPLY
Sname Partname Projname
Smith
Smith
Adamsky
Walton
AdamskyAdamsky
Smith
Bolt
Nut
Bolt
Nut
NailBolt
Bolt
Projx
Projy
Projy
Projz
ProjxProjx
Projy
The key of this table is (sname, partname, projname)
We have seen that it has a join dependency
JD { (sname,partname), (sname,projname), (partname,projname) }
Here the projections are (sname,partname), (sname,projname) and (partname,projname).
We can say that this table supply is not in 5NF because of this join dependencyEach of these projections do not form a superkey of supply.
Superkey of supply is (sname,projname,partname).
(sname,partname) is not a superkey.
(sname,projname) is not a super key.(partname,projname) is not a superkey.
So we have to normalise this table supply in to tables that satisfy 5NF.
We are decomposing the table supply by considering the JD. Take each of the projections in the JD andform tables as
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
46
8/6/2019 db mod 4
47/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
R1
Sname Partname
Smith
Smith
Adamsky
Walton
Adamsky
Bolt
Nut
Bolt
Nut
Nail
R2
Sname Projname
Smith
Smith
AdamskyWalton
Adamsky
Projx
Projy
ProjyProjz
projx
R3Partname Projname
Bolt
Nut
Bolt
Nut
Nail
Projx
Projy
Projy
Projz
projx
See that each of these R1, R2, R3 are in fifth normal form because there are no non trivial join
dependencies in each of these tables.
A join dependency is very difficult to detect in practice. So it is not normally applied in a database.
Example 2: (ref: DBMS by Vipin C. Desai)
Consider the table New_project_assignment
New_Project_assignment
Emp Proj Exp
BrentBrent
WorkstationWorkstation
User interfaceArtificial intelligence
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
47
8/6/2019 db mod 4
48/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Mann
Smith
King
Ito
Ito
Smith
Smith
Workstation
Workstation
Sql2
Sql2
Qbe++
Query systems
File systems
VLSI technology
Operating systems
Relational calculus
Relational algebra
Relational calculus
Database systems
Operating systems
Here there is a join dependency in this table
That is
JD{ (proj, Exp), (emp,exp), (emp,proj) }
This is because on joining these 3 projections (proj, Exp),
(emp,exp),
(emp,proj)
we will get the original table New_Project_assignment.
We can see that this JD is not a trivial join dependency. Also this table New_Project_assignment is not in
5NF because each of the projections in the JD is not a super key of the table.
The super key of the table is (emp, proj, exp).
The projections in the JD (proj, exp) is not a superkey.
(emp,exp) is not a super key.
(emp,proj) is not a super key.
So this table is not in 5NF.
We are normalizing this table to Fifth normal form by decomposing the table using the projections in the
JD.
That is
S1
Project Expertise
Work stationWork stationWork station
Work station
Sql 2
Sql 2
Qbe ++
Query systemsFile systems
User interfaceArtificial intelligenceVlsi technology
Operating systems
Relational calculus
Relational algebra
Relational calculus
Database systemsOperating systems
S2Employee Expertise
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
48
8/6/2019 db mod 4
49/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
Brent
Brent
Mann
King
Ito
ItoSmith
smith
User interface
Artificial intelligence
Vlsi technology
Relational calculus
Relational algebra
Relational calculusDatabase systems
Operating systems
S3
Employee Project
Brent
Mann
King
ItoIto
Smith
Smith
Smith
Work station
Work station
Sql 2
Sql 2Qbe ++
File systems
Query systems
Work station
We can see that each of these S1, S2 and S3 are in 5NF because there are no non trivial join dependencies
in this table.
Example 3: (ref: Dbms by C. J. Date )
SPJS P J
S1
S1
S2
S1
P1
P2
P1
P1
J2
J1
J1
J1
If we decompose the table into SP (S, P), PJ(P, J) and JS(J, S) and we again perform join on these 3
tables we will get the original table SPJ( s, p, j).
So there is a join dependency
JD (SP, PJ, JS)
We can say that this join dependency of table SPJ does not satisfy the 5NF because each of the projections
SP, PJ, JS are not the super keys of table SPJ.
So decompose SPJ into 3 tables as
SP
S P
S1 P1
______________________________________________________________________________________________________________ Department of IT Mangalam College of Engineering, Ettumanoor
49
8/6/2019 db mod 4
50/50
RT503 Database Management Systems Module 4
______________________________________________________________________________________________________________
S2
S2
P2
P1
PJ
P J
P1P2
P1
J2J1
J1
JS
J S
J2
J1J1
S1
S1S2
Each of these tables SP, PJ, JS are in 5NF because there are no non trivial join dependencies in these
tables.
50