Database Designing Concepts Data Base A Data Base is shared
collection of logical interrelated data. A Data Base System Every
organization needs to record data relevant to their every day
activates. The organization can choose from the data collected
& stored some of these data in an electronic Data Base. Eg: A
university needs to record data to help in the activates of
teaching & learning therefore three record what kind of
students & lectures they have what are the courses &
modules they are running which lecturers are teaching what modules.
One the data is entered in to a database it could be utilization to
get complete & accrete informations such as list of students
who have enrolled for each modules. this can help to make decision
on room utilization .etc.. Data vs. Information The Word data
refers to facts concerning things such as people, objects or
events. Information is data that have been processed &
presented in a from suitable for human interpretation.
Disadvantages of manual system 1. A Constant stream inter company
paper work 2. Telephone calls or faxes are required to communicate
changes & keep the file synchronize. 3. These systems cannot
provide what if type of questions. 4. Managers cannot easily obtain
summarized information required for decision making. 5. Duplicate
data can exist throughout organization resulting in lack of
consistency. Disadvantages of file processing system 1. There is
limited data sharing 2. Inconsistency of data 3. Duplication of
data can occur. 4. Excessive program maintenance 5. Poor
enforcement of standard. Advantages of Database 1. Better data
sharing 2. Better security & integrity 3. Reduced redundancy 4.
The program & data are independent 5. Multiple views of
data.
Disadvantages of Database 1. New specialized persons are
required. 2. Need to get 3. Problems in data sharing explicit
backup. Database Approaches Database approach emphasizes the
integration & sharing of data. There are 2 designing
approaches. 1-process Driven Design Requirement Analysis
Process Design
Data Design
Implementation File processing system a process driven
approached was traditional being used with this approche
organization process such as order processins, inventory control,
pay roll are identified & analyzed. Process and data flows
between processed are described using tools such as DED. Designers
work backwards from the required outputs of the system to determine
the requirement unput. They can use flow chart into outputs.
Finally they design data file as a wide product of product of
process divan design. 2- Data Driven Design Requirement
Analysis
Data design Process Design
Implentation
It forcuses on entities person, Plese, events or concept about
which an organization choose to record data. This approche ideenti
files the attributes of the properties of these entities & the
relation ship among them. It also identities the business rules
that speifiese how the entities are managed or used. After creating
the suitable module of the data structure and related business
rules the designer developes the application reqired to manage the
data. Designers now have discovered a balance approche to combine
to different applications approprialely suitable. Database
development Enterprise Data module Requirement Analysis Conceptual
model Logical Data modle
Implementation
Physical Data Model
Conceptual data model This involves building a real world model
expressed interms of data requirements. The initiaset of
information & processing reqirements are gathered from the
users. The developments of the large databases conception modeline
may consis of view integration (combining the user views into onces
schema) The output from this stage is the entity relationship
digram. (3) Logical data Model This involves the building of the
real world expressed interms of a data model. Eg. Hierarchical,
Network ,Relational. This involves determind the concept of the
database using the conception model as input & transforming it
into the architectural data model.This is the mapping of the ER
model.to the relational model & caring the process of
normalization. The output of this stage is relational schema up or
set of table. Physical data model. This involves building a model
of the real world expressed interms of data structures & Acces
mechanisum available in a choosen Database management system.
This involved in a transformation of a logical data model
suitable for a specific software /Hardware specification. This is
usuall expressed in data dominion language & theoutput of this
stage is implementation plan. Entity An entity is the thing we
modeled & it is something about which we wish to store
information we name the entities with the singular noun. An entity
can be 1-Aperson Employer, student, customer etc. 2-A place It
could be a state country , region etc.3-An object chair,Table,
building etc. 4- An Event sales, Rigistration, Renewal. 5- A
concept Acount ,Course The organization. ER models are usually
constructed during the analysis phase of the DB development
process. Output of this stage is a coceptval data model expressed
in the form of a detailed ER digram. The ER digram consist of
entities relationships & attribules.
(4) Basic Systems of an ER digram.
Entity Attributes Registration Primary key Attributes. This is
the property for entity. Primary key
Relationship
Primary key
Eg: Student Relationshipjhhgvf Rel Student No
Attributes Name Add Tel No
ationship Association between entities. Can be specified using
relationship. There are 3 basic kind of relationship. 1- 1:1 (one
to one) 2.- M (one to Many) / (1:*) 3 M:M (Many to Many ) / (*:*)
1:1 1. (5)
Country 1
H 1 H 1 1 H
Capital
2.
Lock
Key
3
Husband
Wife
1:M/ / 1:*
1 B
*
Invoice
1.
Customer 1 H ave 1 Shop has *
2. . 3
Mother
Child
Sales Man
M: M /*:* PE . Student
* Regis ter
* Sales Man
Cannot allow M:M for introduced Link Entity with two 1:M
relationship.
(6)
Student 1 *** * Enrolment
Student OR Course Student
Course
Link Entity
Course
* Person Communi cate
* Country
Country C 1 Language
*
* 1 Person
(7) British counsil library Library items Book / CD DVD /
Enayclopedia Reception Membership Readers Libraraian
British Councial Library
H ave
Reception
Regi ster
Reader
CD DVD Library Item books
H ave
H ave
Encyclopedia Copy A 1.* Librararian Lending Collection D
eati
Accomerdation of ER model to a relationalData model. After
drawing a ER digram it has to be converted into relational schema
it can be done as follows. Convert each entity into a relational
table in your relational data model the identifier of the entity
becomes primary key of the table, all other attributes become
nonkey attributes of the table. (8) Eg. Student (Entity)
(RegNo,Name,Add) Primary Key Table Name
1: M Relationships are represented as relational schema by
taking the key field of the table at one end gide as a foreign key
to the table at the many end side. Eg: Customer Customer (CNo
Primary key, Name, Billno,goods-------) Foreign Key Order Order
(Order No,Name---------(No)
1:1 relationship can represented as one table A turnary
relationship is respresented as 1 table with the compersit primary
key which consist of all & identifiers of the 3 tables. 1:M
Recursive relationship is represented in1 with a forigenkey which
is effectively primary key.
Employee
Manage
** Employee (EmpId , Name, Add.DoB ------
If there are M:M relationship they should be avoided by breaking
them into 2, 1:M relationship for that we have to introduced
additional entity call link entity. LE is added which cross
references between instances of 1 entity & instances of other
entity. The many ends of the relationship upper at the LE. Thhe PK
of the 2 entities are in corperated into the linkentity. (9)
There can be two types of entities avatable in ER digram 1-
Super type entity 2. Sub type entity Super type entity This is an
entity that stores attributes commen to 1 or more entity sub types
Sub type entities This is an entity that inherites some common
attributes from an entity super type.
Attributes
These are the charatristics of entities can be used to disscribe
entities and relationship in the ER digram. An entity can have many
attributes but should have an identification key. A key is an
attribute or group of attributes which can identify an entity
uniquely. A key attribute can be underlined in the ER digram. There
are single valued attribute is 1 that can have sigle valued
attribute Is 1 that can have sigle value. Eg.A persons 1D
,Subject,Mark ,Grade, etc. Multy valued attributes they can have
many values. Eg.An office can have different officeno.A person have
sevelan qualification. Cardinaler & optionarity Cardinality we
already discussed. Relationship partiapation Praticipation in an
entity relationship can either optional or mandertary Optional
relationship Here the participation is optional if one entity
occurrence doesnot require the corresponding entity occurrence,
Entity. Mandertary relationship Relationship participation is
mandertary (must) if one entity occurrence in a pertionship
indicate that the minimum cardinality is one mandertary entity
(10)
Relationship degree The relationship degree indicates the no.of
associated entities or types with the relationship.There are 3 of
relationship 1- unary (Reairsive) This is the relationship between
the same entity class or type. Binary Relationship This is a
relationship between 2 different entity types.
Ternary Relationship This is a relationship between 3 entity
type & it is a simaltion rishipamong 3 entity types at the same
time. Find 5 types of unary , Binary, ternary relationship
Unary
Person
Talk
Child
Play
Customer
Deal s
Company
moni tor
Individual
Cons ult
Turnary 1. Note book
11
pen write
person
2.
Student
Text book Study
Teacher
(12) 3.
Company
Neeal
Staff
Customer
4.
Bank
Individual Need
Money
(13)
Binary 1. Person * Communicat Language
2
Individual
* eat
Food
3.
Country Have
*
District
4
Company Have
*
Staff
5, Student Study
*
Subject
Weak Entity Weak entity is one that needs e following
conditions. 1- It cannot exist without e entity with witch have e
relationship. 2- - It has a primary key that is partially or to
fally dependent rom e parent entity relationship. Data models The
evaluation of data model. Request for better data management has
lot to do with several different models.That attempt to resolve
& file system problems. The mager data models are 1-
hierarchical model 2- - Network model 3- Relational model (14)
Hierchocl model. This was e 1st database model interduced in mid
1960s & was based on e hierarchical model. Which assumes that
all data relationship can be structured as hierarchical. The
hierchieal data model uses a simple approche where e relationship
between entities are always 1:M, This forms a simple one parent
record abcord above it & many child record below it. * Customer
Invoice * Invoice line
This digram illustrate how can make a hierrchi showing e
relationship between customer invoice & Showing e relationship
between customer more invoice line A Cuctomer can own one or more
invoice lines. In hierarchical database model these file would be
tide together to e physical point. Al Pointer is a physical address
which identify where a rewrd can be found on a disribed.
Disadvantages. - Large amount of data is stored - There is only one
path to access data item - Difficult in the real world. - Difficult
to perform ad-hos quries Network Model The Network model was
created to represent complex data relationship more effectively
than e hierarchical model. In e network model any record main have
many immediate parent records as well as many child records. The
network model is able to model greater number of real world
situation. The network model build around e concepts of Set. In a
network Data base model a relationship is called set (There are 1:M
relationship between e owner & e member) Each set is composed
of atleaset 2 record type whwre e owner = hierarchical parent
record & e member record = The hierarchical child record. The
differents between e night inelude a than one set. In other words a
member can have several owners. (15)
Supplier m 1 Ubrary 1 m Library item m 1 Member 1 m 1 m Fine m
Staff
m Reterence
Disadvantages - When compare to hierarchical structure the
network strueture is more complicated by the set concepts - The hos
quries are difficult to execute - The programmer should know the
structure as well as set type which makes of the networking before
processing is done. Relational Model Relational model user the
table from of data collection, which is similler to hles. Tables
within the relational model are known as relations. The records in
the table are known as tuples in a relational model a relational or
table consist of a series of rows the columns. A column is known as
field where a row is to tuple STUDENT Field= Couumns Primary key
Reg No 00106 Record/ Tuple/ Row Stude Course code HND Forein key
Relations can be related to each other by sharing common entity
caracteristies. The common link between student & payment
tables enable us to connect the student to the payment table. Even
though their details are stored in seperrate tables this is done by
the use of payment key foreign key concept. This relational model
contains 3 relation type normally 1-1:1 2-1:M 3- M:M A relational
schema is a visul representation of the relational entities if
attritutes of those entity & relationship. Advantages of
relational model. Payment Amt 10,000 Reg No 00106 St Name Krishi St
Add Col-06 Tel 2361742 Rg.No Paym
-
Essy to understand the concept users are able to interaet with
the relational DB very easily since the familier to everyone. No
physical pointers when compared to previous model. Easy setup &
change can be done with minimal effort. Logical & physical
independent.
Relational Data Base Model Keys 1- Primary Key- A Primary key is
a attribute that can uniquely inentity given row. 2- Eg.In student
table primary key can be taken as Reg No 2.- Composit key Aprimary
key can a single attribute or commpersision of several attributes.
Which can be used to uniquely identify a row or tuple. A unique
identifier cabn be made up of mony attributes & that is call a
composite primarykey. Eg: Reg No, Course Code Compsit key STUDENT
Course code Reg No 0001 0001 0001 HND Multi C++
3- Foreign key An attribute (or combination of attributes) in
one table whose values match the primary key in another table is
called as foreign key. Foreign key is an attribute whose values
match primary key values in the related table. STUDENT Reg No
PAYMENT RegNo**
4- Candidate key We can identify a candidate key an attribute or
combination of attributes that cmiquely identify a new in a table.
A relation can have more than one candidate key among, among the
one will be choosen as the primary key. Eg. Person ID 0001 0001
Project ID 01 02 Supervisor ID 0002 0002 Time spent 50 hrs
Candidate key 1 Person ID, Project ID 2- Person ID, supervisor
ID Reg no 001 002 003 Name Krishi Krishi Renu Dob 22.10.96 19.11.96
22.10.96
Candidate key- 1- Reg No, DOB 2- Name , Reg No Relational Data
Base Integrity Rules. 1. Entity Tntegrity Specially all primary key
entries are unique & no part of a primary key may be null. This
is to gurante that each entity will have a unique identity and
ensure that foreign key values can be properly identified . Eg: No
invoice can have a duplicate number as well as it cannot be null.
All invoices uniquely identified by their invoice number. 11.
Referential Integrity A foregn key may have either a null value as
long as it is not part of its tables primery key or an entry that
matches a primary key value in a table to Which it is related. This
is to make. It possible for a attributes not to have a
corresponding value but it should be impossible to have invalid
entry . Eg: 1 M Teacher Student Teach TEACHER * Techer ID 001 002
TName Mary Ann Qualification Bsc Msc Course Dip.Ict HND
STUDENT StReg No PG 0001 PG 0002 Name Jack Tom DOB 25.02.80
08.11.86 Course HND Dip.ICT Teacher ID 001
1 1Supervision Superviser
*
Employee
Supervisor
SupID 001 002
SName Amal Kamal
Add Col-3 Col-4
Department 1A 2A
Tel 256742 232786
EmployeeNo 001 002
EName Rohan Shovon
Department 1A 2A
SupID Null 002
We dont need details about supervisor ID in a Employee table. 2.
Bank Account Have Customer
Bank Account Bank Type Saving Current Intrest Rate 0.05 0.15
3-
Movie Movie
Star
MID 123 456
MName Star Moon
Language Tamil Tamil
Date 1/1/2000 22/2/2000
Star StarID 001 002 SName Ananda Nalanda Salary 10,000 150,00
MName Star Moon MID 123 456
We Need to identify about the movie ID in a star table iii.-
Domain Integrity This concerns the values which are stored in a
particular columns of a table. Relational Database operators
(Relational Algebra) manupilahng tables and its contents using 8
relational operators. 1- Union 2- Intersect 3- Difference
4- Product 5- Select 6- Project 7- Join 8- Divide Union This
combines all the rows from 2 tables. The tables must have the some
attributes to be used in the union. Stno Name 1111 3333 5555 AA BB
CC Add Col-3 Col-5 Col-7 Tel 234567 543211 2113441 + Stno 2222 4444
6666 Name DD EE FF Add Col-4 Col-6 Col-7 Tell 2234415 5432167
5762134
StNo
Name
Add
Tel
1111 3333 5555 2222 4444 6666
AA BB CC DD EE FF
Col-3 Col-5 Col-7 Col-4 Col-6 Col-7
234567 543211 21`13441 2234415 5432167 5762134
Intersect Intersct output only the rows apperar in both tables.
In this case the tables must be union compatible to produce a valid
result. You cannot use intersect if one of the ottributes is
numeric & the other is character base. Difference Production
Soap Noodles Oil Production Soap Morgarine Bread Product Noodles
oil Margarine Bread
Product Product outputs all possible pairs of rows from 2
tables. If one table has four rows & the other table has 3 rows
a product output is a list of components of 4x3=12 rows. Itcode 001
002 Itdes AA BB ItQty X 10 20 ItPrice 100 50 Itrecord 2 5 OrdNo 111
222
Itcode 001 001 002 002
Itdes AA AA BB BB
Itprice 10 10 20 20
Itrecord 100 50 100 50
Ord No 111 222 111 222
Select Select output values for nall rowsfound in table. Select
can be used to list either all of the row values or it can output
only. Those row values that match a specified criteria, In other
words select outputs a horizontal substraction of a table.
Name 0001 0002 003
Item ID AA BB CC
Descriptin Shoe Noodles Bread
Qty 20 30 400 Output will be same as Item Table
Eg: 1- Select * from Item Eg: 2- Select * from Item where ItemID
=0001 ItemID 0001 Name AA Description Qty Shoe 20
Project Project outputs all values for selected fields. In other
words project outputs a vertical substraction of a table. Item 1
ItemID Name Description Qty 0001 AA Shoe 20 0002 BB Noodle 30 0003
CC Bread 40 Item 1 Item ID 0001 0002 0003 Name AA BB CC Description
Shoe Noodle Bread Qty 20 30 40
Eg: 2- Project ItemID, Name from Item Item ID 0001 0002 0003
Item 2 Name AA BB CC
Divide Divide requires the use of one single column. Table 1 two
column table A B IT ode 0001 0003 0007 0008 Divide A/B
ITcode 0001 0002 0003 0004
Description AA BB CC DD
Description AA CC
Eg:2 Code A A B B B
Loc No 5 7 5 6 3
Divide
Code A B
Loc No 5
Q Table-1 Reg No 0001 0002 0003 0004
Table - 2 Name AA BB CC DD Add Col-3 Col-4 Col-5 Col-6 Tel No
1111 2222 3333 4444 Reg No 0005 0006 0007 0008 0003 Name EF GH HI
JK CC Add Nuge Maha Kirula Katta Col-5 Tel No 5555 6666 7777 8888
3333 Course Name HND NCC BIT ICT HND
1- Union Reg No 0001 0002 0003 0004 0005 0006 0007 0008 Reg No
0003 Name AA BB CC DD EF EH HI JK 2- Intersect Name Cc 3- Product
Add Col-5 Add Col-3 Col-4 Col-5 Col-6 Nuge Maha Kirula Kotta Tel
3333 Tel No 1111 2222 3333 4444 5555 6666 7777 8888
Reg No
Reg No 0001 0001 0001 0001 0001 0002 0002 0002 0002 0002 0003
0003 0003 0003 0003 0004 0004 0004 0004 0004
Name AA AA AA AA AA BB BB BB BB BB CC CC CC CC CC DD DD DD DD
DD
Add 1111 1111 1111 1111 1111 2222 2222 2222 2222 2222 3333 3333
3333 3333 3333 4444 4444 4444 4444 4444
TelNo 00005 00006 00007 00008 0003 0005 0006 0007 0008 0003 0005
0006 0007 0008 0003 0005 0006 0007 0008 0003
Name EF GH HI JK CC EF GH HI JK CC EF GH HI JK CE EF GH HI JK
CC
Add Nuge Maha Kirula Kotta Col-5 Nuge Maha Kirula Katta Col-5
Nuge Maha Kirula Katta Col-5 Nuge Maha Kirula Katta Col-5
Tel No 5555 6666 7777 8888 3333 5555 6666 7777 8888 3333 5555
6666 7777 8888 3333 5555 6666 7777 8888 3333
Course Name HND NCC BIT ICT HND HND NCC BIT ICT HND HND NCC BIT
ICT HND HND NCC BIT ICT HND
4-Difference Reg No 0005 0006 0007 0008
Name EF GH HI JK
Add Nuge Maha Kirula Katta
Tel No 5555 6666 7777 8888
5-Select Name from Table 2 Name EF GH HI JK
6-Select * (All) From Table 2 Where Add Col-5 Reg.No 0005 0006
0007 0008 Name EF GH HI JK Add Nuge Maha Kirula Katta Tel No 5555
6666 7777 8888
7- Project Table over Reg No, Name Reg No 0001 0002 0003 0004
Name AA BB CC DD
Join Join allows us to combine information from 2 or more
tables. Join has power for relational Databases alliwing to use of
independent tabales to linked by the common attributes. Join is a
result of 3 stages. Stage 1- The product operation is applied stage
2- select operation is applied stage 3- A Prosect operation is
applied. CUSTOMER Code CName 0001 A 0002 B 0003 C 0005 D AGENT CAdd
Col-3 Col-4 Col-5 Col-6 ACode 2002 1001 2002 3003
A Code 1001 2002 3003
AArea Kirula Moha Kohu
ATel 2222 3333 4444
Product C code 0001 0001 0001 0002 Cname A A A B CAdd Col-3
Col-3 Col-3 Col-4 Acode 2002 2002 2002 1001 Acode 1001 2002 3003
1001 Aarea Kirula Maha Kohu Kirula ATel 2222 3333 4444 2222
0002 0002 0003 0003 0003 0004 0004 0004
B B C C C D D D
Col-4 Col-4 Col-5 Col-5 Col-5 Col-6 Col-6 Col-6
1001 1001 2002 2002 2002 3003 3003 3003
2002 3003 1001 2002 3003 1001 2002 3003
Moha Kohu Kirula Maha Kohu Kirula Maha Kohu
3333 4444 2222 3333 4444 2222 3333 4444
Per form the select condition where customer Acode = Agent Acod
TABVLE 1 Ccode 0001 0002 0003 0004 Cname A B C D CAdd Col-3 Col-4
Col-5 Col-6 Acode 2002 1001 2002 3003 Acode 2002 1001 2002 3003
Area Maha Kirula Maha Kohu Atel 3333 2222 3333 4444
Project Table-1 over Acode Ccode CName 0001 A 2222 B 3333 C 4444
D Q. SNo 1111 2222 3333 4444
CAdd Col-3 Col-4 Col-5 Col-6
ACode 2002 1001 2002 3003
A Add Maha Kirula Maha Kohu
A Tel 3333 2222 3333 4444
SName AA BB CC DD
Course HND BIT VOW Multi Model
CAdd L11 L12 L11 L12
ACode L11 L12 L13 L15
AAdd AB BC DE FG
ATel A B C D
Product Sno 1111 1111 1111
Sname AA AA AA
Course HND HND HND
Lecid L11 L11 L11
Lecid L11 L12 L13
Lecname AB BC DE
LecQual A B C
1111 2222 2222 2222 2222 3333 3333 3333 3333 4444 4444 4444
4444
AA BB BB BB BB CC CC CC CC DD DD DD DD
HND BIT BIT BIT BIT VOW VOW VOW VOW Multi Mid Multi Mid Multi
Mid Multi Mid
L11 L12 L12 L12 L12 L11 L11 L11 L11 L12 L12 L12 L12
L15 L11 L12 L13 L15 L11 L12 L13 L15 L11 L12 L13 L15
FG AB BC DE FG AB BC DE FG AB BC DE FG
D A B C D A B C D A B C D
Select Table .LECID = TABLE 2. LECID Table -3 SNO 1111 2222 3333
4444 SNAME AA BB CC DD COURSE HND BIT VOW Multi Mid CECID L11 L12
L11 L12 LECID L11 L12 L11 L12 LECNAME AB BC AB BC LECQUAC A B A
B
Project Table 3 over LECID Sno Sname 1111 AA 2222 BB 3333 CC
4444 DD
Course HND BIT Vow Multi mid
Lecid L11 L12 L11 L12
Lecname AB BC AB BC
LecQual A B A B
Relational Analysis There are 2 ways of describing relational
model. One method uses the tearms relation, attributes &
tuples. The other method describes the model using more familier
tearms that is table, field, record/ row. The relational model
describes the data as a set of relations & table. Each
relational table has a table name with set a attributes & each
attribute having a unique name. The relation attributes are same as
table columns or fields. Each relation has set of tuples/ records
in a table.
There is a close crospondence between the ER model &
Relational model. ER View Employee Project
Employee
*
Project Assignment
Project
EmpNo
ProNo
Employee Empno 1111 2222 3333 4444 Project Prono 001 002 003 004
Emp no
Name AA BB CC DD
Add Col-3 Col-4 Col-5 Col-6
Tel 1234 1235 1236 1237
Protitle
Duration 6 month 12 Month 8 Month 6 Month Pro no
Budget 20,000/= 80,000/= 35,000/= 25,000/= Time spent
In relational analysis the notation we would use is as follows.
Employee (Empno) Name, Add, Tel No) Project ( Prono, Pro title,
Duration, Budget) Project Assignment (prono) ,Empno, Time
spent)
In this notation each relation is represented by one line. Each
relation is represented by one line. Each line starts with the
relational name & it is followed by the names of the relational
attributes with in the brackets. The under- lined lined attribute
9s) is the relations,s keys (primary keeys) If you can organize
data into set of such relation then we can gurantee a good DB
design. A Number of normal forms have been defined to eliminate
duplications in relations. They are commonly known as - 1st Normal
Form (INF) - 2nd Normal Form (2NF) - 3rd Normal Form (3NF) - Boyce
code Normal Form (BCNF) Normalization What is Normalization is a
formal process for designg which attributes should be grouped
together in a relation. Before proceding with the physical design
we need a method to validate the logical design to this point.
Normalization is a tool to validate & improve a logical design.
It needs to satisfy some constrain. Such avoide unnecessary
duplicateting data normalizations is a process of decomposing
relations with anomalies to produce small well structured
relations. Normal form are the ruled for structuring relations
these were indroduced by E.F.Cood. Later Dr.Cood proposed another
normal form call boyce could normal form (BCNF) Reason for
normalization data structure to avoid data redundancy If a creting
fields occurs in DB several times errors can are 3 types of
anomalies exist 1- Insertion Anomaly 2- Deletion Anomaly 3-
Updation Anomaly Insertion Anomaly Employee Empno E100 E150 E300
E350 Name Smith James Ann David Add Col-1 Col-2 Col-4 Kandy
------Empno C-ID E100 E100 E300 E100 D-100 D-1100 D-25 M-100 CName
Dura DICS IAD BSC MCSE 5m 1yr 7yr 5yr Price 65000 10900 12500
15000
Supose that we need to add new course details to EMP_ course
table where the PK is EMPNo, C-ID. To insert a new row the user
must supply the values for both empNo & C_ID. This is an
anomaly since the user should be able toenter the course details
without supplying the EMPNo. Deletion Anomaly Suporse that the data
for EMPNO E100 is deleted from the table this will result in
loosing the information to the course name,IAD ,DICS,MCSE To voide
these anomalies we can apply normalization process for the data.
Unnrmalization form Tables with repeating groups Remove repeation
1st normal form Remove partial dependencies 2nd Normal form Remove
transitive dependencies 3rd Normal form Remove remaining
anomalies
Boyce code
Determinency & Dependency During normalization logical
assouations between data items are identified & represented in
the DB Design without any file maintenance anomalise.
Identitifycation of logical associations between data items is
important in normalization & DB design. Such associations
between data items are calle determinency & dependency
relationship . Eg: If we can difermine a specific valie of data
item V If we know the data item determinant & Data item V set
to be the dependent. Therw are 2 types of deteminency dependency
relatitionship 1- Functional dependenty 2- Non functional
dependency Functional dependency Normalization is based on the
analysis of functional dependency, A functional dependency is a
relationship between 2 attributes Eg: - A relation R has an
attribute A for every value instance of A this is value of a
uniquely determines the value of B for a give. Non Funtional
dependency For a given data item if we can find several value of
other data item then there is a non functional dependency between 2
data items Eg::- A student No can have several subjects related.
Candidate key This is a determinant that can uniquy identified non
key attribute are functionally dependent on part of the primary
key. Eg:- Emp Course (EmpID,Name CName, CDuration) Course name
& Duration depends on parts of the primary key that is empid
the other part will be linked by CID. Moltivalued dependency This
is a type of dependence of exist where there are atleast 3
attributes in a relation Eg: A,B,C,Relation
Each value of a there is a well defined value for B & well
defined value for C But the set of values of B indepennent of C
Eg:
Subject DBD DBD
Teacher AA BB
Tex Book Text 1 Text 1
Transitive dependency This is funtionnal dependency between 2 or
more non key Attributes in a relation. Eg: Consider thye following
relation Eg: Consider the following relation Sales (CNo,Name, Sales
person, region) In the above relation each sales person is asigened
to a unique region We can identify that functional dependence exist
in tne sales relation i- Customer No Name Sales person Region
region
2- Sales person
The region is functionally dependent on sales person & sales
person is functionally depend on CNo, AS a result there are update
anoma lies in the sales relation. Eg: We can,t add new sales person
to a new region without a customer. If we delete a customer we,ll
loose the sales person information aswell. 3- If a salesperson is
transferred to a new region then the several rows must be changed.
Rules of Data Normalization. 1- INF- Elemunate repeating groupsMake
a Elemunate repreating groupsMake a separate table for each set of
related attributes & give each table a primary key. 2- 3NF-
Eleminate columns not depend on key- If attribute do not contribute
to the description of key remove that to a separate table. 3- BCNF-
This is Boyce codd Normal dependency between candidate key
attribute sdperte them to different table.
Eliminate repesting Groups. In the original member list,each
member name is followed by any database that the mwmber has
experience with . some might know many, & others might not know
any. To answer the
question, who knows DB2? we need to perform an awkward DB2? we
need to perform an awkward scan of the list looking for reerence to
BB2.This is inefficient & extremely. Untidy way to store
information. Moving the known DB into a separate table helps a lot.
Sepatating the repeating groups of DBS from the member information
results in INF. The member ID in the Database table matches The PK
in the 2 tables with a join operation. Now we can answer the
question by looking in the database table for DB2 & getting the
list of member.
Member list 1. Jojn SmithAccess DB2 FoxPro 2. Dave jones
dBase,Clipper 3. Mike Beach 4. Jerry Miller DB2, Oracle 5.
Benstuary Oracle, Sybase 6. Fred Flint Informix 7. Joe Blow 8. Greg
brown Access , Missq Server 9. Doughope
DataBase Table Member table MID Name 1. Johnsmith 2. David Jones
3Mile Beach 4.Jerry Miller 5. Benstuart 6. Fred Flint 7. Joe Blow
8. Gre Brows 9. Dougltope DID MID Database 1 1 Access 2 1 DB2 3 1
Foxpro 4 2 dBase 5 2 clipper 6 4 DB2 7 4 Oracle 8 5 Oracle 9 5
Sybase 10 6 Infomix 11 8 Access 12 8 Mssqlserver
2. Eliminate Redundant Data
3. In the DB Table, & PK is made up of the member ID &
Databade ID. This makes scnse for other attributes like where
learned & Skill level attributes, since they will be different
for every member/ database combination. But the database name
depends only on the Database. The same database under different
IDs. This is an update anomaly. Or suppose the last member listing
a particular database leaves the group. This records will be
removed from the system, & the data base will not be stored
anywhere . This is a delete anomaly. To avoid these proplems, we
need 2NF. To Achieve this ,separate the attributes deperching on
both parts of the key from those depending only on the Database ID.
This results in 2 tables: database Which gives the name for each
database ID , and Member database which list the databases for each
member. Now we can recassfica database in a single operation: look
up The database ID in the Database table & change its name. The
result will instantly be available throught the application.
Database Table DID 1 2 3 4 5 6 7 8 9 10 11 12 MID Database 1 Access
1 DB2 1 Foxpro 2 dbase 2 Clipper 4 DB2 4 Oracle 5 Oracle 5 Sybase 6
Informax 8 Access 8 Mssqlserren Member Table MID Name 1 2 3 4 5 6 7
8 9 Member Database Table MID DID 1 1 1 2 1 3 2 4 2 5 4 2 4 6 5 6 5
7 6 8 8 1 8 9
Database Table DID Database 1 Access 2 DB2 3 FoxPro 5 dBase 6
Clipper 7 8 9
3 Elimnate columns not Dependent on key The member table
satisfies INF It contains no repeating groups. It satisfies NF
since it doest have a multivalued key. But the key is member
ID,& the company name & location describe only a company,
not a member. To achieve 3 NF, They must be moved in to a separate
table. Since they describe a company, company code becomes the key
for the new company table. The motivation for this is the same for
1NE : we want to avoid update & the IBM were currently stored
no members from the previous design, there would be no record to
its existence, even though 20 past members were from IBM:
Member Table MID Name Company 1 Johnsmrth ABC 2 Dave Jones MC1 3
Mike Beach IBM 4 Jerry Miller MCI 5 Ben Stuart AIC 6 Fred Flint ABC
7 Joe Blow RuNuts 8 Gerg Brown XYZ 9 Doug Hope IBM Com Loc Alabama
Florida Delaware Flnda Nebraska Alahama Iowa Now York Delawre
Member Table MID Name 1 2 3 4 5 6 7 8 9
CID 1 2 3 2 4 1 5 6 3
Company table CID Name Loc 1 2 3 4 5 6
BCNF Boyce code Normal 1 form states mathematically that A
relation R is said to be in BCNF if whenever X->A holds in R
,and A is not in X, then X is candidate key for R. BCNF covers very
specific situations where 3 NF misses inter- dependencies between
nonkey (but candidate key) attributes. Typically, any relation that
is in 3 NF is also in BCNF if (a) there are multiple candidate
keys, (b) the keys are composed of multiple attributes & (c)
there are common attributes between the keys. Eg: Classcode 503 540
Class Enrolment Class Discription Mgt Info Maths Student No 0001
0003 0005 0002 0004 Name AA BB CC DD EE
1- Convert into INF 2- Repealing group exist within this data
each class code have any no. of student init, so the student
information makes the repeating groups. Data cannot be stored or
processed in a database when it this from. What we musthave is one
record containing all the data for each student. INF Class code 503
503 503 540 540 Class discription Mgt Info Mgt Info Mgt Info Maths
Maths Student No 0001 0003 0005 0002 0004 Name AA BB CC DD EE
Enrolled in a class, there can be no gaps in the data when
stored in a file The above table is in the INF no repeating data
& no gaps. To take the table to the 2NF we would have to remove
partial dependency. Class Information Student Info
Class code 503 540
Class Discription Mgt Info Maths
Class code 503 503 503 540 540
Student No 0001 0003 0005 0002 0004
Name AA BB CC CC EE
Non loss decomposition This is the process of transforming an
unnormalized data set in to a fully normalized database.
Relational/ Bracketing nitation If we are going to represent all
normalized table within the system itll become very complexed, to
avoid this , we can use the bracketingnotalion which can represent
table by on statement. Itll start with a table name followed by the
key (s) attributes are separated with commas as well as primary
keys are underlined & the foreign key shown with** Emergence of
the DBMS Enter the database management system the simple approohe
open data file for input soon became fraught to be addressed.
Computer system vendors needed to be able to support the critical
needd of a growing & evolving market palce that supported the
data processing needs of organization in all fields of human
endevaour. Thhe first DBMS appeared during the 1960s at a time in
human history where projects of momentous scale were being
contemplated planned engineeed never before had such large data
problems on the flow where identified & solutions were
researched & develop ofen in real time The BBMS became
necessare because the data was for more volatile than had earlier
been planed & because there was still major limiting factors in
the cost associated with data storage media. Data grew as a
detailed transations by transation levels In the 1980s all the
major vendors of hardware system large enough to support the e
volving need of evolving computerized . record keeping systems of
large oranizations bundle some from of DBMS with their system
solution. The 1st DBMS species were thus very much vendor specific.
IBM as usual let the field but there were a growing no of competes
and cloneds whose DB solution offered varing entry points in to the
bandwagon of computerized record keeping system. Through this time
the specifc nature of the problems being resolved & were around
from the perspective of IT management were evolving with the
teahnology & the no of main frame & mini computer hardware
vendors in creased & the no of pheriperal types & their
vendors also
increased. The bundeling of DB operational sevices such as the
ability to perform & Schedule data backup became routine. The
IT operational environment at this time was categorized with a
collection of system house keeping task with major emhersis on
organization produetion DB backup. File recoranization &
reindex ation of data were also standard but mainly manual system
house keeping system task. Other tasks incude storage media
arehival management of document & records or business
perticilarly those relste to finance record keeping. MedPix medical
image database uses helthy dos of foss MedPix is a sprawling online
medicimages DB & diagnostic tool thats used around The wirld by
radiologist, nurses, physicians & medical students & the
whole system is powered by linux & open source software. MedPix
is hosted by the US federal governments helth sciences university,
the uniformed services university in Bethesda, Maryland. It s the
brainchild of JamesG. Smirniotopoulos, M.D., a USU professor of
Radiology, Neurology & Biomedical informatics & clinical
sciences chair of its Department of Radiology & Radiological
sciences. Exploiting the DBMS (Selecting a DBMS) The DSMS is a
software application system that used to create, maintain &
provide control to create, maintain & provide control to user
database Eg: It provides a user interface to the data base system.
Data management layer The ANS1 / SPARC architechture A DBMS works
as an interface between application progras of the users & the
data base. The American National standard institute standard
planning & requirements comitte propose a 3 level architecture
for this interface such as. External/ User view This level
describes The user application programa viwew of the database
several users can share the same view. Conceptual/ l0gical view
This describes overall data requirementd as a detailed technology
ideperdent speafication of the over all structure of a
database.Entity- Relatiorship modeling & object Oriented
modeling are 2 diffw\erent graphical notations used for presenting
the comceptual notation used for presenting the conceptual mobbed.
Physical / Internal view This describes in which the data is stored
& data is accessed
The above 3 layer of architecture create independence of data at
2 level. Logical Data independence This refers to the ability of
changes to the external Without Eg:- Phys any influence to the
external cchema. Physical Data independence This refers to the
ability of introduang changes to the physical sohema without any
influence to the logical schema Eg: Chaging the storage structure
of data without any influence to the wonceptial Schema.
Toolkit Interface
The DBMS Consist of 3 parts. 1. kernel (core- central point0 2.
Interface 3. toolket Kernel. The DBMS kernel is the central ergine
which handles the rain data management functions. Most DBMS are
installed on thetop of some operating systems. Therefore DBMS needs
to interact with operating systems to implement & access the
data & system calalogece which are usually stored on harddisk.
Therefore they interact with some of the operting system elements
such as i.- file manager which translate between the data
structures manipulated by the DBMS & the files on the harddisk.
ii. acces mechanism the file manager doesnal manage the physical
input & output of data directly . It interact with appropriate
access mechnisum established for different physical structures.
System buffer The reading & writing data is normaly stored
in the system buffer of the operating system. Functions of DBMS. -
the DBMS must be able to create, retrieve, update, delete the data
from the data base (CRUD Function) Data Integrity & data
seairity Data Communication Conarrency cCntrol Data recovery &
backups Query optimization Managing transaction Scheduline
transaction Data Dictionary
Software components available in DBMS. DDL Compilers DDL
Compiler
Data dictioraiyment
Query processor & runtime processor User Queries
Query
Runtime Processor
Pre- compiler Application programs
Precompiler
DML Compiler
Host lang
DML Compiler Host Language Compiler Data dictionary manager Data
base features Data independence The srperation of data
descriptions/ definitions from the applications that uses the data
In a relational model the logical independence & physical
independence is achieved by allowing changes to the physical
storage or the application programs to take place without the user
being aware of the changes. Data abstraction Data abstraction is
the process by which a data base attempts to represent properties
of objects in the real world. A database records. The relevant
details required to support some organizational activities & it
does not record all the details. The database is an abstraction of
the real world. The abstraction level varies according to the
different commercial databases. Data Security Data security is the
process of protecting the data from external thrects. Data is key
resource for organization. Loss of data , Privacy , integrity,
availability etc are issues that cost an organization in financial
terms. Data Interation
A data base should be a collection of data which at last has no
redundant data. The aim is to hare 1 database storing 1 logical
item of data in one place which can be accessed by a range of
information systems. Eg: Replacing several departmental data base
in one company with one data base system that can be nade
accessible to several departments. Physical Data base design. The
process of physical database design is to translate the logical
description of data in to the technical specification for retieving
& storing data, the goal is to create the required performance
& ensure the database integrity, security & recoverability.
Physical database design must be performed carefully since the
decisions made during this stage have major impact on data
accessibility , response time , security , user friendliness , ect.
Physical database design doses not include implementing database
but it produice a technical specification that programmers &
others involved in information system construction There are 3
major inputs to physical database design Logical database structure
that were developed during logical design. This structure may be
expressed in itierarachical, Network, relational object oriented
& definition for each attribute. User processing requirements
that were identified during requirement definition including sign ,
when data is used & where response time , time , data ,security
, backup & recovery , integrity & retention of data.
Description of technology used for omplemnting the databases (DBMS0
There are several critical decisions that ll affect the integrity
& performance of the applications system. These key decisions
are - data volume analysis - transaction / usage analysis - Contro
& security analiysis - Destribution analysis - Integrity
analysis
Integrity Analysis Tntegrity defines the business rules or
constraints that should apply on the data. There are 4 basic types
of integrity rules. Entity Integrity This define that each and
every entity should have a unique identifier a primary key and it
cannot be null.
Referential Tntegrity This refers to the rules concerining the
relationship between the 2 entity types (1e) by defining foregn key
relational data model. Domain integrity This refers to the
eonstraction valid values for attributes. We can achieve the domain
integrity for an attribute by specifying the following. - The type
of the attribute - The length - Format range - Allowed values -
Null support or Not - Check constraints Triggerring Operations All
the other business rules that protect the validity of an attribute
value defined as a tribber A trigger is a program / proceduire
which is aulmatically executed due to an event. An event can be
either Insevetion , deletion / updation. User Rule: check with
drawal amount