Top Banner
Database Designing Concepts Data Base A Data Base is shared collection of logical interrelated data. A Data Base System Every organization needs to record data relevant to their every day activates. The organization can choose from the data collected & stored some of these data in an electronic Data Base. Eg: A university needs to record data to help in the activates of teaching & learning therefore three record what kind of students & lectures they have what are the courses & modules they are running which lecturers are teaching what modules. One the data is entered in to a database it could be utilization to get complete & accrete information’s such as list of students who have enrolled for each modules. this can help to make decision on room utilization .etc.. Data vs. Information The Word data refers to facts concerning things such as people, objects or events. Information is data that have been processed & presented in a from suitable for human interpretation. Disadvantages of manual system 1. A Constant stream inter company paper work 2. Telephone calls or faxes are required to communicate changes & keep the file synchronize. 3. These systems cannot provide “what if” type of questions. 4. Managers cannot easily obtain summarized information required for decision making. 5. Duplicate data can exist throughout organization resulting in lack of consistency. Disadvantages of file processing system 1. There is limited data sharing
68

Database Designing Concepts

Nov 15, 2014

Download

Documents

ChiranSJ

Database Designing Concepts
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript

Database Designing Concepts Data Base A Data Base is shared collection of logical interrelated data. A Data Base System Every organization needs to record data relevant to their every day activates. The organization can choose from the data collected & stored some of these data in an electronic Data Base. Eg: A university needs to record data to help in the activates of teaching & learning therefore three record what kind of students & lectures they have what are the courses & modules they are running which lecturers are teaching what modules. One the data is entered in to a database it could be utilization to get complete & accrete informations such as list of students who have enrolled for each modules. this can help to make decision on room utilization .etc.. Data vs. Information The Word data refers to facts concerning things such as people, objects or events. Information is data that have been processed & presented in a from suitable for human interpretation. Disadvantages of manual system 1. A Constant stream inter company paper work 2. Telephone calls or faxes are required to communicate changes & keep the file synchronize. 3. These systems cannot provide what if type of questions. 4. Managers cannot easily obtain summarized information required for decision making. 5. Duplicate data can exist throughout organization resulting in lack of consistency. Disadvantages of file processing system 1. There is limited data sharing 2. Inconsistency of data 3. Duplication of data can occur. 4. Excessive program maintenance 5. Poor enforcement of standard. Advantages of Database 1. Better data sharing 2. Better security & integrity 3. Reduced redundancy 4. The program & data are independent 5. Multiple views of data.

Disadvantages of Database 1. New specialized persons are required. 2. Need to get 3. Problems in data sharing explicit backup. Database Approaches Database approach emphasizes the integration & sharing of data. There are 2 designing approaches. 1-process Driven Design Requirement Analysis

Process Design

Data Design

Implementation File processing system a process driven approached was traditional being used with this approche organization process such as order processins, inventory control, pay roll are identified & analyzed. Process and data flows between processed are described using tools such as DED. Designers work backwards from the required outputs of the system to determine the requirement unput. They can use flow chart into outputs. Finally they design data file as a wide product of product of process divan design. 2- Data Driven Design Requirement Analysis

Data design Process Design

Implentation

It forcuses on entities person, Plese, events or concept about which an organization choose to record data. This approche ideenti files the attributes of the properties of these entities & the relation ship among them. It also identities the business rules that speifiese how the entities are managed or used. After creating the suitable module of the data structure and related business rules the designer developes the application reqired to manage the data. Designers now have discovered a balance approche to combine to different applications approprialely suitable. Database development Enterprise Data module Requirement Analysis Conceptual model Logical Data modle

Implementation

Physical Data Model

Conceptual data model This involves building a real world model expressed interms of data requirements. The initiaset of information & processing reqirements are gathered from the users. The developments of the large databases conception modeline may consis of view integration (combining the user views into onces schema) The output from this stage is the entity relationship digram. (3) Logical data Model This involves the building of the real world expressed interms of a data model. Eg. Hierarchical, Network ,Relational. This involves determind the concept of the database using the conception model as input & transforming it into the architectural data model.This is the mapping of the ER model.to the relational model & caring the process of normalization. The output of this stage is relational schema up or set of table. Physical data model. This involves building a model of the real world expressed interms of data structures & Acces mechanisum available in a choosen Database management system.

This involved in a transformation of a logical data model suitable for a specific software /Hardware specification. This is usuall expressed in data dominion language & theoutput of this stage is implementation plan. Entity An entity is the thing we modeled & it is something about which we wish to store information we name the entities with the singular noun. An entity can be 1-Aperson Employer, student, customer etc. 2-A place It could be a state country , region etc.3-An object chair,Table, building etc. 4- An Event sales, Rigistration, Renewal. 5- A concept Acount ,Course The organization. ER models are usually constructed during the analysis phase of the DB development process. Output of this stage is a coceptval data model expressed in the form of a detailed ER digram. The ER digram consist of entities relationships & attribules.

(4) Basic Systems of an ER digram.

Entity Attributes Registration Primary key Attributes. This is the property for entity. Primary key

Relationship

Primary key

Eg: Student Relationshipjhhgvf Rel Student No

Attributes Name Add Tel No

ationship Association between entities. Can be specified using relationship. There are 3 basic kind of relationship. 1- 1:1 (one to one) 2.- M (one to Many) / (1:*) 3 M:M (Many to Many ) / (*:*) 1:1 1. (5)

Country 1

H 1 H 1 1 H

Capital

2.

Lock

Key

3

Husband

Wife

1:M/ / 1:*

1 B

*

Invoice

1.

Customer 1 H ave 1 Shop has *

2. . 3

Mother

Child

Sales Man

M: M /*:* PE . Student

* Regis ter

* Sales Man

Cannot allow M:M for introduced Link Entity with two 1:M relationship.

(6)

Student 1 *** * Enrolment

Student OR Course Student

Course

Link Entity

Course

* Person Communi cate

* Country

Country C 1 Language

*

* 1 Person

(7) British counsil library Library items Book / CD DVD / Enayclopedia Reception Membership Readers Libraraian

British Councial Library

H ave

Reception

Regi ster

Reader

CD DVD Library Item books

H ave

H ave

Encyclopedia Copy A 1.* Librararian Lending Collection D eati

Accomerdation of ER model to a relationalData model. After drawing a ER digram it has to be converted into relational schema it can be done as follows. Convert each entity into a relational table in your relational data model the identifier of the entity becomes primary key of the table, all other attributes become nonkey attributes of the table. (8) Eg. Student (Entity) (RegNo,Name,Add) Primary Key Table Name

1: M Relationships are represented as relational schema by taking the key field of the table at one end gide as a foreign key to the table at the many end side. Eg: Customer Customer (CNo Primary key, Name, Billno,goods-------) Foreign Key Order Order (Order No,Name---------(No)

1:1 relationship can represented as one table A turnary relationship is respresented as 1 table with the compersit primary key which consist of all & identifiers of the 3 tables. 1:M Recursive relationship is represented in1 with a forigenkey which is effectively primary key.

Employee

Manage

** Employee (EmpId , Name, Add.DoB ------

If there are M:M relationship they should be avoided by breaking them into 2, 1:M relationship for that we have to introduced additional entity call link entity. LE is added which cross references between instances of 1 entity & instances of other entity. The many ends of the relationship upper at the LE. Thhe PK of the 2 entities are in corperated into the linkentity. (9)

There can be two types of entities avatable in ER digram 1- Super type entity 2. Sub type entity Super type entity This is an entity that stores attributes commen to 1 or more entity sub types Sub type entities This is an entity that inherites some common attributes from an entity super type.

Attributes

These are the charatristics of entities can be used to disscribe entities and relationship in the ER digram. An entity can have many attributes but should have an identification key. A key is an attribute or group of attributes which can identify an entity uniquely. A key attribute can be underlined in the ER digram. There are single valued attribute is 1 that can have sigle valued attribute Is 1 that can have sigle value. Eg.A persons 1D ,Subject,Mark ,Grade, etc. Multy valued attributes they can have many values. Eg.An office can have different officeno.A person have sevelan qualification. Cardinaler & optionarity Cardinality we already discussed. Relationship partiapation Praticipation in an entity relationship can either optional or mandertary Optional relationship Here the participation is optional if one entity occurrence doesnot require the corresponding entity occurrence, Entity. Mandertary relationship Relationship participation is mandertary (must) if one entity occurrence in a pertionship indicate that the minimum cardinality is one mandertary entity (10)

Relationship degree The relationship degree indicates the no.of associated entities or types with the relationship.There are 3 of relationship 1- unary (Reairsive) This is the relationship between the same entity class or type. Binary Relationship This is a relationship between 2 different entity types.

Ternary Relationship This is a relationship between 3 entity type & it is a simaltion rishipamong 3 entity types at the same time. Find 5 types of unary , Binary, ternary relationship Unary

Person

Talk

Child

Play

Customer

Deal s

Company

moni tor

Individual

Cons ult

Turnary 1. Note book

11

pen write

person

2.

Student

Text book Study

Teacher

(12) 3.

Company

Neeal

Staff

Customer

4.

Bank

Individual Need

Money

(13)

Binary 1. Person * Communicat Language

2

Individual

* eat

Food

3.

Country Have

*

District

4

Company Have

*

Staff

5, Student Study

*

Subject

Weak Entity Weak entity is one that needs e following conditions. 1- It cannot exist without e entity with witch have e relationship. 2- - It has a primary key that is partially or to fally dependent rom e parent entity relationship. Data models The evaluation of data model. Request for better data management has lot to do with several different models.That attempt to resolve & file system problems. The mager data models are 1- hierarchical model 2- - Network model 3- Relational model (14) Hierchocl model. This was e 1st database model interduced in mid 1960s & was based on e hierarchical model. Which assumes that all data relationship can be structured as hierarchical. The hierchieal data model uses a simple approche where e relationship between entities are always 1:M, This forms a simple one parent record abcord above it & many child record below it. * Customer Invoice * Invoice line

This digram illustrate how can make a hierrchi showing e relationship between customer invoice & Showing e relationship between customer more invoice line A Cuctomer can own one or more invoice lines. In hierarchical database model these file would be tide together to e physical point. Al Pointer is a physical address which identify where a rewrd can be found on a disribed. Disadvantages. - Large amount of data is stored - There is only one path to access data item - Difficult in the real world. - Difficult to perform ad-hos quries Network Model The Network model was created to represent complex data relationship more effectively than e hierarchical model. In e network model any record main have many immediate parent records as well as many child records. The network model is able to model greater number of real world situation. The network model build around e concepts of Set. In a network Data base model a relationship is called set (There are 1:M relationship between e owner & e member) Each set is composed of atleaset 2 record type whwre e owner = hierarchical parent record & e member record = The hierarchical child record. The differents between e night inelude a than one set. In other words a member can have several owners. (15)

Supplier m 1 Ubrary 1 m Library item m 1 Member 1 m 1 m Fine m Staff

m Reterence

Disadvantages - When compare to hierarchical structure the network strueture is more complicated by the set concepts - The hos quries are difficult to execute - The programmer should know the structure as well as set type which makes of the networking before processing is done. Relational Model Relational model user the table from of data collection, which is similler to hles. Tables within the relational model are known as relations. The records in the table are known as tuples in a relational model a relational or table consist of a series of rows the columns. A column is known as field where a row is to tuple STUDENT Field= Couumns Primary key Reg No 00106 Record/ Tuple/ Row Stude Course code HND Forein key Relations can be related to each other by sharing common entity caracteristies. The common link between student & payment tables enable us to connect the student to the payment table. Even though their details are stored in seperrate tables this is done by the use of payment key foreign key concept. This relational model contains 3 relation type normally 1-1:1 2-1:M 3- M:M A relational schema is a visul representation of the relational entities if attritutes of those entity & relationship. Advantages of relational model. Payment Amt 10,000 Reg No 00106 St Name Krishi St Add Col-06 Tel 2361742 Rg.No Paym

-

Essy to understand the concept users are able to interaet with the relational DB very easily since the familier to everyone. No physical pointers when compared to previous model. Easy setup & change can be done with minimal effort. Logical & physical independent.

Relational Data Base Model Keys 1- Primary Key- A Primary key is a attribute that can uniquely inentity given row. 2- Eg.In student table primary key can be taken as Reg No 2.- Composit key Aprimary key can a single attribute or commpersision of several attributes. Which can be used to uniquely identify a row or tuple. A unique identifier cabn be made up of mony attributes & that is call a composite primarykey. Eg: Reg No, Course Code Compsit key STUDENT Course code Reg No 0001 0001 0001 HND Multi C++

3- Foreign key An attribute (or combination of attributes) in one table whose values match the primary key in another table is called as foreign key. Foreign key is an attribute whose values match primary key values in the related table. STUDENT Reg No PAYMENT RegNo**

4- Candidate key We can identify a candidate key an attribute or combination of attributes that cmiquely identify a new in a table. A relation can have more than one candidate key among, among the one will be choosen as the primary key. Eg. Person ID 0001 0001 Project ID 01 02 Supervisor ID 0002 0002 Time spent 50 hrs

Candidate key 1 Person ID, Project ID 2- Person ID, supervisor ID Reg no 001 002 003 Name Krishi Krishi Renu Dob 22.10.96 19.11.96 22.10.96

Candidate key- 1- Reg No, DOB 2- Name , Reg No Relational Data Base Integrity Rules. 1. Entity Tntegrity Specially all primary key entries are unique & no part of a primary key may be null. This is to gurante that each entity will have a unique identity and ensure that foreign key values can be properly identified . Eg: No invoice can have a duplicate number as well as it cannot be null. All invoices uniquely identified by their invoice number. 11. Referential Integrity A foregn key may have either a null value as long as it is not part of its tables primery key or an entry that matches a primary key value in a table to Which it is related. This is to make. It possible for a attributes not to have a corresponding value but it should be impossible to have invalid entry . Eg: 1 M Teacher Student Teach TEACHER * Techer ID 001 002 TName Mary Ann Qualification Bsc Msc Course Dip.Ict HND

STUDENT StReg No PG 0001 PG 0002 Name Jack Tom DOB 25.02.80 08.11.86 Course HND Dip.ICT Teacher ID 001

1 1Supervision Superviser

*

Employee

Supervisor

SupID 001 002

SName Amal Kamal

Add Col-3 Col-4

Department 1A 2A

Tel 256742 232786

EmployeeNo 001 002

EName Rohan Shovon

Department 1A 2A

SupID Null 002

We dont need details about supervisor ID in a Employee table. 2. Bank Account Have Customer

Bank Account Bank Type Saving Current Intrest Rate 0.05 0.15

3-

Movie Movie

Star

MID 123 456

MName Star Moon

Language Tamil Tamil

Date 1/1/2000 22/2/2000

Star StarID 001 002 SName Ananda Nalanda Salary 10,000 150,00 MName Star Moon MID 123 456

We Need to identify about the movie ID in a star table iii.- Domain Integrity This concerns the values which are stored in a particular columns of a table. Relational Database operators (Relational Algebra) manupilahng tables and its contents using 8 relational operators. 1- Union 2- Intersect 3- Difference

4- Product 5- Select 6- Project 7- Join 8- Divide Union This combines all the rows from 2 tables. The tables must have the some attributes to be used in the union. Stno Name 1111 3333 5555 AA BB CC Add Col-3 Col-5 Col-7 Tel 234567 543211 2113441 + Stno 2222 4444 6666 Name DD EE FF Add Col-4 Col-6 Col-7 Tell 2234415 5432167 5762134

StNo

Name

Add

Tel

1111 3333 5555 2222 4444 6666

AA BB CC DD EE FF

Col-3 Col-5 Col-7 Col-4 Col-6 Col-7

234567 543211 21`13441 2234415 5432167 5762134

Intersect Intersct output only the rows apperar in both tables. In this case the tables must be union compatible to produce a valid result. You cannot use intersect if one of the ottributes is numeric & the other is character base. Difference Production Soap Noodles Oil Production Soap Morgarine Bread Product Noodles oil Margarine Bread

Product Product outputs all possible pairs of rows from 2 tables. If one table has four rows & the other table has 3 rows a product output is a list of components of 4x3=12 rows. Itcode 001 002 Itdes AA BB ItQty X 10 20 ItPrice 100 50 Itrecord 2 5 OrdNo 111 222

Itcode 001 001 002 002

Itdes AA AA BB BB

Itprice 10 10 20 20

Itrecord 100 50 100 50

Ord No 111 222 111 222

Select Select output values for nall rowsfound in table. Select can be used to list either all of the row values or it can output only. Those row values that match a specified criteria, In other words select outputs a horizontal substraction of a table.

Name 0001 0002 003

Item ID AA BB CC

Descriptin Shoe Noodles Bread

Qty 20 30 400 Output will be same as Item Table

Eg: 1- Select * from Item Eg: 2- Select * from Item where ItemID =0001 ItemID 0001 Name AA Description Qty Shoe 20

Project Project outputs all values for selected fields. In other words project outputs a vertical substraction of a table. Item 1 ItemID Name Description Qty 0001 AA Shoe 20 0002 BB Noodle 30 0003 CC Bread 40 Item 1 Item ID 0001 0002 0003 Name AA BB CC Description Shoe Noodle Bread Qty 20 30 40

Eg: 2- Project ItemID, Name from Item Item ID 0001 0002 0003 Item 2 Name AA BB CC

Divide Divide requires the use of one single column. Table 1 two column table A B IT ode 0001 0003 0007 0008 Divide A/B

ITcode 0001 0002 0003 0004

Description AA BB CC DD

Description AA CC

Eg:2 Code A A B B B

Loc No 5 7 5 6 3

Divide

Code A B

Loc No 5

Q Table-1 Reg No 0001 0002 0003 0004

Table - 2 Name AA BB CC DD Add Col-3 Col-4 Col-5 Col-6 Tel No 1111 2222 3333 4444 Reg No 0005 0006 0007 0008 0003 Name EF GH HI JK CC Add Nuge Maha Kirula Katta Col-5 Tel No 5555 6666 7777 8888 3333 Course Name HND NCC BIT ICT HND

1- Union Reg No 0001 0002 0003 0004 0005 0006 0007 0008 Reg No 0003 Name AA BB CC DD EF EH HI JK 2- Intersect Name Cc 3- Product Add Col-5 Add Col-3 Col-4 Col-5 Col-6 Nuge Maha Kirula Kotta Tel 3333 Tel No 1111 2222 3333 4444 5555 6666 7777 8888

Reg No

Reg No 0001 0001 0001 0001 0001 0002 0002 0002 0002 0002 0003 0003 0003 0003 0003 0004 0004 0004 0004 0004

Name AA AA AA AA AA BB BB BB BB BB CC CC CC CC CC DD DD DD DD DD

Add 1111 1111 1111 1111 1111 2222 2222 2222 2222 2222 3333 3333 3333 3333 3333 4444 4444 4444 4444 4444

TelNo 00005 00006 00007 00008 0003 0005 0006 0007 0008 0003 0005 0006 0007 0008 0003 0005 0006 0007 0008 0003

Name EF GH HI JK CC EF GH HI JK CC EF GH HI JK CE EF GH HI JK CC

Add Nuge Maha Kirula Kotta Col-5 Nuge Maha Kirula Katta Col-5 Nuge Maha Kirula Katta Col-5 Nuge Maha Kirula Katta Col-5

Tel No 5555 6666 7777 8888 3333 5555 6666 7777 8888 3333 5555 6666 7777 8888 3333 5555 6666 7777 8888 3333

Course Name HND NCC BIT ICT HND HND NCC BIT ICT HND HND NCC BIT ICT HND HND NCC BIT ICT HND

4-Difference Reg No 0005 0006 0007 0008

Name EF GH HI JK

Add Nuge Maha Kirula Katta

Tel No 5555 6666 7777 8888

5-Select Name from Table 2 Name EF GH HI JK

6-Select * (All) From Table 2 Where Add Col-5 Reg.No 0005 0006 0007 0008 Name EF GH HI JK Add Nuge Maha Kirula Katta Tel No 5555 6666 7777 8888

7- Project Table over Reg No, Name Reg No 0001 0002 0003 0004 Name AA BB CC DD

Join Join allows us to combine information from 2 or more tables. Join has power for relational Databases alliwing to use of independent tabales to linked by the common attributes. Join is a result of 3 stages. Stage 1- The product operation is applied stage 2- select operation is applied stage 3- A Prosect operation is applied. CUSTOMER Code CName 0001 A 0002 B 0003 C 0005 D AGENT CAdd Col-3 Col-4 Col-5 Col-6 ACode 2002 1001 2002 3003

A Code 1001 2002 3003

AArea Kirula Moha Kohu

ATel 2222 3333 4444

Product C code 0001 0001 0001 0002 Cname A A A B CAdd Col-3 Col-3 Col-3 Col-4 Acode 2002 2002 2002 1001 Acode 1001 2002 3003 1001 Aarea Kirula Maha Kohu Kirula ATel 2222 3333 4444 2222

0002 0002 0003 0003 0003 0004 0004 0004

B B C C C D D D

Col-4 Col-4 Col-5 Col-5 Col-5 Col-6 Col-6 Col-6

1001 1001 2002 2002 2002 3003 3003 3003

2002 3003 1001 2002 3003 1001 2002 3003

Moha Kohu Kirula Maha Kohu Kirula Maha Kohu

3333 4444 2222 3333 4444 2222 3333 4444

Per form the select condition where customer Acode = Agent Acod TABVLE 1 Ccode 0001 0002 0003 0004 Cname A B C D CAdd Col-3 Col-4 Col-5 Col-6 Acode 2002 1001 2002 3003 Acode 2002 1001 2002 3003 Area Maha Kirula Maha Kohu Atel 3333 2222 3333 4444

Project Table-1 over Acode Ccode CName 0001 A 2222 B 3333 C 4444 D Q. SNo 1111 2222 3333 4444

CAdd Col-3 Col-4 Col-5 Col-6

ACode 2002 1001 2002 3003

A Add Maha Kirula Maha Kohu

A Tel 3333 2222 3333 4444

SName AA BB CC DD

Course HND BIT VOW Multi Model

CAdd L11 L12 L11 L12

ACode L11 L12 L13 L15

AAdd AB BC DE FG

ATel A B C D

Product Sno 1111 1111 1111

Sname AA AA AA

Course HND HND HND

Lecid L11 L11 L11

Lecid L11 L12 L13

Lecname AB BC DE

LecQual A B C

1111 2222 2222 2222 2222 3333 3333 3333 3333 4444 4444 4444 4444

AA BB BB BB BB CC CC CC CC DD DD DD DD

HND BIT BIT BIT BIT VOW VOW VOW VOW Multi Mid Multi Mid Multi Mid Multi Mid

L11 L12 L12 L12 L12 L11 L11 L11 L11 L12 L12 L12 L12

L15 L11 L12 L13 L15 L11 L12 L13 L15 L11 L12 L13 L15

FG AB BC DE FG AB BC DE FG AB BC DE FG

D A B C D A B C D A B C D

Select Table .LECID = TABLE 2. LECID Table -3 SNO 1111 2222 3333 4444 SNAME AA BB CC DD COURSE HND BIT VOW Multi Mid CECID L11 L12 L11 L12 LECID L11 L12 L11 L12 LECNAME AB BC AB BC LECQUAC A B A B

Project Table 3 over LECID Sno Sname 1111 AA 2222 BB 3333 CC 4444 DD

Course HND BIT Vow Multi mid

Lecid L11 L12 L11 L12

Lecname AB BC AB BC

LecQual A B A B

Relational Analysis There are 2 ways of describing relational model. One method uses the tearms relation, attributes & tuples. The other method describes the model using more familier tearms that is table, field, record/ row. The relational model describes the data as a set of relations & table. Each relational table has a table name with set a attributes & each attribute having a unique name. The relation attributes are same as table columns or fields. Each relation has set of tuples/ records in a table.

There is a close crospondence between the ER model & Relational model. ER View Employee Project

Employee

*

Project Assignment

Project

EmpNo

ProNo

Employee Empno 1111 2222 3333 4444 Project Prono 001 002 003 004 Emp no

Name AA BB CC DD

Add Col-3 Col-4 Col-5 Col-6

Tel 1234 1235 1236 1237

Protitle

Duration 6 month 12 Month 8 Month 6 Month Pro no

Budget 20,000/= 80,000/= 35,000/= 25,000/= Time spent

In relational analysis the notation we would use is as follows. Employee (Empno) Name, Add, Tel No) Project ( Prono, Pro title, Duration, Budget) Project Assignment (prono) ,Empno, Time spent)

In this notation each relation is represented by one line. Each relation is represented by one line. Each line starts with the relational name & it is followed by the names of the relational attributes with in the brackets. The under- lined lined attribute 9s) is the relations,s keys (primary keeys) If you can organize data into set of such relation then we can gurantee a good DB design. A Number of normal forms have been defined to eliminate duplications in relations. They are commonly known as - 1st Normal Form (INF) - 2nd Normal Form (2NF) - 3rd Normal Form (3NF) - Boyce code Normal Form (BCNF) Normalization What is Normalization is a formal process for designg which attributes should be grouped together in a relation. Before proceding with the physical design we need a method to validate the logical design to this point. Normalization is a tool to validate & improve a logical design. It needs to satisfy some constrain. Such avoide unnecessary duplicateting data normalizations is a process of decomposing relations with anomalies to produce small well structured relations. Normal form are the ruled for structuring relations these were indroduced by E.F.Cood. Later Dr.Cood proposed another normal form call boyce could normal form (BCNF) Reason for normalization data structure to avoid data redundancy If a creting fields occurs in DB several times errors can are 3 types of anomalies exist 1- Insertion Anomaly 2- Deletion Anomaly 3- Updation Anomaly Insertion Anomaly Employee Empno E100 E150 E300 E350 Name Smith James Ann David Add Col-1 Col-2 Col-4 Kandy ------Empno C-ID E100 E100 E300 E100 D-100 D-1100 D-25 M-100 CName Dura DICS IAD BSC MCSE 5m 1yr 7yr 5yr Price 65000 10900 12500 15000

Supose that we need to add new course details to EMP_ course table where the PK is EMPNo, C-ID. To insert a new row the user must supply the values for both empNo & C_ID. This is an anomaly since the user should be able toenter the course details without supplying the EMPNo. Deletion Anomaly Suporse that the data for EMPNO E100 is deleted from the table this will result in loosing the information to the course name,IAD ,DICS,MCSE To voide these anomalies we can apply normalization process for the data. Unnrmalization form Tables with repeating groups Remove repeation 1st normal form Remove partial dependencies 2nd Normal form Remove transitive dependencies 3rd Normal form Remove remaining anomalies

Boyce code

Determinency & Dependency During normalization logical assouations between data items are identified & represented in the DB Design without any file maintenance anomalise. Identitifycation of logical associations between data items is important in normalization & DB design. Such associations between data items are calle determinency & dependency relationship . Eg: If we can difermine a specific valie of data item V If we know the data item determinant & Data item V set to be the dependent. Therw are 2 types of deteminency dependency relatitionship 1- Functional dependenty 2- Non functional dependency Functional dependency Normalization is based on the analysis of functional dependency, A functional dependency is a relationship between 2 attributes Eg: - A relation R has an attribute A for every value instance of A this is value of a uniquely determines the value of B for a give. Non Funtional dependency For a given data item if we can find several value of other data item then there is a non functional dependency between 2 data items Eg::- A student No can have several subjects related. Candidate key This is a determinant that can uniquy identified non key attribute are functionally dependent on part of the primary key. Eg:- Emp Course (EmpID,Name CName, CDuration) Course name & Duration depends on parts of the primary key that is empid the other part will be linked by CID. Moltivalued dependency This is a type of dependence of exist where there are atleast 3 attributes in a relation Eg: A,B,C,Relation

Each value of a there is a well defined value for B & well defined value for C But the set of values of B indepennent of C Eg:

Subject DBD DBD

Teacher AA BB

Tex Book Text 1 Text 1

Transitive dependency This is funtionnal dependency between 2 or more non key Attributes in a relation. Eg: Consider thye following relation Eg: Consider the following relation Sales (CNo,Name, Sales person, region) In the above relation each sales person is asigened to a unique region We can identify that functional dependence exist in tne sales relation i- Customer No Name Sales person Region region

2- Sales person

The region is functionally dependent on sales person & sales person is functionally depend on CNo, AS a result there are update anoma lies in the sales relation. Eg: We can,t add new sales person to a new region without a customer. If we delete a customer we,ll loose the sales person information aswell. 3- If a salesperson is transferred to a new region then the several rows must be changed. Rules of Data Normalization. 1- INF- Elemunate repeating groupsMake a Elemunate repreating groupsMake a separate table for each set of related attributes & give each table a primary key. 2- 3NF- Eleminate columns not depend on key- If attribute do not contribute to the description of key remove that to a separate table. 3- BCNF- This is Boyce codd Normal dependency between candidate key attribute sdperte them to different table.

Eliminate repesting Groups. In the original member list,each member name is followed by any database that the mwmber has experience with . some might know many, & others might not know any. To answer the

question, who knows DB2? we need to perform an awkward DB2? we need to perform an awkward scan of the list looking for reerence to BB2.This is inefficient & extremely. Untidy way to store information. Moving the known DB into a separate table helps a lot. Sepatating the repeating groups of DBS from the member information results in INF. The member ID in the Database table matches The PK in the 2 tables with a join operation. Now we can answer the question by looking in the database table for DB2 & getting the list of member.

Member list 1. Jojn SmithAccess DB2 FoxPro 2. Dave jones dBase,Clipper 3. Mike Beach 4. Jerry Miller DB2, Oracle 5. Benstuary Oracle, Sybase 6. Fred Flint Informix 7. Joe Blow 8. Greg brown Access , Missq Server 9. Doughope

DataBase Table Member table MID Name 1. Johnsmith 2. David Jones 3Mile Beach 4.Jerry Miller 5. Benstuart 6. Fred Flint 7. Joe Blow 8. Gre Brows 9. Dougltope DID MID Database 1 1 Access 2 1 DB2 3 1 Foxpro 4 2 dBase 5 2 clipper 6 4 DB2 7 4 Oracle 8 5 Oracle 9 5 Sybase 10 6 Infomix 11 8 Access 12 8 Mssqlserver

2. Eliminate Redundant Data

3. In the DB Table, & PK is made up of the member ID & Databade ID. This makes scnse for other attributes like where learned & Skill level attributes, since they will be different for every member/ database combination. But the database name depends only on the Database. The same database under different IDs. This is an update anomaly. Or suppose the last member listing a particular database leaves the group. This records will be removed from the system, & the data base will not be stored anywhere . This is a delete anomaly. To avoid these proplems, we need 2NF. To Achieve this ,separate the attributes deperching on both parts of the key from those depending only on the Database ID. This results in 2 tables: database Which gives the name for each database ID , and Member database which list the databases for each member. Now we can recassfica database in a single operation: look up The database ID in the Database table & change its name. The result will instantly be available throught the application. Database Table DID 1 2 3 4 5 6 7 8 9 10 11 12 MID Database 1 Access 1 DB2 1 Foxpro 2 dbase 2 Clipper 4 DB2 4 Oracle 5 Oracle 5 Sybase 6 Informax 8 Access 8 Mssqlserren Member Table MID Name 1 2 3 4 5 6 7 8 9 Member Database Table MID DID 1 1 1 2 1 3 2 4 2 5 4 2 4 6 5 6 5 7 6 8 8 1 8 9

Database Table DID Database 1 Access 2 DB2 3 FoxPro 5 dBase 6 Clipper 7 8 9

3 Elimnate columns not Dependent on key The member table satisfies INF It contains no repeating groups. It satisfies NF since it doest have a multivalued key. But the key is member ID,& the company name & location describe only a company, not a member. To achieve 3 NF, They must be moved in to a separate table. Since they describe a company, company code becomes the key for the new company table. The motivation for this is the same for 1NE : we want to avoid update & the IBM were currently stored no members from the previous design, there would be no record to its existence, even though 20 past members were from IBM:

Member Table MID Name Company 1 Johnsmrth ABC 2 Dave Jones MC1 3 Mike Beach IBM 4 Jerry Miller MCI 5 Ben Stuart AIC 6 Fred Flint ABC 7 Joe Blow RuNuts 8 Gerg Brown XYZ 9 Doug Hope IBM Com Loc Alabama Florida Delaware Flnda Nebraska Alahama Iowa Now York Delawre

Member Table MID Name 1 2 3 4 5 6 7 8 9

CID 1 2 3 2 4 1 5 6 3

Company table CID Name Loc 1 2 3 4 5 6

BCNF Boyce code Normal 1 form states mathematically that A relation R is said to be in BCNF if whenever X->A holds in R ,and A is not in X, then X is candidate key for R. BCNF covers very specific situations where 3 NF misses inter- dependencies between nonkey (but candidate key) attributes. Typically, any relation that is in 3 NF is also in BCNF if (a) there are multiple candidate keys, (b) the keys are composed of multiple attributes & (c) there are common attributes between the keys. Eg: Classcode 503 540 Class Enrolment Class Discription Mgt Info Maths Student No 0001 0003 0005 0002 0004 Name AA BB CC DD EE

1- Convert into INF 2- Repealing group exist within this data each class code have any no. of student init, so the student information makes the repeating groups. Data cannot be stored or processed in a database when it this from. What we musthave is one record containing all the data for each student. INF Class code 503 503 503 540 540 Class discription Mgt Info Mgt Info Mgt Info Maths Maths Student No 0001 0003 0005 0002 0004 Name AA BB CC DD EE

Enrolled in a class, there can be no gaps in the data when stored in a file The above table is in the INF no repeating data & no gaps. To take the table to the 2NF we would have to remove partial dependency. Class Information Student Info

Class code 503 540

Class Discription Mgt Info Maths

Class code 503 503 503 540 540

Student No 0001 0003 0005 0002 0004

Name AA BB CC CC EE

Non loss decomposition This is the process of transforming an unnormalized data set in to a fully normalized database. Relational/ Bracketing nitation If we are going to represent all normalized table within the system itll become very complexed, to avoid this , we can use the bracketingnotalion which can represent table by on statement. Itll start with a table name followed by the key (s) attributes are separated with commas as well as primary keys are underlined & the foreign key shown with** Emergence of the DBMS Enter the database management system the simple approohe open data file for input soon became fraught to be addressed. Computer system vendors needed to be able to support the critical needd of a growing & evolving market palce that supported the data processing needs of organization in all fields of human endevaour. Thhe first DBMS appeared during the 1960s at a time in human history where projects of momentous scale were being contemplated planned engineeed never before had such large data problems on the flow where identified & solutions were researched & develop ofen in real time The BBMS became necessare because the data was for more volatile than had earlier been planed & because there was still major limiting factors in the cost associated with data storage media. Data grew as a detailed transations by transation levels In the 1980s all the major vendors of hardware system large enough to support the e volving need of evolving computerized . record keeping systems of large oranizations bundle some from of DBMS with their system solution. The 1st DBMS species were thus very much vendor specific. IBM as usual let the field but there were a growing no of competes and cloneds whose DB solution offered varing entry points in to the bandwagon of computerized record keeping system. Through this time the specifc nature of the problems being resolved & were around from the perspective of IT management were evolving with the teahnology & the no of main frame & mini computer hardware vendors in creased & the no of pheriperal types & their vendors also

increased. The bundeling of DB operational sevices such as the ability to perform & Schedule data backup became routine. The IT operational environment at this time was categorized with a collection of system house keeping task with major emhersis on organization produetion DB backup. File recoranization & reindex ation of data were also standard but mainly manual system house keeping system task. Other tasks incude storage media arehival management of document & records or business perticilarly those relste to finance record keeping. MedPix medical image database uses helthy dos of foss MedPix is a sprawling online medicimages DB & diagnostic tool thats used around The wirld by radiologist, nurses, physicians & medical students & the whole system is powered by linux & open source software. MedPix is hosted by the US federal governments helth sciences university, the uniformed services university in Bethesda, Maryland. It s the brainchild of JamesG. Smirniotopoulos, M.D., a USU professor of Radiology, Neurology & Biomedical informatics & clinical sciences chair of its Department of Radiology & Radiological sciences. Exploiting the DBMS (Selecting a DBMS) The DSMS is a software application system that used to create, maintain & provide control to create, maintain & provide control to user database Eg: It provides a user interface to the data base system. Data management layer The ANS1 / SPARC architechture A DBMS works as an interface between application progras of the users & the data base. The American National standard institute standard planning & requirements comitte propose a 3 level architecture for this interface such as. External/ User view This level describes The user application programa viwew of the database several users can share the same view. Conceptual/ l0gical view This describes overall data requirementd as a detailed technology ideperdent speafication of the over all structure of a database.Entity- Relatiorship modeling & object Oriented modeling are 2 diffw\erent graphical notations used for presenting the comceptual notation used for presenting the conceptual mobbed. Physical / Internal view This describes in which the data is stored & data is accessed

The above 3 layer of architecture create independence of data at 2 level. Logical Data independence This refers to the ability of changes to the external Without Eg:- Phys any influence to the external cchema. Physical Data independence This refers to the ability of introduang changes to the physical sohema without any influence to the logical schema Eg: Chaging the storage structure of data without any influence to the wonceptial Schema.

Toolkit Interface

The DBMS Consist of 3 parts. 1. kernel (core- central point0 2. Interface 3. toolket Kernel. The DBMS kernel is the central ergine which handles the rain data management functions. Most DBMS are installed on thetop of some operating systems. Therefore DBMS needs to interact with operating systems to implement & access the data & system calalogece which are usually stored on harddisk. Therefore they interact with some of the operting system elements such as i.- file manager which translate between the data structures manipulated by the DBMS & the files on the harddisk. ii. acces mechanism the file manager doesnal manage the physical input & output of data directly . It interact with appropriate access mechnisum established for different physical structures.

System buffer The reading & writing data is normaly stored in the system buffer of the operating system. Functions of DBMS. - the DBMS must be able to create, retrieve, update, delete the data from the data base (CRUD Function) Data Integrity & data seairity Data Communication Conarrency cCntrol Data recovery & backups Query optimization Managing transaction Scheduline transaction Data Dictionary

Software components available in DBMS. DDL Compilers DDL Compiler

Data dictioraiyment

Query processor & runtime processor User Queries

Query

Runtime Processor

Pre- compiler Application programs

Precompiler

DML Compiler

Host lang

DML Compiler Host Language Compiler Data dictionary manager Data base features Data independence The srperation of data descriptions/ definitions from the applications that uses the data In a relational model the logical independence & physical independence is achieved by allowing changes to the physical storage or the application programs to take place without the user being aware of the changes. Data abstraction Data abstraction is the process by which a data base attempts to represent properties of objects in the real world. A database records. The relevant details required to support some organizational activities & it does not record all the details. The database is an abstraction of the real world. The abstraction level varies according to the different commercial databases. Data Security Data security is the process of protecting the data from external thrects. Data is key resource for organization. Loss of data , Privacy , integrity, availability etc are issues that cost an organization in financial terms. Data Interation

A data base should be a collection of data which at last has no redundant data. The aim is to hare 1 database storing 1 logical item of data in one place which can be accessed by a range of information systems. Eg: Replacing several departmental data base in one company with one data base system that can be nade accessible to several departments. Physical Data base design. The process of physical database design is to translate the logical description of data in to the technical specification for retieving & storing data, the goal is to create the required performance & ensure the database integrity, security & recoverability. Physical database design must be performed carefully since the decisions made during this stage have major impact on data accessibility , response time , security , user friendliness , ect. Physical database design doses not include implementing database but it produice a technical specification that programmers & others involved in information system construction There are 3 major inputs to physical database design Logical database structure that were developed during logical design. This structure may be expressed in itierarachical, Network, relational object oriented & definition for each attribute. User processing requirements that were identified during requirement definition including sign , when data is used & where response time , time , data ,security , backup & recovery , integrity & retention of data. Description of technology used for omplemnting the databases (DBMS0 There are several critical decisions that ll affect the integrity & performance of the applications system. These key decisions are - data volume analysis - transaction / usage analysis - Contro & security analiysis - Destribution analysis - Integrity analysis

Integrity Analysis Tntegrity defines the business rules or constraints that should apply on the data. There are 4 basic types of integrity rules. Entity Integrity This define that each and every entity should have a unique identifier a primary key and it cannot be null.

Referential Tntegrity This refers to the rules concerining the relationship between the 2 entity types (1e) by defining foregn key relational data model. Domain integrity This refers to the eonstraction valid values for attributes. We can achieve the domain integrity for an attribute by specifying the following. - The type of the attribute - The length - Format range - Allowed values - Null support or Not - Check constraints Triggerring Operations All the other business rules that protect the validity of an attribute value defined as a tribber A trigger is a program / proceduire which is aulmatically executed due to an event. An event can be either Insevetion , deletion / updation. User Rule: check with drawal amount