1 Database Database Management Management
11
Database Database ManagementManagement
2
Basic Concepts related to DatabaseBasic Concepts related to Database
► Database Database It is a collection of related data.It is a collection of related data.
► Data Data Known facts that can be recorded and have implicit meaning.Known facts that can be recorded and have implicit meaning.
► Mini-worldMini-world Some part of the real world about which data is stored in Some part of the real world about which data is stored in
database. database. ► Database management system (DBMS)Database management system (DBMS)
A software package used to facilitate the creation and A software package used to facilitate the creation and maintenance of a computerized database.maintenance of a computerized database.
► Database systemDatabase system It is the DBMS software together with the data itself.It is the DBMS software together with the data itself.
3
Basics of DatabaseBasics of Database
► A database is a collection of information applicable to a A database is a collection of information applicable to a particular subject or purposeparticular subject or purpose
► It is a shared collection of logically related data (and a It is a shared collection of logically related data (and a description of this data), designed to meet the information description of this data), designed to meet the information needs of an organization. needs of an organization.
► Logically related data comprises entities, attributes, and Logically related data comprises entities, attributes, and relationships of an organization's information.relationships of an organization's information.
► The data is typically grouped into specific categories of The data is typically grouped into specific categories of information, which are contained in data storage files called information, which are contained in data storage files called table table
► It is a collection of non-redundant data which can be shared It is a collection of non-redundant data which can be shared by different application systemsby different application systems
4
Contd.Contd.
• It stresses the importance of multiple applications, data It stresses the importance of multiple applications, data sharing sharing
• The spatial database becomes a common resource for an The spatial database becomes a common resource for an agency agency
• A database implies separation of physical storage from A database implies separation of physical storage from use of the data by an application program, i.e. use of the data by an application program, i.e. Program/data independence Program/data independence
The user or programmer or application specialist need The user or programmer or application specialist need not know the details of how the data are stored not know the details of how the data are stored
Changes can be made to data without affecting other Changes can be made to data without affecting other components of the system components of the system
5
Contd.Contd.
Data in a database is stored under various categories known Data in a database is stored under various categories known as fieldsas fields
When the information from each of these fields is combined When the information from each of these fields is combined together as one unit, that unit is considered a single record.together as one unit, that unit is considered a single record.
► All of the records combined becomes a tableAll of the records combined becomes a table ► So the tables organize data into rows called records and So the tables organize data into rows called records and
columns called fieldscolumns called fields. . ► But the tables only store the raw dataBut the tables only store the raw data► In order to make use of these data, we need the following six In order to make use of these data, we need the following six
objects:objects:
6
Contd.Contd. Tables: Tables:
Store the data for the database Store the data for the database Queries:Queries:
Allow a user to select or interact with different sections of Allow a user to select or interact with different sections of data in the database of their own choosing data in the database of their own choosing
Forms: Forms: Used in conjunction with tables, they allow the user to see a Used in conjunction with tables, they allow the user to see a
single record or allow for easier data entry single record or allow for easier data entry Reports: Reports:
Organizes and summarizes information so that it may be Organizes and summarizes information so that it may be easily read and printed easily read and printed
Macros:Macros:These are programs within access that allow users to These are programs within access that allow users to
automate certain tasks.automate certain tasks. Modules: Modules:
These are pieces of visual basic programming which can be These are pieces of visual basic programming which can be associated with a database or particular parts of a database. associated with a database or particular parts of a database.
−
7
Database management: An OverviewDatabase management: An Overview
► A data base is a collection on non-redundant data A data base is a collection on non-redundant data shareable between different application systems. [Howe, shareable between different application systems. [Howe, D.R. 1989]D.R. 1989]
► A database management system (DBMS) is a sophisticated A database management system (DBMS) is a sophisticated software package capable of handling a database stored in software package capable of handling a database stored in computer files.computer files.
► In other words a DBMS is a data storage and retrieval In other words a DBMS is a data storage and retrieval system which permits data to be stored non-redundantly system which permits data to be stored non-redundantly while making it appear to the user as if the data is well-while making it appear to the user as if the data is well-integrated.integrated.
8
Contd.Contd.
► It is a software system that enables users to define, It is a software system that enables users to define, create and maintain the database and which provide create and maintain the database and which provide controlled access to this databasecontrolled access to this database
►The DBMS provides the interface between the application programs and the data
►The three main features of a DBMS that make it attractive are:
- Centralized data management,
- Data independence
- Systems integration.
9
DATABASE MANAGEMENT SYSTEMDATABASE MANAGEMENT SYSTEM
DBMS MANAGES DATA RESOURCES LIKE AN OPERATING SYSTEM MANAGES HARDWARE RESOURCES
DBMSDBMS DATABASECONTAINING
CENTRALIZED SHARED DATA
APPLICATION#1
APPLICATION#2
APPLICATION#3
10
File Based SystemFile Based System
►It is a collection of application programs that It is a collection of application programs that performs services for the end usersperforms services for the end users (e.g. (e.g. Reports). Reports).
►Each program defines and manages its own data.Each program defines and manages its own data.►There is no relationship among these filesThere is no relationship among these files
11
Limitations of file based systemLimitations of file based system
Separation and isolation of data –Separation and isolation of data –Each program maintains its own set of data. Users of one Each program maintains its own set of data. Users of one
program may be unaware of potentially useful data held program may be unaware of potentially useful data held by other programs. by other programs.
Duplication of data – Duplication of data – Same data is held by different programs. Wasted space Same data is held by different programs. Wasted space
and potentially different values and/or different formats and potentially different values and/or different formats for the same item. for the same item.
Data dependence –Data dependence –File structure is defined in the program code. File structure is defined in the program code.
Incompatible file formats – Incompatible file formats – Programs are written in different languages, and so Programs are written in different languages, and so
cannot easily access each others files. cannot easily access each others files. Fixed queries/proliferation of application programs –Fixed queries/proliferation of application programs –
Programs are written to satisfy particular functions. Any Programs are written to satisfy particular functions. Any new requirement needs a new program.new requirement needs a new program.
12
Problems with the file systemProblems with the file system
File systems require extensive programming in a File systems require extensive programming in a third-generation language (3gl).third-generation language (3gl).
As the number of files expands, system As the number of files expands, system administration becomes difficult.administration becomes difficult.
Making changes in existing file structures is Making changes in existing file structures is important and difficult.important and difficult.
Security features to safeguard data are difficult to Security features to safeguard data are difficult to program and usually omitted.program and usually omitted.
Difficulty to pool data creates islands of information.Difficulty to pool data creates islands of information.
13
Contd.Contd.
► Structural and data dependenceStructural and data dependence► Field definitions and naming conventionsField definitions and naming conventions► Data redundancy that leads to data inconsistency and Data redundancy that leads to data inconsistency and
data anomaliesdata anomalies
14
Types of database filesTypes of database files
►Flat files and spreadsheets: All records in this data base have the same number of
"fields". Individual records have different data in each field
with one field serving as a key to locate a particular record.
When the number of fields becomes lengthy a flat file is cumbersome to search.
Although this type of database is simple in its structure, expanding the number of fields usually entails reprogramming.
Additionally, adding new records is time consuming, particularly when there are numerous fields.
15
Contd.Contd. Hierarchical files:
These store data in more than one type of record.
This method is usually described as a "parent-child, one-to-many" relationship.
One field is key to all records, but data in one record does not have to be repeated in another.
This system allows records with similar attributes to be associated together.
The records are linked to each other by a key field in a hierarchy of files.
16
Contd.Contd.
Relational filesThese connect different files or tables without using
internal pointers or keys. Instead a common link of data is used to join or
associate records.The link is not hierarchical.A "matrices of tables" is used to store the
information. As long as the tables have a common link they may be
combined by the user to form new inquires and data output.
This is the most flexible system and is particularly suited to SQL (structured query language).
17
Major characteristics of database systemsMajor characteristics of database systems► Self-contained nature of a database system: Self-contained nature of a database system:
A DBMS catalog stores the description (meta-data) of A DBMS catalog stores the description (meta-data) of the database. This allows the DBMS software to work the database. This allows the DBMS software to work with different databases.with different databases.
► Insulations between program and data: Insulations between program and data: This is provided through: This is provided through:
►Data abstractions: A data model is used to hide Data abstractions: A data model is used to hide storage details and present the user with a storage details and present the user with a conceptual view of the database.conceptual view of the database.
►Program-data independence: Allows changing Program-data independence: Allows changing data storage structures without having to change data storage structures without having to change the DBMS access programs.the DBMS access programs.
►Program-operation independence: Allows Program-operation independence: Allows changing operation implementation without having changing operation implementation without having to change the DBMS access programs.to change the DBMS access programs.
► Support of multiple views of dataSupport of multiple views of data
18
Other important characteristics of database Other important characteristics of database technologytechnology
► Controlling data redundancy Controlling data redundancy ► Restricting unauthorized access to data.Restricting unauthorized access to data.► Providing persistent storage for program objects and data Providing persistent storage for program objects and data
structure.structure.► Providing multiple interfaces to different classes of users.Providing multiple interfaces to different classes of users.► Representing complex relationships among data.Representing complex relationships among data.► Enforcing integrity constraints on the database.Enforcing integrity constraints on the database.► Providing backup and recovery services.Providing backup and recovery services.► Potential for enforcing standards.Potential for enforcing standards.► Flexibility to change data structures.Flexibility to change data structures.► Reduced application development time.Reduced application development time.► Availability of up-to-date information.Availability of up-to-date information.► Economies of scale.Economies of scale.
19
COMPONENTS OF DATABASE SYSTEMSCOMPONENTS OF DATABASE SYSTEMS HARDWAREHARDWARE
► COMPUTERCOMPUTER► PERIPHERALSPERIPHERALS
SOFTWARESOFTWARE► OPERATING SYSTEMS SOFTWAREOPERATING SYSTEMS SOFTWARE► DBMS SOFTWAREDBMS SOFTWARE► APPLICATIONS PROGRAMS AND UTILITIES SOFTWAREAPPLICATIONS PROGRAMS AND UTILITIES SOFTWARE
PEOPLEPEOPLE► SYSTEMS ADMINISTRATORSSYSTEMS ADMINISTRATORS► DATABASE ADMINISTRATORS (DBAS)DATABASE ADMINISTRATORS (DBAS)► DATABASE DESIGNERSDATABASE DESIGNERS► SYSTEMS ANALYSTS AND PROGRAMMERSSYSTEMS ANALYSTS AND PROGRAMMERS► END USERSEND USERS
PROCEDURESPROCEDURES► INSTRUCTIONS AND RULES THAT GOVERN THE DESIGN AND INSTRUCTIONS AND RULES THAT GOVERN THE DESIGN AND
USE OF THE DATABASE SYSTEMUSE OF THE DATABASE SYSTEM
DATADATA► COLLECTION OF FACTS STORED IN THE DATABASE WHICH COLLECTION OF FACTS STORED IN THE DATABASE WHICH
ARE ARE USED BY THE ORGANIZATION AND A DESCRIPTION OF THIS DATA CALLED THE SCHEMA
20
TYPES OF DATABASE SYSTEMSTYPES OF DATABASE SYSTEMS
NUMBER OF USERSNUMBER OF USERS►SINGLE-USERSINGLE-USER
DESKTOP DATABASEDESKTOP DATABASE►MULTI USERMULTI USER
WORKGROUP DATABASEWORKGROUP DATABASE ENTERPRISE DATABASEENTERPRISE DATABASE
SCOPESCOPE► DESKTOPDESKTOP► WORKGROUPWORKGROUP► ENTERPRISEENTERPRISE
LOCATIONLOCATION►CENTRALIZEDCENTRALIZED►DISTRIBUTEDDISTRIBUTED
USEUSE►TRANSACTIONAL (PRODUCTION)TRANSACTIONAL (PRODUCTION)►DECISION SUPPORTDECISION SUPPORT►DATA WAREHOUSEDATA WAREHOUSE
21
MAJOR FUNCTIONS OF A DBMSMAJOR FUNCTIONS OF A DBMS
DATA DICTIONARY MANAGEMENTDATA DICTIONARY MANAGEMENT
DATA STORAGE MANAGEMENTDATA STORAGE MANAGEMENT
DATA TRANSFORMATION AND PRESENTATIONDATA TRANSFORMATION AND PRESENTATION
SECURITY MANAGEMENTSECURITY MANAGEMENT
MULTI-USER ACCESS CONTROLMULTI-USER ACCESS CONTROL
BACKUP AND RECOVERY MANAGEMENTBACKUP AND RECOVERY MANAGEMENT
DATA INTEGRITY MANAGEMENTDATA INTEGRITY MANAGEMENT
DATABASE ACCESS LANGUAGES (DDL AND DML) DATABASE ACCESS LANGUAGES (DDL AND DML) AND APPLICATION PROGRAMMING INTERFACESAND APPLICATION PROGRAMMING INTERFACES
DATABASE COMMUNICATION INTERFACESDATABASE COMMUNICATION INTERFACES
22
Advantages of using DBMSAdvantages of using DBMS►The three main features of a database management
system that make it attractive are: Centralized data management, Data independence, And systems integration.
► In DBMS, all files are integrated into one system thus reducing redundancies and making data management more efficient.
► In addition, DBMS provides centralized control of the operational data.
23
Contd.Contd.
► Some of the advantages of data independence, integration and centralized control are: RRedundancies and inconsistencies can be reduced Better service to the users Flexibility of the system is improved Cost of developing and maintaining systems is lower Standards can be enforced Security can be improved Integrity can be improved Enterprise requirements can be identified Data model must be developed
24
Disadvantages of using DBMSDisadvantages of using DBMS
►Confidentiality, privacy and security
►Data quality
►Data integrity
►Enterprise vulnerability
►The cost of using a DBMS
25
EVOLUTION OF DATABASE SYSTEMSEVOLUTION OF DATABASE SYSTEMS
► FLAT FILES - 1960S - 1980SFLAT FILES - 1960S - 1980S► HIERARCHICAL – 1970S - 1990SHIERARCHICAL – 1970S - 1990S► NETWORK – 1970S - 1990SNETWORK – 1970S - 1990S► RELATIONAL – 1980S - PRESENTRELATIONAL – 1980S - PRESENT► OBJECT-ORIENTED – 1990S - PRESENTOBJECT-ORIENTED – 1990S - PRESENT► OBJECT-RELATIONAL – 1990S - PRESENTOBJECT-RELATIONAL – 1990S - PRESENT► DATA WAREHOUSING – 1980S - PRESENTDATA WAREHOUSING – 1980S - PRESENT► WEB-ENABLED – 1990S - PRESENTWEB-ENABLED – 1990S - PRESENT
26
Database Schema, Instance & StateDatabase Schema, Instance & State
► SchemaSchema A description of a database but not the database A description of a database but not the database
itself!itself! Corresponds to the type in a programming Corresponds to the type in a programming
language, or the abstract data typelanguage, or the abstract data type► InstanceInstance
An occurrence of a data item described in the An occurrence of a data item described in the schemaschema
► Database stateDatabase state The data in the database at a moment in timeThe data in the database at a moment in time
27
DBMS languagesDBMS languages► DDL: Data Definition LanguageDDL: Data Definition Language
These are used to define/change the structure of the These are used to define/change the structure of the databasedatabase
In other words these are used to define the schema In other words these are used to define the schema or describe the data (conceptual schema)or describe the data (conceptual schema)
► DML: Data Manipulation LanguageDML: Data Manipulation Language After the database is built, these are used to query After the database is built, these are used to query
the database, insert data, change data or delete datathe database, insert data, change data or delete data DCL: Data Control Language DCL: Data Control Language
These are used for having control on the user accessThese are used for having control on the user access
28
Database ModelsDatabase Models
► A database model is a collection of logical constructs A database model is a collection of logical constructs used to represent the data structure and the data used to represent the data structure and the data relationships found within the database.relationships found within the database.
► There are two categories of database modelsThere are two categories of database models Conceptual models: Conceptual models:
►Focus on the logical nature of the data representation. They Focus on the logical nature of the data representation. They are concerned with what is represented rather than how it are concerned with what is represented rather than how it is represented.is represented.
Implementation modelsImplementation models►Places the emphasis on how the data are represented in the Places the emphasis on how the data are represented in the
database or on how the data structures are implementeddatabase or on how the data structures are implemented
29
Types of Relationships used in Database ModelsTypes of Relationships used in Database Models
► Generally following three types of relationships are used:Generally following three types of relationships are used: One-to-many relationships (1:M)One-to-many relationships (1:M)
►A painter paints many different paintings, but each A painter paints many different paintings, but each one of them is painted by only that painter.one of them is painted by only that painter. Painter (1) paints painting (m) Painter (1) paints painting (m)
Many-to-many relationships (M:N)Many-to-many relationships (M:N)►An employee might learn many job skills, and each job An employee might learn many job skills, and each job
skill might be learned by many employees.skill might be learned by many employees. Employee (m) learns skill (n)Employee (m) learns skill (n)
One-to-one relationships (1:1)One-to-one relationships (1:1)►Each store is managed by a single employee and each Each store is managed by a single employee and each
store manager (employee) only manages a single store.store manager (employee) only manages a single store. Employee (1) manages store (1)Employee (1) manages store (1)
30
Implementation Data ModelsImplementation Data Models
►There are three types of implementation There are three types of implementation database modelsdatabase models
Hierarchical database modelHierarchical database model
Network database modelNetwork database model
Relational database modelRelational database model
31
A HIERARCHICAL STRUCTURE
32
Hierarchical Database ModelHierarchical Database Model
►Collection of records logically organized to conform to the Collection of records logically organized to conform to the upside-down tree (hierarchical) structure.upside-down tree (hierarchical) structure.
►The top layer is perceived as the parent of the segment The top layer is perceived as the parent of the segment directly beneath it.directly beneath it.
►The segments below other segments are the children of the The segments below other segments are the children of the segment above them.segment above them.
►A tree structure is represented as a hierarchical path on the A tree structure is represented as a hierarchical path on the computer’s storage media.computer’s storage media.
33
Advantages and disadvantages of Hierarchical Advantages and disadvantages of Hierarchical
Database ModelDatabase Model AdvantagesAdvantages
►Conceptual simplicityConceptual simplicity►Database securityDatabase security►Data independenceData independence►Database integrityDatabase integrity►Efficiency dealing with a large databaseEfficiency dealing with a large database
DisadvantagesDisadvantages►Complex implementationComplex implementation►Difficult to manageDifficult to manage►Lacks structural independenceLacks structural independence►Applications programming and use complexityApplications programming and use complexity►Implementation limitationsImplementation limitations►Lack of standardsLack of standards
34
A NETWORK DATABASE MODEL
35
A Network Database ModelA Network Database Model
Basic structureBasic structure
►Set Set -- -- a relationship is called a set. Each set is a relationship is called a set. Each set is composed of at least two record types: an owner composed of at least two record types: an owner (parent) record and a member (child) record.(parent) record and a member (child) record.
►A set represents a 1:m relationship between the A set represents a 1:m relationship between the owner and the member.owner and the member.
36
Advantages & Disadvantages of a Network Advantages & Disadvantages of a Network Database ModelDatabase Model
AdvantagesAdvantages►Conceptual simplicityConceptual simplicity
►Handles more relationship typesHandles more relationship types
►Data access flexibilityData access flexibility
►Promotes database integrityPromotes database integrity
►Data independenceData independence
►Conformance to standardsConformance to standards
DisadvantagesDisadvantages►System complexitySystem complexity
►Lack of structural independenceLack of structural independence
37
Relational database modelRelational database model
►RDBMS allows operations in a human logical RDBMS allows operations in a human logical environment.environment.
►The relational database is perceived as a collection The relational database is perceived as a collection of tables.of tables.
►Each table consists of a series of row/column Each table consists of a series of row/column intersections.intersections.
►Tables (or relations) are related to each other by Tables (or relations) are related to each other by sharing a common entity characteristic.sharing a common entity characteristic.
►The relationship type is often shown in a relational The relationship type is often shown in a relational schema.schema.
►A table yields complete data and structural A table yields complete data and structural independence.independence.
38
LINKING RELATIONAL TABLES
39
Advantages & Disadvantages of Relational Advantages & Disadvantages of Relational Database ModelDatabase Model
AdvantagesAdvantages►Structural independenceStructural independence►Improved conceptual simplicityImproved conceptual simplicity►Easier database design, implementation, Easier database design, implementation,
management, and usemanagement, and use►Ad hoc query capability (SQL)Ad hoc query capability (SQL)►Powerful database management systemPowerful database management system
DisadvantagesDisadvantages►Substantial hardware and system software Substantial hardware and system software
overheadoverhead►Possibility of poor design and implementationPossibility of poor design and implementation►Potential “islands of information” problemsPotential “islands of information” problems
40
Entity Relationship ModelingEntity Relationship Modeling
►E-R models are normally represented in an entity E-R models are normally represented in an entity relationship diagram (ERD).relationship diagram (ERD).
►An entity is represented by a rectangle.An entity is represented by a rectangle.►Each entity is described by a set of attributes. An Each entity is described by a set of attributes. An
attribute describes a particular characteristics of attribute describes a particular characteristics of the entity.the entity.
►A relationship is represented by a diamond A relationship is represented by a diamond connected to the related entities.connected to the related entities.
41
E-R Model ConceptsE-R Model Concepts
► Entities and attributes:Entities and attributes: Entity - a thing, has independent existence-> employee Entity - a thing, has independent existence-> employee Attribute – describes something -> age, ssn, gender, name Attribute – describes something -> age, ssn, gender, name Value - taken on by an attribute -> 25, 456-876-788, Value - taken on by an attribute -> 25, 456-876-788,
female, bart simpson female, bart simpson Composite attributes vs. Atomic or simple attributes -> Composite attributes vs. Atomic or simple attributes ->
bart simpson vs. 45 bart simpson vs. 45 Single-valued attributes vs. Multivalued attributes -> age Single-valued attributes vs. Multivalued attributes -> age
vs. College degrees vs. College degrees Derived attributes vs. Stored attributes -> age vs. Birth Derived attributes vs. Stored attributes -> age vs. Birth
date (age is derived from birth date) date (age is derived from birth date)
42
Contd.Contd.
► Entity types, value sets and key attributesEntity types, value sets and key attributes Entity type - defines the structure of a set of entities that Entity type - defines the structure of a set of entities that
have the same attributes have the same attributes Entity – an instance of an entity type Entity – an instance of an entity type Entity set, collections - group of entities Entity set, collections - group of entities Key, uniqueness Key, uniqueness Combination to create key Combination to create key Value sets (domains) Value sets (domains)
43
NOTATIONS OF E-R DIAGRAM NOTATIONS OF E-R DIAGRAM
► ENTITY TYPEENTITY TYPE
► ATTRIBUTEATTRIBUTE
► KEY ATTRIBUTEKEY ATTRIBUTE
► MULTIVALUED MULTIVALUED ATTRIBUTEATTRIBUTE
► COMPOSITE ATTRIBUTECOMPOSITE ATTRIBUTE
► DERIVED ATTRIBUTEDERIVED ATTRIBUTE
44
DEGREE OF A RELATIONSHIP: BINARY, DEGREE OF A RELATIONSHIP: BINARY, TERNARY, UNARYTERNARY, UNARY
SUPPLIER
SNAME
PARTNO
PROJECT
PART
PROJNAME
SUPPLY
QUANTITY
TERNARY TERNARY RELATIONSHIPRELATIONSHIP
EMPLOYEE
SSN
MANAGES
UNARYUNARYRELATIONSHIPRELATIONSHIP
45
46
47
Advantages & Disadvantages of Entity Advantages & Disadvantages of Entity Relationship Data ModelRelationship Data Model
AdvantagesAdvantages►Exceptional conceptual simplicityExceptional conceptual simplicity
►Visual representationVisual representation
►Effective communication toolEffective communication tool
►Integrated with the relational database modelIntegrated with the relational database model
DisadvantagesDisadvantages►Limited constraint representationLimited constraint representation
►Limited relationship representationLimited relationship representation
►No data manipulation languageNo data manipulation language
►Loss of information contentLoss of information content
48
NormalizationNormalization► Normalization is a process for assigning attributes to Normalization is a process for assigning attributes to
entities. entities. ► It reduces data redundancies and helps eliminate the It reduces data redundancies and helps eliminate the
data anomalies. data anomalies.
► Normalization works through a series of stages called Normalization works through a series of stages called normal forms:normal forms: First normal form (1NF)First normal form (1NF) Second normal form (2NF)Second normal form (2NF) Third normal form (3NF)Third normal form (3NF) Fourth normal form (4NF)Fourth normal form (4NF)
► The highest level of normalization is not always The highest level of normalization is not always desirabledesirable
49
Contd.Contd.
► It's the process of efficiently organizing data in a It's the process of efficiently organizing data in a database. database.
► There are two goals of the normalization processThere are two goals of the normalization process Eliminate redundant data (for example, storing the same data Eliminate redundant data (for example, storing the same data
in more than one table) and in more than one table) and Ensure data dependencies make sense (only storing related Ensure data dependencies make sense (only storing related
data in a table). data in a table).
► These goals help to reduce the amount of space a These goals help to reduce the amount of space a database consumes and ensure that data is logically database consumes and ensure that data is logically stored. stored.
50
Contd.Contd.
► The database community has developed a series of The database community has developed a series of guidelines for ensuring that databases are normalized. guidelines for ensuring that databases are normalized.
► These are referred to as normal forms and are These are referred to as normal forms and are numbered from one (the lowest form of normalization, numbered from one (the lowest form of normalization, referred to as first normal form or 1NF) through five referred to as first normal form or 1NF) through five (fifth normal form or 5NF). (fifth normal form or 5NF).
► In practical applications, we often see 1NF, 2NF, and In practical applications, we often see 1NF, 2NF, and 3NF along with the occasional 4NF. 3NF along with the occasional 4NF.
► Fifth normal form is very rarely seen Fifth normal form is very rarely seen ► All these normalization guidelines are cumulative. For a All these normalization guidelines are cumulative. For a
database to be in 2NF, it must first fulfill all the criteria database to be in 2NF, it must first fulfill all the criteria of a 1NF database. of a 1NF database.
51
Example for NormalizationExample for Normalization Case of a construction companyCase of a construction company
►Building project -- project number, name, employees Building project -- project number, name, employees assigned to the project.assigned to the project.
►Employee -- employee number, name, job classificationEmployee -- employee number, name, job classification
►The company charges its clients by billing the hours spent The company charges its clients by billing the hours spent on each project. The hourly billing rate is dependent on the on each project. The hourly billing rate is dependent on the employee’s position.employee’s position.
►Periodically, a report is generated.Periodically, a report is generated.
►The table whose contents correspond to the reporting The table whose contents correspond to the reporting requirements is shown in table 5.1.requirements is shown in table 5.1.
52
ScenarioScenario
A few employees works for one project.
Project Num : 15
Project Name : Evergreen
Employee Num : 101, 102, 103, 105
53
54
TABLE STRUCTURE MATCHES THE REPORT FORMAT
55
Problems with the report formatProblems with the report format
The project number is intended to be a primary key, The project number is intended to be a primary key, but it contains nulls.but it contains nulls.
The table displays data redundancies.The table displays data redundancies.
The table entries invite data inconsistencies.The table entries invite data inconsistencies.
The data redundancies yield the following anomalies:The data redundancies yield the following anomalies:
►Update anomalies.Update anomalies.
►Addition anomalies.Addition anomalies.
►Deletion anomalies.Deletion anomalies.
56
Solving the problemSolving the problem► Conversion to first Conversion to first
normal formnormal form
A relational table A relational table must not contain must not contain repeating groups.repeating groups.
Repeating groups Repeating groups can be eliminated can be eliminated by adding the by adding the appropriate entry appropriate entry in at least the in at least the primary key primary key column (s). column (s).
57
TABLE BEFORE NORMALIZATION
58
TABLE AFTER NORMALIZATION TO 1NF
59
FIRST NORMAL FORM(1NF)FIRST NORMAL FORM(1NF)►First normal form (1NF) sets the very basic rules First normal form (1NF) sets the very basic rules
for an organized database:for an organized database:► 1NF definition1NF definition
The term first normal form (1NF) describes the tabular The term first normal form (1NF) describes the tabular format in which:format in which:►All the key attributes are defined.All the key attributes are defined.►There are no repeating groups in the table. There are no repeating groups in the table. ►All attributes are dependent on the primary key.All attributes are dependent on the primary key.
► For this eliminate duplicative columns from the For this eliminate duplicative columns from the same table. same table.
►Create separate tables for each group of related Create separate tables for each group of related data and identify each row with a unique column data and identify each row with a unique column or set of columns (the primary key). or set of columns (the primary key).
60
Contd.Contd.
►The first rule dictates that we must not The first rule dictates that we must not duplicate data within the same row of a table. duplicate data within the same row of a table.
►Within the database community, this concept is Within the database community, this concept is referred to as the atomicity of a table and the referred to as the atomicity of a table and the tables that comply with this rule are said to be tables that comply with this rule are said to be atomic. atomic.
61
The primary key components are bold, underlined, and shaded The primary key components are bold, underlined, and shaded in a different color.in a different color.
The arrows above entities indicate all desirable dependencies, The arrows above entities indicate all desirable dependencies, i.e., Dependencies that are based on PK.i.e., Dependencies that are based on PK.
The arrows below the dependency diagram indicate less The arrows below the dependency diagram indicate less desirable dependencies -- partial dependencies and transitive desirable dependencies -- partial dependencies and transitive dependencies.dependencies.
Dependency Diagram for the exampleDependency Diagram for the example
62
SECOND NORMAL FORM (2NF)SECOND NORMAL FORM (2NF)
► Second normal form (2NF) further addresses the Second normal form (2NF) further addresses the concept of removing duplicative dataconcept of removing duplicative data
► Remove subsets of data that apply to multiple rows of a Remove subsets of data that apply to multiple rows of a table and place them in separate tables. table and place them in separate tables.
► Create relationships between these new tables and their Create relationships between these new tables and their predecessors through the use of foreign keys. predecessors through the use of foreign keys.
► These rules can be summarized in a simple statement: These rules can be summarized in a simple statement: 2NF attempts to reduce the amount of redundant data in 2NF attempts to reduce the amount of redundant data in a table by extracting it, placing it in new table (s) and a table by extracting it, placing it in new table (s) and creating relationships between those tables.creating relationships between those tables.
63
Contd.Contd.
► Conversion to second normal formConversion to second normal form Starting with the 1NF format, the database can be converted Starting with the 1NF format, the database can be converted
into the 2NF format byinto the 2NF format by►Writing each key component on a separate line, and then Writing each key component on a separate line, and then
writing the original key on the last line andwriting the original key on the last line and►Writing the dependent attributes after each new key.Writing the dependent attributes after each new key.
Project (Project (proj_num,proj_num, proj_name) proj_name)
Employee (Employee (emp_numemp_num, emp_name, job_class, chg_hour), emp_name, job_class, chg_hour)
Assign (Assign (proj_num, emp_numproj_num, emp_num, hours), hours)
64
Contd.Contd.A table is in 2NF if:A table is in 2NF if:
►It is in 1NF andIt is in 1NF and
►It includes no partial dependencies; that is, no It includes no partial dependencies; that is, no attribute is dependent on only a portion of the primary attribute is dependent on only a portion of the primary key.key.
(It is still possible for a table in 2NF to exhibit (It is still possible for a table in 2NF to exhibit transitive dependency; that is, one or more attributes transitive dependency; that is, one or more attributes may be functionally dependent on non-key attributes.)may be functionally dependent on non-key attributes.)
65
DEPENDENCY DIAGRAM FOR 2NF
66
THIRD & FOURTH NORMAL FORM (3NF & 4NF)THIRD & FOURTH NORMAL FORM (3NF & 4NF)
► Third normal form (3NF) goes one large step furtherThird normal form (3NF) goes one large step further► Remove columns that are not dependent upon the primary Remove columns that are not dependent upon the primary
key. key. ► 3NF definition3NF definition
A table is in 3NF ifA table is in 3NF if►It is in 2NF andIt is in 2NF and►It contains no transitive dependencies.It contains no transitive dependencies.
► Finally, fourth normal form (4NF) has one requirementFinally, fourth normal form (4NF) has one requirement A relation is in 4NF if it has no multi-valued dependencies. A relation is in 4NF if it has no multi-valued dependencies.
67
CONVERSION TO 3NF IN THE EXAMPLECONVERSION TO 3NF IN THE EXAMPLE
► Create a separate table with attributes in a transitive Create a separate table with attributes in a transitive functional dependence relationshipfunctional dependence relationship
PROJECT (PROJECT (PROJ_NUM,PROJ_NUM, PROJ_NAME) PROJ_NAME)
ASSIGN (ASSIGN (PROJ_NUM, EMP_NUMPROJ_NUM, EMP_NUM, HOURS), HOURS)
EMPLOYEE (EMPLOYEE (EMP_NUMEMP_NUM, EMP_NAME, JOB_CLASS), EMP_NAME, JOB_CLASS)
JOB (JOB (JOB_CLASSJOB_CLASS, CHG_HOUR), CHG_HOUR)
68
69
CONVERSION TO 4NF
70
Structured Query Language (SQL)Structured Query Language (SQL)► SQL is used to define, manipulate, and control data in relational SQL is used to define, manipulate, and control data in relational
databases.databases.► So all these fall into the following three main categories So all these fall into the following three main categories according to the functions:according to the functions:-- Data Definition Language (DDL) - define or change database Data Definition Language (DDL) - define or change database
structure(s)structure(s)CreateCreateAlterAlterDropDrop
- Data Manipulation Language (DML) - select or change data- Data Manipulation Language (DML) - select or change dataInsertInsertUpdateUpdateDeleteDeleteSelectSelect
- Data Control Language (DCL) - control user access (e.g., Grant, - Data Control Language (DCL) - control user access (e.g., Grant, Revoke) transactions (e.g., Commit)Revoke) transactions (e.g., Commit)
71
Data Definition Language (DDL)Data Definition Language (DDL)Creating tableCreating table► Empty tables are constructed using the create table statement.Empty tables are constructed using the create table statement.► Data must be entered later using insert.Data must be entered later using insert.
Create table s ( sno char(5),Create table s ( sno char(5), Sname char(20),Sname char(20), Status decimal(3),Status decimal(3), City char(15),City char(15), Primary key (sno) ) Primary key (sno) )
► Columns which are defined as primary keys will never Columns which are defined as primary keys will never have two rows with the same key value.have two rows with the same key value.
► Primary keys may consist of more than one column Primary keys may consist of more than one column (values unique in combination).(values unique in combination).
► A table name and unique column names must be A table name and unique column names must be specifiedspecified
72
Contd.Contd.
Alter table:Alter table:► This is used to add or remove columns or constraints. This is used to add or remove columns or constraints.
Alter table categoriesnopixAlter table categoriesnopix
DROP COLUMN shortdesc;DROP COLUMN shortdesc;
Drop tableDrop table::► Use drop objectname Use drop objectname to remove from the database to remove from the database
any object that was createdany object that was created
DROP TABLE categoriesnopix; DROP TABLE categoriesnopix;
73
Data Manipulation Language (DML)Data Manipulation Language (DML)
►There are four basic SQL data manipulation There are four basic SQL data manipulation operations.operations. SELECTSELECT - - RETRIEVES DATARETRIEVES DATA INSERT - ADD A NEW ROWINSERT - ADD A NEW ROW UPDATE - CHANGE VALUES IN EXISTING UPDATE - CHANGE VALUES IN EXISTING
RECORDSRECORDS DELETE - REMOVE ROW (S)DELETE - REMOVE ROW (S)
74
Insert CommandInsert Command► Use the Insert command to enter data into a table. Use the Insert command to enter data into a table. ► You may insert one row at a time, or select several rows You may insert one row at a time, or select several rows
from an existing table and insert them all at once.from an existing table and insert them all at once.
INSERTINSERT
INTO SP ( SNO, PNO, QTY )INTO SP ( SNO, PNO, QTY )
VALUES ( 'S4', 'P1', 1000 )VALUES ( 'S4', 'P1', 1000 )
75
Update & Delete CommandsUpdate & Delete Commands
► Use the update statement to change data values in one or Use the update statement to change data values in one or more columns, usually based on specific criteria.more columns, usually based on specific criteria.
UPDATE MySuppliersUPDATE MySuppliers
SET Region = "UK"SET Region = "UK"
WHERE City IN ("London", "Manchester");WHERE City IN ("London", "Manchester");► Delete command is used to remove whole rows from a Delete command is used to remove whole rows from a
table. Use with caution!table. Use with caution!
DELETE * FROM PersonnelDELETE * FROM Personnel
WHERE Department = "Chemistry";WHERE Department = "Chemistry";
76
Select CommandSelect Command
► SELECT has the general form SELECT has the general form
SELECT-FROM-WHERE.SELECT-FROM-WHERE.► The result is another (new) table.The result is another (new) table.► If DISTINCT is used in SELECT then- no duplicate If DISTINCT is used in SELECT then- no duplicate
rows are asked forrows are asked for► When WHERE is missing from the query- all rows of When WHERE is missing from the query- all rows of
from table are returned.from table are returned.► SELECT * is used for select the entire row (all columns).SELECT * is used for select the entire row (all columns).