Database Basics CPSC 4670/5670
Jan 13, 2016
Database Basics
CPSC 4670/5670
2
Purpose of a Database
• The purpose of a database is to keep track of things
• Unlike a list or spreadsheet, a database may store information that is more complicated than a simple list
3
Database Systems• The four components of a database system
are:• Users• Database Application• Database Management System (DBMS)• Database
4
Users, Database
• A user of a database system will• Use a database
application to track things
• Use forms to enter, read, delete and query data
• Produce reports
• A database is a self-describing collection of related records
• Self-describing• The database itself
contains the definition of its structure
• Metadata is data describing the structure of the database data
• Tables within a relational database are related to each other
5
Database Management System (DBMS)
• A database management system (DBMS) serves as an intermediary between database applications and the database
• The DBMS manages and controls database activities
• The DBMS creates, processes and administers the databases it controls
• Create databases• Create tables• Create supporting
structures• Read database data• Modify database data
(insert, update, delete)• Maintain database
structures• Enforce rules i.e.,
Referential Integrity Constraints
• Control concurrency• Provide security• Perform backup and
recovery
6
Database Applications• A database application is a set of one or
more computer programs that serves as an intermediary between the user and the DBMS
• A database application• Create and process forms• Process user queries• Create and process reports• Execute application logic• Control database applications
7
Desktop Database Systems• Desktop database systems typically:
• Have one application• Have only a few tables• Are simple in design• Involve only one computer• Support one user at a time
8
Organizational Database Systems
• Organizational database systems typically:• Support several users
simultaneously• Include more than one
application• Involve multiple
computers• Are complex in design• Have many tables• Have many databases
9
Commercial DBMS Products
• Example Desktop DBMS Products• Microsoft Access
• Example Organizational DBMS Products• Oracle’s Oracle• Microsoft’s SQL Server• IBM’s DB2• InterSystems Caché®
http://www.intersystems.com/cache/
The Relational Model
11
Relational Databases• Relational databases are designed to
address many of the information complexity issues
• A relational database stores information in tables. Each informational topic is stored in its own table.
• In essence, a relational database will break-up a list into several parts. One part for each theme in the list.
• A Project List would be divided into a CUSTOMER Table, a PROJECT Table, and a PROJECT_MANAGER Table
12
Entity
• An entity is something of importance to a user that needs to be represented in a database
• An entity represents one theme or topic
13
Relation
• A relation is a two-dimensional table that has specific characteristics
• The table dimensions, like a matrix, consist of rows and columns
14
Characteristics of a Relation
• Rows contain data about an entity• Columns contain data about attributes
of the entity• Cells of the table hold a single value• All entries in a column are of the same
kind• Each column has a unique name• The order of the columns is unimportant• The order of the rows is unimportant• No two rows may be identical
15
A Sample Relation
EmployeeNumber
FirstName
LastName
100 Mary Abermany
101 Jerry Caldera
104 Alea Copley
107 Murugan Jacksoni
16
A Nonrelation Example
EmployeeNumber Phone LastName
100 335-6421,454-9744
Abermany
101 215-7789 Caldera
104 610-9850 Copley
107 299-9090 Jacksoni
Cells of the table hold multiple values
17
EmployeeNumber Phone LastName
100 335-6421 Abermany
101 215-7789 Caldera
104 610-9850 Copley
100 335-6421 Abermany
107 299-9090 Jacksoni
No two rows may be identical
A Nonrelation Example
18
Terminology
Table Row Column
File or Data file
Record Field
Relation Tuple Attribute
Synonyms…
19
A Key and Uniqueness of Keys
• A key is one (or more) columns of a relation that is (are) used to identify a row
Unique Key Non-unique Key• Data value is unique for each row.• Consequently, the key will uniquely identify a row.
• Data value may be shared among several rows.• Consequently, the key will identify a set of rows.
20
A Composite Key
• A composite key is a key that contains two or more attributes
• For a key to be unique, often it must become a composite key
• For example, • To identify a family member, you need to
know a FamilyID, a FirstName, and a Suffix (e.g., Jr.)
• The composite key is: (FamilyID, FirstName, Suffix)
• One needs to know the value of all three columns to uniquely identify an individual
21
A Candidate and Primary Key
• A candidate key is called “candidate” because it is a candidate to become the primary key
• A candidate key is a unique key• A primary key is a candidate key chosen to
be the main key for the relation• If you know the value of the primary key,
you will be able to uniquely identify a single row
22
Relationships Between Tables• A table may be related to other tables
• For example• An Employee works in a Department• A Manager controls a Project
• To preserve relationships, you may need to create a foreign key
• A foreign key is a primary key from one table placed into another table
• The key is called a foreign key in the table that received the key
23
Foreign Key Example
Project
ProjID
ProjName
MgrID
Manager
MgrID
MgrName
Foreign Key
Primary Key
24
Department
DeptID
DeptName
Location
Employee
EmpID
DeptID
EmpNameForeign Key
Primary Key
Foreign Key Example
25
Referential Integrity
• Referential integrity states that every value of a foreign key must match a value of an existing primary key
• For example (see previous slide)• If EmpID = 4 in EMPLOYEE has a
DeptID = 7 (a foreign key), a Department with DeptID = 7 must exist in DEPARTMENT
26
Referential Integrity• Another perspective… The value of the Foreign Key
EmployeeID in EQUIPMENT
must exist in
The values of the Primary Key EmployeeID
in EMPLOYEE
27
A Surrogate Key• A Surrogate Key is a unique,
numeric value that is added to a relation to serve as the Primary Key
• Surrogate Key values have no meaning to users and are usually hidden in forms, queries and reports
• A Surrogate Key is often used in place of a composite primary key
28
Surrogate Key Example• If the Family Member Primary Key is FamilyID,
FirstName, Suffix, it would be easier to append and use a surrogate key of FamMemberID
• FamilyID, FirstName and Suffix remain in the relation
• Referential Integrity:Use… (FamMemberID) in School must exist in
(FamMemberID) in FamilyMemberInstead of:
(FamilyID, FirstName, Suffix) in School must exist in (FamilyID, FirstName, Suffix) in FamilyMember
29
Functional Dependency
• A relationship between attributes in which one attribute (or group of attributes) determines the value of another attribute in the same table
• Illustration…• The price of one cookie can determine
the price of a box of 12 cookies
(CookiePrice, Qty) BoxPrice
30
Determinants
• The attribute (or attributes) that we use as the starting point (the variable on the left side of the equation) is called a determinant
(CookiePrice, Qty) BoxPrice
Determinant
31
Candidate/Primary Keys and Functional Dependency
• By definition…A candidate key of a relation will functionally determine all other attributes in the row
• Likewise, by definition…A primary key of a relation will functionally determine all other attributes in the row
32
Primary Key and Functional Dependency Example
(EmployeeID)(EmpLastName, EmpPhone)
(ProjectID) (ProjectName, StartDate)
33
Normalization
• Normalization is a process of analyzing a relation to ensure that it is well-formed
• More specifically, if a relation is normalized (well-formed), rows can be inserted, deleted, or modified without creating update anomalies
34
Normalization Principles
• Relational design principles for normalized relations:• To be a well-formed relation, every
determinant must be a candidate key• Any relation that is not well formed
should be broken into two or more well-formed relations.
35
Normalization Example(StudentID)
(StudentName, DormName, DormCost)
(DormName) (DormCost)
However, if…
(StudentID) (StudentName, DormName)
(DormName) (DormCost)
Then DormCost should be placed into its own relation,resulting in the relations:
36
Normalization Example(AttorneyID,ClientID)
(ClientName, MeetingDate, Duration)
(ClientID) (ClientName)
However, if…
(ClientID) (ClientName)
(AttorneyID,ClientID)
(MeetingDate, Duration)
Then ClientName should be placed into its own relation,resulting in the relations:
37
Structured Query Language (SQL)
38
Structured Query Language
• Structured Query Language• Acronym: SQL• Pronounced as “S-Q-L”• Also pronounced as “Sequel”• Originally developed by IBM as the
SEQUEL language in the 1970s• SQL-92 is an ANSI national standard
adopted in 1992
39
SQL Defined
• SQL is an international standard for creating, processing and querying database and their tables.
• SQL is comprised of:• A data definition language (DDL)
• Used to define database structures
• A data manipulation language (DML)• Data definition and updating• Data retrieval (Queries)
40
SQL for Data Definition
• The SQL data definition statements include• CREATE
• To create database objects
• ALTER• To modify the structure and/or
characteristics of database objects
• DROP• To delete database objects
41
SQL for Data Definition: CREATE
• Creating database tables• The SQL CREATE TABLE statement
CREATE TABLE Employee( EmpID Integer Primary Key, EmpName Char(25) Not Null);
42
SQL for Data Definition: CREATE with CONSTRAINT
• Creating database tables with PRIMARY KEY constraints• The SQL CREATE TABLE statement• The SQL CONSTRAINT keyword
CREATE TABLE Employee( EmpID Integer Not Null, EmpName Char(25) Not Null CONSTRAINT EmpPK PRIMARY KEY (EmpID));
43
SQL for Data Definition: CREATE with CONSTRAINT
• Creating database tables with composite primary keys using PRIMARY KEY constraints• The SQL CREATE TABLE statement• The SQL CONSTRAINT keyword
CREATE TABLE Emp_Skill ( EmpID Integer Not Null,
SkillID Integer Not Null, SkillLevel Integer, CONSTRAINT EmpSkillPK PRIMARY KEY (EmpID, SkillID)
);
44
SQL for Data Definition: CREATE with CONSTRAINT
• Creating database tables using PRIMARY KEY and FOREIGN KEY constraints• The SQL CREATE TABLE statement• The SQL CONSTRAINT keyword
CREATE TABLE Emp_Skill ( EmpID Integer Not Null,
SkillID Integer Not Null,SkillLevel Integer,CONSTRAINT EmpSkillPK PRIMARY KEY (EmpID, SkillID),
CONSTRAINT EmpFK FOREIGN KEY EmpID REFERENCES Employee
(EmpID),CONSTRAINT SkillFK FOREIGN KEY
SkillID REFERENCES Skill (SkillID));
45
SQL for Data Definition: CREATE with CONSTRAINT
• Creating database tables using PRIMARY KEY and FOREIGN KEY constraints• The SQL CREATE TABLE statement• The SQL CONSTRAINT keyword• ON UPDATE CASCADE and ON DELETE CASCADE
CREATE TABLE Emp_Skill ( EmpID Integer Not Null,
SkillID Integer Not Null,SkillLevel Integer,CONSTRAINT EmpSkillPK PRIMARY KEY (EmpID, SkillID),
CONSTRAINT EmpFK FOREIGN KEY (EmpID) REFERENCES Employee (EmpID) ON DELETE CASCADE,
CONSTRAINT SkillFK FOREIGN KEY (SkillID) REFERENCES Skill (SkillID)
ON UPDATE CASCADE);
When the row of EmpID (primary key) in Employee TABLE is deleted, the EmpFK (foreign key) is deleted also.
46
Deleting Database Objects: DROP
• To remove unwanted database objects from the database, use the SQL DROP statement
• Warning… The DROP statement will permanently remove the object and all data
DROP TABLE Employee;
47
Removing a Constraint: ALTER & DROP
• To change the constraints on existing tables, you may need to remove the existing constraints before adding new constraints
ALTER TABLE Employee DROP CONSTRAINT EmpFK;
48
SQL for Data Retrieval (Queries)
• SELECT is the best known SQL statement
• SELECT will retrieve information from the database that matches the specified criteria
SELECT EmpName
FROM Emp;
49
The Results of a Query is a Relation
• A query pulls information from one or more relations and creates (temporarily) a new relation
• This allows for a query to: • Create a new relation• Feed information to another query
(as a “sub-query”)
50
Displaying All Columns: *• To show all of the column values for the
rows that match the specified criteria, use an *
SELECT *
FROM Emp;Showing a Row Only Once: DISTINCT
• A qualifier may be added to the SELECT statement to inhibit duplicate rows from displaying
SELECT DISTINCT DeptID
FROM Emp;
51
Specifying Search Criteria: WHERE
• The WHERE clause stipulates the matching criteria for the record that are to be displayed
SELECT EmpName
FROM EmpWHERE DeptID = 15;
52
Match Criteria
• The WHERE clause match criteria may include• Equals “=“• Not Equals “<>”• Greater than “>”• Less than “<“• Greater than or Equal to “>=“• Less than or Equal to “<=“
53
Match Operators• Multiple matching criteria may be
specified using• AND
• Representing an intersection of the data sets
• OR• Representing a union of the data sets
SELECT EmpNameFROM EmpWHERE DeptID < 7
OR DeptID > 12;
54
A List of Values• The WHERE clause may specify that a
particular column value must be included in a list of values
SELECT EmpNameFROM EmpWHERE DeptID IN (4, 8, 9);
55
The Logical NOT Operator
• Any criteria statement may be preceded by a NOT operator which is to say that all information will be shown except that information matching the specified criteria
SELECT EmpNameFROM EmpWHERE DeptID NOT IN (4, 8, 9);
56
Finding Data Matching a Range of Values: BETWEEN
• SQL provides a BETWEEN statement that allows a user to specify a minimum and maximum value on one line
SELECT EmpNameFROM EmpWHERE SalaryCode BETWEEN 10 AND 45;
57
Allowing for Wildcard Searches: LIKE
• Sometimes it may be advantageous to find rows matching a string value using wildcards• Single character wildcard character is
an underscore (_)• Multiple character wildcard character
is a percent sign (%)
58
Wildcard Search Examples
SELECT EmpIDFROM EmpWHERE EmpName LIKE ‘Kr%’;
SELECT EmpIDFROM EmpWHERE Phone LIKE ‘616-___-____’;
3 underscores
59
Sorting the Results: ORDER BY
• The results may be sorted using the ORDER BY clause
SELECT *FROM EmpORDER BY EmpName;
60
Built-in SQL Functions
• SQL provides several built-in functions • COUNT
• Counts the number of rows that match the specified criteria
• MIN• Finds the minimum value for a specific
column for those rows matching the criteria
• MAX• Finds the maximum value for a specific
column for those rows matching the criteria
61
Built-in SQL Functions (continued)
• SUM• Calculates the sum for a specific
column for those rows matching the criteria
• AVG• Calculates the numerical average of a
specific column for those rows matching the criteria
62
Built-in Function Examples
SELECT COUNT(DISTINCT DeptID)FROM Emp;
SELECT MIN(Hours), MAX(Hours), AVG(Hours)
FROM ProjectWHERE ProjID > 7;
63
Providing Subtotals:GROUP BY
• Subtotals may be calculated by using the GROUP BY clause
SELECT DeptID, COUNT(*)FROM EmpGROUP BY DeptIDHAVING Count(*) > 3;
64
Retrieving Information from Multiple Tables
• SubQueries• As stated earlier, the result of a query is a
relation. As a result, a query may feed another query. This is called a subquery
• Joins• Another way of combining data is by using
a Join • Join [also called an Inner Join]• Left Outer Join• Right Outer Join
65
Subquery
SELECT EmpNameFROM EmpWHERE DeptID in
(SELECT DeptIDFROM Department
WHERE DeptNameLIKE ‘Accounts%’);
66
Join
SELECT EmpNameFROM Emp, DepartmentWHERE Emp.DeptID = Department.DeptID
AND Department.DeptName LIKE ‘Account%’;
67
Modifying Data using SQL
• Insert• Will add a new row in a table
• Update• Will update the data in a table that
matches the specified criteria
• Delete• Will delete the data in a table that
matches the specified criteria
68
Adding Data: INSERT
• To add a row to an existing table, use the INSERT statement
INSERT INTO Emp VALUES (91, ‘Smither’,
12);
INSERT INTO Emp (EmpID, SalaryCode) VALUES (62, 11);
69
Changing Data Values: UPDATE
• To change the data values in an existing row (or set of rows) use the Update statement
UPDATE EmpSET Phone =‘791-555-1234’WHERE EmpID = 29;
UPDATE EmpSET DeptID = 44WHERE EmpName LIKE ‘Kr%’;
70
Deleting Data: DELETE
• To delete a row or set of rows from a table using the DELETE statement
DELETE FROM EmpWHERE EmpID = 29;
DELETE FROM EmpWHERE EmpName LIKE ‘Kr%’;
71
Movie(title, length)
SELECT title AS name, length * 0.016667 AS lengthInHours
72
SELECT nameFROM MovieExec, Movies, StarInWHERE cert# = producerC# AND
title = movieTitle ANDyear = movieYear ANDstarName = ‘Harrison Ford’
MovieExec(name, address, cert#, netWorth)Movies(title, year, length, genre, studioName, producerC#)StarsIn(movieTitle, movieYear, starName)
73
Movies(title, year, length, genre, studioName, producerC#)MovieExec(name, address, cert#, netWorth)StarsIn(movieTitle, movieYear, starName)
SELECT nameFROM MovieExecWHERE cert# IN
(SELECT producerC#FROM MoviesWHERE (title, year) IN
(SELECT movieTitle, MovieYear)FROM StartsInWHERE starName = ‘Harrison Ford’)
);