Relational Databases CIS-182
May 24, 2015
Relational Databases
CIS-182
What is a database?
• Database: Collection of related data and the tools to manage and use that data
• Collection refers to a group of like things– Students at SPSCC represent a collection
• What belongs in the group is determined by a purpose, task or need: – What will the data be used for?– Provides the ability to determine what specific
data is needed to complete a task, satisfy the stated purpose
Manage & Use Data
• Add new data• Edit existing data• Remove data• Find data– Filtering: limit by characteristics– Sorting: order by value
Database Tools
• How data is stored doesn’t matter– May be a list– May be post-it notes
• Tools may be simple or complex– Piece of paper and pencil– Spreadsheet
• A Grocery List is a database– Using pen/paper, can add items to buy, change
items to buy, remove items to buy– Use a different list for different days or stores
What is a relational database?
• Incorporates basic definition of a database• Data organized in a set of tables, where the data
and relationships between data are modeled based on the real world– Table: Group of records about one kind of
thing (entity)– Record: Entry for one entity (row)– Field: Single value describing characteristic of one
entity (column)
• Reduces data entry, size of files, number of errors
• Helps to ensure the accuracy and validity of data
Relationships – 1:1
• One to one: for each record in one table there is a single corresponding record in a second table
• Similar to splitting a table in two: – If have a persons table (name, address) could
have a students table (school ID, major); each person can only be a single student
Relationships – 1:M
• One to many: for each record in one table, there can be one or more related records in a second table
• In simplest form, represents ownership– One student completes many assignments
• Each assignment “belongs” to a single student, and only that student
Relationships – M:N
• Many to many: for each record in both tables, there can be many matching records in the other table
• Most common kind of relationship– One student takes many classes, each class has
many students
• Requires a third table to create relationship– Third table “joins” entries in original table
• An Enrollments table would identify which student is in which class
– Join table has at least two foreign keys
Primary Keys
• Keys provide a means to find specific rows • Primary key defines unique, required value
in a table– Provides a way to get one row in the table– May be one or more columns
• Column(s) in primary key must have a value• Value(s) must be different for each row• Table can have only one primary key
• Student ID represents a value that is different (unique) for each student in the Students table
Foreign Keys
• Foreign key is a value in one table that refers to a unique value in a different table– “Foreign” means outside
• A student ID in enrollments refers to an entry in the Students table
– Usually refers to primary key, but can use any unique index
• Foreign key must “look like” related primary key– Same number of fields
• Field names don’t have to match
– Data types must match
Referential Integrity
• Ensures that data is consistent– Value in foreign key must exist in related
primary key– Prevents “orphans”, records on many side
without a valid “parent”
• Creates limits on both tables– Can’t enter a row in the many side with a
foreign key that doesn’t exist– Can’t remove a row on the one side if there are
related rows in the many side
Cascade Update/Delete
• Can implement referential integrity to help manage changes automatically
• Cascade update passes changes to primary key values to the related rows on the many side– If student ID is changed in students, change
the student ID in related rows in Enrollments to match the new value
• Cascade delete deletes related rows from the many table when a row from the one-side is deleted– If a student is deleted from Students table,
delete related rows in Enrollments
Normalizing a database
• Process of organizing data in database to:– Reduce redundancy: don’t repeat values
• Some repetition is needed for primary keys/foreign keys
– Reduce inconsistent dependencies: changing one value shouldn’t require a change to a second value• Rather than store Price, Quantity, Total Price (which is
Price * Quantity), store Price, Quantity and calculate Total Price when needed
• Helps to ensure each table is about one thing
Using Normalization
• Different degrees of normalization are referred to as “forms”
• Database designer determines how far to normalize– Most relational databases are in 3rd normal
form– Some “de-normalization” is common
• Each higher level of normalization leads to more tables with fewer columns
• More joins in queries are required to make data useful, understandable
First Normal Form
• First Normal Form is most basic level• Each row/column combination has only
one value– Address should be broken up to Street, City,
State, Zip
• Eliminate repeating groups– Don’t have multiple phone number columns in
a Students table
First Normal Form Example
• Example: Instead of using a single field for all items purchased, each item is unique, as is quantity
OrderIDCustomerIDOrderDateItems Purchased
Not Normalized OrderIDCustomerIDOrderDateItemIDQuantityItemName
1st Normal Form
Second Normal Form
• Remove fields that are not fully dependent on the key, and place in separate table(s)– Each row should be about just one thing– Listing the grades a student receives doesn’t
help describe a student – grades belong in a different table
Second Normal Form Example
• Example: ItemID is not dependent on the Customer and OrderID; it is dependent on OrderID (an order can include many different things)
OrderIDCustomerIDOrderDateItemIDQuantityItemName
1st Normal Form
OrderIDCustomerIDOrderDate
2nd Normal Form
OrderIDItemIDQuantityItemName
Third Normal Form
• All non-key columns are mutually independent– A change in one field does not require a
change in another field in the table (i.e. no calculations)
• All fields contribute to describing the key (making the record unique)
• There are limits to how far to go:– A change in city could require a change in
state and zip code– Need to either add many small tables or some
level of not being normalized
Third Normal Form Example
• If the ItemID that’s part of an order changes, that means that item name should change too; break out Products into it’s own table
OrderIDCustomerIDOrderDate
2nd Normal Form
OrderIDItemIDQuantityItemName
OrderIDCustomerIDOrderDate
3rd Normal Form
OrderIDItemIDQuantity
ItemIDItemName
Normalization Summary
• A change in one field should not require change in another field in the table– No calculations
• All fields help describe the key– Each record is unique– Each table stores information about one
“thing”