27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless • Course Structure – 1) Intro to the Web – 2) HTML – 3) HTML and CSS – 4) Intro to Databases – 5) Intro to Databases – 6) PHP and MySQL – 7) Reading Week – 8) PHP and MySQL – 9) PHP and XML – 10) CMS – 11) Analytics – 12) Visualisation • Course Structure – 1) Intro to the Web – 2) HTML – 3) HTML and CSS – Essay Information Session – 4) Intro to Databases – 5) Intro to Databases – 6) PHP and MySQL – 7) Reading Week – 8) PHP and MySQL – 9) PHP and XML – 10) CMS – 11) Analytics – 12) Visualisation • Course Structure – 1) Intro to the Web – 2) HTML – 3) HTML and CSS – Essay Information Session – 4) Intro to Databases – 5) Intro to Databases – 6) PHP and MySQL – 7) Reading Week – 8) PHP and MySQL – 9) PHP and XML – 10) CMS – 11) Analytics – 12) Visualisation Housekeeping
32
Embed
Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
27/10/2011
1
Introduction to Databases
G6921 and G6931
Web Technologies
Dr. Séamus Lawless
• Course Structure – 1) Intro to the Web – 2) HTML – 3) HTML and CSS – 4) Intro to Databases – 5) Intro to Databases – 6) PHP and MySQL – 7) Reading Week – 8) PHP and MySQL – 9) PHP and XML – 10) CMS – 11) Analytics – 12) Visualisation
• Course Structure – 1) Intro to the Web – 2) HTML – 3) HTML and CSS – Essay Information Session – 4) Intro to Databases – 5) Intro to Databases – 6) PHP and MySQL – 7) Reading Week – 8) PHP and MySQL – 9) PHP and XML – 10) CMS – 11) Analytics – 12) Visualisation
• Course Structure – 1) Intro to the Web – 2) HTML – 3) HTML and CSS – Essay Information Session – 4) Intro to Databases – 5) Intro to Databases – 6) PHP and MySQL – 7) Reading Week – 8) PHP and MySQL – 9) PHP and XML – 10) CMS – 11) Analytics – 12) Visualisation
Housekeeping
27/10/2011
2
Housekeeping
• You all have a Laptop?
• Can you all get connected to the Internet?
• Can you all get connected to your webspace through SHH?
• Can you use a URL to view your files?
• Assessment
– Continuous Assessment
• Upload the files you create!
Exercise 1
• An airline company flies many flights, but each flight is flown by only one airline
27/10/2011
3
Exercise 2
• The Globex Corporation operates many factories. Each of these factories is located in a region. Each region can have more than one Globex factory. Each factory employs many employees, but each of these employees is employed by only one factory.
Exercise 3
• Dodgy Builders Construction Company is a building contractor that specialises in mid-range homes.
• Dodgy Builder’s has a number of customers and employees, handles a series of projects and owns lots of building equipment.
• A customer can engage the company for work on more than one project.
• Dodgy Builder employees can often work on more than one project at a time.
• Building equipment is always assigned to only one project at a time.
• Draw an ERD showing that models this companies data. • Write out each table and it’s associated columns
27/10/2011
4
Data Types
• Character
– char
– varchar
– text
• Date and Time
– Date
– Time
– timestamp
• Number
– bit
– integer
– real
– numeric
• Others
– blob
– boolean
DBMS
• Database Management System
• Goal of a DBMS is to simplify the storage of, and access to data
• DBMS support:
– Definition
– Manipulation
– Querying
27/10/2011
5
DBMS
• Well known DBMS: – Proprietary
• Oracle
• Access (Microsoft)
• SQL Server (Microsoft)
• DB2 (IBM)
– Open Source • MySQL
• SQLite
• PostgreSQL
Database Languages
• Programming languages which are used to – Define a database (i.e. its entities and the
relationships between them) – Manipulate its content (i.e. insert new data and
update or delete existing data) – Conduct queries (i.e. request information based
upon defined criteria).
• The Structured Query Language (SQL) is the most commonly used language for Relational Databases – Supported by all relational DBMSs and is a standard.
27/10/2011
6
SQL
• SQL is split into four sets of commands which are divided based upon the tasks they are used for:
– Data Definition Language
– Data Modification Language
– Data Query Language
– Data Control Language
Data Definition Language
• SQL uses a collection of imperative verbs whose effect is to modify the schema of the database
• Can be used to add, change or delete definitions of tables or other objects.
• These statements can be freely mixed with other SQL statements
– so the DDL is not truly a separate language.
27/10/2011
7
Data Definition Language
• CREATE Statement – This Statement is used for creating the database and its objects
• ALTER Statement – This Statement is used for modifying the database and its
objects
• DROP Statement – This Statement is used for deleting the database and its objects
• TRUNCATE Statement – This Statement is used to delete the data available in a table in a
Row-By-Row manner but with out disturbing its structure
Data Manipulation Language
• The data manipulation language comprises the SQL data change statements
– Modifies stored data
– Does NOT modify the schema or database objects
• This is always the responsibility of the Data Definition Language
• Used for inserting, deleting and updating data in the tables of a database
27/10/2011
8
Data Manipulation Language
• INSERT Statement – This statement is used for inserting data into a
table
• DELETE Statement – This statement is used to delete data from a table
that matches certain criteria
• UPDATE Statement – This statement is used to update data in a table
that matches certain criteria
Data Query Language
• The data query language allows users of a database to formulate requests and generate reports
• There is one primary command used in SQL to query the database - the SELECT Statement – This statement is used to query or retrieve data from a
table in the database. – A query may retrieve information from specified
columns or from all of the columns in the table – A query may have specified criteria that must be met
in order for data to be returned
27/10/2011
9
ST3001 – MySQL Tutorial 1 - University of Dublin, Trinity
College
SELECT location
FROM department;
Location
-----------
Tower 221
The Docks
City Centre
Downtown
SQL Statement Entered
Returned Data is Displayed
Statement is sent to Database
Result Data is Returned
Example
• Lets create the database from your “Dodgy Builders” exercise
• See if you can create all the tables from the Dodgy Builders example
• Be aware of:
– Primary Keys
• Especially on the Resource Allocation Table
– The order in which you create the Tables
• Foreign Keys
Populating the Tables
• INSERT Statement – This Statement is used for populating the Tables with Data
• It is possible to write the INSERT statement in two ways.
• Syntax
INSERT INTO table_name
VALUES
(value_1, value_2, ... , value_n);
27/10/2011
16
Populating the Tables
• However if you only want to populate certain columns in the table, then you can specify which to enter data into:
INSERT INTO table_name
(column_1, column_2, ... , column_n)
VALUES
(value_1, value_2, ... , value_n);
Populating the Tables
• To populate the employee table: INSERT INTO employee
VALUES
(‘Shay’, ‘Lawless’, ‘Bray’, ‘11223344u’);
• Or INSERT INTO employee
(firstname, surname, address)
VALUES
(‘Shay’, ‘Lawless’, ‘Bray’);
27/10/2011
17
Populating the Tables
• It is possible to populate many rows at once:
INSERT INTO employee
VALUES
(‘Shay’, ‘Lawless’, ‘Bray’, ‘11223344u’),
(‘Jimmy’, ‘McNulty’, ‘Baltimore’, ‘22334455u’),
(‘Homer’, ‘Simpson’, ‘Springfield’, ‘00000000f’);
Populating the Tables
• See if you can populate all the tables from the Dodgy Builders example
• Be aware of:
– AUTO_INCREMENT columns
– NOT NULL columns
• Foreign Keys
– Data Types
27/10/2011
18
Query the Database
• The SELECT statement is used to query a DB
• It is the most regularly used command in the SQL language
• SELECT statements look like this:
SELECT a1, a2, … , an
FROM t1, t2, … , tm
WHERE condition
Query the Database
• The most basic SELECT statement is used to retrieve all the data from a single table
SELECT *
FROM employee;
• Think of * like a wildcard
– It is a quick way of selecting all columns
27/10/2011
19
Query the Database
• You include the WHERE clause when you want to only retrieve data that matches certain criteria
• For example:
SELECT firstname, surname, PPS
FROM employee
WHERE firstname = ‘Jimmy’;
Query the Database
• Often it is necessary to combine the information in two tables to answer a query
• For example, we want to know the names of all the customers and the project number(s) that they are involved in
• This type of SELECT is known as a “Join”
– Inner Join
– Outer Join
27/10/2011
20
Joins
• Imagine we have two tables table a table b
id name id name
-- ---- -- ----
1 Pirate 1 Rutabaga
2 Monkey 2 Pirate
3 Ninja 3 Darth Vader
4 Spaghetti 4 Ninja
• We want to retrieve all names that appear in both tables
• INNER JOIN produces only the set of records that match in both Table A and Table B.
SELECT * FROM table_a INNER JOIN table_b ON table_a.name = table_b.name
id name id name
-- ---- -- ----
1 Pirate 2 Pirate
3 Ninja 4 Ninja
Joins
27/10/2011
21
Joins
• Full outer join produces the set of all records in Table A and Table B, with matching records from both sides where available. – If there is no match, the missing side will contain
null.
SELECT *
FROM table_a
FULL OUTER JOIN table_b
ON table_a.name = table_b.name
Joins
id name id name
-- ---- -- ----
1 Pirate 2 Pirate
2 Monkey null null
3 Ninja 4 Ninja
4 Spaghetti null null
null null 1 Rutabaga
null null 3 Darth Vader
27/10/2011
22
Joins
• You can also specify LEFT or RIGHT OUTER JOINs
• LEFT OUTER JOIN produces a complete set of records from table_a, with the matching records (where available) in table_b.
– If there is no match, the right side will contain null.
• RIGHT OUTER JOIN produces a complete set of records from table_b, with the matching records (where available) in table_a.
– If there is no match, the left side will contain null.
Joins
SELECT * FROM table_a LEFT OUTER JOIN table_b ON table_a.name = table_b.name id name id name
-- ---- -- ----
1 Pirate 2 Pirate
2 Monkey null null
3 Ninja 4 Ninja
4 Spaghetti null null
27/10/2011
23
Joins
• INNER JOIN is the most commonly used
– If no join type is specified, INNER JOIN is the default
• In our example, we want to know the names of all the customers and the project number(s) that they are involved in:
SELECT customer.name, project.name
FROM customer, project
WHERE customer.customer_id = project.customer_id;
Joins
• Returns:
– Customer name and Project name
– Only where the Customer IDs match
27/10/2011
24
Joins
Foreign Key
Primary Key
EMPNO NAME JOB DEPTNO
7856 MCNULTY OFFICER 30
7710 DANIELS LIEUTENANT 40
7992 GREGGS DETECTIVE 10
7428 MORELAND DETECTIVE 20
DEPTNO NAME LOCATION
10 NARCOTICS TOWER 221
20 HOMICIDE CITY CENTER
30 MARINE DOCKS
40 EVIDENCE DOWNTOWN
Joins
SELECT employee.name, job, department.name
FROM employee, department
WHERE employee.deptno = department.deptno;
NAME JOB NAME
MCNULTY OFFICER MARINE
DANIELS LIEUTENANT EVIDENCE
GREGGS DETECTIVE NARCOTICS
MORELAND DETECTIVE HOMICIDE
27/10/2011
25
Query the Database
• Often we need to list results in a particular order.
– ascending order, descending order, based on either numerical value or text value.
• To do this we can use the ORDER BY keyword SELECT customer.name, project.name
FROM customer, project
WHERE customer.customer_id = project.customer_id
ORDER BY customer.name ASC;
Update Data
• Once there is data in a table, it may become necessary to modify the data.
• To do so, we can use the UPDATE command.
UPDATE table_name SET column1=value, column2=value2 WHERE some_column=some_value
UPDATE employee SET firstname = ‘Séamus’ WHERE PPS_number = ‘11223344u’;
27/10/2011
26
Delete Data
• Sometimes it may be necessary to remove records from a table.
• To do so, we can use the DELETE command
DELETE FROM table_name WHERE some_column=some_value
DELETE FROM employee WHERE firstname = ‘Séamus’ AND surname = ‘Lawless’
Delete a Table
• It may be necessary to remove a table in the database
• The DROP TABLE command is used to do this.
• The syntax for DROP TABLE is simply:
DROP TABLE table_name;
• So, to drop the table called customer:
DROP TABLE customer;
27/10/2011
27
Other SQL Commands
• There are numerous other SQL commands that are useful to know
• You can find them all here: – http://dev.mysql.com/doc/refman/5.5/en/sql-syntax.html
• There are useful tutorials here: – http://www.1keydata.com/sql
– http://www.w3schools.com/sql/
Normalisation
• When designing databases you are faced with a series of choices. – How many tables will there be and what will they
represent? – Which columns will go in which tables? – What will the relationships between the tables be?
• The answers each to these questions lies in normalisation.
• Normalisation is the process of simplifying the design of a database so that it achieves its optimum structure.
• Normalisation theory gives us the concept of “normal forms” to assist in achieving the optimum structure.
• The normal forms are a set of rules that you apply to your database – Each higher normal form achieves better, more efficient design.
• The normal forms are: – First Normal Form – Second Normal Form – Third Normal Form – Boyce Codd Normal Form – Fourth Normal Form – Fifth Normal Form
Normalisation
• Normal Forms are based on relations rather than tables. A relation is an algebraic entity that has the following attributes: – It describes one entity. – It has no duplicate rows; hence there is always a
primary key. – The columns are unordered. – The rows are unordered.
• For all practical purposes the terms table and relation are interchangeable once a table meets the definition of a relation.
27/10/2011
29
First Normal Form
• First Normal Form (1NF) says that all column values must be “atomic”. – The word atom comes from the Latin atomis, meaning
indivisible (or literally "not to cut").
• 1NF dictates that at every row-column intersection, there exists only one value, not a list of values.
• The benefits from this rule should be fairly obvious. If lists of values are stored in a single column, there is no simple way to manipulate those values.
First Normal Form
• The following would violate 1NF
order_id customer_id items
1 7 2 hammers, 1 screwdriver
2 33 1 shelf, 4 brackets, 8 screws
3 12 1 lawnmower
4 4 8 drill bits
27/10/2011
30
Second Normal Form
• A table is said to be in Second Normal Form (2NF) if: – it is 1NF compliant
– every non-key column is fully dependent on the (entire) primary key.
• In other words: – tables should only store data relating to one
"thing" (or entity)
– that entity should be described by its primary key.
Third Normal Form
• A table is said to be in Third Normal Form (3NF) if: – it is 2NF compliant – all non-key columns are mutually independent.
• For example – if a table contains the columns “quantity” and “per_item_cost” – you could opt to calculate and store in that same table a
“total_cost” column (which would be equal to Quantity*PerItemCost)
– this table would not then be 3NF compliant
• It's better to leave this column out of the table and allow the application to make this calculation – Saves on storage space in the database – Avoids having to update total_cost every time quantity or
per_item_cost changes.
27/10/2011
31
Higher Normal Forms
• Every higher normal form is a superset of all lower forms. Thus, if your design is in Third Normal Form, by definition it is also in 1NF and 2NF.
• If you've normalized your database to 3NF, you've likely also achieved Boyce/Codd Normal Form (and maybe even 4NF or 5NF).
• To quote C.J. Date, the principles of database design are "nothing more than formalized common sense."
• Database design is more art than science. – While it's relatively easy to work through the examples in
this course, the process gets more difficult when you are presented with a real world problem!
Modelling a Database
• Identify and model the entities
• Identify and model the relationships between the entities