YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

1

Introduction to Databases

G6921 and G6931

Web Technologies

Dr. Séamus Lawless

• Course Structure – 1) Intro to the Web – 2) HTML – 3) HTML and CSS – 4) Intro to Databases – 5) Intro to Databases – 6) PHP and MySQL – 7) Reading Week – 8) PHP and MySQL – 9) PHP and XML – 10) CMS – 11) Analytics – 12) Visualisation

• Course Structure – 1) Intro to the Web – 2) HTML – 3) HTML and CSS – Essay Information Session – 4) Intro to Databases – 5) Intro to Databases – 6) PHP and MySQL – 7) Reading Week – 8) PHP and MySQL – 9) PHP and XML – 10) CMS – 11) Analytics – 12) Visualisation

• Course Structure – 1) Intro to the Web – 2) HTML – 3) HTML and CSS – Essay Information Session – 4) Intro to Databases – 5) Intro to Databases – 6) PHP and MySQL – 7) Reading Week – 8) PHP and MySQL – 9) PHP and XML – 10) CMS – 11) Analytics – 12) Visualisation

Housekeeping

Page 2: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

2

Housekeeping

• You all have a Laptop?

• Can you all get connected to the Internet?

• Can you all get connected to your webspace through SHH?

• Can you use a URL to view your files?

• Assessment

– Continuous Assessment

• Upload the files you create!

Exercise 1

• An airline company flies many flights, but each flight is flown by only one airline

Page 3: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

3

Exercise 2

• The Globex Corporation operates many factories. Each of these factories is located in a region. Each region can have more than one Globex factory. Each factory employs many employees, but each of these employees is employed by only one factory.

Exercise 3

• Dodgy Builders Construction Company is a building contractor that specialises in mid-range homes.

• Dodgy Builder’s has a number of customers and employees, handles a series of projects and owns lots of building equipment.

• A customer can engage the company for work on more than one project.

• Dodgy Builder employees can often work on more than one project at a time.

• Building equipment is always assigned to only one project at a time.

• Draw an ERD showing that models this companies data. • Write out each table and it’s associated columns

Page 4: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

4

Data Types

• Character

– char

– varchar

– text

• Date and Time

– Date

– Time

– timestamp

• Number

– bit

– integer

– real

– numeric

• Others

– blob

– boolean

DBMS

• Database Management System

• Goal of a DBMS is to simplify the storage of, and access to data

• DBMS support:

– Definition

– Manipulation

– Querying

Page 5: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

5

DBMS

• Well known DBMS: – Proprietary

• Oracle

• Access (Microsoft)

• SQL Server (Microsoft)

• DB2 (IBM)

– Open Source • MySQL

• SQLite

• PostgreSQL

Database Languages

• Programming languages which are used to – Define a database (i.e. its entities and the

relationships between them) – Manipulate its content (i.e. insert new data and

update or delete existing data) – Conduct queries (i.e. request information based

upon defined criteria).

• The Structured Query Language (SQL) is the most commonly used language for Relational Databases – Supported by all relational DBMSs and is a standard.

Page 6: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

6

SQL

• SQL is split into four sets of commands which are divided based upon the tasks they are used for:

– Data Definition Language

– Data Modification Language

– Data Query Language

– Data Control Language

Data Definition Language

• SQL uses a collection of imperative verbs whose effect is to modify the schema of the database

• Can be used to add, change or delete definitions of tables or other objects.

• These statements can be freely mixed with other SQL statements

– so the DDL is not truly a separate language.

Page 7: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

7

Data Definition Language

• CREATE Statement – This Statement is used for creating the database and its objects

• ALTER Statement – This Statement is used for modifying the database and its

objects

• DROP Statement – This Statement is used for deleting the database and its objects

• TRUNCATE Statement – This Statement is used to delete the data available in a table in a

Row-By-Row manner but with out disturbing its structure

Data Manipulation Language

• The data manipulation language comprises the SQL data change statements

– Modifies stored data

– Does NOT modify the schema or database objects

• This is always the responsibility of the Data Definition Language

• Used for inserting, deleting and updating data in the tables of a database

Page 8: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

8

Data Manipulation Language

• INSERT Statement – This statement is used for inserting data into a

table

• DELETE Statement – This statement is used to delete data from a table

that matches certain criteria

• UPDATE Statement – This statement is used to update data in a table

that matches certain criteria

Data Query Language

• The data query language allows users of a database to formulate requests and generate reports

• There is one primary command used in SQL to query the database - the SELECT Statement – This statement is used to query or retrieve data from a

table in the database. – A query may retrieve information from specified

columns or from all of the columns in the table – A query may have specified criteria that must be met

in order for data to be returned

Page 9: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

9

ST3001 – MySQL Tutorial 1 - University of Dublin, Trinity

College

SELECT location

FROM department;

Location

-----------

Tower 221

The Docks

City Centre

Downtown

SQL Statement Entered

Returned Data is Displayed

Statement is sent to Database

Result Data is Returned

Example

• Lets create the database from your “Dodgy Builders” exercise

• We will use MySQL on your Webspace

• Windows – Use PuTTY

– (http://www.putty.org/)

• Mac – Use Terminal

• Linux – Use Terminal

– xTerm, Gnome-Terminal

Page 10: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

10

Connect

• Connect to your webspace using the command line

• Windows – In PuTTY

• Host Name – dh-sandbox.tchpc.tcd.ie

• Port – 22

• SSH

• Mac / Linux – In Terminal

• ssh [email protected]

Open MySQL

• On the command line type:

– mysql –u username –p

• Enter the password I have given you

• You are now connected to MySQL

• Try the following commands

– show databases;

– show tables;

Page 11: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

11

Creating a Database

• There are several steps you need to complete:

– Create the Database

– Create the Tables

– Populate the Tables

• You can then:

– Query the Database

– Update the Data or Insert new Data

– Delete Data or Tables

Creating the Database

• CREATE Statement – This Statement is used for creating the database and its objects

• We will be using two versions of the CREATE statement – CREATE DATABASE

– CREATE TABLE

– Both do exactly as they say on the tin!

• Syntax – CREATE ENTITY entity_name;

Page 12: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

12

Creating the Database

• Once you have typed:

– CREATE DATABASE my_database_name;

• Your empty database has been created

– Try the following:

• use my_database_name;

– This gives you access to the database

– Now try:

• show tables;

Creating the Tables

• This time we use the CREATE TABLE statement • You need to specify:

– Table name – Column names – Column datatypes

• Syntax CREATE TABLE table_name ( column_name_1 data_type, column_name_2 data_type, .... column_name_n data_type );

Page 13: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

13

Creating the Tables

• So in order to create the employee table, we would use the following command:

CREATE TABLE employee ( employee_id int, firstname varchar, surname varchar, address varchar(255), pps_number int );

Creating the Tables

• However, we also need to add in Constraints – NOT NULL

– PRIMARY KEY

– FOREIGN KEY

– UNIQUE

– CHECK

– DEFAULT

• …and Attributes – AUTO_INCREMENT

Page 14: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

14

Constraints

CREATE TABLE employee ( employee_id int AUTO_INCREMENT, firstname varchar, surname varchar NOT NULL, address varchar(255), pps_number varchar, PRIMARY KEY (employee_id) );

Foreign Key

• Foreign Keys also need to be declared:

CREATE TABLE project

(

project_id int AUTO_INCREMENT,

customer_id int NOT NULL,

name varchar,

start_date date,

PRIMARY KEY (project_id),

FOREIGN KEY (customer_id) references customer(customer_id)

);

Page 15: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

15

Creating the Tables

• See if you can create all the tables from the Dodgy Builders example

• Be aware of:

– Primary Keys

• Especially on the Resource Allocation Table

– The order in which you create the Tables

• Foreign Keys

Populating the Tables

• INSERT Statement – This Statement is used for populating the Tables with Data

• It is possible to write the INSERT statement in two ways.

• Syntax

INSERT INTO table_name

VALUES

(value_1, value_2, ... , value_n);

Page 16: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

16

Populating the Tables

• However if you only want to populate certain columns in the table, then you can specify which to enter data into:

INSERT INTO table_name

(column_1, column_2, ... , column_n)

VALUES

(value_1, value_2, ... , value_n);

Populating the Tables

• To populate the employee table: INSERT INTO employee

VALUES

(‘Shay’, ‘Lawless’, ‘Bray’, ‘11223344u’);

• Or INSERT INTO employee

(firstname, surname, address)

VALUES

(‘Shay’, ‘Lawless’, ‘Bray’);

Page 17: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

17

Populating the Tables

• It is possible to populate many rows at once:

INSERT INTO employee

VALUES

(‘Shay’, ‘Lawless’, ‘Bray’, ‘11223344u’),

(‘Jimmy’, ‘McNulty’, ‘Baltimore’, ‘22334455u’),

(‘Homer’, ‘Simpson’, ‘Springfield’, ‘00000000f’);

Populating the Tables

• See if you can populate all the tables from the Dodgy Builders example

• Be aware of:

– AUTO_INCREMENT columns

– NOT NULL columns

• Foreign Keys

– Data Types

Page 18: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

18

Query the Database

• The SELECT statement is used to query a DB

• It is the most regularly used command in the SQL language

• SELECT statements look like this:

SELECT a1, a2, … , an

FROM t1, t2, … , tm

WHERE condition

Query the Database

• The most basic SELECT statement is used to retrieve all the data from a single table

SELECT *

FROM employee;

• Think of * like a wildcard

– It is a quick way of selecting all columns

Page 19: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

19

Query the Database

• You include the WHERE clause when you want to only retrieve data that matches certain criteria

• For example:

SELECT firstname, surname, PPS

FROM employee

WHERE firstname = ‘Jimmy’;

Query the Database

• Often it is necessary to combine the information in two tables to answer a query

• For example, we want to know the names of all the customers and the project number(s) that they are involved in

• This type of SELECT is known as a “Join”

– Inner Join

– Outer Join

Page 20: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

20

Joins

• Imagine we have two tables table a table b

id name id name

-- ---- -- ----

1 Pirate 1 Rutabaga

2 Monkey 2 Pirate

3 Ninja 3 Darth Vader

4 Spaghetti 4 Ninja

• We want to retrieve all names that appear in both tables

• INNER JOIN produces only the set of records that match in both Table A and Table B.

SELECT * FROM table_a INNER JOIN table_b ON table_a.name = table_b.name

id name id name

-- ---- -- ----

1 Pirate 2 Pirate

3 Ninja 4 Ninja

Joins

Page 21: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

21

Joins

• Full outer join produces the set of all records in Table A and Table B, with matching records from both sides where available. – If there is no match, the missing side will contain

null.

SELECT *

FROM table_a

FULL OUTER JOIN table_b

ON table_a.name = table_b.name

Joins

id name id name

-- ---- -- ----

1 Pirate 2 Pirate

2 Monkey null null

3 Ninja 4 Ninja

4 Spaghetti null null

null null 1 Rutabaga

null null 3 Darth Vader

Page 22: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

22

Joins

• You can also specify LEFT or RIGHT OUTER JOINs

• LEFT OUTER JOIN produces a complete set of records from table_a, with the matching records (where available) in table_b.

– If there is no match, the right side will contain null.

• RIGHT OUTER JOIN produces a complete set of records from table_b, with the matching records (where available) in table_a.

– If there is no match, the left side will contain null.

Joins

SELECT * FROM table_a LEFT OUTER JOIN table_b ON table_a.name = table_b.name id name id name

-- ---- -- ----

1 Pirate 2 Pirate

2 Monkey null null

3 Ninja 4 Ninja

4 Spaghetti null null

Page 23: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

23

Joins

• INNER JOIN is the most commonly used

– If no join type is specified, INNER JOIN is the default

• In our example, we want to know the names of all the customers and the project number(s) that they are involved in:

SELECT customer.name, project.name

FROM customer, project

WHERE customer.customer_id = project.customer_id;

Joins

• Returns:

– Customer name and Project name

– Only where the Customer IDs match

Page 24: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

24

Joins

Foreign Key

Primary Key

EMPNO NAME JOB DEPTNO

7856 MCNULTY OFFICER 30

7710 DANIELS LIEUTENANT 40

7992 GREGGS DETECTIVE 10

7428 MORELAND DETECTIVE 20

DEPTNO NAME LOCATION

10 NARCOTICS TOWER 221

20 HOMICIDE CITY CENTER

30 MARINE DOCKS

40 EVIDENCE DOWNTOWN

Joins

SELECT employee.name, job, department.name

FROM employee, department

WHERE employee.deptno = department.deptno;

NAME JOB NAME

MCNULTY OFFICER MARINE

DANIELS LIEUTENANT EVIDENCE

GREGGS DETECTIVE NARCOTICS

MORELAND DETECTIVE HOMICIDE

Page 25: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

25

Query the Database

• Often we need to list results in a particular order.

– ascending order, descending order, based on either numerical value or text value.

• To do this we can use the ORDER BY keyword SELECT customer.name, project.name

FROM customer, project

WHERE customer.customer_id = project.customer_id

ORDER BY customer.name ASC;

Update Data

• Once there is data in a table, it may become necessary to modify the data.

• To do so, we can use the UPDATE command.

UPDATE table_name SET column1=value, column2=value2 WHERE some_column=some_value

UPDATE employee SET firstname = ‘Séamus’ WHERE PPS_number = ‘11223344u’;

Page 26: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

26

Delete Data

• Sometimes it may be necessary to remove records from a table.

• To do so, we can use the DELETE command

DELETE FROM table_name WHERE some_column=some_value

DELETE FROM employee WHERE firstname = ‘Séamus’ AND surname = ‘Lawless’

Delete a Table

• It may be necessary to remove a table in the database

• The DROP TABLE command is used to do this.

• The syntax for DROP TABLE is simply:

DROP TABLE table_name;

• So, to drop the table called customer:

DROP TABLE customer;

Page 27: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

27

Other SQL Commands

• There are numerous other SQL commands that are useful to know

• You can find them all here: – http://dev.mysql.com/doc/refman/5.5/en/sql-syntax.html

• There are useful tutorials here: – http://www.1keydata.com/sql

– http://www.w3schools.com/sql/

Normalisation

• When designing databases you are faced with a series of choices. – How many tables will there be and what will they

represent? – Which columns will go in which tables? – What will the relationships between the tables be?

• The answers each to these questions lies in normalisation.

• Normalisation is the process of simplifying the design of a database so that it achieves its optimum structure.

Page 28: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

28

Normalisation

• Normalisation theory gives us the concept of “normal forms” to assist in achieving the optimum structure.

• The normal forms are a set of rules that you apply to your database – Each higher normal form achieves better, more efficient design.

• The normal forms are: – First Normal Form – Second Normal Form – Third Normal Form – Boyce Codd Normal Form – Fourth Normal Form – Fifth Normal Form

Normalisation

• Normal Forms are based on relations rather than tables. A relation is an algebraic entity that has the following attributes: – It describes one entity. – It has no duplicate rows; hence there is always a

primary key. – The columns are unordered. – The rows are unordered.

• For all practical purposes the terms table and relation are interchangeable once a table meets the definition of a relation.

Page 29: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

29

First Normal Form

• First Normal Form (1NF) says that all column values must be “atomic”. – The word atom comes from the Latin atomis, meaning

indivisible (or literally "not to cut").

• 1NF dictates that at every row-column intersection, there exists only one value, not a list of values.

• The benefits from this rule should be fairly obvious. If lists of values are stored in a single column, there is no simple way to manipulate those values.

First Normal Form

• The following would violate 1NF

order_id customer_id items

1 7 2 hammers, 1 screwdriver

2 33 1 shelf, 4 brackets, 8 screws

3 12 1 lawnmower

4 4 8 drill bits

Page 30: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

30

Second Normal Form

• A table is said to be in Second Normal Form (2NF) if: – it is 1NF compliant

– every non-key column is fully dependent on the (entire) primary key.

• In other words: – tables should only store data relating to one

"thing" (or entity)

– that entity should be described by its primary key.

Third Normal Form

• A table is said to be in Third Normal Form (3NF) if: – it is 2NF compliant – all non-key columns are mutually independent.

• For example – if a table contains the columns “quantity” and “per_item_cost” – you could opt to calculate and store in that same table a

“total_cost” column (which would be equal to Quantity*PerItemCost)

– this table would not then be 3NF compliant

• It's better to leave this column out of the table and allow the application to make this calculation – Saves on storage space in the database – Avoids having to update total_cost every time quantity or

per_item_cost changes.

Page 31: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

31

Higher Normal Forms

• Every higher normal form is a superset of all lower forms. Thus, if your design is in Third Normal Form, by definition it is also in 1NF and 2NF.

• If you've normalized your database to 3NF, you've likely also achieved Boyce/Codd Normal Form (and maybe even 4NF or 5NF).

• To quote C.J. Date, the principles of database design are "nothing more than formalized common sense."

• Database design is more art than science. – While it's relatively easy to work through the examples in

this course, the process gets more difficult when you are presented with a real world problem!

Modelling a Database

• Identify and model the entities

• Identify and model the relationships between the entities

• Identify and model the attributes

• Identify unique identifiers for each entity

• Normalise

Page 32: Introduction to Databases - Trinity College Dublin...27/10/2011 1 Introduction to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless •Course Structure –1) Intro to

27/10/2011

32

Physical Database Design

• Entities become tables in the database

– You have already named the tables

• Attributes become columns

– Choose an appropriate datatype for each column

• Unique identifiers become primary keys

– These can never be NULL

• Relationships are modelled as ‘foreign keys’

– Attributes added to tables to make links


Related Documents