Basic SQL. Implementation Schemes: Once we have a set of relation schemes, we can translate them into implementation schemes. We use to express implementation.

Basic SQL

Implementation Schemes:Once we have a set of relation schemes,

we can translate them into implementation schemes. We use to express implementation schemes a Data-Definition Language (DDL).

The DDL in next slide is written in SQL, a standardized commercial database language for specifying updating, and querying a database.

CREATE TABLE Library ( 1 L_Code VCHAR(10) NOT NULL UNIQUE 2 L_City VCHAR(10) NOT NULL 3 L_Size INTEGER NOT NULL 4 B_Code INTEGER 5 6 PRIMARY KEY (L_Code), 7 FOREIGN KEY (B_Code ) );

CREATE TABLE Library ( 1 L_Code VCHAR(10) NOT NULL UNIQUE 2 L_City VCHAR(10) NOT NULL 3 L_Size INTEGER NOT NULL 4 B_Code INTEGER 5 6 PRIMARY KEY (L_Code), 7 FOREIGN KEY (B_Code ) );

What is update operation:

Update operations insert tuples into relations, delete tuples form relations,and modify tuples in relations.

What is Retrieval operation:

Retrieval operations obtain tuples from relations that satisfy certain conditions and combine these tuples in various ways to produce result relations.

Basic Structure

The SELECT clause is used to list the attributes desired in the result of a query.

The FROM clause lists the relations (tables) to be scanned in the evaluation of the expression.

The WHERE clause consists of a predicate involving attributes of the relations that appear in the FROM clause.

Form

A typical SQL query has the form:

SELECT A1, A2, . . . , An

FROM r1, r2, . . . , r3

WHERE P Each A1 represents an attribute and each r1 a

relation. P is a predicate.

Form

If the WHERE clause is omitted, the predicate P is true. The list of attributes may be replaced with a star (*) to select all attributes of all relations appearing in the FROM clause.

Some Examples of SQL Queries

Find the names of all the branches in the deposit relation (relation):

SELECT branch-name

FROM deposit Now, we want no duplicates in the list so we

add the DISTINCT clause:

SELECT DISTINCT branch-name

FROM deposit

More Examples

If we want to make sure duplicates are not removed, we can add the ALL clause.

SELECT ALL branch-name

FROM deposit

Constraints

If we want to find all the customers in the deposit relation who bank at the branch "Perryridge."

SELECT DISTINCT customer-name

FROM deposit

WHERE branch-name = "Perryridge"

Constraints - continued

Or we can pull the customers in the borrow relation who bank at the branch "Perryridge."

SELECT DISTINCT customer-name

FROM borrow

WHERE branch-name = "Perryridge"

Constraints - continued

We can use operators to find balances between $90,000 and $100,000.

SELECT customer-nameFROM depositWHERE balance BETWEEN 90000 AND 100000

This is the same as:

SELECT customer-nameFROM depositWHERE balance 100000 AND balance 90000

Operators

Other operators such as <. >, +, /, *, = etc work the same as does the OR constraint.

Substrings

As part of the WHERE clause we can do substrings. For example, we can find the customers who live on Main Street, even though their address only includes Main as part of the address.

SELECT customer-nameFROM customerWHERE street LIKE "%Main%"

Joining Relations We can find the name and city of all

customers from the CUSTOMER relation who have deposits at the Perryridge branch (from the DEPOSIT relation)

To do this, you can specify the entire relation name or make a pseudonym for it with a letter:SELECT DISTINCT customer-name, customer-cityFROM deposit, customerWHERE deposit.customer-name = customer.customer-name

Joining

This is the same as:

SELECT DISTINCT customer-name, customer-city

FROM deposit D, customer C

WHERE D.customer-name = C.customer-name

Other Functions

AVG = Average, MIN = Minimum, MAX = Maximum, SUM = Total, COUNT = Count

Order by = sort order for result Group by = places the result in groups

Insert and Delete Adds and deletions to relations can be made with

the INSERT and DELETE clauses. For example:

INSERT INTO deposit

VALUES ("Perryridge", 9732, "Smith", 1200) This will put the values into the relation in order,

if you wish be more specific:

INSERT INTO deposit (branch-name, account-number, customer-name, balance)

VALUES ("Perryridge", 9732, "Smith", 1200)

Creating a Table

CREATE TABLE r (A1, D1, A2, D2, . . . An, Dn) Each A is the name of an attribue in the relation

r and D is the data type. The DROP clause eliminates a table (relation). The DROP clause differs from the DELETE

clause in that DELETE only eliminates tuples in r leaving the relation itself, DROP eliminates the relation and all information relating to it.

Data DefinitionThe SQL Data Definition Language (DDL)

allows us to create and destroy database objects such as schemas, domains, tables, views, and indexes. The ISO standard also allows the creation of assertions,

character sets, collations

and translations.

The main SQL data definition language statements are: CREATE SCHEMA DROP SCHEMA CREATE DOMAIN ALTER DOMAIN DROP DOMAIN

CREATE TABLE ALTER TABLE DROP TABLE CREATE VIEW DROP VIEW

While CREATE INDEX and

DROP INDEX are provided

by DBMS

Joins: Combining Data from Multiple Tables Inner Joins Equi-Joins and Natural Joins Self Joins Outer Joins (Left, Right and Full), Nulls Union Operators Merge Joins Hash Joins

Inner Joins

Inner join is the default join type. Therefore, you are not required to specify it. But, I recommend that you do for clarity.

The SELECT clause says which columns to output, the FROM clause lists the tables, and the WHERE clause tells SQL Server how to join the tables.

SELECT pub_name, title FROM titles INNER JOIN publishers ON titles.pub_id = publishers.pub_id

Equi-Joins

An equijoin is an INNER JOIN in which you ask for the join columns from both tables. This join will return redundant data. The whole premise of this join is that the data in the join column is the same in both tables. An equijoin would look like this:

SELECT pub_name, publishers.pub_id, titles.pub_id, title FROM publishers INNER JOIN titles ON publishers.pub_id = titles.pub_id

Natural Joins

Natural Join is the same as equi-join except you show the join criteria field only once. Since we want to see the publishers info, here is the natural join query.

SELECT pub_name, publishers.pub_id, title FROM titles INNER JOIN publishers ON titles.pub_id = publishers.pub_id

Self-Joins Useful join to identify multiple names for same address.

SELECT au1.au_fname, au1.au_lname, au2.au_fname, au2.au_lname

FROM authors au1, authors au2

WHERE au1.state = 'UT'

AND au1.au_lname = au2.au_lname

AND au1.au_id < au2.au_id Notice the table name aliases au1 and au2 for same table,

named authors.

Left Outer Joins (*=) Left Outer Join means, "Give me all the data from the

table to the left of the Join (in this case, titles) and the matching data from the table to the right of the Join (in this case, sales)." So you see all the titles, including those with no sales. Where sales data exists for a title, it is output. Where no sales data exists, you see Nulls.

SELECT substring (title,1,40) AS Book, stor_id, ord_num, qty, ord_date FROM titles, sales WHERE titles.title_id *= sales.title_id

Left verses Right Joins In the Left Outer Join example, no sales data could appear if

no entries were in the titles table. Thus, if you just change Left to Right and run the query, you

see what is really an Inner Join.

Right Outer Joins (=*)

But if you flip the tablesSELECT substring (title,1,40) AS Book, stor_id, ord_num, qty,

ord_date FROM sales RIGHT OUTER JOIN titles ON sales.title_id = titles.title_id

then you have a RIGHT OUTER JOIN. In other words, titles Left Join sales produces the same results as sales Right Join titles. In fact, internally, SQL Server doesn't process Right Joins. It just flips them around and handles them as Left Joins.

Right Outer Join returns all the rows from the table to the right of the join and only the matching rows from the table on the left.

Full Outer Joins (*=*) The Full Outer Join combines the left and right outer joins;

and it gives you all the data from both table A and table B, matching rows when appropriate and filling in NULLS when there is no match.

Select v.VendorName, i.InventoryDscrFROM tblInventory i FULL OUTER JOIN tblVendor vON i.VendorId = v. VendorIDOrder by V.VendorName

Outer Joins with Nulls

When results show nulls, does that indicate that the table really doesn’t have that data or does it just not show because of the join?

Union Operators

The UNION operator combines two tables in a single result set that contains all rows appearing in either or both tables.

UNION processing is different depending on whether you use the ALL option.

If you specify ALL, the optimizer displays all resulting rows, including duplicates.

If you don't specify ALL, the optimizer processes UNION the same way it processes the DISTINCT clause, removing all duplicate rows.

Union Operators .. example

SELECT Name, City, 'Supplier' AS Source FROM Suppliers WHERE Country = 'Canada' UNION SELECT Name, City, 'Customer' FROM Customers WHERE Country = 'Canada' ORDER BY City, Source;

Merge Joins

The SQL Server query optimizer uses a variety of algorithms and new join techniques to find the best available plan for your queries.

If the optimizer decides that a merge join is optimal, the query execution scans two sorted inputs and merges them.

For a merge join, the query processor sorts both inputs on the join column; if the two inputs to the join are already sorted on the join column (for example, if each has a clustered index on the join column), merge join is a logical choice.

SQL Server scans both the Orders table and the OrderDetails table. For each row in Orders, the matching rows in OrderDetails are merged.

Merge Joins .. logic If a one-to-many relationship exists between the inputs, SQL

Server follows a series of steps for an inner join executed with a merge join strategy.

First, it gets a row from each input, then compares the join columns of the rows.

If the join columns match, SQL Server returns the requested columns from each row.

If the columns don't match, SQL Server discards the row with the lower value and gets the next row from that input.

SQL Server repeats these steps until it has processed all rows in both inputs, scanning both inputs only once.

Merge Joins .. An example

To demonstrate a situation in which the optimizer will choose merge join, we inserted 10,000 rows into the Orders table. For each Order, we inserted five products into the OrderDetails table, resulting in 50,000 rows. Both Orders and OrderDetails have clustered indexes on the OrderID column. Then we executed the following query against the tables:

SELECT O.OrderID, O.OrderDate, OD.ProductID, OD.Quantity FROM Orders O INNER JOIN OrderDetails OD ON O.OrderID = OD.OrderID

Hash Joins

The main reason why the SQL Server optimizer chooses the hash join algorithm is a lack of adequate indexes on the join columns.

If the optimizer chooses a hash join, the execution plan involves building a hash table, which is like building a special index on the fly.

The hash join strategy has two phases: build and probe.

Hash Joins ..

During the build phase, the query processor builds a hash table by scanning each value in the build input (usually the smaller table) and applying the hashing algorithm to the key. The hash table consists of linked lists called hash buckets. The hash function applied to each hash key in the build input determines the relevant hash bucket.

During the probe phase, the query processor scans each row from the probe input (the second table) and computes the same hash value on the hash key to find any matches in the corresponding hash bucket.

Hash Joins .. demonstration

To demonstrate a situation in which the optimizer usually chooses a hash join, drop the constraints and both clustered indexes from the Orders and OrderDetails tables. Now execute the same query we ran in the merge join example:

SELECT O.OrderID, O.OrderDate, OD.ProductID, OD.Quantity FROM Orders O INNER JOIN OrderDetails OD ON O.OrderID = OD.OrderID

Hash Joins .. Demo explained ..

The Orders table is the best candidate for the build input because it's considerably smaller. The linked lists start with the function's possible results; in this case, there are only five. Linked to each bucket are records containing all the values the query will return from the build input table that correspond to the bucket. Applying the hash function to the join column of each build input record determines the correspondence. OrderID and OrderDate are the only columns you need from the build input (the Orders table), so these values are the only ones that stay in each record of the hash table.

Hash Joins .. Demo explained …

SQL Server performs a table scan on the smaller table, Orders, which is the build input. Then it builds a hash table with the values that result from applying the hash function to each scanned row. SQL Server also performs a table scan on the probe—the OrderDetails table—and looks for the matching values in the hash table by applying to each scanned row the same formula that it used to build the hash table.

Basic SQL. Implementation Schemes: Once we have a set of relation schemes, we can translate them into implementation schemes. We use to express implementation.

Documents