3/2/2011 1 SQL Data Definition Language o The schema for each relation, including attribute types. o Integrity constraints o Authorization information for each relation. o Non-standard SQL extensions also allow specification of o The set of indices to be maintained for each relations. o The physical storage structure of each relation on disk. Allows the specification of: Basic Query Structure • A typical SQL query has the form: select A 1 , A 2 , ..., A n from r 1 , r 2 , ..., r m where P • A i represents an attribute • R i represents a relation • P is a predicate. • This query is equivalent to the relational algebra expression. • The result of an SQL query is a relation. )) ( ( 2 1 , , , 2 1 m P A A A r r r n Semicolon is the standard way to separate each SQL statement in database systems that allow more than one SQL statement to be executed in the same call to the server…………. Basic SQL Query • relation-list A list of relation names • target-list A list of attributes of relations in relation-list • qualification Comparisons (Attr op const or Attr1 op Attr2, where op is one of ) combined using AND, OR and NOT. • DISTINCT is an optional keyword indicating that the answer should not contain duplicates. Default is that duplicates are not eliminated! SELECT [DISTINCT] target-list FROM relation-list WHERE qualification , , , , , Conceptual Evaluation Strategy Semantics of an SQL query defined in terms of the following conceptual evaluation strategy: Compute the cross-product of relation-list. Discard resulting tuples if they fail qualifications. Delete attributes that are not in target-list. If DISTINCT is specified, eliminate duplicate rows. This strategy is probably the least efficient way to compute a query! An optimizer will find more efficient strategies to compute the same answers. The select Clause • The select clause list the attributes desired in the result of a query • corresponds to the projection operation of the relational algebra • Example: find the names of all branches in the loan relation: select branch_name from loan • In the relational algebra, the query would be: branch_name (loan) • NOTE: SQL names are case insensitive (i.e., you may use upper- or lower-case letters.) • E.g. Branch_Name ≡ BRANCH_NAME ≡ branch_name
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
3/2/2011
1
SQL
Data Definition Language
o The schema for each relation, including attribute types.
o Integrity constraints
o Authorization information for each relation.
o Non-standard SQL extensions also allow specification of
o The set of indices to be maintained for each relations.
o The physical storage structure of each relation on disk.
Allows the specification of:
Basic Query Structure • A typical SQL query has the form:
select A1, A2, ..., Anfrom r1, r2, ..., rmwhere P
• Ai represents an attribute• Ri represents a relation• P is a predicate.
• This query is equivalent to the relational algebra expression.
• The result of an SQL query is a relation.
))((21,,, 21 mPAAA
rrrn
Semicolon is the standard way to
separate each SQL statement in
database systems that allow more
than one SQL statement to be
executed in the same call to the
server………….
Basic SQL Query
• relation-list A list of relation names
• target-list A list of attributes of relations in relation-list
• qualification Comparisons (Attr op const or Attr1 op Attr2, where opis one of ) combined using AND, OR and NOT.
• DISTINCT is an optional keyword indicating that the answer should
not contain duplicates. Default is that duplicates are not eliminated!
The where Clause• The where clause specifies conditions that the
result must satisfy– Corresponds to the selection predicate of the
relational algebra.
• To find all loan number for loans made at the Perryridge branch with loan amounts greater than $1200.select loan_numberfrom loanwhere branch_name = 'Perryridge' and amount > 1200
• Comparison results can be combined using the logical connectives and, or, and not.
The from Clause• The from clause lists the relations involved in the query
– Corresponds to the Cartesian product operation of the relational algebra.
• Find the Cartesian product borrower X loanselect from borrower, loan
Find the name, loan number and loan amount of all customers having a loan at the Perryridge branch.
select customer_name, borrower.loan_number, amountfrom borrower, loanwhere borrower.loan_number = loan.loan_number and
branch_name = 'Perryridge'
The Rename Operation• SQL allows renaming relations and attributes using the
as clause:old-name as new-name
• E.g. Find the name, loan number and loan amount of all customers; rename the column name loan_number as loan_id.
Tuple Variables• Tuple variables are defined in the from clause via the use
of the as clause.• Find the customer names and their loan numbers and
amount for all customers having a loan at some branch.
Find the names of all branches that have greater assets than some branch located in Brooklyn.
select distinct T.branch_namefrom branch as T, branch as Swhere T.assets > S.assets and S.branch_city = 'Brooklyn'
Keyword as is optional and may be omittedborrower as T ≡ borrower T
Some database such as Oracle require as to be omitted
select customer_name, T.loan_number, S.amountfrom borrower as T, loan as Swhere T.loan_number = S.loan_number
String Operations• SQL includes a string-matching operator for comparisons
on character strings. The operator “like” uses patterns that are described using two special characters:– percent (%). The % character matches any substring.– underscore (_). The _ character matches any character.
• Find the names of all customers whose street includes the substring “Main”.
select customer_namefrom customerwhere customer_street like '% Main%'
• Match the name “Main%”like 'Main\%' escape '\'
• SQL supports a variety of string operations such as– concatenation (using “||”)– converting from upper to lower case (and vice versa)– finding string length, extracting substrings, etc.
Ordering the Display of Tuples
• List in alphabetic order the names of all customers having a loan in Perryridge branch
select distinct customer_namefrom borrower, loanwhere borrower loan_number = loan.loan_number and
branch_name = 'Perryridge' order by customer_name
• We may specify desc for descending order or ascfor ascending order, for each attribute; ascending order is the default.Example: order by customer_name desc
Set Operations• The set operations union, intersect, and except
operate on relations and correspond to the relational algebra operations
• Each of the above operations automatically eliminates duplicates; to retain all duplicates use the corresponding multiset versions union all, intersect all and except all.
Suppose a tuple occurs m times in r and n times in s, then, it occurs:• m + n times in r union all s• min(m,n) times in r intersect all s• max(0, m – n) times in r except all s
Set Operations
Find all customers who have a loan, an account, or both:
(select customer_name from depositor)except(select customer_name from borrower)
(select customer_name from depositor)intersect(select customer_name from borrower)
Find all customers who have an account but no loan.
(select customer_name from depositor)union(select customer_name from borrower)
Find all customers who have both a loan and an account.
• So far, we’ve applied aggregate operators to all (qualifying) tuples. Sometimes, we want to apply them to each of several groups of tuples.
• Consider: Find the age of the youngest sailor for each rating level.
– In general, we don’t know how many rating levels exist, and what the rating values for these levels are!
– Suppose we know that rating values go from 1 to 10; we can write 10 queries that look like this (!):
SELECT MIN (S.age)FROM Sailors SWHERE S.rating = i
For i = 1, 2, ... , 10:
Queries With GROUP BY and HAVING
• The target-list contains (i) attribute names (ii) terms with aggregate operations (e.g., MIN (S.age)).
– The attribute list (i) must be a subset of grouping-list. Intuitively, each answer tuple corresponds to a group, and these attributes must have a single value per group. (A group is a set of tuples that have the same value for all attributes in grouping-list.)
SELECT [DISTINCT] target-listFROM relation-listWHERE qualificationGROUP BY grouping-listHAVING group-qualification
Conceptual Evaluation
• The cross-product of relation-list is computed, tuples that fail qualification are discarded, ̀ unnecessary’ fields are deleted, and the remaining tuples are partitioned into groups by the value of attributes in grouping-list.
• The group-qualification is then applied to eliminate some groups. Expressions in group-qualification must have a single value per group!
– In effect, an attribute in group-qualification that is not an argument of an aggregate op also appears in grouping-list.
• One answer tuple is generated per qualifying group.
3/2/2011
5
Aggregate Functions – Group By
• Find the number of depositors for each branch.
Note: Attributes in select clause outside of aggregate functions must appear in group by list
• Find the names of all branches where the average account balance is more than $1,200.
Note: predicates in the having clause are applied after the formation of groups whereas predicates in the whereclause are applied before forming groups
• Find the average balance for each customer who lives in Harrison and has at least three accounts.
Select d.customer_name, avg (balance)from account a , depositor d, customer c
Where d.account_number=a.account_number and
d.customer_name=c.customer_name and
customer_city=‘Harrison’
group by d.customer_namehaving count(d.account_number) >= 3
Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors
SELECT S.rating, MIN (S.age) AS minage
FROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1
sid sname rating age
22 dustin 7 45.0
29 brutus 1 33.0
31 lubber 8 55.5
32 andy 8 25.5
58 rusty 10 35.0
64 horatio 7 35.0
71 zorba 10 16.0
74 horatio 9 35.0
85 art 3 25.5
95 bob 3 63.5
96 frodo 3 25.5
Answer relation:
Sailors instance:
rating minage
3 25.5
7 35.0
8 25.5
Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors.
rating age
7 45.0
1 33.0
8 55.5
8 25.5
10 35.0
7 35.0
10 16.0
9 35.0
3 25.5
3 63.5
3 25.5
rating minage
3 25.5
7 35.0
8 25.5
rating age
1 33.0
3 25.5
3 63.5
3 25.5
7 45.0
7 35.0
8 55.5
8 25.5
9 35.0
10 35.0
Nested Subqueries• SQL provides a mechanism for the nesting of
subqueries.
• A subquery is a select-from-where expression that is nested within another query.
• A common use of subqueries is to perform tests for set membership, set comparisons, and set cardinality.
3/2/2011
6
“In” Construct• Find all customers who have both an account
and a loan at the bank.
Find all customers who have a loan at the bank but do not have an account at the bank
select distinct customer_namefrom borrowerwhere customer_name not in (select customer_name
from depositor )
select distinct customer_namefrom borrowerwhere customer_name in (select customer_name
from depositor )
Nested Queries
• A very powerful feature of SQL: a WHERE clause can itself contain an SQL query! (Actually, so can FROM and HAVING clauses.)
• To find sailors who’ve not reserved #103, use NOT IN.
• To understand semantics of nested queries, think of a nested loopsevaluation: For each Sailors tuple, check the qualification by computing the subquery.
SELECT S.snameFROM Sailors SWHERE S.sid IN (SELECT R.sid
FROM Reserves RWHERE R.bid=103)
Find names of sailors who’ve reserved boat #103:
Example Query• Find all customers who have both an account and a
loan at the Perryridge branch
Note: Above query can be written in a much simpler manner. The formulation above is simply to illustrate SQL features.
Nested Subqueries• SQL provides a mechanism for the nesting of
subqueries.
• A subquery is a select-from-where expression that is nested within another query.
• A common use of subqueries is to perform tests for set membership, set comparisons, and set cardinality.
“In” Construct• Find all customers who have both an account
and a loan at the bank.
Find all customers who have a loan at the bank but do not have an account at the bank
select distinct customer_namefrom borrowerwhere customer_name not in (select customer_name
from depositor )
select distinct customer_namefrom borrowerwhere customer_name in (select customer_name
from depositor )
3/2/2011
9
Nested Queries
• A very powerful feature of SQL: a WHERE clause can itself contain an SQL query! (Actually, so can FROM and HAVING clauses.)
• To find sailors who’ve not reserved #103, use NOT IN.
• To understand semantics of nested queries, think of a nested loopsevaluation: For each Sailors tuple, check the qualification by computing the subquery.
SELECT S.snameFROM Sailors SWHERE S.sid IN (SELECT R.sid
FROM Reserves RWHERE R.bid=103)
Find names of sailors who’ve reserved boat #103:Example Query
• Find all customers who have both an account and a loan at the Perryridge branch
Note: Above query can be written in a much simpler manner. The formulation above is simply to illustrate SQL features.
Variable from outer level is known as a correlation variable
Derived Relationso SQL allows a subquery expression to be used in the from
clauseo Find the average account balance of those branches where
the average account balance is greater than $1200.select branch_name, avg_balancefrom (select branch_name, avg (balance)
from accountgroup by branch_name )as branch_avg ( branch_name, avg_balance )
where avg_balance > 1200
o Note that we do not need to use the having clause, since we compute the temporary (view) relation branch_avg in the from clause, and the attributes of branch_avg can be used directly in the where clause.
Null Values• It is possible for tuples to have a null value, denoted by null,
for some of their attributes
• null signifies an unknown value or that a value does not exist.
• The predicate is null can be used to check for null values.
– Example: Find all loan number which appear in the loanrelation with null values for amount.
select loan_numberfrom loanwhere amount is null
• The result of any arithmetic expression involving null is null
– Example: 5 + null returns null
• However, aggregate functions simply ignore nulls
3/2/2011
11
Null Values and Three Valued Logic
• Any comparison with null returns unknown– Example: 5 < null or null <> null or null = null
• Three-valued logic using the truth value unknown:– OR: (unknown or true) = true,
(unknown or false) = unknown(unknown or unknown) = unknown
– AND: (true and unknown) = unknown, (false and unknown) = false,(unknown and unknown) = unknown
– NOT: (not unknown) = unknown– “P is unknown” evaluates to true if predicate P
evaluates to unknown
• Result of where clause predicate is treated as false if it evaluates to unknown
Null Values and Aggregates• Total all loan amounts
select sum (amount )from loan
– Above statement ignores null amounts
• All aggregate operations except count(*) ignore tuples with null values on the aggregated attributes.
Derived Relationso SQL allows a subquery expression to be used in the from
clauseo Find the average account balance of those branches where
the average account balance is greater than $1200.select branch_name, avg_balancefrom (select branch_name, avg (balance)
from accountgroup by branch_name )as branch_avg ( branch_name, avg_balance )
where avg_balance > 1200
o Note that we do not need to use the having clause, since we compute the temporary (view) relation branch_avg in the from clause, and the attributes of branch_avg can be used directly in the where clause.
• Find the maximum total balance across all branches.
• Select max(tot_balance)
• From ( select branch_name, sum(balance)
• from account
• group by branch_name) as branch_total(branch_name,tot_balance)
3/2/2011
12
Joined Relations• Join operations take two relations and return as a result
another relation.
• These additional operations are typically used as subquery expressions in the from clause
• Join condition – defines which tuples in the two relations match, and what attributes are present in the result of the join.
• Join type – defines how tuples in each relation that do not match any tuple in the other relation (based on the join condition) are treated.
Joined Relations – Datasets for Examples
Relation loan Relation borrower
Note: borrower information missing for L-260 and loan information missing for L-155