Cs5530/6530 Exam 1 Reviecs5530/Lectures/exam1-review.pdf · • SQL • Integrity constraints, triggers and security • Indexes and transactions . CS5530/6530 3! Juliana Freire Exam

Post on 01-Nov-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

CS5530/6530 Juliana Freire 1!

Cs5530/6530 Exam 1 Review

Juliana Freire

CS5530/6530 Juliana Freire 2!

What will be covered?

Everything we covered through 3/1: •  Relational model •  Relational algebra •  SQL •  Integrity constraints, triggers and security •  Indexes and transactions

CS5530/6530 Juliana Freire 3!

Exam Format

•  Open book, closed everything else •  Similar to homework assignments

CS5530/6530 Juliana Freire 4!

The Relational Model and Relational Algebra

CS5530/6530 Juliana Freire 5!

Database Model

•  Provides the means for specifying particular data structures, for constraining the data sets associated with these structures, and for manipulating the data

•  Data definition language (DDL): define structures and constraints

•  Data manipulation language (DML): specify manipulations/operations over the data

CS5530/6530 Juliana Freire 6!

Why Study the Relational Database Model?

•  Extremely useful and simple –  Single data-modeling concept: relations = 2-D tables –  Allows clean yet powerful manipulation languages

•  Most widely used model

CS5530/6530 Juliana Freire 7!

•  We can describe tables in a relational database as sets of tuples

•  We can describe query operators using set theory •  The query language is called relational algebra •  Normally, not used directly -- foundation for SQL

and query processing

Describing a Relational Database Mathematically: Relational Algebra

CS5530/6530 Juliana Freire 8!

•  The usual set operations: union, intersection, difference

•  Operations that remove parts of relations: selection, projection

•  Operations that combine tuples from two relations: Cartesian product, join

•  Since each operation returns a relation, operations can be composed! (Algebra is “closed”)

Relational Algebra

CS5530/6530 Juliana Freire 9!

Relational Algebra

•  Several ways to express the same query •  We can prove that 2 queries are equivalent! •  Query optimizer derives several equivalent queries

and selects the one that is most efficient R ⋈ S = πR.A,R.B,R.C (σR.B=S.B and R.C=S.C (R X S)) r ÷ s = ∏R-S (r) – ∏R-S ((∏R-S (r) x s) – ∏R-S,S (r))

σ cond1 (σ cond2 R) ≡ σ cond2 (σ cond1 R) ≡ σ cond1 and cond2 R

R1 ⋈cond R2 ≡ σcond (R1 X R2)

CS5530/6530 Juliana Freire 10!

Challenge Question

•  How could you express the natural join operation if you didn’t have a natural join operator in relational algebra? Consider you have two relations R(A,B,C) and S(B,C,D). ????

CS5530/6530 Juliana Freire 11!

Challenge Question

•  How could you express the natural join operation if you didn’t have a natural join operator in relational algebra? Consider you have two relations R(A,B,C) and S(B,C,D).

πR.A,R.B,R.C, S.D (σR.B=S.B and R.C=S.C (R X S))

CS5530/6530 Juliana Freire 12!

Challenge Question

•  How could you express the left outer join operation in relational algebra? Consider you have two relations R(A,B) and S(B,C).

CS5530/6530 Juliana Freire 13!

Challenge Question

•  How could you express the left outer join operation in relational algebra? Consider you have two relations R(A,B) and S(B,C).

•  (R⋈S) U (πR.A,R.B,NULL(R – (πR.A,R.BR⋈S))

•  What about the right outer join? And full outer join?

CS5530/6530 Juliana Freire 14!

SQL

Juliana Freire

CS5530/6530 Juliana Freire 15!

What is SQL?

•  Data manipulation: ad-hoc queries and updates

•  Data definition: creation of tables and views

•  Control: assertions to protect data integrity

SELECT * FROM Account WHERE Type = "checking ";

CREATE TABLE Account (Number integer NOT NULL, Owner character,

Balance currency, Type character, PRIMARY KEY (Number));

CHECK (Owner IS NOT NULL)

CS5530/6530 Juliana Freire 16!

Relational Algebra vs. SQL

•  Relational algebra = query only •  SQL = data manipulation + data definition +

constraints •  SQL data manipulation is similar to, but not exactly

the same as relational algebra –  SQL is based on set and relational operations with

certain modifications and enhancements –  SQL supports aggregates –  Results are not always sets

CS5530/6530 Juliana Freire 17!

Relational Algebra and SQL: Going Back and Forth

SELECT * FROM ATMWithdrawal WHERE Amount < 50;

σ Amount < 50 ATMWithdrawal

CS5530/6530 Juliana Freire 18!

Relational Algebra and SQL: Going Back and Forth

SELECT AcctNo, Amount FROM ATMWithdrawal WHERE Amount < 50;

πAcctNo, Amount (σ Amount < 50 ATMWithdrawal)

CS5530/6530 Juliana Freire 19!

Relational Algebra and SQL: Going Back and Forth

SELECT AcctNo, Amount FROM ATMWithdrawal A,Customer C WHERE Amount < 50 AND A.cid = C.cid; πAcctNo, Amount (σ ATMWithdrawal.cid=Customer.cid AND Amount < 50

ATMWithdrawal X Customer)

πAcctNo, Amount (σ AND Amount < 50 (ATMWithdrawal ⋈ ATMWithdrawal.cid=Customer.cid

Customer))

CS5530/6530 Juliana Freire 20!

Sets vs. Bags

•  Query results can be bags! •  Duplicates are not always removed by default

–  It is expensive! –  Need to explicitly request duplicate removal:

SELECT DISTINCT •  Some operators do remove duplicates: UNION,

INTERSECT, MINUS –  Unless you specify otherwise – UNION ALL,

INTERSECT ALL and MINUS ALL return bags

CS5530/6530 Juliana Freire 21!

How is an SQL query evaluated

SELECT AcctNo, Amount FROM ATMWithdrawal WHERE Amount < 50;

Second, the WHERE clause is evaluated for all possible combinations from the input tables.

First, the FROM clause tells us the input tables.

Third, the SELECT clause tells us which attributes to keep in the query answer.

CS5530/6530 Juliana Freire 22!

Relational Algebra Helps!

•  select A1, A2, ..., An from r1, r2, ..., rm where P

∏A1, A2, ..., An(σP (r1 x r2 x ... x rm))

CS5530/6530 Juliana Freire 23!

Subquery: Semantics

•  Analyze from the inside out –  Evaluate the innermost subquery, and replace that

with the resulting relation –  Repeat

SELECT C1.Number, C1.Name FROM Customer C1 WHERE C1.CRating IN

(SELECT MAX (C2.CRating) FROM Customer C2)

(10)

CS5530/6530 Juliana Freire 24!

Correlated Subqueries

The simplest subqueries can be evaluated once and for all and the result used in a higher-level query

•  More complicated subqueries must be evaluated once for each assignment of a value to a term in the subquery that comes from a tuple outside the subquery – nested loop semantics

SELECT S.Number, S.Name FROM Salesperson S, Customer C WHERE S.Number = C. SalespersonNum AND C.Name = S.Name SELECT S.Number, S.Name FROM Salesperson S WHERE S.Number IN (SELECT C.SalespersonNum

FROM Customer C WHERE C.Name = S.Name;);

Because the subquery mentions an attribute from a table in the outer query

CS5530/6530 Juliana Freire 25!

Grouping

•  GROUP BY partitions a relation into groups of tuples that agree on the value of one or more columns

•  Useful when combined with aggregation – apply aggregation within each group

•  Any form of SQL query (e.g., with or without subqueries) can have the answer “grouped”

•  The query result contains one output row for each group

CS5530/6530 Juliana Freire 26!

Group By: Example

•  R(a,b) = {(1,2),(1,3),(2,3)}

Select a From R Group by a

What is the result of this query? {1,2} How would you write this query without the Group By

clause?

CS5530/6530 Juliana Freire 27!

Group By: Another Example

•  R(a,b) = {(1,2),(1,3),(2,3)}

Select a,b From R Group by a

What is the result of this query?

CS5530/6530 Juliana Freire 28!

HAVING Clauses

•  Select groups based on some aggregate property of the group –  E.g., Only list a salesperson if he/she has more than 10 customers

•  The HAVING clause is a condition evaluated against each group –  A group participates in the query answer if it satisfies the HAVING

predicate

SELECT SalespersonNum FROM Customer GROUP BY SalespersonNum HAVING Count(*) > 10;

CS5530/6530 Juliana Freire 29!

GROUP BY, HAVING: Note

•  The only attributes that can appear in a “grouped” query answer are aggregate operators (that are applied to the group) or the grouping attribute(s)

SELECT SalespersonNum, COUNT(*) FROM Customer GROUP BY SalespersonNum;

SELECT SalespersonNum FROM Customer GROUP BY SalespersonNum HAVING Count(*) > 10;

SELECT SalespersonNum, C.Name, COUNT(*) FROM Customer C GROUP BY SalespersonNum;

Incorrect!

CS5530/6530 Juliana Freire 30!

Challenge Question

Under what conditions are the following two queries equivalent?

SELECT DISTINCT A FROM Table1;

SELECT A FROM Table1;

Note: Two queries are equivalent only if they produce identical results for all instances

?????

CS5530/6530 Juliana Freire 31!

Challenge Question

Under what conditions are the following two queries equivalent?

SELECT DISTINCT A FROM Table1;

SELECT A FROM Table1;

Note: Two queries are equivalent only if they produce identical results for all instances

Answer: When A is a key for Table1.

CS5530/6530 Juliana Freire 32!

Challenge Questions

What is the implication of using DISTINCT when computing the SUM or AVG of an attribute?

COUNT(DISTINCT CBalance) or AVG(DISTINCT CBalance)

CS5530/6530 Juliana Freire 33!

Challenge Questions

What about the following usage?

DISTINCT SUM(CBalance) or DISTINCT AVG(CBalance)

and

DISTINCT MIN(CBalance) or DISTINCT MAX(CBalance)

CS5530/6530 Juliana Freire 34!

Null Values are trouble!

•  A null value may have different meanings: unknown,value inapplicable,value withheld

•  NULL is not a constant –  NULL+3 is not a legal expression! –  If x is NULL, x+3 = NULL

•  Special predicates: IS NULL, IS NOT NULL •  In comparisons, NULLs may lead to UNKNOWN!

–  TRUE = 1; FALSE = 0; UNKNOWN=1/2 –  X and Y = min(X,Y); X or Y = max(X,Y); not X = 1 – X

•  Tuples for which the condition evaluates to UNKNOWN are not included in the result

CS5530/6530 Juliana Freire 35!

Challenge Question

Since all withdrawals have Amount greater than or equal to zero, is it the case that the query SELECT * FROM ATMWithdrawal WHERE Amount >= 0;

Always return a copy of the ATMWithdrawal table?

not if NULLs are allowed in the amount column!

CS5530/6530 Juliana Freire 36!

Aggregates and NULLs

•  General rule: aggregates ignore NULL values – Avg(1,2,3,NULL,4) = Avg(1,2,3,4) – Count(1,2,3,NULL,4) = Count(1,2,3,4)

•  But… – Count(*) returns the total number of tuples,

regardless whether they contain NULLs or not

CS5530/6530 Juliana Freire 37!

Modifying the Database

delete from account where balance < (select avg (balance) from account)

insert into account select loan-number, branch-name, 200 from loan where branch-name = ‘Perryridge’

update account set balance = balance * 1.05

where balance > select avg(balance) from account

CS5530/6530 Juliana Freire 38!

Integrity Constraints and

Security

CS5530/6530 Juliana Freire 39!

Why Integrity Constraints?

•  Integrity constraints guard against accidental damage to the database, by ensuring that authorized changes to the database do not result in a loss of data consistency.

CS5530/6530 Juliana Freire 40!

Constraints: Keys and Foreign Keys •  Keys

–  Can only have one primary key –  Multiple candidate keys defined through UNIQUE

CREATE TABLE Student (sid CHAR(20), ssn char(8), sname CHAR(40), PRIMARY KEY (sid), UNIQUE(ssn) •  Foreign-key, or referential-integrity CREATE TABLE Enrolled (sid CHAR(20), ssn char(8), cid CHAR(20), grade CHAR

(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students )

CS5530/6530 Juliana Freire 41!

Constraints: Value- and Tuple-Based •  Value-based constraints.

–  Constrain values of a particular attribute CREATE TABLE Enrolled (sid CHAR(20), ssn char(8), cid CHAR(20), grade CHAR(2), CHECK (grade >=0 AND grade <=4), … )

•  Tuple-based constraints. –  Relationship among components CREATE TABLE Enrolled

(sid CHAR(20) CHECK ( sid IN (SELECT sid FROM Students)), cid CHAR(20),grade CHAR(2),…)

When are these constraints checked?

CS5530/6530 Juliana Freire 42!

Constraints: Assertions

•  Assertions: any SQL boolean expression CREATE ASSERTION NoOverworkedStudent CHECK (

NOT EXISTS ( SELECT sid,COUNT(cid) AS TotCourses FROM Enrolled GROUP BY sid HAVING TotCourses > 4 ));

When are these constraints checked?

CS5530/6530 Juliana Freire 43!

SQL and Recursion

•  Adding recursion extends relational algebra and SQL-92 in a fundamental way; included in SQL:1999, though not the core subset –  Not supported by all vendors

WITH RECURSIVE Comp(Part, Subpt) AS

(SELECT A1.Part, A1.Subpt FROM Assembly A1) UNION ALL (SELECT A2.Part, C1.Subpt FROM Assembly A2, Comp C1 WHERE A2.Subpt=C1.Part)

SELECT * FROM Comp

CS5530/6530 Juliana Freire 44!

Granting of Privileges

•  The passage of authorization from one user to another may be represented by an authorization graph

•  The root of the graph is the database administrator •  The nodes of this graph are the users •  An edge Ui →Uj indicates that user Ui has granted update

authorization to Uj. U1 U4

U2 U5

U3

DBA

How would you write an SQL

query to find the names of all users that have update

privilege?

top related