Top Banner
1 1 © D. Wong 2003 © D. Wong 2003 Normalization Normalization Purpose: process to eliminate redundancy in Purpose: process to eliminate redundancy in relations due to functional or multi-valued relations due to functional or multi-valued dependencies. dependencies. Decompose relation schema into Normal forms: Decompose relation schema into Normal forms: Boyce-Codd Normal Form (BCNF) Boyce-Codd Normal Form (BCNF) Third Normal Form (3NF) Third Normal Form (3NF) Fourth Normal Form (4NF) Fourth Normal Form (4NF) To obtain the new relations, project the To obtain the new relations, project the schemas onto the original relation schema schemas onto the original relation schema (e.g. Movie) (e.g. Movie) To recover information (I.e. Movie) from the To recover information (I.e. Movie) from the new relations: natural join the new new relations: natural join the new relations. relations.
33

© D. Wong 2003 1 Normalization Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies. Decompose relation.

Jan 17, 2016

Download

Documents

Alan Stephens
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

11 © D. Wong 2003© D. Wong 2003

NormalizationNormalization

Purpose: process to eliminate redundancy in Purpose: process to eliminate redundancy in relations due to functional or multi-valued relations due to functional or multi-valued dependencies.dependencies.

Decompose relation schema into Normal forms:Decompose relation schema into Normal forms:

– Boyce-Codd Normal Form (BCNF)Boyce-Codd Normal Form (BCNF)

– Third Normal Form (3NF)Third Normal Form (3NF)

– Fourth Normal Form (4NF)Fourth Normal Form (4NF) To obtain the new relations, project the schemas To obtain the new relations, project the schemas

onto the original relation schema (e.g. Movie)onto the original relation schema (e.g. Movie) To recover information (I.e. Movie) from the new To recover information (I.e. Movie) from the new

relations: natural join the new relations. relations: natural join the new relations.

Page 2: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

22 © D. Wong 2003© D. Wong 2003

BCNF Decomposition Example 3.24 pp 104BCNF Decomposition Example 3.24 pp 104

Relation: Movie(title, year, length, filmType, studioName, Relation: Movie(title, year, length, filmType, studioName, starName)starName)

Key: {title, year, starName}Key: {title, year, starName}

FD’s: title year FD’s: title year length filmType studioName is a BCNF length filmType studioName is a BCNF violation, so Movie not in BCNFviolation, so Movie not in BCNF

Decomposition:Decomposition:

Schema 1: {title, year, length, filmType, studioName}Schema 1: {title, year, length, filmType, studioName}

Schema 2: {title, year, starName}Schema 2: {title, year, starName}

To obtain the new relations, project the schemas onto MovieTo obtain the new relations, project the schemas onto Movie

To recover information (I.e. Movie) from the new relations: To recover information (I.e. Movie) from the new relations: natural join the new relations. Does not lose information.natural join the new relations. Does not lose information.

Page 3: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

33 © D. Wong 2003© D. Wong 2003

Functional Dependencies (FD)Functional Dependencies (FD)

Given: Given: relation schemarelation schema R(A1, …, An), and X and R(A1, …, An), and X and Y be subsets of (A1, … An).Y be subsets of (A1, … An).

FD : X FD : X Y means X functionally determines Y Y means X functionally determines Y

e.g. Ae.g. A11AA22…A…Ann B B11BB22…B…Bmm

AA11AA22…A…Ann B B1BB2…B…Bm is an assertion about R that is an assertion about R that

two attributes or sets of attributes in R are two attributes or sets of attributes in R are dependent of one another.dependent of one another.

Page 4: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

44 © D. Wong 2003© D. Wong 2003

Mutivalued Dependencies (MVD)Mutivalued Dependencies (MVD)Given: relation schema R, and AGiven: relation schema R, and A11AA22…A…Ann and B and B1BB2…B…Bm be subsets be subsets

of attributes of R.of attributes of R.

MVD : AMVD : A11AA22…A…Ann B B1BB2…B…Bm holds in R if : holds in R if :

For each pair of tuples t and u of relation R that agree on For each pair of tuples t and u of relation R that agree on all the A’s, we can find in R some tuple v that agrees:all the A’s, we can find in R some tuple v that agrees:

1.1. With both t and u on the A’s,With both t and u on the A’s,2.2. With t on the B’s, andWith t on the B’s, and3.3. With u on all attributes of R that are not among the With u on all attributes of R that are not among the

A’s or B’sA’s or B’sAA11AA22…A…Ann B B1BB2…B…Bm is an assertion about R that two is an assertion about R that two

attributes or sets of attributes in R are attributes or sets of attributes in R are independentindependent of one of one another.another.

Cause redundancy not related to FD’s in a BCNF schema. Cause redundancy not related to FD’s in a BCNF schema. Most common source: putting 2 or more many-many Most common source: putting 2 or more many-many

relationships in a single relation.relationships in a single relation.

Page 5: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

55 © D. Wong 2003© D. Wong 2003

MVD RulesMVD Rules

Trivial dependencies ruleTrivial dependencies rule

If AIf A11AA22…A…Ann B B1BB2…B…Bm holds for R, then AA11AA22…A…Ann CC1CC2…C…Ck holds where the C’s are the B’s + one or more of the A’s. The converse also hold.

Transitive rule

If AA11AA22…A…Ann B B1BB2…B…Bm and BB11BB22…B…Bmm C C1CC2…C…Ck

then AA11AA22…A…Ann C C1CC2…C…Ck

Splitting rule does not holdE.g. name street city, but not name street city, but not name street street

So, always start with set of attributes on the R.S. because So, always start with set of attributes on the R.S. because splitting rule does not hold.splitting rule does not hold.

Page 6: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

66 © D. Wong 2003© D. Wong 2003

More MVD RulesMore MVD Rules Every FD is an MVDEvery FD is an MVD

Because If FD Because If FD AA11AA22…A…Ann BB1BB2…B…Bm, then swapping B’s between , then swapping B’s between tuples that agree on A’s doesn’t create new tuples.tuples that agree on A’s doesn’t create new tuples.

Complementation ruleComplementation rule

If If X X Y, then X Y, then X Z, where Z is all attributes not in X Z, where Z is all attributes not in X or Yor Y

e.g. Star_Star_In {name, street, city, title, year}e.g. Star_Star_In {name, street, city, title, year}

name name street city street city

namename title year title year

A’s B’s

t

u

Page 7: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

77 © D. Wong 2003© D. Wong 2003

Nontrivial MVDNontrivial MVD

AA11AA22…A…Ann B B1BB2…B…Bm for a relation R is nontrivial if:

1.1. BB1BB2…B…Bm is not a subset of AA11AA22…A…Ann

2.2. AA11AA22…A…An n B B1BB2…B…Bm is not all attributes of R

Page 8: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

88 © D. Wong 2003© D. Wong 2003

Fourth Normal Form (4NF)Fourth Normal Form (4NF)

Decompose relations that has MVD’s into 4NF to Decompose relations that has MVD’s into 4NF to eliminate MVD’s.eliminate MVD’s.

Definition:Definition:

R is in 4NF if AR is in 4NF if A11AA22…A…Ann B B1BB2…B…Bm is a nontrivial MVD, {AA11AA22…A…Ann} is a superkey.} is a superkey.

Since every FD is an MVD, so 4NF is more every FD is an MVD, so 4NF is more stringent than BCNFstringent than BCNF

Only nontrivial MVD’s has the potential to violate 4NF

Page 9: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

99 © D. Wong 2003© D. Wong 2003

4NF Decomposition4NF Decomposition

Given: relation R, and nontrivial MVD X Given: relation R, and nontrivial MVD X Y that violate Y that violate 4NF4NF

1.1. Decompose X Decompose X Y into XY and X Y into XY and X (R-Y) (R-Y)

2.2. Produce the relations by projecting R onto XY and Produce the relations by projecting R onto XY and X X (R-Y) (R-Y)

3.3. Reconstruct R from the new relations using natural joinReconstruct R from the new relations using natural join

e.g. Star_Star_In {name, street, city, title, year} and e.g. Star_Star_In {name, street, city, title, year} and

name name street city street city

Decompose Star_Star_In using name Decompose Star_Star_In using name street city into street city into {name, street, city} and {name, title, year}{name, street, city} and {name, title, year}

X

Y

R

Page 10: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

1010 © D. Wong 2003© D. Wong 2003

Relationships among normal formsRelationships among normal forms

4NF is the most stringent4NF is the most stringent

4NF 4NF BCNF BCNF 3NF 3NF

Page 11: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

1111 © D. Wong 2003© D. Wong 2003

Lossless-join decompositionLossless-join decomposition

Given: Relation R, decomposed into schemes RGiven: Relation R, decomposed into schemes R11, R, R22, … , …

RRkk, and D is a set of dependencies., and D is a set of dependencies.

Definition: RDefinition: R11, R, R22, … R, … Rk k is a lossless-join (w.r.t. D) if for is a lossless-join (w.r.t. D) if for

every relation r for R satisfying D:every relation r for R satisfying D:

r = r = R1R1(r) (r) R2R2(r) (r) …RkRk(r) (r)

i.e. Every relation r for R is the natural join of i.e. Every relation r for R is the natural join of its projections onto the Rits projections onto the Rii’s.’s.

The lossless-join property is necessary if the decomposed The lossless-join property is necessary if the decomposed relation is to be recoverable from its relation is to be recoverable from its decomposition.decomposition.

However, joins are expensive. So, don’t over decompose!However, joins are expensive. So, don’t over decompose!

Page 12: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

1212 © D. Wong 2003© D. Wong 2003

Structured Query Language (SQL)Structured Query Language (SQL)

A DDL and DML for relational DBMSsA DDL and DML for relational DBMSs

History: ANSI SQL, , SQL-92 (SQL2), SQL-99 (SQL3)History: ANSI SQL, , SQL-92 (SQL2), SQL-99 (SQL3)

SQL-99 extends SQL2 with object-relational features and SQL-99 extends SQL2 with object-relational features and other new featuresother new features

Most DBMS vendors implements the core, and then add Most DBMS vendors implements the core, and then add bells and whistles and variationsbells and whistles and variations

Query capability is close to relational algebra, with lots of Query capability is close to relational algebra, with lots of extensions.extensions.

Case insensitive except characters inside quoted strings ' 'Case insensitive except characters inside quoted strings ' '

e.g. 'Smith' e.g. 'Smith' 'SMITH' 'SMITH'

; as statement delimiter; as statement delimiter

Page 13: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

1313 © D. Wong 2003© D. Wong 2003

Example database schemaExample database schema

Movie(title, year, length, inColor, studioName, producerC#)Movie(title, year, length, inColor, studioName, producerC#)

StartIn(movieTitle, movieYear, starName)StartIn(movieTitle, movieYear, starName)

MovieStar(name, address, gender, birthdate)MovieStar(name, address, gender, birthdate)

MovieExec(name, address, cert#, netWorth)MovieExec(name, address, cert#, netWorth)

Studio(name, address, presC#)Studio(name, address, presC#)

Page 14: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

1414 © D. Wong 2003© D. Wong 2003

SQL Quries – basic formSQL Quries – basic form

SELECT attribute/sSELECT attribute/s

FROM relations / views /subquryFROM relations / views /subqury

WHERE conditional expression;WHERE conditional expression;

Page 15: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

1515 © D. Wong 2003© D. Wong 2003

SQL query examplesSQL query examples

1.1. Example 1:Example 1:

SELECT * SELECT *

FROM Movie;FROM Movie; -- * => all attributes of Movie -- * => all attributes of Movie

2.2. Example 2:Example 2:

SELECT * SELECT *

FROM MovieFROM Movie

WHERE studioName = 'Disney' AND year = 1990;WHERE studioName = 'Disney' AND year = 1990;

3.3. Example 3:Example 3:

SELECT title, length SELECT title, length

FROM MovieFROM Movie

WHERE studioName = 'Disney' AND year = 1990;WHERE studioName = 'Disney' AND year = 1990;

Page 16: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

1616 © D. Wong 2003© D. Wong 2003

DuplicatesDuplicates

SQL generally operates using bags instead of setsSQL generally operates using bags instead of sets

Exception: UNION, INTERSECT, EXCEPT Exception: UNION, INTERSECT, EXCEPT operationoperation

To eliminate duplicates, add keyword DISTINCT To eliminate duplicates, add keyword DISTINCT to the SELECT clauseto the SELECT clause

e.g. SELECT DISTINCT starName e.g. SELECT DISTINCT starName

FROM StarsIn;FROM StarsIn;

Duplicate elimination is costly. Use judiciously.Duplicate elimination is costly. Use judiciously.

Page 17: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

1717 © D. Wong 2003© D. Wong 2003

SQL Correspondence to Relational AlgebraSQL Correspondence to Relational Algebra

SELECT SELECT LL -- -- R.A. project R.A. project

FROM FROM RR -- -- R.A. operands R.A. operands

WHERE WHERE CC ;; -- -- R.A. select R.A. select

R.A. expression: R.A. expression: LL((CC(R))(R))

When reading and writing queries:When reading and writing queries:

1.1. FROMFROM -- what relations are involved-- what relations are involved

2.2. WHEREWHERE -- what's the tuples selection criteria-- what's the tuples selection criteria

3.3. SELECTSELECT -- what columns to output-- what columns to output

Page 18: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

1818 © D. Wong 2003© D. Wong 2003

Union, Intersection, Difference of QueriesUnion, Intersection, Difference of Queries

UNION : UNION : R1 UNION R2R1 UNION R2 or or (Q1) UNION (Q2)(Q1) UNION (Q2)

e.g. (SELECT title, year FROM Movie)e.g. (SELECT title, year FROM Movie)

UNIONUNION

(SELECT movieTitle AS title, movieYear (SELECT movieTitle AS title, movieYear AS year FROM StarsIn);AS year FROM StarsIn);

INTERSECT : INTERSECT : R1 INTERSECT R2R1 INTERSECT R2 or or

(Q1) INTERSECT (Q2)(Q1) INTERSECT (Q2)

EXCEPT: EXCEPT: R1 EXCEPT R2R1 EXCEPT R2 -- difference-- difference

(Q1)(Q1) EXCEPT EXCEPT (Q2)(Q2)

Page 19: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

1919 © D. Wong 2003© D. Wong 2003

Union, Intersection, Difference of Queries (continued)Union, Intersection, Difference of Queries (continued)

Q1 and Q2 are queries that produce relationsQ1 and Q2 are queries that produce relations

R1 and R2, or results of Q1 and Q2 should have R1 and R2, or results of Q1 and Q2 should have the same list of attributes and attribute types. the same list of attributes and attribute types. Rename if necessary.Rename if necessary.

Duplicates are eliminated automaticallyDuplicates are eliminated automatically

Add the keyword ALL after UNION, Add the keyword ALL after UNION, INTERSECT, or EXCEPT to prevent duplicates INTERSECT, or EXCEPT to prevent duplicates eliminationelimination

Page 20: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

2020 © D. Wong 2003© D. Wong 2003

SQL and Relational AlgebraSQL and Relational Algebra

The six independent operations are implemented The six independent operations are implemented by SQLby SQL

SQL is relational completeSQL is relational complete

Page 21: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

2121 © D. Wong 2003© D. Wong 2003

Some data values in SQLSome data values in SQL

1.1. Strings Strings

2.2. Dates and TimesDates and Times

3.3. Null valuesNull values

4.4. Truth value of UnknownTruth value of Unknown

Page 22: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

2222 © D. Wong 2003© D. Wong 2003

1. Strings1. Strings

Comparison operators (according to lexicographical order) Comparison operators (according to lexicographical order) <, >, <=, >= = <, >, <=, >= =

LIKE -- pattern matchingLIKE -- pattern matching

%% -- matches any sequence of 0 or more characters -- matches any sequence of 0 or more characters

__ -- matches any one character -- matches any one character

E.g.: title LIKE 'Star E.g.: title LIKE 'Star _ _ _ __ _ _ _''

E.g.: title LIKE 'E.g.: title LIKE '%''%''ss%%''

Can specify escape characterCan specify escape characterE.g. title LIKE 'E.g. title LIKE 'x%x%%%x%' ESCAPE 'x'x%' ESCAPE 'x'

Page 23: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

2323 © D. Wong 2003© D. Wong 2003

2. Dates and Times2. Dates and Times

Date constant: DATE '2002-10-01'Date constant: DATE '2002-10-01'

Time constant: TIME '15:00:02.5'Time constant: TIME '15:00:02.5'

Timestamp (combines dates and times):Timestamp (combines dates and times):

TIMESTAMP '2002-10-01 15:00:02.5‘TIMESTAMP '2002-10-01 15:00:02.5‘

(beware of implementation differences!)(beware of implementation differences!)

Comparison operators applyComparison operators apply

Page 24: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

2424 © D. Wong 2003© D. Wong 2003

3. Null Values3. Null Values

NULL to represent:NULL to represent:

1.1. Value unknownValue unknown

2.2. Value inapplicableValue inapplicable

3.3. Value withheldValue withheld Operations involving NULLOperations involving NULL

1.1. Arithmetic operation: result is NULLArithmetic operation: result is NULL

2.2. Comparison: result is UNKNOWNComparison: result is UNKNOWN NULL is NULL is notnot a constant, therefore NULL cannot be used a constant, therefore NULL cannot be used

explicitly as an operand.explicitly as an operand. IS NULL and IS NOT NULL checksIS NULL and IS NOT NULL checks Read "Pitfalls Regarding Nulls" pp. 250Read "Pitfalls Regarding Nulls" pp. 250

Page 25: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

2525 © D. Wong 2003© D. Wong 2003

4. UNKNOWN4. UNKNOWN

Consider TRUE = 1, FALSE = 0, UNKNOWN = Consider TRUE = 1, FALSE = 0, UNKNOWN = 0.50.5

1.1. AND of 2 truth-value = min. of the 2 valuesAND of 2 truth-value = min. of the 2 values

2.2. OR of 2 truth-value = max. of the 2 valuesOR of 2 truth-value = max. of the 2 values

3.3. Negation of v = 1-vNegation of v = 1-v

Refer Fig. 6.2 pp. 250 for truth table for 3-valued Refer Fig. 6.2 pp. 250 for truth table for 3-valued logiclogic

Page 26: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

2626 © D. Wong 2003© D. Wong 2003

The Six Clauses in SQL QueriesThe Six Clauses in SQL Queries

1.1. SELECTSELECT -- required-- required

2.2. FROMFROM -- required-- required

3.3. WHEREWHERE

4.4. GROUP BYGROUP BY

5.5. HAVINGHAVING -- if used, must follows a group by -- if used, must follows a group by clauseclause

6.6. ORDER BYORDER BY

Subqueries may appear in the FROM clause and the Subqueries may appear in the FROM clause and the WHERE clauseWHERE clause

Comments begins with ‘--’Comments begins with ‘--’

Page 27: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

2727 © D. Wong 2003© D. Wong 2003

Table level SQL (ref. 6.6, pp. 292)Table level SQL (ref. 6.6, pp. 292)

Create table – to define the schema of a base table Create table – to define the schema of a base table (Ref. 6.6.1 for data types syntax)(Ref. 6.6.1 for data types syntax)

E.g. E.g. create tablecreate table EMP EMP (( empno empno int not null,int not null, lastName lastName varchar(varchar(3030) not null,) not null, firstName firstName varchar(varchar(3030) not null,) not null, num_of_children num_of_children int,int, constraintconstraint pk_EMP pk_EMP primary keyprimary key ((empnoempno))););

Drop table – to destroy a base tableDrop table – to destroy a base tablee.g. e.g. drop tabledrop table EMP; EMP;

Page 28: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

2828 © D. Wong 2003© D. Wong 2003

Tuple Modification Statements (ref. 6.5, pp. 286)Tuple Modification Statements (ref. 6.5, pp. 286)

Insert – to add a rowInsert – to add a row

Syntax: Syntax: insert intoinsert into R(A R(A11..A..Ann) ) valuesvalues (v (v11…v…vnn))

– E.g. E.g. insert intoinsert into emp( emp(empnoempno,, lastName lastName,, firstName firstName, , num_of_children)num_of_children) valuesvalues (12345, ‘Doe’, ‘John’, 1) (12345, ‘Doe’, ‘John’, 1)

– Or Or insert intoinsert into emp emp valuesvalues (12345, ‘Doe’, ‘John’, 1) (12345, ‘Doe’, ‘John’, 1)

Delete – to remove a rowDelete – to remove a row

Syntax: Syntax: delete fromdelete from R R wherewhere <condition> <condition>

– E.g. E.g. delete fromdelete from emp emp wherewhere empno = 12345 empno = 12345 Update – to modify the contents of a rowUpdate – to modify the contents of a row

Syntax: Syntax: updateupdate R R setset A Aii = value = value wherewhere A Ajj = targetValue = targetValue

– E.g. E.g. updateupdate emp emp setset num_of_children = 2 num_of_children = 2 wherewhere empno = empno = 1234512345

Page 29: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

2929 © D. Wong 2003© D. Wong 2003

Some JOINS in SQL. (ref. pp. 270)Some JOINS in SQL. (ref. pp. 270)

CROSS JOINCROSS JOIN -- -- R.A. cartesian product R.A. cartesian product

e.g. Movie CROSS JOIN StarsIn;e.g. Movie CROSS JOIN StarsIn;

JOIN … ONJOIN … ON -- -- R.A. theta-join R.A. theta-join

e.g. Movie JOIN StarsIn ON title = movieTitle AND year = e.g. Movie JOIN StarsIn ON title = movieTitle AND year = movieYear;movieYear;

[NATURAL] JOIN[NATURAL] JOIN -- -- R.A. natural join R.A. natural join

e.g. MovieStar NATURAL JOIN MovieExec; ore.g. MovieStar NATURAL JOIN MovieExec; or

MovieStar JOIN MovieExec;MovieStar JOIN MovieExec;

OUTERJOINSOUTERJOINS -- joins that include dangling -- joins that include dangling tuplestuples

Page 30: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

3030 © D. Wong 2003© D. Wong 2003

OUTERJOINSOUTERJOINS

An operator to augment the result of a join by the An operator to augment the result of a join by the dangling tuples, padded with null values.dangling tuples, padded with null values.

Full outerjoin of R1 and R2 is a join that includes all Full outerjoin of R1 and R2 is a join that includes all rows from R1 and R2 matched or not. Unmatched rows rows from R1 and R2 matched or not. Unmatched rows are padded with NULLs.are padded with NULLs.

LEFT outerjoin of R1 and R2 is a join that includes all LEFT outerjoin of R1 and R2 is a join that includes all rows from R1, matched or not, plus the matching values rows from R1, matched or not, plus the matching values from R2. Unmatched rows are padded with NULLs.from R2. Unmatched rows are padded with NULLs.

RIGHT outerjoin of R1 and R2 is a join that includes all RIGHT outerjoin of R1 and R2 is a join that includes all rows from R2, matched or not, plus the matching values rows from R2, matched or not, plus the matching values from R1. Unmatched rows are padded with NULLs.from R1. Unmatched rows are padded with NULLs.

The joining may be NATURAL or theta joinThe joining may be NATURAL or theta join

Page 31: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

3131 © D. Wong 2003© D. Wong 2003

Outerjoins SyntaxOuterjoins Syntax

1.1. R1R1 NATURAL { NATURAL {FULL FULL | | LEFT LEFT | | RIGHT}RIGHT} OUTER OUTER JOIN JOIN R2R2;;

E.g. 1. MovieStar NATURAL FULL OUTER E.g. 1. MovieStar NATURAL FULL OUTER JOIN MovieExec;JOIN MovieExec;

E.g. 2. MovieStar NATURAL LEFT OUTER E.g. 2. MovieStar NATURAL LEFT OUTER JOIN MovieExec;JOIN MovieExec;

E.g. 3. MovieStar NATURAL RIGHT OUTER E.g. 3. MovieStar NATURAL RIGHT OUTER JOIN MovieExec;JOIN MovieExec;

Page 32: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

3232 © D. Wong 2003© D. Wong 2003

Outerjoins Syntax (continued)Outerjoins Syntax (continued)

1.1. R1R1 { {FULL FULL | | LEFT LEFT | | RIGHT}RIGHT} OUTER JOIN OUTER JOIN R2 R2 ON conditional expressionON conditional expression;;

E.g. 1. Movie E.g. 1. Movie FULL OUTER JOINFULL OUTER JOIN StarsIn StarsIn ONON title = movieTitle title = movieTitle ANDAND year = movieYear; year = movieYear;

E.g. 2. MovieStar E.g. 2. MovieStar LEFT OUTER JOINLEFT OUTER JOIN StarsIn StarsIn ONON title = movieTitle title = movieTitle ANDAND year = movieYear; year = movieYear;

E.g. 3. MovieStar E.g. 3. MovieStar RIGHT OUTER JOINRIGHT OUTER JOIN StarsIn StarsIn ONON title = movieTitle title = movieTitle ANDAND year = movieYear; year = movieYear;

Page 33: © D. Wong 2003 1 Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.

3333 © D. Wong 2003© D. Wong 2003

Use result of joins as subqueries in queriesUse result of joins as subqueries in queries

E.g. E.g.

SELECT title, year, length, inColor, studioName, SELECT title, year, length, inColor, studioName, producerC#, starNameproducerC#, starName

FROM Movie JOIN StarsIn ONFROM Movie JOIN StarsIn ON

title = movieTitle AND year = movieYear;title = movieTitle AND year = movieYear;