Top Banner
® © 2005 IBM Corporation Access Path Selection and Physical DB Design in Relational Database Management Systems: An Overview Padua University – 10 June 2005
31

Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

Mar 22, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

®

©2005 IB

M C

orp

ora

tion

Acce

ss P

ath

Se

lectio

n a

nd

P

hysic

al D

B D

esig

n

in R

ela

tio

na

l D

ata

ba

se

Ma

na

ge

me

nt

Syste

ms:

An O

verv

iew

����������

������� �����������

Padua U

niv

ers

ity –

10 J

un

e 2

005

Page 2: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

2

��������

Co

nte

nts

�R

DB

MS

an

d S

QL L

angua

ge

�S

tatic S

QL a

nd D

ynam

ic S

QL

�M

app

ing t

he P

hysic

al D

esig

n

�Q

uery

Com

pile

r O

verv

iew

�Q

uery

Optim

izatio

n

�A

ccess P

ath

Sele

ction

�Q

uery

Rew

rite

�E

xam

ple

s

�T

he O

ptim

izer

Exp

lain

ed

�T

he R

ole

of th

e D

BA

Page 3: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

3

��������

Re

lation

al M

ode

l &

Rela

tio

na

l D

ata

base M

ana

gem

ent

Syste

ms

�T

he r

ela

tio

na

l m

ode

l is

the a

pp

lication o

f

�p

red

icate

lo

gic

and

�set

theo

ry

to d

ata

base m

anag

em

ent

�T

heory

: p

red

icate

lo

gic

and s

et

math

em

ati

cs

�S

tructu

re: R

-tab

les

�In

tegrity

: d

om

ain

, co

lum

n, ta

ble

and d

ata

base inte

grity

�M

anip

ula

tion: R

-op

era

tio

ns

(restr

ict, p

roje

ct, join

, etc

.)

Page 4: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

4

��������

R-o

pe

ratio

ns:

Th

e S

QL

La

ng

ua

ge

SELECT *

FROM Today_Orders

�Q

uery

constr

ucts

�S

et O

riente

d

�Logic

al “e

xpre

ssio

ns”

�N

O r

efe

rence t

o a

ny p

hysic

al str

uctu

res

�F

unctiona

lly in

depe

nde

nt

from

ph

ysic

al str

uctu

res

�Q

uery

perf

orm

ance

�H

ighly

(but not fu

lly)

independent fr

om

language c

onstr

ucts

�H

ighly

dependent on D

B p

hysic

al desig

n

Page 5: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

5

��������

R-o

pe

ratio

ns:

Ma

pp

ing

th

e P

hysic

al D

esig

n

SELECT *

FROM Today_Orders

CREATE VIEW Today_OrdersAS

SELECT O.CUST#, O.ORD#, RO.PROD#, RO.QTY

FROM Orders

O

, Order_Lines RO

WHERE O.ODR# = RO.ORD#

AND O.ORDER_DATE = current date

ORDER BY O.CUST#, O.ORD#, RO.PROD#

Ind

exes

Ind

exes

Page 6: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

6

��������

Sta

tic S

QL

�P

re-d

efin

ed s

ynta

x,

hard

-cod

ed into

a h

ost la

ng

ua

ge

(e.g

., C

ob

ol, C

, C

++

, etc

.)

�It m

ay r

efe

rence (

usually

does)

host-

pro

gra

m v

ariable

s

�E

xam

ple

EXEC SQL DECLARE CRS1CURSOR FOR

SELECT C1, C2, C3

FROM T1

WHERE C4 = :c4

OPEN CRS1

FETCH CRS1INTO :c1, :c2, :c3

…….

CLOSE CRS1

�M

ostly u

sed f

or

OLT

P a

nd B

atc

h p

rocessin

g, w

here

data

nee

ds a

re k

now

n in a

dvance

Page 7: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

7

��������

Dyn

am

ic S

QL

�D

ata

needs tota

lly o

r part

ially

unknow

n: e.g

.

�Q

uery

too

ls

�Q

uery

typed b

y e

nd-u

ser

on a

bro

wser

page a

nd b

uffere

d b

y

applic

ation p

rogra

m

�E

xam

ple

EXEC SQL PREPARE STMTINTO :MYSQLDA FROM :SQLBUFFER

EXEC SQL DECLARE CRS1CURSOR FOR STMT

EXEC SQL OPEN CRS1USING :C

EXEC SQL FETCH CRS1INTO :C1, :C2, :C3

….

EXEC SQL CLOSE CRS1

SQLBUFFER = ‘SELECT C1, C2, C3 FROM T1 WHERE C4 = ?’

Page 8: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

8

��������

Qu

ery

Co

mp

iler

Ove

rvie

w

SQ

L Q

uery

Pars

er

Glo

bal Q

uery

Sem

antics

Query

Re-W

rite

Pla

nO

ptim

ization

Code

Genera

tio

n

Qu

ery

Gra

ph

Mo

del

Query

Pla

n

Query

Exp

lain

Pla

nE

xp

lain

Executa

ble

Pla

n

Page 9: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

9

��������

Ste

ps in

Qu

ery

Co

mp

ilatio

n

�P

ars

ing

�A

na

lyze "

text"

of S

QL q

uery

�D

ete

ct synta

x e

rrors

�C

reate

inte

rna

l qu

ery

re

pre

senta

tio

n

�S

em

antic C

heckin

g

�V

alid

ate

SQ

L s

tate

ment

�V

iew

ana

lysis

�In

corp

ora

te c

onstr

ain

ts,

trig

gers

, etc

.

�Q

uery

Optim

ization

�M

od

ify q

uery

to im

pro

ve

perf

orm

ance (

Query

Re

write

)

�C

hoose the m

ost effic

ient

"access p

lan"

(Query

O

ptim

ization)

�C

ode G

enera

tion: genera

te

code that is

�E

xecuta

ble

�E

ffic

ient

Query

Com

pila

tion e

xecute

d d

uring

�A

pplic

ation P

rogra

m B

IND

pro

cess (

Sta

tic S

QL)

�E

xecution o

f S

QL P

repare

(D

ynam

ic S

QL)

Page 10: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

10

��������

Qu

ery

Op

tim

iza

tio

n:

Acce

ss P

ath

Se

lectio

n

�B

ase

d o

n e

stim

ating Q

uery

execu

tion c

ost

�C

PU

�I/O

�E

lapsed T

ime

�…

of A

ccess P

ath

variations b

ased o

n

�U

sin

g d

iffe

rent in

dex(e

s)

/ no index a

t all

�U

sin

g d

iffe

rent jo

in s

equence

�U

sin

g d

iffe

rent jo

in m

eth

od

�U

sin

g q

uery

para

llelis

m

�…

.

�Less e

xpe

nsiv

e A

cce

ss P

ath

chosen

Page 11: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

11

��������

Wh

at’s a

n I

nd

ex

�A

n Index is a

physic

al

str

uctu

re a

llow

ing d

irect (a

nd

ord

ere

d)

access to r

ecord

s

matc

hin

g the v

alu

es o

f th

e

Index K

ey o

r a p

ort

ion o

f it

�A

n Index k

ey c

an b

e

�sin

gle

-colu

mn o

r m

ulti-colu

mn

�U

niq

ue o

r no

n-u

niq

ue

�A

dditio

nal o

ptions m

ight

ap

ply

(e

.g.

Clu

ste

r)

�M

ost com

mon Indexes h

ave a

B

-tre

e s

tructu

re (

see fig

ure

)

�In

dexes p

rovid

e p

erf

orm

ance

advanta

ge for

data

retr

ieval,

but in

cre

ase the c

ost of D

ele

te

/ In

sert

/ U

pdate

opera

tions

Page 12: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

12

��������

Acce

ss P

ath

Qu

iz –

1:

Sin

gle

Ta

ble

Acce

ss

�T

1 =

Rela

tional T

able

(not a V

iew

)

�S

ingle

table

in s

ingle

ph

ysic

al file

�T

CA

RD

=

1,0

00,0

00

�In

de

x I

X1 o

n (

C1),

with D

up

licate

s

�In

de

x I

X2 o

n (

C2),

with D

up

licate

s

�In

de

x I

X3 o

n (

C4),

Uniq

ue

�W

hic

h Index m

ight pro

vid

e the b

est A

ccess P

ath

?

�A

ny s

uggestions for

impro

vin

g the A

ccess P

ath

?

SELECT C1, C2, C3, C4

FROM T1

WHERE C1 = ?

AND C2 = ?

ORDER BY C3

Page 13: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

13

��������

Acce

ss P

ath

Qu

iz 1

: F

ilte

r F

acto

rs

�F

F(C

ol=

litera

l) =

1 / C

ol-C

AR

D

�C

ol-C

AR

D

= N

um

ber

of dis

tinct valu

es for

Col

�T

CA

RD

= N

um

ber

of tu

ple

sin

Table

�E

xam

ple

:

�C

1-C

AR

D=

10,0

00; C

2-C

AR

D =

100,0

00

�F

F(C

1 =

litera

l) =

1 / 1

0,0

00 =

0,0

001 (i.e

. 0,0

1%

)

�F

F(C

2 =

litera

l) =

1 / 1

00,0

00 =

0,0

0001 (

i.e. 0,0

01%

)

SELECT C1, C2, C3, C4

FROM T1

WHERE C1 = ?

AND C2 = ?

ORDER BY C3

Page 14: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

14

��������

Acce

ss P

ath

Qu

iz 1

: C

om

pa

rin

g I

nd

ex A

cce

ss

�U

sin

g IX

1 (

C1)

�Q

ualif

yin

g T

uple

s (

estim

ate

) =

10**

6*

10**

(-4

)=

100

�R

etr

ieve a

ll 10

0 q

ualif

yin

g T

uple

s,

app

ly p

redic

ate

on C

2,

sort

(O

RD

ER

B

Y C

3)

�U

sin

g IX

2 (

C2)

�Q

ualif

yin

g T

uple

s (

estim

ate

) =

10**

6*

10**

(-5

)=

10

�R

etr

ieve a

ll 10 q

ualif

yin

g T

uple

s,

apply

pre

dic

ate

on C

1,

sort

�W

hat about S

kew

ing?

�W

hat if IX

4(C

1,C

2)?

�S

uppose n

um

ber

of

Dis

tinct

Ke

ys in I

X4 =

50

0,0

00

�F

F(C

1 =

litera

l A

ND

C2 =

litera

l) =

4 *

10**

(-6

)

�C

an t

he R

DB

MS

derive F

F(C

1 =

litera

l, C

2 =

litera

l) b

y its

kn

ow

ledge o

f F

F(C

1 =

litera

l) a

nd F

F(C

2 =

litera

l) ?

�A

ny a

dvanta

ge, if a

bove h

old

s tru

e, w

ith IX

4(C

1,C

2,C

3)

?

Page 15: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

15

��������

Acce

ss P

ath

Qu

iz 1

: S

ke

win

g

y

1000y

Colu

mn V

alu

e

Frequency

Avera

ge

Actu

al

Actu

al

Actu

al

Page 16: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

16

��������

Ske

win

g:

No

tes

�F

ilter

Facto

r assum

es U

niform

Dis

trib

ution o

f V

alu

es

Fre

que

ncy

�T

hat’s u

su

ally

not

the c

ase in r

ea

l lif

e s

ituations

�R

DB

MS

Optim

izer

needs to k

now

more

on v

alu

es fre

quency

(e.g

. dis

trib

ution s

tatistics)

�Q

uery

perf

orm

ance w

ill b

e h

igh

ly d

epen

de

nt

on

valu

es s

pe

cifie

d in p

red

icate

s

�F

or

better

optim

ization, R

DB

MS

Optim

izer

must know

com

parison v

alu

es in p

redic

ate

at com

pile

tim

e

�U

se D

ynam

ic S

QL w

ith

out

para

mete

r m

ark

ers

; or

..

�R

e-o

ptim

ize S

QL q

uery

at

execution t

ime

Page 17: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

17

��������

Acce

ss P

ath

Qu

iz –

2:

Jo

in

�T

1, T

2 =

Rela

tional T

able

s (

not a V

iew

s)

�T

1-C

AR

D =

500,0

00

�T

2-C

AR

D =

5,0

00,0

00

�F

ilter

facto

rs

�F

F(T

1.C

2 =

litera

l) =

10**

(-4

)

�F

F(T

2.C

3 =

litera

l) =

10**

(-6

)

�N

o Indexes: H

ow

would

you m

anage the join

?

�W

hic

h Indexes w

ould

you r

ecom

mend?

SELECT T1.C1, T1.C2, T2.C3

FROM T1

, T2

WHERE T1.C1 = T2.C1

AND T1.C2 = ?

AND T2.C3 = ?

ORDER BY T1.C1

Page 18: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

18

��������

Acce

ss P

ath

Qu

iz –

2:

Mo

st C

om

mo

n J

oin

Me

tho

ds

�N

este

d L

oop J

oin

(N

LJ)

�F

or

each q

ualif

yin

g tuple

of T

x, sele

ct m

atc

hin

g tuple

sfr

om

Ty

�U

sually

, an Index o

n T

y.J

Cs

used

�M

erg

e S

ca

n J

oin

(M

SJ)

�S

ort

qualif

yin

g tuple

sfr

om

Tx

on join

colu

mns

�S

ort

qualif

yin

g tuple

sfr

om

Ty

on join

colu

mns

�M

erg

e tuple

sm

atc

hin

g join

pre

dic

ate

s

�H

ash J

oin

(H

J)

�S

imila

r to

NLJ

�H

ashin

g u

sed to s

peed u

p r

etr

ieval of m

atc

hin

g tuple

sfr

om

Ty

Page 19: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

19

��������

Qu

ery

Re

wri

te

�R

ew

riti

ng

a g

iven

SQ

L q

uery

in

to a

sem

an

tically e

qu

ivale

nt

form

th

at

�m

ay b

e p

rocessed m

ore

eff

icie

ntly

�giv

es t

he O

ptim

izer

more

latitu

de

�W

hy

�S

am

e q

uery

ma

y h

ave m

ultip

le r

epre

se

nta

tio

ns in S

QL

�C

om

ple

x q

ueri

es o

ften r

esult in r

edun

dancy,

especia

lly w

ith v

iew

s

�Q

uery

genera

tors

�o

ften

pro

du

ce s

ub

op

tim

al

qu

eri

es t

hat

do

n't

perf

orm

well

�d

on

't p

erm

it "

han

d o

pti

miz

ati

on

"

�D

B2 c

ap

ab

ilit

ies is b

ased

on

Sta

rbu

rst

Qu

ery

Rew

rite

�R

ule

-base

d q

uery

re

wri

te e

ng

ine

�T

ransfo

rms legal Q

GM

into

more

eff

icie

nt

QG

M

�T

erm

inate

s w

hen n

o r

ule

s e

ligib

le o

r b

udg

et

excede

d

Page 20: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

20

��������

Qu

ery

Re

wri

te:

Exa

mp

les

�E

quiv

ale

nt pre

dic

ate

s

�N

AM

E L

IKE

‘a%

�S

UB

ST

R(N

AM

E,1

,1)

= ‘a’

�T

ransfo

rm S

ubsele

ctin

to J

oin

SELECT T1.*

FROM T1

WHERE EXISTS

(SELECT *

FROM T2

WHERE T1.K = T2.K)

SELECT DISTINCT T1.*

FROM T1

, T2

WHERE T1.K = T2.K

�V

iew

Merg

e

�A

llow

s a

dditio

nal jo

ins o

rder

�C

an e

limin

ate

redun

dant

join

s

�R

edundant jo

in e

limin

ation

�S

atisfies m

ultip

le r

efe

rences t

o t

he s

am

e t

ab

le w

ith a

sin

gle

scan

Page 21: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

21

��������

Qu

ery

Re

wri

te:

Su

bq

ue

ry M

ad

ne

ss!

SE

LE

CT

*

FR

OM

( S

ELE

CT

F

LA

G,

TO

_N

UM

BE

R (

NU

M )

NU

M

FR

OM

SU

BT

ES

T

WH

ER

EF

LA

G =

'N' )

WH

ER

EN

UM

> 0

;

�E

RR

OR

: O

RA

-01722: in

valid

num

ber

�R

eason: query

rew

rite

SE

LE

CT

F

LA

G, T

O_N

UM

BE

R (

NU

M )

NU

M

FR

OM

SU

BT

ES

T

WH

ER

EF

LA

G =

'N'

AN

D T

O_N

UM

BE

R (

NU

M )

> 0

;

and p

redic

ate

sequence c

hanged!!!

�“M

odel vs. im

ple

menta

tion is o

ne o

f th

e g

reat lo

gic

al d

iffe

ren

ces”

(C.

Date

)

Zdv

C

428

N

23

N

Abc

C

100

N

NU

MF

LA

G

Page 22: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

22

��������

Op

tim

ize

r E

xp

lain

ed

–1

ase

lect

co

d_n

dg

_co

ntr

op

arte

, co

un

t(*)

fr

om

rai

nb

ow

.co

inte

staz

ion

e ct

wh

ere

exis

ts (

sele

ct *

fr

om

rai

nb

ow

.co

ntr

op

arte

cp

w

her

e cp

.idn

_co

ntr

op

arte

=

ct.id

n_c

oin

test

atar

ioan

dcp

.idn

_cr_

tip

o_c

on

tro

par

te =

421

9)

gro

up

by

cod

_nd

g_c

on

tro

par

te

hav

ing

co

un

t(*)

> 1

0 o

rder

by

2 d

esc

SE

LE

CT

Q4.

$C1

AS

"C

OD

_ND

G_C

ON

TR

OP

AR

TE",

Q4.

$C0

FR

OM

(S

EL

EC

T C

OU

NT

(* )

, Q3.

$C0

FR

OM

(S

ELE

CT

Q2.

CO

D_N

DG

_CO

NT

RO

PA

RT

E

FR

OM

RA

INB

OW

.CO

NTR

OP

AR

TE

AS

Q1

, R

AIN

BO

W.C

OIN

TE

STA

ZIO

NE

A

S Q

2 W

HE

RE

(Q1.

IDN

_CO

NT

RO

PA

RT

E =

Q

2.ID

N_C

OIN

TE

STA

TAR

IO)

AN

D (

Q1.

IDN

_CR

_TIP

O_C

ON

TR

OP

AR

TE

= 42

19))

AS

Q3

GR

OU

P B

Y Q

3.$C

0) A

S Q

4 W

HE

RE

(10

< Q

4.$C

0)

OR

DE

R B

Y Q

4.$C

0 D

ES

C

Page 23: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

23

��������

Op

tim

ize

r E

xp

lain

ed

–1

b (

afte

r R

eo

rg)

sele

ct

cod

_nd

g_c

on

tro

par

te, c

ou

nt(

*)

fro

m r

ain

bo

w.c

oin

test

azio

ne

ctw

her

e ex

ists

(se

lect

*

fro

m r

ain

bo

w.c

on

tro

par

te c

p

wh

ere

cp.id

n_c

on

tro

par

te =

ct

.idn

_co

inte

stat

ario

and

cp.id

n_c

r_ti

po

_co

ntr

op

arte

= 4

219)

g

rou

p b

y co

d_n

dg

_co

ntr

op

arte

h

avin

g c

ou

nt(

*) >

10

ord

er b

y 2

des

c

SE

LE

CT

Q4.

$C1

AS

"C

OD

_ND

G_C

ON

TR

OP

AR

TE",

Q4.

$C0

FR

OM

(S

EL

EC

T C

OU

NT

(* )

, Q3.

$C0

FR

OM

(S

ELE

CT

Q2.

CO

D_N

DG

_CO

NT

RO

PA

RT

E

FR

OM

RA

INB

OW

.CO

NTR

OP

AR

TE

AS

Q1

, R

AIN

BO

W.C

OIN

TE

STA

ZIO

NE

A

S Q

2 W

HE

RE

(Q1.

IDN

_CO

NT

RO

PA

RT

E =

Q

2.ID

N_C

OIN

TE

STA

TAR

IO)

AN

D (

Q1.

IDN

_CR

_TIP

O_C

ON

TR

OP

AR

TE

= 42

19))

AS

Q3

GR

OU

P B

Y Q

3.$C

0) A

S Q

4 W

HE

RE

(10

< Q

4.$C

0)

OR

DE

R B

Y Q

4.$C

0 D

ES

C

Page 24: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

24

��������

Th

e O

ptim

ize

r: S

um

ma

ry

�V

ery

cle

ver

pie

ce o

f code

�B

eh

avio

r heavily

dep

ende

nt

on lo

gic

al a

nd p

hysic

al

desig

n

�N

um

ber

of jo

ins (

logic

al desig

n)

�In

dexes a

nd their c

hara

cte

ristics

�A

lso d

ep

en

dent

on

�R

DB

MS

configura

tion (

e.g

. buffer

siz

e, sort

heap s

ize, etc

.)

�N

um

ber

of C

PU

and C

PU

pow

er

�I/O

configura

tion

Page 25: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

25

��������

DB

A R

ole

: M

ost

rele

va

nt

pe

rfo

rma

nce

kn

ob

s

�R

DB

MS

co

nfigura

tion s

ett

ings

�H

ighly

dependent on n

am

ed R

DB

MS

�Log

ical d

esig

n

�B

e a

ware

of cost of over-

norm

aliz

ation

�H

igh

er

num

ber

of

join

s

�H

igh

er

num

ber

of

join

ed t

ab

les

�P

hysic

al desig

n

�N

um

ber

of in

dexes

�In

dex k

ey

�B

e a

ware

of

the m

ain

ten

ance

cost

of

each inde

x

�G

oal: b

ala

nce t

hro

ug

hput

vs.

sin

gle

query

pe

rform

ance

Page 26: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

26

��������

DB

A R

ole

: A

Me

tho

d A

pp

roa

ch

to

SQ

L O

ptim

iza

tio

n

�E

nsure

the e

ntire

Data

Access L

ogic

is p

art

of th

e S

QL

sta

tem

ent

�A

void

ha

ndlin

g p

redic

ate

s o

r re

latio

nal o

pera

tors

(e.g

. jo

in)

inapplic

atio

n c

od

e,

unle

ss r

eq

uir

ed b

y a

kno

wn

pro

duct

limitation

�R

un O

ptim

izer

Expla

in

�U

nders

tand the p

rovid

ed info

rmation

�U

nders

tand w

heth

er

there

is r

oom

for

impro

vem

ent

�A

t th

e S

QL s

yn

tax level

�A

t th

e D

B d

esig

n leve

l

�U

nders

tand the c

ost of im

ple

menta

tion a

nd p

ote

ntial im

pact on

exis

ting a

pplic

ations.

�E

.g.

addin

g ind

exes

�R

e-o

rderin

g in

de

x c

olu

mns

�Im

ple

ment changes

Page 27: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

27

��������

Th

e I

mp

act

of P

oo

r O

ptim

iza

tio

n

�In

the p

ast, d

ue t

o lim

ited m

em

ory

siz

e

�I/O

bottle

necks

�Long e

lapsed tim

es

�T

oday,

with

very

pow

erf

ul C

PU

s a

nd v

ery

larg

e

mem

ory

siz

e

�H

igh C

PU

utiliz

ation

�T

hro

ughput lo

wer

than e

xpecte

d

�Lockin

g c

onte

ntion

We a

ll te

nd to o

verlook the im

pact of share

d r

esourc

es, due to the

incre

dib

le a

mount of re

sourc

es a

vaila

ble

on o

ur

ow

n s

ingle

-user

PC

Page 28: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

28

��������

Co

nte

nts

Re

vie

w

�R

DB

MS

an

d S

QL L

angua

ge

�S

tatic S

QL a

nd D

ynam

ic S

QL

�M

app

ing t

he P

hysic

al D

esig

n

�Q

uery

Com

pile

r O

verv

iew

�Q

uery

Optim

izatio

n

�A

ccess P

ath

Sele

ction

�Q

uery

Rew

rite

�E

xam

ple

s

�T

he O

ptim

izer

Exp

lain

ed

�T

he R

ole

of th

e D

BA

Page 29: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

29

��������

Page 30: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

30

��������

Bib

liogr

aphy

-1

1.

Selin

ger

et

al.,

“Access P

ath

Sele

ctio

n in a

Rela

tiona

l D

ata

base

Manag

em

ent

Syste

m”,

Sig

mo

d 1

979,

htt

p:/

/ww

w.c

s.b

erk

ele

y.e

du/~

bre

wer/

cs26

2/A

ccessP

ath

.pdf

2.

Ioannid

is a

nd K

ang,

“Ran

do

miz

ed A

lgorith

ms f

or

Optim

izin

g L

arg

e J

oin

Queries”,

Sig

mod

199

0,

htt

p:/

/ww

w.c

c.g

ate

ch.e

du/c

om

puting/D

ata

base/r

eadin

ggro

up

/art

icle

s/p

31

2-

ioan

nid

is.p

df

3.

Ham

id P

irahesh, T

. Y

. C

liff

Leung,

Waqar

Ha

san,

"A R

ule

Engin

e f

or

Query

T

ransfo

rmation in S

tarb

urs

t an

d I

BM

DB

2 C

/S D

BM

S",

IC

DE

1997,

pp.

391-4

00,

htt

p:/

/ww

w.d

iku

.dk/u

nderv

isnin

g/2

003f/

72

9/p

apers

/rule

_en

gin

e.p

df

4.

Pete

r G

assner,

Gu

y M

. Lohm

an,

K.

Bern

hard

Schie

fer,

Yu

nW

ang,

"Query

Optim

ization in

the I

BM

DB

2 F

am

ily",

Data

Engin

eeri

ng B

ulle

tin 1

6(4

): 4

-18

(1993),

ftp

://f

tp.r

esearc

h.m

icro

soft

.com

/pub/d

ebull/

93D

EC

-CD

.pdf

Page 31: Access Path Selection and Physical DB Design in Relational ...pretto/bd2005/RDBMS_Query_Optimizer_2005.pdf · Access Path Selection and Physical DB Design in Relational Database Management

IBM

Softw

are

Gro

up

31

��������

Bib

liogr

aphy

-2

1.

Subqu

ery

Mad

ness! at

htt

p:/

/fiv

e.p

airlis

t.net/

pip

erm

ail/

ora

cle

-

art

icle

/20

04/0

00012.h

tml;

htt

p:/

/dba-

ora

cle

.com

/ora

cle

_n

ew

s/2

004

_9_1

4a_

200

4.h

tm;

htt

p:/

/ww

w.d

bd

ebunk.c

om

/pa

ge/p

ag

e/1

35

13

81.h

tm

2.

Cost C

ontr

ol: Insid

e the O

racle

Optim

izer,

htt

p:/

/ww

w.o

racle

.com

/techno

log

y/o

ram

ag/w

ebcolu

mns/2

00

3/t

echart

icle

s/b

urleso

n_cb

o_p

t1.h

tml

3.

DB

2 R

ed

books:

htt

p://w

ww

.re

dbooks.ibm

.co

m/

4.

Ora

cle

White P

apers

:

htt

p:/

/ora

cle

.itt

oolb

ox.c

om

/do

cum

ents

/defa

ult.a

sp?S

ection=

White+

Papers