Top Banner
Why SQL DBMS’s Still Lack Full Declarative Constraint Support (A polite excuse) Toon Koppelaars
47

Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Dec 02, 2014

Download

Software

Kyle Hailey

Toon Koppelaars presentation on Oracle database constraints
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Why SQL DBMS’s Still

Lack Full Declarative

Constraint Support

(A polite excuse)

Toon Koppelaars

Page 2: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Who am I?

• A database guy, relational dude

– Developer (with DBA experience)

• Oracle technology user since 1987

• Co-author of this book

• Today’s talk = last chapter

Page 3: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Agenda

• SQL Assertions?

– A few examples

• Validation Execution Models

– More efficient along the way

• Serializability

– Probably the biggest problem

• Conclusions

Page 4: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

SQL Assertions?

• Examples

1. There is at most one president

2. One cannot manage more than two

departments

3. Department with president and/or

manager should have an administrator

Data integrity constraints

– Just like: CHECK, PK, UK, FK

– ‘all the other ones’

They constrain

data allowed in

our tables

They constrain

data allowed in

our tables

Page 5: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Syntax?

• Create assertion [some_name] as

check([some_SQL_expression]);

• For over 25 years part of Ansi/Iso

SQL standard

Page 6: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Example 1

• create assertion

at_most_one_president

as check

(1 =>

(select count(*)

from EMP e

where e.JOB = ‘PRESIDENT’

)

);

• Task of DBMS: make sure it’s true at all times

- Closed SQL

expression (no

free variables)

- Evaluates to

true or false

Page 7: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Example 2

• create assertion

cannot_manage_more_than_2

as check

(not exists

(select ‘x’

from (select d.MGR

,count(*) as cnt

from DEPT d

group by d.MGR)

where cnt > 2

)

);

Page 8: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Example 3

• create assertion admin_in_dept_with_vip as check (not exists (select ‘x’ from (select distinct e.DEPTNO from EMP e) e where exists (select ‘y’ from EMP e1 where e1.DEPTNO = e.DEPTNO and e1.JOB in (‘P’,’M’)) and not exists (select ‘z’ from EMP e2 where e2.DEPTNO = e.DEPTNO and e2.JOB = ‘ADMIN’) ) );

Page 9: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

• Can this be the killer feature of

OracleXIII?

Page 10: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Imagine...

• If DBMS vendor would support these

– How much less lines of code one would

need to write in application

development

– How much less bugs this results into

– How easy we could accomodate

change requested by the business

– How data quality would improve

Page 11: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Point to be made

• SQL assertions ‘cover’ all the other

declarative constraints available

– CHECK can be written as assertion

– PK/UK can be written as assertion

– FK can be written as assertion

• We just have shorthands for these,

since these are so common in every

database design

Page 12: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Check writen as assertion

• create assertion hire_only_18_plus

as check

(not exists

(select ‘x’

from EMP e

where e.HIRED - e.BORN <

18*365

)

);

CHECK((HIRED – BORN) >= 18*365)

Page 13: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

PK/UK written as assertion

• create assertion empno_is_unique

as check

(not exists

(select ‘x’

from EMP e1

,EMP e2

where e1.EMPNO = e2.EMPNO

and e1.rowid != e2.rowid

)

);

PRIMARY KEY (EMPNO)

Page 14: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

FK written as assertion

• create assertion work_in_known_dept as check (not exists (select ‘x’ from EMP e where not exists (select ‘y’ from DEPT d where d.deptno = e.deptno ) ) ); FOREIGN KEY (DEPTNO) REFERENCES DEPT(DEPTNO)

Page 15: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Scientific Problem

• Why is CREATE ASSERTION not

supported?

• Think about this:

– The DBMS has our assertion expression

but:

• Would you accept full evaluation of given

expression for every insert/update/delete?

– No.

• Performance would be horrible in real world db’s

• You require a much more sophisticated execution

model in the real world

Page 16: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Scientific Problem

• The challenge is:

– Given: arbitrary complex assertion expression

+

– Given: arbitrary complex SQL statement

– What is the minimum required check that

needs to be executed by the DBMS for the

assertion to remain true?

• Can be ‘nothing’ if assertion is immune for the given

DML

Page 17: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

For instance...

• Department that employs president or manager should also employ administrator – Obviously only when DML operates on EMP

1.Insert into emp values(100,’Smith’,’SALESREP’

,’1/1/80’,’20/11/07’,7000,20);

2.Insert into emp values(:b0,:b1,:b2

,:b3,:b4,:b5,:b6);

3.Insert into emp (select * from emp_loaded);

4.Update emp set msal=1.05*msal;

5.Delete from emp where empno=101;

Page 18: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Models

• Using triggers:

– EM1: always

– EM2: involved tables

– EM3: involved columns

– EM4: table polarities

– EM5: involved literals (TE)

– EM6: TE + minimal check

Page 19: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 1

• Evaluate every assertion on every

DML statement

– Evaluating every <boolean

expression with subqueries>

– On every DML statement

• In after-statement table trigger

– WOULD BE *VERY* INEFFICIENT

• Let’s quickly forget this “EM1”

Page 20: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 2

• Only evaluate the assertions that involve the table that is being “DML-led”

– Finding involved tables by parsing the

assertion expression

– Per assertion 100% code generation of a

(insert/update/delete) “after statement” table

trigger

• Select from dual where [expression]

• Found? OK

• Not found raise_application_error

Page 21: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 2

• Using examples 2 and 3: – Admin_in_dept_with_vip EMP table

– Cannot_manage_more_than_2 DEPT table

Page 22: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 2

create trigger EMP_AIUDS_EMP01

after insert or update or delete on EMP

declare pl_dummy varchar(40);

begin

--

select 'Constraint EMP01 is satisfied' into pl_dummy

from DUAL

where not exists(select ‘a violation’

from (select distinct deptno from EMP) d

where exists(select e2.*

from EMP e2

where e2.DEPTNO = d.DEPTNO

and e2.JOB in ('PRESIDENT','MANAGER'))

and not exists(select e3.*

from EMP e3

where e3.DEPTNO = d.DEPTNO

and e3.JOB = 'ADMIN'));

--

exception when no_data_found then

raise_application_error(-20999,'Constraint EMP01 is violated.');

end;

Assertion

predicate

Page 23: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 2

create trigger DEPT_AIUDS_DEPT01

after insert or update or delete on DEPT

declare pl_dummy varchar(1);

begin

--

select ‘a' into pl_dummy

from DUAL

where not exists (select ‘a violation’ from (select d.MGR ,count(*) as cnt from DEPT d group by d.MGR) where cnt > 2));

--

exception when no_data_found then

raise_application_error(-20999,'Constraint DEPT01 is violated.');

end;

EM2 could be supported declaratively

Page 24: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 2

• Inefficiencies:

– EMP01 and DEPT01 are checked when updating columns that are not involved, for instance:

• Updating EMP.ENAME

• Updating DEPT.LOC

Page 25: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 3

• For inserts + deletes: EM3 == EM2

• For updates: only evaluate

assertions that involve *columns*

being changed

– Simple parse will find columns

– Assumes ‘clean specification’

• Create trigger syntax allows

specification of columns that are

being changed

Page 26: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 3

create trigger EMP_AUS_EMP01

after update of DEPTNO,JOB on EMP

declare pl_dummy varchar(40);

Begin

--

select 'Constraint EMP01 is satisfied' into pl_dummy

from DUAL

where not exists(select ‘department in violation’

from (select distinct deptno from EMP) d

where exists(select e2.*

from EMP e2

where e2.DEPTNO = d.DEPTNO

and e2.JOB in ('PRESIDENT','MANAGER'))

and not exists(select e3.*

from EMP e3

where e3.DEPTNO = d.DEPTNO

and e3.JOB = 'ADMIN'));

--

exception when no_data_found then

raise_application_error(-20999,'Constraint EMP01 is violated.');

end;

Page 27: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 3

create trigger DEPT_AIUDS_DEPT01

after update of MGR on DEPT

declare pl_dummy varchar(40);

begin

--

select ‘a' into pl_dummy

from DUAL

where not exists (select ‘x’ from (select d.MGR ,count(*) as cnt from DEPT d group by d.MGR) where cnt > 2));

--

exception when no_data_found then

raise_application_error(-20999,'Constraint DEPT01 is violated.');

end;

EM3 could be supported declaratively

Page 28: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 3

• Inefficiencies:

– Sometimes inserts (e)or deletes can

never violate a constraint

– For Cannot_manage_more_than_2,

deleting a department does not require

re-validation

– For Admin_in_dept_with_vip, both

inserts and deletes do require re-

validation

Page 29: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 4

• For updates: EM4 = EM3

• For deletes and inserts EM4 drops unnecessary delete (e)or insert table triggers

– Polarity of a table for a given constraint • Positive: inserts can violate

• Negative: deletes can violate

• Neutral: both can violate

• Undefined: table is not involved

Polarity of table for given constraint can be

computed via special parsing

Page 30: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 4

• EM4 maintains:

– All EM3 triggers, except for one:

• Drops:

– Cannot_manage_more_than_2 delete

trigger

EM4 could be supported declaratively

Page 31: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 4

• Inefficiencies: – Admin_in_dept_with_vip: eg. inserting

SALESMAN, or deleting TRAINER does not require re-validation

• Start involving literals mentioned in assertions

• If assertion does not have literals then next EM is same as EM4 – Cannot_manage_more_than_2 has no literals

Page 32: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 5

• How do we see: – Salesman is inserted? Trainer deleted?

• Could parse the DML statement

– But does not always work due to absence of literals

• Make use of column values of affected rows – Requires “Transition Effect” (TE) of a DML statement

• Common concept (aka. “delta”-tables)

• Inserted_rows, Updated_rows, Deleted_rows

– Maintaining TE is straightforward (see book)

• EM5: Only check the assertion when a property holds in the TE

Page 33: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 5

• TE property for

Admin_in_dept_with_vip:

1. Inserted_rows holds a president or a

manager

or,

2. Deleted_rows holds an admininstrator

3. Updated rows shows that ...

Page 34: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 5

create trigger EMP_AIS_EMP01

after insert on EMP

declare pl_dummy varchar(40);

begin

-- If this returns no rows, then EMP01 cannot be violated.

select 'EMP01 must be validated' into pl_dummy

from DUAL

where exists

(select 'A president or manager has just been inserted'

from inserted_rows

where JOB in ('PRESIDENT','MANAGER'));

--

begin

--

<same trigger code as EM4>

--

end;

exception when no_data_found then

-- No need to validate EMP01.

null;

--

end;

Page 35: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 5

create trigger EMP_ADS_EMP01

after delete on EMP

declare pl_dummy varchar(40);

begin

-- If this returns no rows, then EMP01 cannot be violated.

select 'EMP01 must be validated' into pl_dummy

from DUAL

where exists

(select 'An administrator has just been deleted'

from deleted_rows

where JOB = 'ADMIN');

--

begin

--

<same trigger code as EM4>

--

end;

exception when no_data_found then

-- No need to validate EMP01.

null;

--

end;

Page 36: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 5

• Update TE-property for EMP01:

select 'EMP01 is in need of validation'

from DUAL

where exists

(select 'Some department just won a president/

manager or just lost an administrator'

from updated_rows

where (n_job in ('PRESIDENT','MANAGER') and

o_job not in ('PRESIDENT','MANAGER')

or (o_job='ADMIN' and n_job<>'ADMIN')

or (o_deptno<>n_deptno and

(o_job='ADMIN' or n_job in

('PRESIDENT','MANAGER')))

• Can be deduced from insert + delete properties

• EM5 fully declarative too? – Here it gets complex...

Page 37: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 5

• Inefficiencies:

– Admin_in_dept_with_vip:

triggers validate all departments

• Unacceptable in real-world databases

• Only some require re-validation

– Cannot_manage_more_than_2:

triggers validate all department

managers

• Unacceptable in real-world databases

• Only some require re-validation

Page 38: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 6

• On TE-property + optimized

validation query

– Use the TE-query to find:

• Which deptno-values require re-validation

• Which mgr-values require re-validation

– Then use these values in the assertion-

expression

Page 39: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

create trigger EMP_AIS_EMP01

after insert on EMP

declare pl_dummy varchar(40);

begin

--

for r in (select distinct deptno

from inserted_rows

where JOB in ('PRESIDENT','MANAGER'));

loop

begin

-- Note: this now uses r.deptno value from preceeding TE-query.

select 'Constraint EMP01 is satisfied' into pl_dummy

from DUAL

where not exists(select ‘department in violation’

from (select distinct deptno from EMP where deptno = r.deptno) d

where exists(select e2.*

from EMP e2

where e2.DEPTNO = d.DEPTNO

and e2.JOB in ('PRESIDENT','MANAGER'))

and not exists(select e3.*

from EMP e3

where e3.DEPTNO = d.DEPTNO

and e3.JOB = 'ADMIN'));

--

exception when no_data_found then

--

raise_application_error(-20999,

'Constraint EMP01 is violated for department '||to_char(r.deptno)||'.');

--

end;

end loop;

end;

Page 40: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Execution Model 6

• This requires:

– Detecting that the ASSERTION can be (re)written as a universal quantification

– Can sometimes be done in multiple ways

• Which to choose?

– EM6 fully declarative? • Complexity introduced in EM5 further

increases

• Remember: given any arbitrary complex assertion + dml-statement

Page 41: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Still not there yet...

• Then there is something else too...

– Which is often overseen by database

professionals

– And, which is neglected in every

research paper (I’ve read...) that deals

with generating constraint validation

code

Serializability

Page 42: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Concurrent Transactions

• Deptno 13 has two admins and one

manager

– TX1 deletes an admin from 13

• Does not yet commit

– TX2 deletes the other admin 13

• Commits

– TX1 commits

• Constraint is violated for deptno 13!

– TX1 and TX2 must be serialized

Page 43: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Concurrent Transactions

• Note: this is *not* about locking rows of data, but rather: locking a constraint No two TX’s can validate at same time

• We can use DBMS_LOCK to serialize these transactions – See book for example code

• Again complexity further increases

Page 44: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Concurrent Transactions

• Concurrency impact of acquiring rule locks

– EM1: One TX at a time

– EM2: One TX per table at a time

– EM3, EM4, EM5 slowly relaxes • Up to EM5: not acceptable

– EM6: Only if two TX’s actually validate *and* involve same deptno (EMP01 assertion)

• Acceptable

Page 45: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Another complicating

factor

• Deferrabilty...

– Involves temporarily storing violation

cases for re-evaluation at commit time

– More comments on that in chapter 11

of the book

Page 46: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

The Polite Excuse

• Inefficient EM’s

– Could be supported, but: • Are unacceptable wgt. Performance &

transaction concurrency

• Efficient EM’s

– Aligned with business requirements • Performance and TX concurrency

But,

– Need more research to determine if they could be declaratively supported

Page 47: Oaktable World 2014 Toon Koppelaars: database constraints polite excuse

Questions?

• For slidedeck email me at [email protected]

• Twitter @ToonKoppelaars