Advanced Distributed Software Architectures and Technology group ADSaT 1 Transactions and Databases Paul Greenfield CSIRO.

Post on 21-Dec-2015

214 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

Transcript

1Advanced Distributed Software Architectures and Technology group

ADSaT

Transactions and Databases

Paul GreenfieldCSIRO

2Advanced Distributed Software Architectures and Technology group

ADSaT

This Week

• More on transactions – Left overs– http://research.microsoft.com/

~gray/wics_99_TP• Isolation and locking

– How do we achieve isolation?• Recovery

– How do we recover after failure?

3Advanced Distributed Software Architectures and Technology group

ADSaT

Why bother with TP?

• Use two-tier apps with database transactions?– Business logic in client

and stored procedures– Fast!– Scalable?– Maintainable?– Cheaper?– Flexible??

Stored procedures

Database Server

4Advanced Distributed Software Architectures and Technology group

ADSaT

Two-tier Applications

• The most recent ‘legacy’• Stored procedures

– Different and proprietary languages– Integrated debugging?– Re-use in different applications?

• DB connection per client– Even when not active

5Advanced Distributed Software Architectures and Technology group

ADSaT

Three-tier Applications

• Business logic written in common or standard languages (VB, C++, Java)

• Clean separation of business logic– Easier re-use and maintainability?

• Use server resources only for active transactions– Process and connection pooling

6Advanced Distributed Software Architectures and Technology group

ADSaT

TP Implementation

• What are the TP programs?– Small ‘one-shot’ executable

programs?– Application programs fed from queue?– Libraries called from a process?– Libraries called from threads?

• Answer have an effect on performance, integrity and management

7Advanced Distributed Software Architectures and Technology group

ADSaT

One-shot Programs

• Old-style solution (CICS, TIP, …)• Schedule application to run when

transaction request arrives– Start app, process request, terminate– Single function per application

• OS/TP monitor support for– Fast application startup– Application recycling (reduce

overheads)

8Advanced Distributed Software Architectures and Technology group

ADSaT

Queued Applications• TP application ‘always’ running

– Instances balanced against load– Queue of waiting requests– Application supports multiple functions

• Group functions into applications• Clients not bound to server applications

– Tune response times • Faster response time for some transactions• Multiple copies of critical applications

9Advanced Distributed Software Architectures and Technology group

ADSaT

TP Processes

Client bound to server process – Typical CORBA approach – Queue of requests for each server – Need to run/manage multiple servers

• Tune response times? – Can allocate transactions to programs– Fast, critical transactions delayed?

• Need for load balancing– Unequal server load possible

10Advanced Distributed Software Architectures and Technology group

ADSaT

TP Process - OrbixServer

processes

Waiting requests

Server objects

11Advanced Distributed Software Architectures and Technology group

ADSaT

Orbix ExampleConfiguration 1: 20 Servers

0500

100015002000250030003500

100 200 300 400

Number of Clients

Mil

lise

con

ds

buy

create

getholding

query code

queryid

sell

update

12Advanced Distributed Software Architectures and Technology group

ADSaT

TP Threads

• Thread pool inside a server process– No binding from client to thread– Objects live in process address space– Threads have access to all objects– Queue of requests shared by all

threads• No need for load balancing

– No idle/busy processes– No way to push priority of some

transactions – may not matter?

13Advanced Distributed Software Architectures and Technology group

ADSaT

TP Threads - MTSServer threads

Waiting requests

Activeserver objects

Proxyserver objects

14Advanced Distributed Software Architectures and Technology group

ADSaT

MTS ExampleTransaction times - C++ & Keytable

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Buy CreateAccount

Get HoldingStmnt

QueryCode QueryID Sell Update

Res

po

nse

tim

e (m

s)

100 clients

200 clients

400 clients

600 clients

15Advanced Distributed Software Architectures and Technology group

ADSaT

Failure

• Need to isolate faults– Failing application takes down

what??– Entire application process?– Process holding thread pool?– Entire transaction system?

• Need to run applications as separate processes or have careful fault traps

16Advanced Distributed Software Architectures and Technology group

ADSaT

What Goes Where?

• Routing and directories– Where to send a request message?– Where to create a remote object?

• Routing tables– Table of what requests go where

• Directories/name servers– Database and server that knows

who is providing what service

17Advanced Distributed Software Architectures and Technology group

ADSaT

Directory/Name Servers

• Map name onto server locations• Could be part of TP system

– CORBA Name Servers• Could be part of system-wide

directory– Active Directory for COM+

• ‘Hard-wiring’ also works – Administration costs can be high

18Advanced Distributed Software Architectures and Technology group

ADSaT

Name Servers

• Client asks name server where to find a service when creating object

• Servers advertise their services to the name server

• Load balancing by name server distributing requests over multiple server processes and systems

19Advanced Distributed Software Architectures and Technology group

ADSaT

Name Servers

ClientApp

Object

Name Server

Object

A

B

Goods?

Use object Xon server B

Goods server

Goods server

20Advanced Distributed Software Architectures and Technology group

ADSaT

Request Integrity

• What happens to requests on failure– Transactions ensure database integrity– Incoming requests can be saved to

disk– Fetch request operation included as

part of transaction• Undone and request requeued on failure• Need to avoid failure loops!• Easy recovery from transient errors

21Advanced Distributed Software Architectures and Technology group

ADSaT

Response Integrity

• Are responses part of transaction?– Rolled out if transaction fails– Recovered and sent after system

recovery if committed• Is this reasonable? Sent to who??• Just discard?• Need feedback to know delivery

succeeded

• Just what does the operator see/do?– Wait? Retry? Check success?

22Advanced Distributed Software Architectures and Technology group

ADSaT

RPC Extras

• DCE, CORBA, COM, … are language and platform independent– Interfaces specified in IDL– Marshalling translates between

languages and platforms• Character sets, byte order, …

– Translate to and from ‘canonical’ form– Or use ‘receiver makes it right’

• Send in client format• Receiver translates only if necessary

23Advanced Distributed Software Architectures and Technology group

ADSaT

IDL Example

• COM IDL fragment – More detail in a later lecture!!

[object, uuid(6B29FC40-CA47-1067-B31D-00DD010662DA)]interface IHop : IUnknown {

import “unknwn.idl”; // bring in definition of IUnknownHRESULT Walk([in] long How_far);HRESULT Hop([in] long How_far);HRESULT Bound([in] BSTR Over_what);

}

24Advanced Distributed Software Architectures and Technology group

ADSaT

Nested Transactions

• Calling a transaction from anywhere– Directly from a client– From within a transaction

• Start a sub-transaction, linked into the parent transaction

– All transactions committed together• Sub-transaction commit does not really

commit and make changes durable. Changes made visible to other sub-transactions.

25Advanced Distributed Software Architectures and Technology group

ADSaT

Nested Transactions

• Not widely supported• Alternative programming models

– Top-level transactional service code calling on business logic

– MTS and EJB ‘requires transaction’• Run in existing transaction if there is

one• Start new transaction otherwise• More in MTS/COM+ and EJB lectures

26Advanced Distributed Software Architectures and Technology group

ADSaT

Nested TransactionsFunction transfer(src, dest, amt) tx_start withdraw(src, amt) deposit(src, amt) tx_commit

Function withdraw(src, amt) tx_start …….. Tx_commit

Function deposit(dest, amt) tx_start …….. Tx_commit

Nested Transactions

Function transfer(src, dest, amt) tx_start withdraw(src, amt) deposit(src, amt) tx_commit

Function withdraw(src, amt) ……..

Function deposit(dest, amt) ……..

Transactional Services

27Advanced Distributed Software Architectures and Technology group

ADSaT

Isolation and Locking

• How do resource managers achieve the illusion of ‘isolation’– Application programmers can

(largely) pretend no other programs are running concurrently

– Done using ‘locks’ and ‘lock managers’

– Application programmers still need to be aware of possible problems

28Advanced Distributed Software Architectures and Technology group

ADSaT

Serialisable

• Concurrent execution of concurrent transactions has the same effect as running them serially.– One after another with no overlap

• Highest level SQL Isolation Level• Implemented by locking

resources before they are used

29Advanced Distributed Software Architectures and Technology group

ADSaT

Locks

• Lock data before using it– Set read lock before reading– Set write lock before writing – Wait if lock cannot be granted– Locks only granted if no conflicts

• Read locks conflict with write locks• Write locks conflict with both read and

write locks

30Advanced Distributed Software Architectures and Technology group

ADSaT

Locks

• Locks affect performance– All computers wait at the same speed– Can result in single-threading

• Concurrent transactions waiting for access to the same resource

• Strongly influenced by application design

• Locks introduce new problems– deadlocks

31Advanced Distributed Software Architectures and Technology group

ADSaT

Two-phase Rule

• Correct locking avoids problems– Locks have to be held until commit

to achieve isolation• Locks are held for longer• Performance is reduced

– Two phases• Locking resources• Unlocking (only at commit)• Avoids cascading aborts

32Advanced Distributed Software Architectures and Technology group

ADSaT

Lock Managers• Code that manages locking

– Maintains a lock table• Keeps track of all locks in the database• Waiting requests and granted locks

– Lock operations are atomic• Protected by low-level locks (mutex, spin)

y T2(write) T4(read), T1(read)

x T1(read), T2(read) T3(write)

z T1(read)

Locks granted Locks requested

33Advanced Distributed Software Architectures and Technology group

ADSaT

Lock Managers

• Distributed systems can have interesting locking problems– No lock analysis across databases?

• Distributed databases have distributed lock managers– Shared lock state– Communication between LMs

34Advanced Distributed Software Architectures and Technology group

ADSaT

Lock Types

• More than just read and write!– Shared (read) locks– Exclusive (write) locks– Update (read then write)– Intent locks (lock also held at finer

level)– Key locks (lock ranges within keys)

35Advanced Distributed Software Architectures and Technology group

ADSaT

Lock Granularity

• What is locked?– Whole database– Whole table?– Page of data?– Individual record?

• All of the above at times– X lock on record– IX locks on page and table– S locks on database

36Advanced Distributed Software Architectures and Technology group

ADSaT

Tables to Records

Table

Page Page Page

Record Record Record

37Advanced Distributed Software Architectures and Technology group

ADSaT

Lock Granularity

• Level of locking a DB decision– Fine grain locks give less

contention and better performance– Fine grain locks using lots of locks

and are more expensive to manage

• Choose record lock when..– Just locking a few records

• Otherwise get coarser locks

38Advanced Distributed Software Architectures and Technology group

ADSaT

Lock Escalation

• DB can start with record locks and move to page/table locks– Finds that many locks are being

held for the page/table– Escalate lock up a level– Free lock resources

• Guess at proper locking level and adjust as needed (up only?)

39Advanced Distributed Software Architectures and Technology group

ADSaT

SQL Isolation Levels

• Uncommitted read (dirty read)– Read all changes, no locks, no waits– Fastest and sometimes useful

• Statistical scans of data

• Committed read (SQL default)– Only read committed data– Release read locks after use– Repeating an SQL statement can

give different results each time

40Advanced Distributed Software Architectures and Technology group

ADSaT

SQL Isolation Levels

• Repeatable read– Same query always returns same

data • Can get phantoms – new records

– Keep shared locks until Commit• Serializable (TP Isolation)

– Same query returns same data• No phantoms! • Lock data that does not exist

– Need to keep key locks as well

41Advanced Distributed Software Architectures and Technology group

ADSaT

Locking Hints

• DB decides what locks to use– Shared or exclusive lock?– Locks can be converted normally– Programmer can override with ‘hints’

• Programmer knows what will happen next• Avoid deadlocks?

Select * from accounts (updlock) where acc_no = 123

Update accounts set balance= … where acc_no=123

42Advanced Distributed Software Architectures and Technology group

ADSaT

Deadlocks

• Normally applications just wait for locks to be granted

• Sometimes dependencies between locks means they would wait forever

Lock A

Lock B

T1 Lock B

Lock A

T2A

B

T1

T2

T2

T1

Granted

Waiting

43Advanced Distributed Software Architectures and Technology group

ADSaT

Deadlocks

• Db performs locking graph analysis

• Deadlock if loop found! • Solution?

– Pick a process/transaction and return a db error

– Application recovers or dies…– Transaction abort and retry?

44Advanced Distributed Software Architectures and Technology group

ADSaT

Deadlocks

• Deadlock avoidance is an application coding problem – and a hard one– Use ‘canonical locking orders’

• Define a standard locking order• Invoice header before invoice details

– Nice idea in theory– Can still get ‘conversion deadlocks’

45Advanced Distributed Software Architectures and Technology group

ADSaT

Conversion Deadlocks

• Database uses shared locks rather than exclusive locks for reading– Can convert to exclusive later– Deadlocks when DB cannot do convert

Select next from keytable where type=1

Update keytable set next=next+1 where type=1

K1 T1(s) T1(x)

Granted

Waiting

T2(s) T2(x)

46Advanced Distributed Software Architectures and Technology group

ADSaT

Conversion Deadlocks

• A use for locking hints– Tell DB to get exclusive lock earlier

Select next from keytable (updlock) where type=1

Update keytable set next=next+1 where type=1

K1 T1(x) T2(x)

Granted

Waiting

47Advanced Distributed Software Architectures and Technology group

ADSaT

Performance

• Blocking on waits undesirable– Remove hot spots

• ‘next entry’ counters, summary information, end of file counter

• Avoid altogether• Cache high contention records

– Reduce ‘path length’– Obtain locks as late as possible

48Advanced Distributed Software Architectures and Technology group

ADSaT

PerformanceC++ transaction rates

0

50

100

150

200

250

300

350

400

450

500

0 200 400 600 800 1000 1200

Client threads

TP

S

Local keytable

Local Identity

Remote identity 10M

Remote identity 100M

Remote keytable 100M

49Advanced Distributed Software Architectures and Technology group

ADSaT

PerformanceC++ response times

remote db - identity & keytable

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

0 200 400 600 800 1000 1200

Clients

Resp

on

se t

ime (

ms)

Read ident

Update ident

Average ident

Read key

Update key

Average key

50Advanced Distributed Software Architectures and Technology group

ADSaT

Recovery

• Durability and redundancy – Keep critical information on disk– In-memory copies for performance– Ensure disk writes complete before

continuing at critical times– Keep multiple copies of disk data

• Protecting against …– Memory loss when system fails– Disk file loss with disk failure

51Advanced Distributed Software Architectures and Technology group

ADSaT

Database Model

• Really two databases– Database tables on disk +

in-memory changed pages/records• For performance

– Logged changes on disk/tape + database dump• For durability

• The log really the durable database– Can recreate the disk/memory form

52Advanced Distributed Software Architectures and Technology group

ADSaT

Logging• Write to log…

– Before images• Changes, deletions

– After images• Changes, insertions

– Data pages, index blocks, storage allocation

• Need to wait for log flushes– Can be major performance

bottleneck– Batch flushes by adding a short delay

53Advanced Distributed Software Architectures and Technology group

ADSaT

Logging

• Write-log-ahead– Never flush an uncommitted

change to the database.

• Changes can be flushed after they have been committed– Leave in memory until cache

manager needs the space…

54Advanced Distributed Software Architectures and Technology group

ADSaT

Commit

• Changes are written to a log page– Page write initiated when page full

• At commit time– Flush all logged changes to disk– Flush logged commit record to disk

• Changes are now in stable storage– Database is recoverable

55Advanced Distributed Software Architectures and Technology group

ADSaT

Recovery

• Recover from abort– Apply before images if necessary to

pages in cache• Recovery from system failure

– Apply after images to disk pages• Recovery from media failure

– Restore from backup– Apply after images to disk pages

56Advanced Distributed Software Architectures and Technology group

ADSaT

Checkpoints• How can we recover more quickly?

– How far back do we go in the log?• When do we know that there are no more

log records that need to be applied?

– Problem comes from caching and lazy database page writes

• Checkpoints force database pages back out to disk now and then– Stop recovery when checkpoint found– Fuzzy checkpoints to improve CP cost

57Advanced Distributed Software Architectures and Technology group

ADSaT

CheckpointsLast

checkpointAll updates in stable database

Log

Lastcheckpoint

All updates in stable database

Log

2nd lastcheckpoint

Classic checkpoint

Fuzzy checkpoint

58Advanced Distributed Software Architectures and Technology group

ADSaT

Media failure?

• Duplicate the media (disks)– RAID disks– Mirror/shadow disks– Avoid sharing anything

• Multiple disks with multiple controllers• Remote sites for backup?

– Put logs on mirror/RAID at least• Archive logs to tape or …

59Advanced Distributed Software Architectures and Technology group

ADSaT

Performance

• Disk performance is the key– Disks are slow to rotate (latency)– Disk heads are slow to move (seek)

• One heavily used file per disk is best

– Allocate DB files and logs across disks to balance out usage

– Number of disks can be more important than storage capacity

60Advanced Distributed Software Architectures and Technology group

ADSaT

Next week

• Security!– Access control– Authentication– Data privacy– Public key crypto– SSL/TLS

top related