Improving Transaction-Time DBMS Performance and Functionality David Lomet Microsoft Research Feifei Li Florida State University.

Improving Transaction-Time DBMS Performance and Functionality

David LometMicrosoft Research

Feifei LiFlorida State University

Immortal DB: A Transaction-Time DB

• What is Transaction-Time DB?– Retains versions of records

• Current and prior database states

– Supports temporal based access to these versions• Using transaction time

• Immortal DB Goals– Performance close to unversioned DB– Full indexed access to history– Explore other functionality based on versions

• History as backup• Bad user transaction removal• Auditing

Prior Publications

• SIGMOD’04: demo’d and demo paper• ICDE’04: initial running system described• SIGMOD’06: removing effects of bad user

transactions• ICDE’08: indexing with version compression• ICDE’09: performance and functionality

Talk Outline• Immortal DB: a transaction time database• Update Performance: timestamping

– Timestamping is main update overhead– Prior approaches– Our new approach– Update performance results

• Support for auditing– What do we provide– Exploiting timestamping implementation

• Range Read Performance: new page splitting strategy– Storage utilization determines range read performance– Prior split strategy guaranteeing “as off” version utilization– Our new approach– Storage utilization results

Timestamping & Update Performance

• Timestamp not known until commit– Fixing it to early leads to aborts

• Requires 2nd “touch” to add TS to record– 1st for update when TS not known– 2nd for adding TS when known

• TID:TS mapping must be stable until all timestamping completes and is stable

• Biggest single extra cost for updates

Prior Timestamping Techniques

• Eager timestamping– As a 2nd update during transaction– Delays commit, ~doubles update

• Lazy Timestamping – several variations– Replace Transaction ID (TID) with timestamp (TS) lazily

after commit; but this requires …– Persisting (TID:TS) mapping

• Trick is in handling this efficiently• Most prior efforts updated Persistent Transaction Timestamp

Table (PTT) at commit with TID:TS mapping• We improve on this part of process

Lazier Timestamping

LogTID:TS

PTTTID:TS

Commit record: with TID:TS

TID:TS posted to log at commit

Main MemoryVol. ts table(VTT) TID:TS: ref cnt

TID:TS batch write from VTT to PTT at chkpt

Timestamping activityBased mostly on VTT

Removes VTT entriesWhen TS’ing completeRef cnt = 0 and stable

TS added at commit

Only TID:TS with unfinished TS’ing

Execution Time

50% PTT

batch inserts

20% PTT

batch inserts

Unversioned

Prior TS method

unbatched

100% PTT batch inserts

IMPORTANT: Simple “ONE UPDATE” Transaction

Expected result is less than 20% case





Adding Audit Support

• Basic infrastructure only– Too much in audit to try to do more– For every update, who did it and when

• Technique– Extend PTT schema to include User ID (UID)

– Always persist this information• No garbage collection

– Timestamping technique permits batch update to PTT

TID:TS:UIDPTT

What does it cost?

50% PTT

batch inserts

20% PTT

batch inserts

Unversioned

Prior TS method

unbatched

100% PTT batch inserts

Audit Mode: Always keep everything in PTT, never delete~ equal to 50% batch insert case as these also are batch deleted

IMPORTANT: Simple “ONE UPDATE” Transaction





Utilization => Range Read Performance

• Biggest factor is records/page• Current data is most frequently read• We need technique that will improve storage

utilization– Surely for current data– No compromise for historical data

• Prior page splitting technology evolved from WOB-tree– Which was constrained by write-once media

• We can do better with write-many media

Prior Approaches to Guaranteed Utilization

• Choose target fill factor for current database– Can’t be 100% like unversioned – Higher => more redundant versions for “partially persistent indexes”

• Like TSB-tree, BV-tree, WOB-tree• Because splitting by time creates redundant versions when they cross time split

boundary

• “Naked” key splits compromise version utilization– Key split splits history as well as current data– Excessive key splits without time splits drives down storage utilization by any specific

version.

• What to do? Always time split with key split– Removes historical data from new current pages– Permitting them to fill fully to fill factor– Protects historical versions from further splitting – Originally in WOB-tree– a necessity there with WO storage media

Why time split with key split?

Historical data

Added versions

Free space

Key

split

Page

fills

Key

split

Same page over time

Historical page

key

split

Tim

e sp

lit

Current page

Time split with key split guarantees historical page will have good utilization for its versions

Intuition for new splitting technique– Always time split when page first is full– Key split afterwards when the page is full again

Historical data

Added versions

Free spaceTi

me

split

Page

fills

Historical page

Current page

Key

split

Historical page utilization preservedCurrent page utilization improved

Analytical Result• We can show the following:

)1( maxmaxmaxno-deferno-deferdefer SVCU

incrup

inSVCUSVCU

Where in is the insertion ratio, up is the update ratio and cr is the compression ratio.

)]2ln

1(2ln

[2ln

SVCU

incrup

inSVCUSVCU

no-deferavg

no-deferavgdefer

avg

* Formula derived based on one extra time for current pages to fill

Added current records with one

extra page fill before key split

Analysis:Current Storage Utilization

vs Update Ratio

0 0.1 0.2 0.3 0.4 0.5

0.600000000000001

0.700000000000001 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

B-tree

CR=.10 Deferred

CR=.10

CR=1.0 Deferred

CR=1.0

Expect update ratio of 65% - 85%

Update Ratio

Cur Utilization

Summary

• Optimizing timestamping yields update performance close to unversioned

• Optimizing page splitting yields current time range search performance close to unversioned

• Audit functionality easy to add via timestamping infrastructure

• Questions???

Improving Transaction-Time DBMS Performance and Functionality David Lomet Microsoft Research Feifei Li Florida State University.

Documents

timestamping timestamping

update transaction slide

ts tid

ts ptt tid

ts batch

batch update

main update

timestamp ts