Top Banner
Transactions in HBase Andreas Neumann Gokul Gunasekaran HbaseCon June 2017 gokul at cask.co anew at apache.org @caskoid
111

HBaseCon2017 Transactions in HBase

Jan 23, 2018

Download

Technology

HBaseCon
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: HBaseCon2017 Transactions in HBase

Transactions in HBase

Andreas Neumann Gokul GunasekaranHbaseCon June 2017

gokul at cask.coanew at apache.org

@caskoid

Page 2: HBaseCon2017 Transactions in HBase

Goals of this Talk

- Why transactions?- Optimistic Concurrency Control- Three Apache projects: Omid, Tephra, Trafodion- How are they different?

2

Page 3: HBaseCon2017 Transactions in HBase

Transactions in noSQL?History• SQL: RDBMS, EDW, …• noSQL: MapReduce, HDFS, HBase, …• n(ot)o(nly)SQL: Hive, Phoenix, …

Motivation: • Data consistency under highly concurrent loads• Partial outputs after failure• Consistent view of data for long-running jobs• (Near) real-time processing

3

Page 4: HBaseCon2017 Transactions in HBase

Stream Processing

4

HBaseTable

...Queue ...

...

Flowlet

... ...

Page 5: HBaseCon2017 Transactions in HBase

HBaseTable

...Queue ...

...

Flowlet

... ...

Write Conflict!

5

Page 6: HBaseCon2017 Transactions in HBase

Transactions to the Rescue

6

HBaseTable

...Queue ...

...

Flowlet

- Atomicity of all writes involved- Protection from concurrent update

Page 7: HBaseCon2017 Transactions in HBase

ACID Properties

From good old SQL:

• Atomic - Entire transaction is committed as one• Consistent - No partial state change due to failure• Isolated - No dirty reads, transaction is only visible after commit• Durable - Once committed, data is persisted reliably

7

Page 8: HBaseCon2017 Transactions in HBase

What is HBase?

8

Client

Region Server

Region Region…Coprocessor

Region Server

Region Region…Coprocessor

Page 9: HBaseCon2017 Transactions in HBase

What is HBase?

9

Simplified:

• Distributed Key-Value Store• Key = <row>.<family>.<column>.<timestamp>• Partitioned into Regions (= continuous range of rows)

• Each Region Server hosts multiple regions• Optional: Coprocessor in Region Server

• Durable writes

Page 10: HBaseCon2017 Transactions in HBase

ACID Properties in HBase

• Atomic• At cell, row, and region level• Not across regions, tables or multiple calls

• Consistent - No built-in rollback mechanism• Isolated - Timestamp filters provide some level of isolation• Durable - Once committed, data is persisted reliably

How to implement full ACID?

10

Page 11: HBaseCon2017 Transactions in HBase

Implementing Transactions• Traditional approach (RDBMS): locking

• May produce deadlocks• Causes idle wait• complex and expensive in a distributed env

• Optimistic Concurrency Control• lockless: allow concurrent writes to go forward• on commit, detect conflicts with other transactions• on conflict, roll back all changes and retry

• Snapshot Isolation• Similar to repeatable read• Take snapshot of all data at transaction start• Read isolation

11

Page 12: HBaseCon2017 Transactions in HBase

Optimistic Concurrency Control

12

time

x=10client1: start fail/rollback

client2: start read x commitmust see the old value of x

Page 13: HBaseCon2017 Transactions in HBase

Optimistic Concurrency Control

13

time

incr xclient1: start commit

client2: start incr x commit

x=10

rollback

x=11

sees the old value of x=10

Page 14: HBaseCon2017 Transactions in HBase

Conflicting Transactions

14

time

tx:A

Page 15: HBaseCon2017 Transactions in HBase

Conflicting Transactions

14

time

tx:A

tx:B

Page 16: HBaseCon2017 Transactions in HBase

Conflicting Transactions

14

time

tx:A

tx:Btx:C (A fails)

Page 17: HBaseCon2017 Transactions in HBase

Conflicting Transactions

14

time

tx:A

tx:Btx:C (A fails)

tx:D (A fails)

Page 18: HBaseCon2017 Transactions in HBase

Conflicting Transactions

14

time

tx:A

tx:Btx:C (A fails)

tx:D (A fails)

tx:E (E fails)

Page 19: HBaseCon2017 Transactions in HBase

Conflicting Transactions

14

time

tx:A

tx:Btx:C (A fails)

tx:D (A fails)

tx:E (E fails)tx:F (F fails)

Page 20: HBaseCon2017 Transactions in HBase

Conflicting Transactions

14

time

tx:A

tx:Btx:C (A fails)

tx:D (A fails)

tx:E (E fails)tx:F (F fails)

tx:G

Page 21: HBaseCon2017 Transactions in HBase

Conflicting Transactions• Two transactions have a conflict if

• they write to the same cell• they overlap in time

• If two transactions conflict, the one that commits later rolls back• Active change set = set of transactions t such that:

• t is committed, and• there is at least one in-flight tx t’ that started before t’s commit time

• This change set is needed in order to perform conflict detection.

15

Page 22: HBaseCon2017 Transactions in HBase

HBase Transactions in Apache

16

Apache Omid (incubating)

(incubating)

(incubating)

Page 23: HBaseCon2017 Transactions in HBase

In Common• Optimistic Concurrency Control must:

• maintain Transaction State:• what tx are in flight and committed?• what is the change set of each tx? (for conflict detection, rollback)• what transactions are invalid (failed to roll back due to crash etc.)

• generate unique transaction IDs• coordinate the life cycle of a transaction

• start, detect conflicts, commit, rollback

• All of { Omid, Tephra, Trafodion } implement this• but vary in how they do it

17

Page 24: HBaseCon2017 Transactions in HBase

Apache Tephra• Based on the original Omid paper:

Daniel Gómez Ferro, Flavio Junqueira, Ivan Kelly, Benjamin Reed, Maysam Yabandeh:Omid: Lock-free transactional support for distributed data stores. ICDE 2014.

• Transaction Manager:• Issues unique, monotonic transaction IDs• Maintains the set of excluded (in-flight and invalid) transactions • Maintains change sets for active transactions• Performs conflict detection

• Client: • Uses transaction ID as timestamp for writes• Filters excluded transactions for isolation• Performs rollback

18

Page 25: HBaseCon2017 Transactions in HBase

Transaction Lifecycle

19

Page 26: HBaseCon2017 Transactions in HBase

Transaction Lifecycle

19

in progress

start new tx

Page 27: HBaseCon2017 Transactions in HBase

Transaction Lifecycle

19

in progress

start new tx

writeto

HBase

Page 28: HBaseCon2017 Transactions in HBase

Transaction Lifecycle

19

in progress

start new tx

writeto

HBasedetect conflicts

Page 29: HBaseCon2017 Transactions in HBase

Transaction Lifecycle

19

in progress

start new tx

writeto

HBasedetect conflicts

ok

complete

make visible

Page 30: HBaseCon2017 Transactions in HBase

Transaction Lifecycle

19

in progress

start new tx

writeto

HBase

aborting

conflicts

detect conflicts

ok

complete

make visible

Page 31: HBaseCon2017 Transactions in HBase

Transaction Lifecycle

19

in progress

start new tx

writeto

HBase

aborting

conflicts

roll backin HBase

ok

detect conflicts

ok

complete

make visible

Page 32: HBaseCon2017 Transactions in HBase

Transaction Lifecycle

19

in progress

start new tx

writeto

HBase

aborting

conflicts

invalid

failure

roll backin HBase

ok

detect conflicts

ok

complete

make visible

Page 33: HBaseCon2017 Transactions in HBase

Transaction Lifecycle

19

in progress

start new tx

writeto

HBase

aborting

conflicts

invalid

failure

roll backin HBase

ok

timeout

detect conflicts

ok

complete

make visible

Page 34: HBaseCon2017 Transactions in HBase

Transaction Lifecycle

19

in progress

start new tx

writeto

HBase

aborting

conflicts

invalid

failure

roll backin HBase

ok

timeout

detect conflicts

ok

complete

make visible

• Transaction consists of:• transaction ID (unique timestamp)• exclude list (in-flight and invalid tx)

• Transactions that do complete• must still participate in conflict detection• disappear from transaction state when they do not overlap with in-flight tx

• Transactions that do not complete• time out (by transaction manager)• added to invalid list

Page 35: HBaseCon2017 Transactions in HBase

Apache Tephra

20

TxManagerClient A

HBaseRegion Server

x:10 37

Region Server

in-flight: …

Page 36: HBaseCon2017 Transactions in HBase

Apache Tephra

20

TxManagerClient A

HBaseRegion Server

x:10 37

Region Server

in-flight: …

start()id: 42, excludes = {…}

,42

Page 37: HBaseCon2017 Transactions in HBase

Apache Tephra

20

TxManagerClient A

HBaseRegion Server

x:10 37

write x=11

x:11 42

Region Server

in-flight: …

start()id: 42, excludes = {…}

,42

Page 38: HBaseCon2017 Transactions in HBase

Apache Tephra

20

TxManagerClient A

HBaseRegion Server

x:10 37

write x=11

x:11 42

Region Server

write: y=17

y:17 42

in-flight: …

start()id: 42, excludes = {…}

,42

Page 39: HBaseCon2017 Transactions in HBase

HBase

Apache Tephra

21

TxManager

Client B

Region Server

x:10 37

x:11 42

Region Server

y:17 42

in-flight: …,42

Page 40: HBaseCon2017 Transactions in HBase

HBase

Apache Tephra

21

TxManager

Client B

Region Server

x:10 37

x:11 42

Region Server

y:17 42

in-flight: …,42

start()id: 48, excludes = {…,42} ,48

Page 41: HBaseCon2017 Transactions in HBase

HBase

Apache Tephra

21

TxManager

read x

Client B

Region Server

x:10 37

x:11 42

Region Server

y:17 42

in-flight: …,42

start()id: 48, excludes = {…,42} ,48

Page 42: HBaseCon2017 Transactions in HBase

HBase

Apache Tephra

21

TxManager

read x

Client B

x:10

Region Server

x:10 37

x:11 42

Region Server

y:17 42

in-flight: …,42

start()id: 48, excludes = {…,42} ,48

Page 43: HBaseCon2017 Transactions in HBase

Region Server

HBaseRegion Server

x:10 37 y:17 42

Apache Tephra

22

TxManagerClient A

x:11 42

in-flight: …,42

Page 44: HBaseCon2017 Transactions in HBase

Region Server

HBaseRegion Server

x:10 37 y:17 42

Apache Tephra

22

TxManagerClient A

x:11 42

commit()conflict

in-flight: …,42

Page 45: HBaseCon2017 Transactions in HBase

Region Server

HBaseRegion Server

x:10 37 y:17 42

Apache Tephra

22

TxManagerClient A

x:11 42

roll back

commit()conflict

in-flight: …,42

Page 46: HBaseCon2017 Transactions in HBase

Region Server

HBaseRegion Server

x:10 37 y:17 42

Apache Tephra

22

TxManagerClient A

x:11 42

roll back

commit()conflict

x:10 37

in-flight: …,42

Page 47: HBaseCon2017 Transactions in HBase

Region Server

HBaseRegion Server

x:10 37 y:17 42

Apache Tephra

22

TxManagerClient A

x:11 42

roll back

commit()conflict

x:10 37

in-flight: …,42

in-flight: …

make visible

Page 48: HBaseCon2017 Transactions in HBase

HBase

Apache Tephra

23

Region Server

x:10 37

x:11 42

Region Server

y:17 42

TxManagerClient A

in-flight: …,42

Page 49: HBaseCon2017 Transactions in HBase

HBase

Apache Tephra

23

Region Server

x:10 37

x:11 42

Region Server

y:17 42

TxManagerClient A

in-flight: …,42

commit()success

in-flight: …

Page 50: HBaseCon2017 Transactions in HBase

HBase

Apache Tephra

23

Region Server

x:10 37

x:11 42

Region Server

y:17 42

TxManagerClient A

in-flight: …,42

commit()success

in-flight: …Client C start()

id: 52, excludes: {…}in-flight:

…,52

Page 51: HBaseCon2017 Transactions in HBase

HBase

Apache Tephra

23

Region Server

x:10 37

x:11 42

Region Server

y:17 42

read x

x:11

TxManagerClient A

in-flight: …,42

commit()success

in-flight: …Client C start()

id: 52, excludes: {…}in-flight:

…,52

Page 52: HBaseCon2017 Transactions in HBase

Apache Tephra

24

Client

Region Server

Region Region…Coprocessor

Region Server

Region Region…Coprocessor

HBase

TxManager

Tx id generationTx lifecycle

rollbackTx state

lifecycle transitions

data operations

Page 53: HBaseCon2017 Transactions in HBase

Apache Tephra• HBase coprocessors

• For efficient visibility filtering (on region-server side)• For eliminating invalid cells on flush and compaction

• Programming Abstraction• TransactionalHTable:

• Implements HTable interface• Existing code is easy to port

• TransactionContext:• Implements transaction lifecycle

25

Page 54: HBaseCon2017 Transactions in HBase

Apache Tephra - ExampletxTable = new TransactionAwareHTable(table); txContext = new TransactionContext(txClient, txTable); txContext.start();try { // perform Hbase operations in txTable txTable.put(…); ...} catch (Exception e) { // throws TransactionFailureException(e) txContext.abort(e);}// throws TransactionConflictException if so txContext.finish();

26

Page 55: HBaseCon2017 Transactions in HBase

Apache Tephra - Strengths• Compatible with existing, non-tx data in HBase• Programming model

• Same API as HTable, keep existing client code• Conflict detection granularity

• Row, Column, Off• Special “long-running tx” for MapReduce and similar jobs

• HA and Fault Tolerance• Checkpoints and WAL for transaction state, Standby Tx Manager

• Replication compatible• Checkpoint to HBase, use HBase replication

• Secure, Multi-tenant

27

Page 56: HBaseCon2017 Transactions in HBase

Apache Tephra - Not-So Strengths• Exclude list can grow large over time

• RPC, post-filtering overhead• Solution: Invalid tx pruning on compaction - complex!

• Single Transaction Manager• performs all lifecycle state transitions, including conflict detection• conflict detection requires lock on the transaction state• becomes a bottleneck• Solution: distributed Transaction Manager with consensus protocol

28

Page 57: HBaseCon2017 Transactions in HBase

Apache Trafodion• A complete distributed database (RDBMS)

• transaction system is not available by itself• APIs: jdbc, SQL

• Inspired by original HBase TRX (transactional region server• migrated transaction logic into coprocessors • coprocessors cache in-flight data in-memory• transaction state (change sets) in coprocessors• conflict detection with 2-phase commit

• Transaction Manager • orchestrates transaction lifecycle across involved region servers• multiple instances, but one per client

29

(incubating)

Page 58: HBaseCon2017 Transactions in HBase

Apache Trafodion

30

Page 59: HBaseCon2017 Transactions in HBase

Apache Trafodion

31

TxManagerClient A

HBaseRegion Server

x:10

Region Server

in-flight: …

Page 60: HBaseCon2017 Transactions in HBase

Apache Trafodion

31

TxManagerClient A

HBaseRegion Server

x:10

Region Server

in-flight: …

start()id:42

,42

Page 61: HBaseCon2017 Transactions in HBase

Apache Trafodion

31

TxManagerClient A

HBaseRegion Server

x:10

Region Server

in-flight: …

start()id:42

,42write x=11

x:11

region:…

,42

Page 62: HBaseCon2017 Transactions in HBase

Apache Trafodion

31

TxManagerClient A

HBaseRegion Server

x:10

Region Server

in-flight: …

start()id:42

,42write: y=17

y:17

write x=11

x:11

region:…

,42

Page 63: HBaseCon2017 Transactions in HBase

Apache Trafodion

32

TxManager

Client Bin-flight:

…,42

HBaseRegion Server

x:10

Region Server x:11 y:17

Page 64: HBaseCon2017 Transactions in HBase

Apache Trafodion

32

TxManager

Client Bin-flight:

…,42start()id: 48 ,48

HBaseRegion Server

x:10

Region Server x:11 y:17

Page 65: HBaseCon2017 Transactions in HBase

Apache Trafodion

32

TxManager

read x

Client Bin-flight:

…,42start()id: 48 ,48

HBaseRegion Server

x:10

Region Server x:11 y:17

Page 66: HBaseCon2017 Transactions in HBase

Apache Trafodion

32

TxManager

read x

Client B

x:10

in-flight: …,42

start()id: 48 ,48

HBaseRegion Server

x:10

Region Server x:11 y:17

Page 67: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

33

TxManagerClient A

in-flight: …,42

Region Server

x:10

Region Server x:11 y:17

Page 68: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

33

TxManagerClient A

commit()

in-flight: …,42

Region Server

x:10

Region Server x:11 y:17

Page 69: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

33

TxManagerClient A

1. conflicts?

commit()

in-flight: …,42

Region Server

x:10

Region Server x:11 y:17

Page 70: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

33

TxManagerClient A

1. conflicts?

commit()

in-flight: …,42

Region Server

x:10

Region Server x:11 y:17

Page 71: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

33

TxManagerClient A

1. conflicts?

commit()

in-flight: …,42

Region Server

x:10

Region Server x:11 y:17

2. roll back

Page 72: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

33

TxManagerClient A

1. conflicts?

commit()

in-flight: …,42

Region Server

x:10

Region Server x:11 y:17

2. roll back

Page 73: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

33

TxManagerClient A

1. conflicts?

commit()

in-flight: …,42

in-flight: …

Region Server

x:10

Region Server x:11 y:17

2. roll back

Page 74: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

34

TxManagerClient A

in-flight: …,42

Region Server

x:10

Region Server x:11 y:17

Page 75: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

34

TxManagerClient A

commit()

in-flight: …,42

Region Server

x:10

Region Server x:11 y:17

Page 76: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

34

TxManagerClient A

1. conflicts?

commit()

in-flight: …,42

Region Server

x:10

Region Server x:11 y:17

Page 77: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

34

TxManagerClient A

1. conflicts?

commit()

in-flight: …,42

Region Server

x:10

Region Server x:11 y:17

Page 78: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

34

TxManagerClient A

1. conflicts?

commit()

in-flight: …,42

Region Server

x:10

Region Server x:11 y:17

2. commit!

Page 79: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

34

TxManagerClient A

1. conflicts?

commit()

in-flight: …,42

Region Server

x:10

Region Server x:11 y:17

2. commit!

x:11 y:17

Page 80: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

34

TxManagerClient A

1. conflicts?

commit()

in-flight: …,42

in-flight: …

Region Server

x:10

Region Server x:11 y:17

2. commit!

x:11 y:17

Page 81: HBaseCon2017 Transactions in HBase

HBase

Apache Trafodion

35

Client

Region Server

Region Region…Coprocessor

Region Server

Region Region…Coprocessor

TxManager

Tx id generation

conflictsTx state

Tx life cycle (commit)

transitionsregion ids

2-phase commit

data operations

Tx lifecycle

In-flight data

Client 2 Tx 2 Manager

Page 82: HBaseCon2017 Transactions in HBase

Apache Trafodion• Scales well:

• Conflict detection is distributed: no single bottleneck• Commit coordination by multiple transaction managers• Optimization: bypass 2-hase commit if single region

• Coprocessors cache in-flight data in Memory• Flushed to HBase only on commit• Committed read (not snapshot, not repeatable read)• Option: cause conflicts for reads, too

• HA and Fault Tolerance• WAL for all state• All services are redundant and take over for each other

• Replication: Only in paid (non-Apache) add-on

36

Page 83: HBaseCon2017 Transactions in HBase

Apache Trafodion - Strengths• Very good scalability

• Scales almost linearly• Especially for very small transactions

• Familiar SQL/jdbc interface for RDB programmers• Redundant and fault-tolerant• Secure and multi-tenant:

• Trafodion/SQL layer provides authn+authz

37

Page 84: HBaseCon2017 Transactions in HBase

Apache Trafodion - Not-So Strengths• Monolithic, not available as standalone transaction system • Heavy load on coprocessors

• memory and compute• Large transactions (e.g., MapReduce) will cause Out-of-memory

• no special support for long-running transactions

38

Page 85: HBaseCon2017 Transactions in HBase

Apache Omid• Evolution of Omid based on the Google Percolator paper:

Daniel Peng, Frank Dabek: Large-scale Incremental Processing Using Distributed Transactions and Notifications, USENIX 2010.

• Idea: Move as much transaction state as possible into HBase• Shadow cells represent the state of a transaction• One shadow cell for every data cell written• Track committed transactions in an HBase table• Transaction Manager (TSO) has only 3 tasks

• issue transaction IDs• conflict detection• write to commit table

39

Page 86: HBaseCon2017 Transactions in HBase

Apache Omid

40

Page 87: HBaseCon2017 Transactions in HBase

Apache Omid

41

TxManager

Client A

HBaseRegion Server

x:10 37: commit.40

Region Server Commits

37: 40

Page 88: HBaseCon2017 Transactions in HBase

Apache Omid

41

TxManager

Client A start()id: 42

HBaseRegion Server

x:10 37: commit.40

Region Server Commits

37: 40

Page 89: HBaseCon2017 Transactions in HBase

Apache Omid

41

TxManager

Client A start()id: 42

HBaseRegion Server

x:10 37: commit.40

write x=11

x:11 42: in-flight

Region Server Commits

37: 40

Page 90: HBaseCon2017 Transactions in HBase

Apache Omid

41

TxManager

Client A start()id: 42

HBaseRegion Server

x:10 37: commit.40

write x=11

x:11 42: in-flight

Region Server Commits

37: 40

write: y=17

y:17 42: in-flight

Page 91: HBaseCon2017 Transactions in HBase

HBase

Apache Omid

42

TxManagerClient B

Region Server

x:10 37: commit.40

x:11 42: in-flight

Region Server

y:17

Commits

37: 40 42: in-flight

Page 92: HBaseCon2017 Transactions in HBase

HBase

Apache Omid

42

TxManagerstart()

id: 48Client B

Region Server

x:10 37: commit.40

x:11 42: in-flight

Region Server

y:17

Commits

37: 40 42: in-flight

Page 93: HBaseCon2017 Transactions in HBase

HBase

Apache Omid

42

TxManagerstart()

id: 48

read x

Client B

Region Server

x:10 37: commit.40

x:11 42: in-flight

Region Server

y:17

Commits

37: 40 42: in-flight

Page 94: HBaseCon2017 Transactions in HBase

HBase

Apache Omid

42

TxManagerstart()

id: 48

read x

Client B

x:10

Region Server

x:10 37: commit.40

x:11 42: in-flight

Region Server

y:17

Commits

37: 40 42: in-flight

Page 95: HBaseCon2017 Transactions in HBase

Region Server

HBaseRegion Server

x:10 37: commit.40 y:17 42: in-flight

Apache Omid

43

TxManager

Client A

Commits

37: 40

x:11 42: in-flight

Page 96: HBaseCon2017 Transactions in HBase

Region Server

HBaseRegion Server

x:10 37: commit.40 y:17 42: in-flight

Apache Omid

43

TxManager

Client A

Commits

37: 40

x:11 42: in-flight

commit()conflict

Page 97: HBaseCon2017 Transactions in HBase

Region Server

HBaseRegion Server

x:10 37: commit.40 y:17 42: in-flight

Apache Omid

43

TxManager

Client A

Commits

37: 40

x:11 42: in-flight

roll back

commit()conflict

Page 98: HBaseCon2017 Transactions in HBase

Region Server

HBaseRegion Server

x:10 37: commit.40 y:17 42: in-flight

Apache Omid

43

TxManager

Client A

Commits

37: 40

x:11 42: in-flight

roll back

commit()conflict

x:10 37: commit.40

Page 99: HBaseCon2017 Transactions in HBase

HBase

Apache Omid

44

Region Server

TxManager

Client A

x:10 37: commit.40

x:11 42: in-flight

Region Server

y:17

Commits

37: 40 42: in-flight

Page 100: HBaseCon2017 Transactions in HBase

HBase

Apache Omid

44

Region Server

TxManager

Client A

x:10 37: commit.40

x:11 42: in-flight

Region Server

y:17

Commits

37: 40 42: in-flight

commit()success:50

42: 50

Page 101: HBaseCon2017 Transactions in HBase

HBase

Apache Omid

44

Region Server

TxManager

Client A

x:10 37: commit.40

x:11 42: in-flight

Region Server

y:17

Commits

37: 40 42: in-flight

mark as committed

42: commit.50

42: commit.50

commit()success:50

42: 50

Page 102: HBaseCon2017 Transactions in HBase

HBase

Apache Omid

44

Region Server

TxManager

Client A

Client C start()id: 52

x:10 37: commit.40

x:11 42: in-flight

Region Server

y:17

Commits

37: 40 42: in-flight

mark as committed

42: commit.50

42: commit.50

commit()success:50

42: 50

Page 103: HBaseCon2017 Transactions in HBase

HBase

Apache Omid

44

Region Server

TxManager

Client A

Client C start()id: 52

x:10 37: commit.40

x:11 42: in-flight

Region Server

y:17

Commits

37: 40 42: in-flight

mark as committed

42: commit.50

42: commit.50

read x

x:11

commit()success:50

42: 50

Page 104: HBaseCon2017 Transactions in HBase

Apache Omid - Future• Atomic commit with linking?

• Eliminate need for commit table

45

HBaseRegion Server

x:10 37: commit.40

x:11 42: in-flight

Region Server Commits

37: 40 y:17

Page 105: HBaseCon2017 Transactions in HBase

HBase

Apache Omid

46

Client

Region Server

Region Region…Coprocessor

Region Server

Region Region…Coprocessor

TxManager

Tx id generationConflict detection

startcommit

data operations

+ shadow cells

Tx state

Tx lifecycle rollbackcommit

committable

Page 106: HBaseCon2017 Transactions in HBase

Apache Omid - Strengths• Transaction state is in the database

• Shadow cells plus commit table• Scales with the size of the cluster

• Transaction Manager is lightweight• Generation of tx IDs delegated to timestamp oracle• Conflict detection• Writing to commit table

• Fault Tolerance:• After failure, fail all existing transactions attempting to commit• Self-correcting: Read clients can delete invalid cells

47

Page 107: HBaseCon2017 Transactions in HBase

Apache Omid - Not So Strengths• Storage intensive - shadow cells double the space• I/O intensive - every cell requires two writes

1. write data and shadow cell2. record commit in shadow cell

• Reads may also require two reads from HBase (commit table)• Producer/Consumer: will often find the (uncommitted) shadow cell

• Scans: high througput sequential read disrupted by frequent lookups• Security/Multi-tenancy:

• All clients need access to commit table • Read clients need write access to repair invalid data

• Replication: Not implemented

48

Page 108: HBaseCon2017 Transactions in HBase

Summary

49

Apache Tephra Apache Trafodion Apache Omid

Tx State Tx Manager Distributed to region servers

Tx Manager (changes) HBase (shadows/commits)

Conflict detection Tx Manager Distributed to regions, 2-phase commit Tx Manager

ID generation Tx Manager Distributed to multipleTx Managers Tx Manager

API HTable SQL CustomMulti-tenant Yes Yes NoStrength Scans, Large Tx, API Scalable, full SQL Scale, throughputSo so Scale, Throughput API not Hbase, Large Tx Scans, Producer/Consumer

Page 109: HBaseCon2017 Transactions in HBase

LinksJoin the community:

50

Apache Omid (incubating)http://omid.apache.org/

(incubating)http://trafodion.apache.org/

(incubating)http://tephra.apache.org/

Page 110: HBaseCon2017 Transactions in HBase

Thank you… for listening to my talk.

Credits: - Sean Broeder, Narendra Goyal (Trafodion)- Francisco Perez-Sorrosal (Omid)

51

Page 111: HBaseCon2017 Transactions in HBase

Thank you… for listening to my talk.

Credits: - Sean Broeder, Narendra Goyal (Trafodion)- Francisco Perez-Sorrosal (Omid)

51

Questions?