Splice Machine Open Source RDBMS September 26, 2016 Daniel Gómez Ferro John Leach
Splice MachineOpen Source
RDBMSSeptember 26, 2016
Daniel Gómez FerroJohn Leach
Open Source Stack: Spark, Hadoop and Derby
Apache Derby▪ ANSI SQL-99 RDBMS▪ Java-based▪ ODBC/JDBC Compliant
Apache HBase/Hadoop▪Auto-sharding▪High availability▪Scalability to 100s of PBs
Apache Spark▪Analytical engine▪Fast, in-memory technology▪Memory resilient to node
failure 2
Splice Machine: Query Execution
3
Splice Machine: Query Execution
4
1. Parse SQL• Generate Abstract Syntax
Tree (AST)• Bind AST to Transactional
Dictionary
Splice Machine: Query Execution
5
1. Parse SQL2. Optimize query plan
• Determine join order and storage structure (e.g., base table, index) using table statistics (e.g., cardinality estimates)
• Push predicates• Unroll nested subqueries
Splice Machine: Query Execution
6
3. Generate optimal byte code
1. Parse SQL2. Optimize query plan
Splice Machine: Query Execution
7
OLTP Execution on HBase4a. Execute OLTP query from
byte code5a. Use block cache and bloom
filters to optimize data access6a. Return results
3. Generate optimal byte code
1. Parse SQL2. Optimize query plan
Splice Machine: Query Execution
8
OLAP Execution on Spark4b. Generate Spark execution plan
OLTP Execution on HBase4a. Execute OLTP query from
byte code5a. Use block cache and bloom
filters to optimize data access6a. Return results
3. Generate optimal byte code
1. Parse SQL2. Optimize query plan
OLAP Execution on Spark4b. Generate Spark execution plan5b. Submit Spark plan with byte code6b. Fair scheduling of distributed of tasks7b. Generate RDD from HFiles and Memstore 8b. Execute query and return results
Architectural Differences: Don’t we already have SQL on HBase?
Transactional System Tephra Centralized SI Two Phase Commit Hierarchical Distributed SI
Analytical Engine HBase Coprocessors,JDBC Client
HBase Coprocessors,Executor Services Processes
Spark on Yarn
Import Process Python or MapReduce MapReduce via Hive JDBC CommandSpark job
Scanning DataCoprocessor Internal Scans,HBase Scans
Coprocessor Internal Scans,HBase Scans
File Oriented Hybrid Scanner
Compaction HBase Compaction HBase Compaction Spark Compaction
Resource Management HBase Call Queues Workload Management System
Spark Job Scheduling (FAIR)
TPCH 100 Load Times
Tables Row Count
LINEITEM 600037902 5:19:27 1:25:46 0:22:34
ORDERS 150000000 0:51:28 0:15:29 0:09:58
PARTSUPP 80000000 0:18:41 0:08:52 0:06:28
PART 20000000 0:07:26 0:02:27 0:02:14
CUSTOMERS 15000000 0:05:37 0:02:03 0:01:42
SUPPLIER 1000000 0:01:48 0:00:26 0:00:18
NATION 25 0:00:41 0:00:07 0:00:01
REGION 5 0:00:43 0:00:05 0:00:01
TPCH 100 Load Throughput
Write Pipeline▪Features
▪ Batched writes per region server▪ Congestion control, retries▪ Asynchronous writes▪ Constraint checking (PK, FK…)▪ Index updates
▪One-for-all pipeline▪ OLTP queries▪ Batch data ingestion (Imports, Hadoop OutputFormat, OLAP query inserts...)▪ Streaming data ingestion (Kafka, Spark streaming…)
Spark Compactions
13
Spark UI▪Out of process compactions
▪ Minor and Major▪ Decrease Regionserver load▪ Increase stability▪ Remote compactions▪ Prioritized by Spark’s fair scheduler
TPCH 100 Query Times (seconds)Query
1 395 TRAFODION-2237 99
2 PHOENIX-3322 516 44
3 PHOENIX-3322 TRAFODION-2237 126
4 PHOENIX-3322 TBD 133
5 PHOENIX-3322 TBD 192
6 74 3178 38
7 PHOENIX-3322 4442 220
8 PHOENIX-3322 TRAFODION-2239 620
9 PHOENIX-3322 941 273
10 PHOENIX-3322 TRAFODION-2241 101
11 PHOENIX-3317 463 56
TPCH 100 Query Times (seconds)Query
12 379 TBD 85
13 PHOENIX-3318 TBD 71
14 PHOENIX-3322 TBD 50
15 PHOENIX-3319 TBD 102
16 PHOENIX-3322 TBD 33
17 PHOENIX-3322 TBD 929
18 PHOENIX-3322 TBD SPLICE-34
19 PHOENIX-3322 TBD 57
20 PHOENIX-3320 TBD SPLICE-410
21 PHOENIX-3321 TBD 479
22 PHOENIX-3322 TBD 219
Splice Machine: Advanced Spark Integration
16
Innovative, High-Performance RDD Creation▪Fast access to HFiles in HDFS▪Merged with deltas from Memstore▪Avoids slower HBase API ▪Reduces load in HBase
Universal Execution Plan and Byte Code▪Optimizer, plan and code shared
across Spark or HBase execution
•••
HBase Region Server
HDFS
•••Region 1
Memstore
Spark Worker
•••RDD 1
HFile HFile•••
PHYSICAL NODE
RDD N
HFile••• HFile•••
Region N
Memstore
HBase Region Server
HDFS
•••Region 1
Memstore
Spark Worker
•••RDD 1
HFile HFile•••
PHYSICAL NODE
RDD N
HFile••• HFile•••
Region N
Memstore
Resources▪Do you trust us? Nah...
▪ Give it a shot yourself and let us know what you find...▪ https://github.com/splicemachine/benchmarks
▪Want to get involved?▪ http://community.splicemachine.com/
▪ Want to code? Yeah, me too...▪ https://github.com/splicemachine/spliceengine