Top Banner
GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014
73

GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Jul 17, 2018

Download

Documents

phamtruc
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

GUSTAVO ALONSO SYSTEMS GROUP

DEPT. OF COMPUTER SCIENCE ETH ZURICH

Crazy little thing called hardware

HTDC 2014

Page 2: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Systems Group = www.systems.ethz.ch Enterprise Computing Center = www.ecc.ethz.ch

Page 3: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Hardware rules

Multicore, Many core Transactional Memory SIMD, AVX, vectorization SSDs, persistent memory Infiniband, RDMA GPUs, FPGAs (hardware acceleration) Intelligent storage engines, main memory Database appliances

Reacting to changes we do not control

Page 4: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Do we care?

Software

producer of performance

instead of

consumer of performance

Mark D. Hill (U. of Wisconsin)

Page 5: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

What does it mean?

Many good ideas from decades ago are no longer good ideas (as conceived):

• General purpose solutions

• Threading, thread based parallelism

• Locking, shared state

• Centralized synchronization

• Concurrency control

• STM (sorry …)

Page 6: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

The take away message

Ignoring hardware trends is irresponsible

• Software evolves more slowly than hardware

Focus on the end (solve a problem), not on the means (the technique used)

How to succeed in research:

• Make hardware your friend

• Influence hardware evolution

Page 7: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Outline

1. Hardware as a problem for system design

a) A jungle of improvements

b) An example (database joins)

2. The case for custom systems

a) Economies of scale

b) An example (Crescando)

3. Ideas and directions

Page 8: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

1. Hardware as a problem for system design

a) A jungle of improvements

Page 9: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Why is this happening?

Single thread performance

The Future of Computing Performance: Game Over or Next Level? Fuller and Millett (Eds) http://www.nap.edu/catalog.php?record_id=12980 Presentation idea from Bob Colwell

Page 10: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Why is this happening?

Clock frequency

Page 11: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Why is this happening?

Power dissipation

Page 12: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Why is this happening?

MULTICORE

Page 13: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Multicore is great: avoid distribution

• Analysis of MapReduce workloads: – Microsoft: median job size < 14 GB

– Yahoo: median job size < 12.5 GB

– Facebook: 90% of jobs less than 100 GB

• Fit in main memory

• One server more efficient than a cluster

• Adding memory to a big server better than using a cluster

Nobody ever got fired for using Hadoop on a Cluster A. Rowstron, D. Narayanan, A. Donnely, G. O’Shea, A. Douglas

HotCDP 2012, Bern, Switzerland

Page 14: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

… but multicore is dead …

MULTICORE

DARK SILICON

Page 15: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Dark Silicon = heterogeneity

Symmetric

Asymmetric

Dynamic (power)

Specialized/dynamic

Probabilistic

Page 16: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Heterogeneity is a mess

Experiment setup • 8GB datastore size • SLA latency requirement 8s • 4 different machines

Example: deployment on multicores

Min Cores Partition Size [GB] RT [s]

Intel Nehalem 2 4 6.54

AMD Barcelona 5 1.6 3.55

AMD Shanghai 3 2.6 4.33

AMD MagnyCours 2 2 7.37

Jana Giceva, Tudor-Ioan Salomie, Adrian Schüpbach,

Gustavo Alonso, Timothy Roscoe: COD: Database /

Operating System Co-Design. CIDR 2013

Page 17: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Beyond multicore: manycore

MULTICORE

DARK SILICON

MANY CORE

Page 18: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Manycore (example)

“normal” CPU

PC

Ie

Page 19: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Extreme manycore: Intelligent storage

Louis Woods, Zsolt István, Gustavo Alonso: Hybrid

FPGA-accelerated SQL query processing. FPL

2013

Page 20: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Don’t need to trust the software stack Offload all sensitive operations to secure co-

processor

Testing industry-standard benchmarks with modified SQL.exe, 0-60% perf overhead

20

CPU Memory Secure

Co-processor

Keys, Plaintext Keys, Plaintext

SLIDE COURTESY OF KEN EGURO, MICROSOFT RESEARCH

Page 21: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

The take away message

Ignoring hardware trends at your own peril

• You will be solving the wrong problem

• Your solution will irrelevant very quickly

• Your solution will be making the wrong assumptions

• Hardware might solve the problem that you are trying to solve (may have solved it already)

Page 22: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

1. Hardware as a problem for system design

b) An example (database joins)

Page 23: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

The joy of joins

"Main-Memory Hash Joins on Multi-Core CPUs: Tuning to the Underlying Hardware" by Cagri

Balkesen, Jens Teubner, Gustavo Alonso, and Tamer Ozsu, ICDE 2013

Cagri Balkesen, Gustavo Alonso, Jens Teubner, M. Tamer Özsu: Multi-Core, Main-Memory Joins: Sort

vs. Hash Revisited. PVLDB 7(1): 85-96 (2013)

Page 24: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Performance in the multicore era 24

Joins are a complex and demanding operation

Lots of work on implementing all kinds of joins

In 2009

– Kim, Sedlar et al. paper (PVLDB 2009)

Radix join on multicore

Sort Merge join on multicore

Claim fastest implementation to date

Key message: when SIMD wide enough, sort merge will be faster than radix join

Hardware conscious

Page 25: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Performance in the multicore era 25

In 2011

– Blanas, Li et al. (SIGMOD 2011)

No partitioning join

(vs. radix join version of Kim paper)

Claim: Hardware is good enough, no need for careful tailoring to the underlying hardware

Hardware oblivious

Page 26: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Performance in the multicore era 26

In 2012

– Albutiu, Kemper et al. (PVLDB 2012)

Sort merge joins

(vs join version of Blanas)

Claim: Sort merge already better and without using SIMD

Hardware confusion

Page 27: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

The basic hash join

Page 28: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

✔Complexity: O(|R|+|S|), i.e., O(N)

✔Easy to parallelize

Canonical Hash Join

k

hash(key)

1. Build phase

bucket 1

bucket n-1

bucket 0

hash table 2. Probe phase

k R

S

hash(key) match

Performance in the multicore era 28

Page 29: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Need for Speed

Hardware-Conscious Hash Joins

Page 30: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

No more cache misses during the join

Partitioned Hash Join (Shatdal et al. 1994)

k

R S

h1(key) h1(key) . . .

1

p

1

p

"Cache conscious algorithms for relational query processing", Shatdal et al, VLDB ‘94

k

.

.

.

.

.

.

.

.

.

Idea: Partition input into disjoint chunks of cache size

① Partition ① Partition ② Build ③ Probe

h2(k)

p > #TLB-entries TLB misses

p > #Cache-entries Cache thrashing Problem: p can be too large!

30 Performance in the multicore era

Page 31: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

31

Problem: Hardware limits fan-out, i.e. T = #TLB-entries (typically 64-512)

Solution: Do the partitioning in multiple passes!

Multi-Pass Radix Partitioning

1st pass h1(key)

1

T

.

.

.

2nd Pass h2(key)

1

T

.

.

.

2nd Pass h2(key)

1

T

.

.

.

.

.

.

.

.

.

... ith pass ...

1st log2T bits of hash(key)

2nd log2T bits of hash(key)

partition - 1

partition - T i

TLB & Cache efficiency compensates multiple read/write passes

input relation

i = logT p

«Database Architecture Optimized for the new Bottleneck: Memory Access», Manegold et al, VLDB ‘99 Performance in the multicore era

Page 32: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Thread-1 Thread-2 Thread-3 Thread-N

Re

lati

on

Local Histograms

Global Histogram & Prefix Sum

Par

titi

on

ed

1st Scan

2nd Scan

Each thread scatters out its tuples based on the prefix sum

Parallel Radix Join

Parallelizing the Partitioning: Pass - 1

Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs, VLDB ‘09

32

T0 T1 TN T0 T1 TN T0 T1 TN T0 T1 TN

Performance in the multicore era

Page 33: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Parallel Radix Join

Parallelizing the Partitioning: Pass - (2 .. i)

Thread-2 Thread-4 Thread-N

Re

lati

on

Histogram

Prefix Sum

Par

titi

on

ed

1st Scan

2nd Scan Each thread individually partitions sub-relations from pass-1

. . .

33

P1 P2 P3 P4

P11 P12 P13 P14 P21 P22 P23 P24 P31 P32 P33 P41 P42 P43 Performance in the multicore era

Page 34: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Trust the force

Hardware-Oblivious Hash Joins

Page 35: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Parallel Hash Join („no partitioning join“ of Blanas et al.)

Thread-1 Thread-2 Thread-3 Thread-N

R

shared hashtable

bucket 1

bucket n-1

bucket 0

latches

L1

L(n-1)

L0

1. Acquire latch

2. Store in bucket

1. Build Phase

S

2. Probe Phase

Thread-1 Thread-2 Thread-3 Thread-N

Compare & match

Performance in the multicore era 35

Page 36: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

And the winner is …

Page 37: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Effect of Workloads – Case 1

Hardware –oblivious vs. –conscious with our optimized code

≈ -30% may seem in favor of hardware-oblivious, but …

Workload A: 16M ⋈ 256M, 16-byte tuples, i.e., 256MiB ⋈ 4096MiB

16.5 cy/tpl 12.9 cy/tpl

1. Effective on-chip threading 2. Efficient sync. primitives (ldstub) 3. Larger page size

Performance in the multicore era 37

Page 38: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Effect of Workloads – Case 2

Picture radically changes: Hardware-conscious is better by 3.5X on Intel and 2.5X on others

With larger build table, overhead of not being hardware-conscious is clearly visible

Workload B: Equal-sized tables, 977MiB ⋈ 977MiB, 8-byte tuples

50 cy/tpl

14 cy/tpl

≈3.5X

Performance in the multicore era 38

Page 39: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Input size is an important parameter! Optimizations on radix: NUMA-aware data placement, partitioning with

software-managed buffers

Sort or Hash?

375M/sec 6.41 cy/tpl

299M/sec 8.03 cy/tpl

619M/sec 3.87 cy/tpl

305M/sec 7.99 cy/tpl

Workload: 12.8GB ⋈ 12.8GB Workload: 1GB ⋈ 1GB

Machine: Intel Sandy Bridge E4640, 2.4GHz, 32-cores, 64-threads

39 Performance in the multicore era

Page 40: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Game on for academia

• The next several hundred papers …

1. Pick an algorithm

2. Pick an architecture

3. Optimize

4. Publish

5. Go to 1

• Algorithm can be used in parallel for increased throughput

Page 41: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

2. The case for custom systems

a) Economies of scale

Page 42: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Game changers

Hardware evolution:

One size does not fit all

Cloud computing

IT industry as service industry

Big X (X=data volume, loads, system costs, requirements, scale)

Demand for customized solutions

Page 43: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

• Intelligent storage manager • Massive caching • RAC based architecture • Fast network interconnect

ORACLE EXADATA

Page 44: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

NETEZZA (IBM) TWINFIN

• No storage manager • Distributed disks (per node) • FPGA processing • No indexing

Page 45: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Has been tried before …

The idea that database workloads are relevant enough to justify customization is not new

Decades ago it was difficult to fight the progress of general purpose systems.

Not any more …

Gamma Database Machine (Dewitt)

Page 46: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

2. The case for custom systems

b) An example (Crescando)

Page 47: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Amadeus Workload

Passenger-Booking Database

• ~ 600 GB of raw data (two years of bookings)

• single table, denormalized

• ~ 50 attributes: flight-no, name, date, ..., many flags

Query Workload

• up to 4000 queries / second

• latency guarantees: 2 seconds

• today: only pre-canned queries allowed

Update Workload • avg. 600 updates per second

(1 update per GB per sec) • peak of 12000 updates per

second • data freshness guarantee: 2

seconds

Problems with State-of-the Art • Simple queries work only

because of mat. views multi-month project to

implement new query / process

• Complex queries do not work at all

Page 48: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Crescando: the Amadeus use case

Remove load interaction

Remove unpredictability

Simplify design for scalability and modeling

Treat a multicore machine as a collection of individual nodes (not as a parallel machine)

Run only on main memory

One thread per core

Highly tune the code at each core

Page 49: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Scan on a core

READ CURSOR

WRITE CURSOR DATA IN

CIRCULAR BUFFER

(WIDE TABLE)

BUILD QUERY INDEX FOR NEXT SCAN QUERIES

UPDATES

Page 50: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Crescando on 1 Machine (N Cores)

...

Split

Scan Thread

Scan Thread

Scan Thread

Scan Thread

Scan Thread

Merge

Input Queue

(Operations)

Input Queue

(Operations)

Output Queue

(Result Tuples)

Output Queue

(Result Tuples)

Page 51: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Crescando in a Data Center (N Machines)

...

Aggregation

Layers

Replication

Groups

...

...

External Clients

Crescando

...

Page 52: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Why is this interesting?

Fully predictable performance

• Response time determined by design regardless of load

Only two parameters:

• Size of the scan

• Number of queries per scan

Scalable to arbitrary numbers of nodes

Philipp Unterbrunner, Georgios Giannikis, Gustavo Alonso, Dietmar Fauser, Donald Kossmann: Predictable Performance for Unpredictable Workloads. PVLDB 2(1): 706-717 (2009)

Page 53: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Even more interesting

On such a system (or a key value store, No-SQL database, etc.):

• Why using a general purpose CPU?

• Many other options available

Smaller CPUs

FPGAs

ASIC

Zsolt István, Gustavo Alonso, Michaela Blott, Kees A. Vissers: A flexible hash table design for 10GBPS key-value stores on FPGAS. FPL 2013

Page 54: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

3. Ideas and directions

Page 55: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Performance in the multicore era 55

Not everything is parallel

P. Roy, J. Teubner, G. Alonso Efficient Frequent Item Counting in Multi-Core Hardware, KDD 2012

Page 56: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Not everything is a CPU

Louis Woods, Zsolt István, Gustavo Alonso: Hybrid

FPGA-accelerated SQL query processing. FPL

2013

Page 57: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Hardware might solve your problem

Louis Woods, Gustavo Alonso, Jens Teubner:

Parallel Computation of Skyline Queries. FCCM 2013

Page 58: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Pipeline parallelism

SharedDB does not run queries individually (each one in one thread). Instead, it runs operators that process queries in batches thousands of queries at a time

Georgios Giannikis, Gustavo Alonso, Donald Kossmann: SharedDB: Killing One Thousand Queries With One Stone. PVLDB 5(6): 526-537 (2012)

Page 59: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Shared DB can run TPC-W!

Page 60: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

For the non-db people

TPC-W has updates!!!

Full consistency without conventional transaction manager

Transactions are no longer what you read in textbooks …

• Sequential execution

• Memory CoW (Hyder, TU Munich)

• Snapshot isolation

Page 61: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Raw performance

Page 62: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Predictability, robustness

Page 63: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Parallelism or distribution?

Page 64: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

CIDR’13

COD : Overview

64

OS

DBMS

Application requirements and characteristics

System state and utilization of resources

Hardware & architecture +

What is the knowledge we have?

Who knows what?

Insert interface here

COD: Database/Operating System co-design

Page 65: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

CIDR’13 COD: Database/Operating System co-design 65

Policy Engine

System-level

OS

System-level facts

DB storage engine

DB system-level properties

Application-specific

DB-specific Facts & properties

Cod’s Interface supports

Push application-specific facts: #Requests (in a batch) Datastore size (#Tuples, and TupleSize) SLA response time requirement

Needed for: cost / utility functions:

Cost functions

Page 66: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

CIDR’13

COD’s key features

Declarative interface

Resource allocation for imperative requests

Resource allocation based on cost functions

Proactive interface

Inform of system state

Request releasing of resources

Recommend reallocation of resources

COD: Database/Operating System co-design 66

Page 67: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

CIDR’13

0

200

400

600

800

1000

1200

0 1000 2000 3000 4000

Thro

ugh

pu

t [r

eq

/se

con

d]

Increasing load [#requests]

System Performance -Throughput

COD: Database/Operating System co-design 67

48 cores - isolated

48 cores - noisy

what we get using COD, knowing the system state

Deployment in a noisy system Experimental results

Experiment setup

• AMD MagnyCours • 4 x 2.2GHz AMD Opteron 6174 processors • total Datastore size 53GB • Noise: another CPU-intensive task running on core 0

Page 68: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

CIDR’13

0

200

400

600

800

1000

1200

0 1000 2000 3000 4000

Thro

ugh

pu

t [r

eq

/se

con

d]

Increasing load [#requests]

System Performance -Throughput

COD: Database/Operating System co-design 68

48 cores - isolated

47 cores – COD noisy

48 cores - noisy

what we get using COD, knowing the system state

Deployment in a noisy system Experimental results

Experiment setup

• AMD MagnyCours • 4 x 2.2GHz AMD Opteron 6174 processors • total Datastore size 53GB • Noise: another CPU-intensive task running on core 0

Page 69: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

CIDR’13 COD: Database/Operating System co-design 69

Adaptability to dynamic system state

Experimental results

0

1

2

3

4

5

6

7

8

9

0 5 10 15 20

Late

ncy

[se

c]

Elapsed time [min]

Adaptability – Latency

SLA

Experiment setup

• AMD MagnyCours • 4 x 2.2GHz AMD Opteron 6174 processors • total Datastore size 53GB • Noise: other CPU-intensive threads spawned every 4-5min on core 0

Page 70: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

CIDR’13 COD: Database/Operating System co-design 70

Adaptability to dynamic system state

Experimental results

0

1

2

3

4

5

6

7

8

9

0 5 10 15 20

Late

ncy

[se

c]

Elapsed time [min]

Adaptability – Latency

Naïve datastore engine

SLA

Experiment setup

• AMD MagnyCours • 4 x 2.2GHz AMD Opteron 6174 processors • total Datastore size 53GB • Noise: other CPU-intensive threads spawned every 4-5min on core 0

Page 71: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

CIDR’13 COD: Database/Operating System co-design 71

Adaptability to dynamic system state

Experimental results

0

1

2

3

4

5

6

7

8

9

0 5 10 15 20

Late

ncy

[se

c]

Elapsed time [min]

Adaptability – Latency

Naïve datastore engine

SLA COD

Experiment setup

• AMD MagnyCours • 4 x 2.2GHz AMD Opteron 6174 processors • total Datastore size 53GB • Noise: other CPU-intensive threads spawned every 4-5min on core 0

Page 72: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

Conclusions

Page 73: GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE … · GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Crazy little thing called hardware HTDC 2014

The opportunity is now

Consensus on major crisis in hardware (from the sw perspective)

Hardware not really improving, responsibility passed on to software

Business models and IT systems moving towards specialization