Top Banner
How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com http://www.monash.com http://www.DBMS2.com
40

How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Mar 26, 2015

Download

Documents

Jessica Farley
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

How to Select an Analytic DBMS

Overview, checklists, and tips

byCurt A. Monash, Ph.D.

President, Monash ResearchEditor, DBMS2

contact @monash.comhttp://www.monash.comhttp://www.DBMS2.com

Page 2: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Curt Monash

Analyst since 1981, own firm since 1987 Covered DBMS since the pre-relational days Also analytics, search, etc.

Publicly available research Blogs, including DBMS2 (www.DBMS2.com -- the

source for most of this talk) Feed at www.monash.com/blogs.html White papers and more at www.monash.com

User and vendor consulting

Page 3: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Our agenda

Why are there such things as specialized analytic DBMS?

What are the major analytic DBMS product alternatives?

What are the most relevant differentiations among analytic DBMS users?

What’s the best process for selecting an analytic DBMS?

Page 4: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Why are there specialized analytic DBMS?

General-purpose database managers are optimized for updating short rows …

… not for analytic query performance 10-100X price/performance differences

are not uncommon

At issue is the interplay between storage, processors, and RAM

Page 5: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Moore’s Law, Kryder’s Law, and a huge exception

Growth factors:

Transistors/chip:

>100,000 since 1971 Disk density:

>100,000,000 since 1956 Disk speed:

12.5 since 1956

The disk speed barrier dominates everything!

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

Compound Annual Growth Rate

Transistors/Chipssince 1971

Disk Density since 1956

Disk Speed since 1956

04/10/23 DRAFT!! THIRD TEST!!

Page 6: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

The “1,000,000:1” disk-speed barrier

RAM access times ~5-7.5 nanoseconds CPU clock speed <1 nanosecond Interprocessor communication can be ~1,000X slower

than on-chip

Disk seek times ~2.5-3 milliseconds Limit = ½ rotation i.e., 1/30,000 minutes i.e., 1/500 seconds = 2 ms

Tiering brings it closer to ~1,000:1 in practice, but even so the difference is VERY BIG

Page 7: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Software strategies to optimize analytic I/O

Minimize data returned Classic query optimization

Minimize index accesses Page size

Precalculate results Materialized views OLAP cubes

Return data sequentially Store data in columns Stash data in RAM

Page 8: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Hardware strategies to optimize analytic I/O

Lots of RAM Parallel disk access!!! Lots of networking

Tuned MPP (Massively Parallel Processing) is the key

Page 9: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Specialty hardware strategies

Custom or unusual chips (rare) Custom or unusual interconnects Fixed configurations of common parts

Appliances or recommended configurations

And there’s also SaaS

Page 10: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

18 contenders (and there are more)

Aster Data Dataupia Exasol Greenplum HP Neoview IBM DB2 BCUs Infobright/MySQL Kickfire/MySQL Kognitio Microsoft Madison

Netezza Oracle Exadata Oracle w/o Exadata ParAccel SQL Server w/o

Madison Sybase IQ Teradata Vertica

Page 11: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

General areas of feature differentiation

Query performance Update/load performance Compatibilities Advanced analytics Alternate datatypes Manageability and availability Encryption and security

Page 12: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Major analytic DBMS product groupings

Architecture is a hot subject

Traditional OLTP Row-based MPP Columnar (Not covered tonight) MOLAP/array-based

Page 13: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Traditional OLTP examples

Oracle (especially pre-Exadata) IBM DB2 (especially mainframe) Microsoft SQL Server (pre-Madison)

Page 14: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Analytic optimizations for OLTP DBMS

Two major kinds of precalculation Star indexes Materialized views

Other specialized indexes Query optimization tools OLAP extensions SQL 2003 Other embedded analytics

Page 15: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Drawbacks

Complexity and people cost Hardware cost Software cost Absolute performance

Page 16: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Legitimate use scenarios

When TCO isn’t an issue Undemanding performance (and therefore

administration too) When specialized features matter

OLTP-like Integrated MOLAP Edge-case analytics

Rigid enterprise standards Small enterprise/true single-instance

Page 17: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Row-based MPP examples

Teradata DB2 (open systems version) Netezza Oracle Exadata (sort of) DATAllegro/Microsoft Madison Greenplum Aster Data Kognitio HP Neoview

Page 18: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Typical design choices in row-based MPP

“Random” (hashed or round-robin) data distribution among nodes

Large block sizes Suitable for scans rather than random accesses

Limited indexing alternatives Or little optimization for using the full boat

Carefully balanced hardware High-end networking

Page 19: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Tradeoffs among row MPP alternatives

Enterprise standards Vendor size Hardware lock-in Total system price Features

Page 20: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Columnar DBMS examples

Sybase IQ SAND Vertica ParAccel InfoBright Kickfire Exasol MonetDB SAP BI Accelerator (sort of)

Page 21: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Columnar pros and cons

Bulk retrieval is faster Pinpoint I/O is slower Compression is easier Memory-centric processing is easier MPP is not quite as crucial

Page 22: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Segmentation – a first cut

One database to rule them all One analytic database to rule them all Frontline analytic database Very, very big analytic database Big analytic database handled very cost-

effectively

Page 23: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Basics of systematic segmentation

Use cases Metrics Platform preferences

Page 24: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Use cases – a first cut

Light reporting Diverse EDW Big Data Operational analytics

Page 25: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Metrics – a first cut

Total raw/user data Below 1-2 TB, references abound 10 TB is another major breakpoint

Total concurrent users 5, 15, 50, or 500?

Data freshness Hours Minutes Seconds

Page 26: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Basic platform issues

Enterprise standards Appliance-friendliness Need for MPP? Cloud/SaaS

Page 27: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

The selection process in a nutshell

Figure out what you’re trying to buy Make a shortlist Do free POCs* Evaluate and decide

*The only part that’s even slightly specific to the analytic DBMS category

Page 28: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Figure out what you’re trying to buy

Inventory your use cases Current Known future Wish-list/dream-list future

Set constraints People and platforms Money

Establish target SLAs Must-haves Nice-to-haves

Page 29: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Use-case checklist -- generalities

Database growth As time goes by … More detail New data sources

Users (human) Users/usage (automated) Freshness (data and query results)

Page 30: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Use-case checklist – traditional BI

Reports Today Future

Dashboards and alerts Today Future Latency

Ad-hoc Users Now that we have great response time …

Page 31: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Use-case checklist – data mining

How much do you think it would improve results to Run more models? Model on more data? Add more variables? Increase model complexity?

Which of those can the DBMS help with anyway? What about scoring?

Real-time Other latency issues

Page 32: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

SLA realism

What kind of turnaround truly matters? Customer or customer-facing users Executive users Analyst users

How bad is downtime? Customer or customer-facing users Executive users Analyst users

Page 33: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Short list constraints

Cash cost But purchases are heavily negotiated

Deployment effort Appliances can be good

Platform politics Appliances can be bad You might as well consider incumbent(s)

Page 34: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Filling out the shortlist

Who matches your requirements in theory?

What kinds of evidence do you require? References?

How many? How relevant?

A careful POC? Analyst recommendations? General “buzz”?

Page 35: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

A checklist for shortlists

What’s your tolerance for specialized hardware? What’s your tolerance for set-up effort? What’s your tolerance for ongoing administration? What are your insert and update requirements? At what volumes will you run fairly simple

queries? What are your complex queries like? For which third-party tools do you need support?

and, most important,

Are you madly in love with your current DBMS?

Page 36: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Proof-of-Concept basics

The better you match your use cases, the more reliable the POC is

Most of the effort is in the set-up You might as well do POCs for several

vendors – at (almost) the same time! Where is the POC being held?

Page 37: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

The three big POC challenges

Getting data Real?

Politics Privacy

Synthetic? Hybrid?

Picking queries And more?

Realistic simulation(s) Workload Platform Talent

Page 38: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

POC tips

Don’t underestimate requirements Don’t overestimate requirements Get SOME data ASAP Don’t leave the vendor in control Test what you’ll be buying Use the baseball bat

Page 39: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Evaluate and decide

It all comes down to

Cost Speed Risk

and in some cases

Time to value Upside

Page 40: How to Select an Analytic DBMS Overview, checklists, and tips by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2 contact @monash.com .

Further information

Curt A. Monash, Ph.D.President, Monash Research

Editor, DBMS2

contact @monash.comhttp://www.monash.comhttp://www.DBMS2.com