Top Banner
1 MariaDB ColumnStore
23

MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

Jan 17, 2017

Download

Software

MariaDB
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

1

MariaDBColumnStore

Page 2: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

2

Weshouldbetalkingabouttheanaly6csofthings,nottheinternetofthings.

JimDavisCMO,SAS

Page 3: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

3

CurrentStateofAnaly5cs

•  Tradi5onalOLAP

o  Costtoperform

o  AppliancesorProprietarySolu6ons

•  Big-DataAnaly5cs

o  Scaletoperform

o  Non-SQLInterfaces

•  Analy5csandTransac5onSepara5on

Page 4: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

4

WhyMariaDBColumnStore

PricetoPerformanceatScale

DataAnaly6csusingSQLorSPARK

UnifiedSimplicity(Transac6onandAnaly6csunderthesameRoof)

Open-SourceGPL2

SQL

Page 5: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

5

WhyCustomersChooseMariaDBColumnStoreSCALE●  Massivelyparallelarchitecturedesignedforbigdatascalingtoprocesspetabytesofdata

●  Readperformancescaleslinearlywithdatagrowth

SPEED●  Excep6onalperformance

●  Real-6meresponsetoanaly6csqueriesandHighspeeddataloading

SECURITYandRELIABILITY●  Datawithencryp6onfordatainmo6on,rolebasedaccessandauditfeaturesof

MariaDBEnterprise

●  Built-inhighavailabilityataccessanddatalayers

SIMPLICITYwithPOWER●  Simplifiedmanagementandmaintenance,Easyinstalla6onandscaling

●  SameinterfaceasMariaDBandMySQL,AVachestowiderangeofBItools

Page 6: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

ColumnarDistributedDataStorage

MariaDBSQLFrontEnd

QueryEngine

UserModules

PerformanceModules1 ... Performance

ModulesNPerformanceModules2

PerformanceModules3

Clients

UserConnec5ons

6

MariaDBColumnStoreArchitecture▪  UserModule:ProcessesSQLRequests▪  PerformanceModule:DistributedProcessingEngine

Page 7: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

7

Row-OrientedvsColumn-OrientedRow-oriented: rows stored sequentially in a file

Key Fname Lname State Zip Phone Age Sales1 Bugs Bunny NJ 11217 (123)938-3235 34 1002 Yosemite Sam CT 95389 (234)375-6572 52 5003 Daffy Duck IA 10013 (345)227-1810 35 2004 Elmer Fudd CT 04578 (456)882-7323 43 105 Witch Hazel CT 01970 (567)744-0991 57 250

Column-oriented: each column is stored in a separate file. Each column for a given row is at the same offset. Key12345

FnameBugsYosemiteDaffyElmerWitch

LnameBunnySamDuckFuddHazel

StateNJCTIACTCT

Zip1121795389100130457801970

Phone(123)938-3235(234)375-6572(345)227-1810(456)882-7323(567)744-0991

Age3452354357

Sales10050020010250

Page 8: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

8

DataStorage-ExtentsandPMs

Extent 1 Extent 2

Extent 3 Extent 4

Extent 5 Extent 6

Extent 7 Extent 8

PM 1 PM 2

Extent 1 Extent 2 Extent 3 Extent 4

Extent 5 Extent 6 Extent 7 Extent 8

PM 1 PM 2 PM 4 PM 3

●  ExtentMap

○  Inmemorymeta-dataofanextent’smin,maxvalueforacolumn,extent’sphysicalblockoffsetandPMonwhichtheextentresides

Page 9: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

DataInges5on●  BulkdataloadHadoopissuitablefor

○  cpimport:CSVandBinary

○  LOADDATAINFILE:CSV

●  ApacheSqoopIntegra6on:○  Integra6onwithcpimportandsqlinterface

●  FutureRelease○  DataStreamingfromMariaDB/MySQLdatabasetoMariaDBColumnStorecluster

•  via Kafka

•  Avro data record

Page 10: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

DataInges5on-BulkDataLoad●  cpimport

○  Fastestwaytoloaddata•  Load data from CSV file •  Load data from Standard Input •  Load data from Binary Source file

○  Mul6pletablesincanbeloadedinparallelbylaunchingmul6plejobs○  Readqueriescon6nuewithoutbeingblocked○  Successfulcpimportisauto-commiVed○  Incaseoferrors,en6reloadisrolledback

●  LOADDATAINFILE○  Tradi6onalwayofimpor6ngdataintoanyMariaDBstorageenginetable○  Upto26messlowerthancpimportforlargesizeimports○  Eithersuccessorerroropera6oncanberolledback

Page 11: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

Analy5cs Indatabaseanaly6cswithcomplexjoins,windowingfunc6onsandUDFsOutofboxBIToolsconnec6vity,Analy6csintegra6onwithR

Scale •  Columnar,MassivelyParallel•  Linearscalability

Performance •  Highperformanceadhocanalysis•  Consistentqueryresponse6me

HighAvailability Builtinredundancyandhighavailability

EaseofUse •  ANSISQLcompa6ble•  ACIDcompliant•  Noindexes,Nomaterializedviews•  Nomanualpar66oning

DataInges5on CONNECTEngineCreateTableasSelectHighspeedparalleldataloadandextract

Security SSLsupport,AuditPlugin,Authen6ca6onPlugin,RoleBasedAccess

DeploymentOp5ons Onpremise,AWS,Hadoop11

MariaDBColumnStore1.0

Page 12: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

•  Harvestnewvaluefromlargehistoricaldatasetsbyderivingnewinsights•  Supportgrowthinyourbusiness,whilecon6nuetodeliverhighservicelevels

fordataanaly6cs

Rows/DataSizeScope

110010,0001,000,000100,000,00010,000,000,000100,000,000,00010-100GB 100-1000GB 1-10TB 10-100TB...PB

MariaDBEnterpriseOLTP

MariaDBEnterpriseEnterpriseOLAP

UseCase:ScalingBigDataAnaly5cs

12

Page 13: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

13

UseCase:ScalingBigDataAnaly5cs

●  Anorganiza6onisgenera6nglargeamountofopera6onaldata

●  Mul6pletera-bytesofhistoricaldata

●  Withgrowthinbusinessandinopera6onaldata

○  Analy6csqueryperformancedegrades

○  Imprac6caltodoanaly6cs

●  PutpastdataintoMariaDBColumnStore

●  Asdatagrows

●  Performanaly6cswithoutperformancedegrada6on

●  LinearScalabilitywithdatagrowth

BusinessChallenge MariaDBSolu6on

1 2 3

MariaDB ColumnStore 1.0

Add new node(s)

Page 14: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

●  Uncovernewbusinessopportunitywithdataexplora6onandanaly6csonpetabytedatavolumes

●  Generatereal-6meinsightstoinformandenhancelivecustomerinterac6ons

UseCase:DiscoverInsight

Page 15: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

UseCase:DiscoverInsight

Challenges

●  Needtoanalyzereal-6meandhistoricalflightparameterdata

●  Too6me-consumingtoperformanaly6cswithcurrenttoolset

●  MostdataanalysthaveSQLbackground

Objec5ves:●  Maintainflightsafety-accurately

predictpartreplacementt●  Providehighservicelevelsand

minimizecost-proac6velyplanequipmentmaintenanceandre6rement

GlobalCommercialAvia5onManufacturer

HistoricalDATAReal-6mein-flightperformancedata

•  Complex-join,aggrega6onandwindowingfunc6ons

•  Highspeedreal-6meperformance

Micro-batchuploadreal-6meflightperformanceintoMariaDBColumnStore

Analy6csDATAScien6st

FamiliarSQLInterface

Thecompanyplanstosellthissolu5onasaservicetocommercialairliners

Timelymaintenanceforecast,partreplacement,

flightre5rement

Page 16: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

●  FamiliarSQLinterfacesdemocra6zesaccesstobigdatatolargeruserbase

●  AVachwiderangeofBItoolsviaMariaDB/MySQLconnectors

●  GekngmostvalueoutofbigdatawhileminimizingOpexcost

●  LeverageHadoopdeployments

UseCase:AcceleratedAnaly5cswithSQL&SPARK

Page 17: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

17

UseCase:AcceleratedAnaly5cswithHadoop

●  MariaDBColumnStoreOLAPcanrunonpremise,oncloudoronHadoopcluster

●  IngestdatafromHadoop

●  MatureANSI-SQLcompliance

●  Stellarperformance:70to806mesfasterthanSQL-on-HadoopcounterpartsHive,HbaseandImpala

●  Matureinterfaces

BusinessChallenge MariaDBSolu6on●  LargeamountofdatainHadoop

●  Hadoopissuitablefor

○  batchprocessing

○  TransformsviaMap-Reduceprogramming

●  Real-6meanaly6csonHadoop

○  SpeedcannotmeetbusinessrequirementwiththeHadooptoolset

●  ShortageofHadoopskillsforDataScien6st/BA

○  SQLinterfacesonHadoopToolsarenotmature

MapReduceHBase MariaDBColumnStore

HadoopDistributedFileSystem

Pig/Hive

BatchProcessing HighPerformanceanaly6cs

Page 18: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

●  ImprovedDBAproduc6vity

●  FamiliarSQLinterfacesdemocra6zesaccesstobigdatatolargeruserbase

●  Reducedopera6onalcomplexity

●  GekngmostvalueoutofbigdatawhileminimizingDBAOpexcost

UseCase:SimplifyingBigDataManagement

Page 19: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

19

UseCase:SimplifyingBigDataManagement

●  MariaDBColumnStore

●  Libera6onfromIndexmanagement

●  Automa6cpar66oning

●  Easytogrow

●  Micro-batchbulkloadforreal-6medata-flow

BusinessChallenge MariaDBSolu6onComplexityofdatamanagementincreasesasdatavolumegrows

●  Tedioustokeepupwithindexesandpar66oningasdatagrow

●  Scaling-outorScalingupmanagement

●  Movingopera6onaldatatobigdataanaly6csplalorminreal-6me

PMNode

cpimport

Source Source Source

UMNode

PMNode

PMNode

Page 20: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

20

MariaDBColumnStoreRoadmap

Firstrelease•  MariaDBColumnStore(Por6ngofInfiniDBonMariaDB10.1)•  AmazonEBSsupport•  CreateTableLike/AsSelect

FutureReleases•  SparkIntegra6on•  DataStreamingintegra6onwithMaxScale•  Na6veAPIforcolumnarfile•  JoinandFilterperformanceop6miza6on•  ROLLUP,CUBEinMariaDBColumnStore•  ASOFimplementa6oninMariaDBServer•  CONNECTEnginesupportinMariaDBServer•  SQLEditor(OSSor3rdpartypartner)

Subscrip6onoffering

Page 21: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

21

●  BETAreleaseinQ42016.

●  Signupforno6fica6onofBETAavailabilitytoday

●  ProductPagehVps://mariadb.com/products/mariadb-columnstore

LearnmoreaboutMariaDBColumnStore

Page 22: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

22

Q&A

Page 23: MariaDB Roadshow 2016: Introduction to MariaDB ColumnStore

23

ThankYou