Top Banner
The Modern Data Warehouse 1 Marc Schöni Technology Solution Professional BI Microsoft Switzerland Karl-Heinz Sütterlin Technology Solution Professional App Plat
23

The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Jul 17, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

The Modern

Data Warehouse

1

Marc Schöni

Technology Solution Professional BI

Microsoft Switzerland

Karl-Heinz Sütterlin

Technology Solution Professional App Plat

Page 2: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Unlocking Insights on Any Data

Breakthrough Data Platform Performance with SQL Server 2014

Enabling Familiar, Powerful Business Intelligence

The Modern Data Warehouse

Page 3: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities
Page 4: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities
Page 5: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Big DataBig data solutions deal with complexities of:

VOLUME

(Size)

VARIETY

(Structure)

VELOCITY

(Speed)

VALUE

Page 6: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Industry TrendsHow the challenges are tackled technology-wise

VOLUME

(Size)

VARIETY

(Structure)

VELOCITY

(Speed)

MPPIn-

MemoryHadoop

VALUE EXCEL/PowerBI

Page 7: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

13

MapReduce (Job Scheduling/Execution System)

HDFS (Hadoop Distributed File System)

HBase (Column DB)

Hive Mahout

Oozie

Sqoop

HBase/Cassandra/Couch/

MongoDB

Avro

Zo

okeep

er

Pig

Hadoop = MapReduce + HDFS

FlumeCascad-

ingR

Am

bari

HCatalog

Page 8: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

[data + analytics + people] @ speedMicrosoft Analytics Platform System

Relational MPP

Database (PDW)

Hadoop

(HDInsight)

PolyBase

MPPPre-built and

performance-tuned

appliance for

analytical workload

In-

Memory100x speed

improvement

Scale-out to

petabytes of

data

Huge storage

savings with

columnstore

Hadoop

Dedicated region

for Hadoop

Joining relational

and non-relational

data with Polybase

Page 9: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Concurrency and mixed workloadsGreat Query Performance at Scale

Massively Parallel

Processing (MPP)

parallelizes queries

MPP query execution

Query

Results

Handles query

complexity and

concurrency at scale

Page 10: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Scaling out relational data to petabytesStart small and scale as you grow

16

Scale-out

PDW

0TB 6PB

PDW/

HDInsight

PDW/

HDInsight

PDW/

HDInsight

PDW/

HDInsight

PDW/

HDInsight

PDW/

HDInsight

Dedicated

CPU, memory

and storage

Incrementally

add hardware

for near-linear

scale

Integrates

HDInsight and

PDW

No “forklift” of

prior warehouse

to increase

capacity

Page 11: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Driver for In Memory

1

10

100

1000

10000

100000

1000000

10000000

1970 1975 1980 1985 1990 1995 2000 2005 2010 2015

Intel CPU trends and Memory prices($/GB)

Computing power holds Moore Law

(due to parallelism)

CPU clock frequency stalled

Memory has gotten a LOT cheaper

17

Up to 100x faster queries

Updatable clustered columnstore vs. table with customary indexing

Up to 15xmore compression

Page 12: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Integrate relational data and HadoopQuery relational + non relational relational data with PolyBase

Polybase

Analytics

Platform

System

Hortonworks

(Windows, Linux),

Cloudera

Microsoft Azure

HDInsight

Microsoft

HDInsight

Result set

PolyBase

Select…PDW and

HDInsight in a

single

appliance

Single query

model

Enterprise-ready

Hadoop

(Security,

Manageability

and HA)

Hybrid: spans

Hadoop on-

prem

or in the cloud

Page 13: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Analytics Platform System - EvolutionExtending the Data Warehouse further

SQL ServerControl Node

Compute Node

Compute Node

Compute Node

Compute Node

Hadoop Node

Hadoop Node

Hadoop Node

Hadoop Node

Analytics Platform SystemAnalytics Platform System

PDW HDInsight

Page 14: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Polybase

Page 15: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities
Page 16: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

What did Coop do?

• Traditional SMP

• Traditional approach, 1 single server

• 32 physical cores

• 256GB Memory

• Shared SAN

• HA with Clustering

• Scale out (MPP) = APS/PDW

• Modern way of data warehousing

• 32 physical cores (active)

• 512GB Memory

• Direct attached storage

• HA built-in

• Connected to Hadoop

Modernized from SMP to MPP and get “Big Data” Ready

SQL Instance Storage

SQL Instance #2

SQL Instance #1

Storage

Storage

Page 17: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Customer Scoring Procedure (Example)60x or more performance improvement

More data, more

accurate results

From nightly

batch to

overday ad-

hoc/on going

scoring

Do the

impossible

Immediate

responses to ad-

hoc queries,

model

improvements

Page 18: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Need for faster analytics using more data from Point of Sales and Loyalty Programs

Business result

Improved supply chain and time to market

Optimized and better targeted marketing

Faster price adoptions –even on regional levels

Business need & result

Page 19: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Data platform

SQL Oracle SAP …

Hadoop

HDFS

APS with Polybase

Azure

Hdinsight

(Hadoop on

Azure)

Internal data External data

BI Apps BI Apps BI Apps

Collect data

Reduce data

Driving the ITDM’s decision:

Cost/performance profile

Driving the BDM’s decision:

Time to value

Another customer example

Page 20: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities
Page 21: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities

Excel & the Power BI tools

Huge SQL Server’s

Massive Tuning

& Partitioning Effort

Complex Event

Processing Software

.NET Know How

Hadoop without

Polybase

Massive Map Reduce &

Java/.NET Know How

VOLUME

(Size)

VARIETY

(Structure)VELOCITY

(Speed)

Analytics Platform

System

PDW Region

(MPP)

Analytics Platform

System

Clustered Column Store

Index (In Memory)

Analytics Platform

System

Hadoop Region

& Polybase

Past

Now

Forever

Page 22: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities
Page 23: The Modern Data Warehouse - download.microsoft.comdownload.microsoft.com/.../SQL_Roadshow_ModernDWH_240614.pdf · Data Warehouse . Big Data Big data solutions deal with complexities