Azure SQL Data Warehouse Andrija Marcic Microsoft
Azure SQL Data Warehouse
Andrija Marcic
Microsoft
App Service
Intelligent App
Hadoop
Azure Machine Learning
Power BI
End to end platform built for the cloud
Azure SQL Database
SQL
Azure SQL DataWarehouse
Options:
SQL Server on physical machines
SQL Server in on-premises VM (private
cloud)
SQL Server in Azure VM (public cloud)
Azure SQL Database (public cloud)
SQL Server, Azure VMs, Azure SQL DB
Shared Lower Cost
Dedicated Higher Cost
Higher Administration Lower Administration
Off Premises
Hybrid Cloud
Physical
SQL ServerPhysical Machines(raw Iron)SQL
Virtual
SQL Server Private CloudVirtualized Machines + Appliances
Infrastructure as a service
SQL Server in Azure VMVirtualized machinesSQL
Platform as a service
Software as a services
Azure SQL DatabaseVirtualized Databases
SQL
On Premises
Demo
Azure SQL DB vs Azure SQL DataWarehouse
Azure SQL DB
SQL Azure Data Warehouse (MPP)
SQL Server
Real-time elasticity
Resize in <1 minute On-demand compute
Expand or reduce
as needed
Data Warehouse Units (DWUs) are a measure of reserved compute performance or ‘power.’ A customer’s DWU needs can vary depending on the needs of their workload.
100 200 300 400 500 600 1000 1200 1500 2000 3000 6000
DATA WAREHOUSE UNITS (DWUS)
ALTER DATABASE ContosoDW
MODIFY
(service_objective ='DW1000'
)
;
DWU
DW100
DW200
DW300
DW400
DW500
DW600
DW1000
DW1200
DW1500
DW2000
DW3000
DW6000
Data Warehouse Unit (DWU)
Demo
CREATE DATABASE MyDB COLLATESQL_Latin1_General_CP1_CI_AS
(
EDITION = 'DataWarehouse'
, SERVICE_OBJECTIVE = 'DW400'
, MAXSIZE = 10240 GB
);
ALTER DATABASE MyDBMODIFY (MAXSIZE = 245760 GB);
SQL Server Bulk Copy tool, SQL Server Integration Services (SSIS), Informatica, Attunity, Bryte Systems, snapLogic, others
Azure Blob Storage, HDInsight (future), Azure Data Lake Store (future)
Azure Data Factory, Redgate Data Platform Studio
Query unstructured data via PolyBase/T-SQL
PolyBase
Scale out compute
SQL DW instanceHadoop VMs /
Azure Storage/
Azure Datalake
Any data, any size, anywhere
T-SQL query
Azure SQL DW Hadoop
Quote:
************************
**********************
*********************
**********************
***********************
$658.39
Jim Gray
Name
11/13/58
DOB
WA
State
Ann Smith 04/29/76 ME
32active queries
1024
op
en se
ssions
1
2
4
8
16
32
64
128
Slo
ts
mediumrc (m) largerc (l) xlargerc (xl) Max Concurrency Slots
0,0
10,0
20,0
30,0
40,0
50,0
60,0
70,0
80,0
90,0
100,0
mediumrc (m) largerc (l) xlargerc (xl)
% C
on
sum
ed
DW100 DW200 DW300 DW400 DW500 DW600 DW1000 DW1200 DW1500 DW2000
0
4
8
12
16
20
24
28
32
DW100 DW200 DW300 DW400 DW500 DW600 DW1000 DW1200 DW1500 DW2000
Slo
ts
Concurrency impact
DWUMax External
ReadersMax Writers
DW100 8 60
DW200 16 60
DW300 24 60
DW400 32 60
DW500 40 60
DW600 48 60
DW1000 80 80
DW1200 96 96
DW1500 120 120
DW2000 160 160
DW3000 240 240
DW6000 480 480
0
1000
2000
3000
4000
5000
6000
7000
8000
1 2 3 4 5 6 10 12 15 20 30 60
GB/Hour
GB/Hour
0
5000
10000
15000
20000
25000
DW100 DW200 DW300 DW400 DW500 DW600 DW1000 DW1200 DW1500 DW2000 DW3000 DW6000
Capacity (GB)
0-4 TBs
4-8 TBs
8-12 TBs
12-16 TBs
16-20 TBs
20-36 TBs
36-48 TBs
48-60 TBs
60-80 TBs
80-160 TBs
>160 TBs
100 200 300 400 500 600 1000 1200 1500 2000 3000 6000
Recommended starting point
Flexibility to select any range of DWUs
?
Tack för eran uppmärkasmhet
Tack för eran uppmärksamhet&
Glöm inte att fylla i utvärderingen
Azure SQL Data Warehouse Architecture
Control
Node
Compute
Node
Compute
Node
Compute
Node
Compute
Node
SQL DB
SQL DB
SQL DB
SQL DB
Blob storage [WASB(S)]
ComputeScale compute up or down
when required(SLA <= 60 seconds).
Pause, Restart, Stop, Start.
StorageAdd\Load data to WASB(S) without incurring compute
costs
Massively Parallel Processing (MPP) Engine
Azure Infrastructure and Storage
100 DWU < > 2000 DWU
Storage and Compute are de-coupled, enabling a true elastic service and
separate charging for both compute and storage
Application or User connection
HDInsight
Data Loading(SSIS, REST, OLE, ADO, ODBC,
WebHDFS, AZCopy, PS) DMS
DMS DMS DMS DMS
DMS (Data Movement Service) executes across all
database nodes
Azure SQL Data Warehouse – Control Node
Control
Node
Compute
Node
Compute
Node
Compute
Node
Compute
Node
SQL DB
SQL DB
SQL DB
SQL DB
Blob storage [WASB(S)]
Massively Parallel Processing (MPP) Engine
HDInsight
Control
Node
SQL DB
• Endpoint for connections
• Regular SQL endpoint (TCP 1433)
• Persists no user data (metadata only)
• Coordinates compute activity using MPP
Azure SQL Data Warehouse - Compute Nodes
Control
Node
Compute
Node
Compute
Node
Compute
Node
Compute
Node
SQL DB
SQL DB
SQL DB
SQL DB
Blob storage [WASB(S)]
Massively Parallel Processing (MPP) Engine
HDInsight
Compute
Node(s)
Azure SQL Database
SQL DB
An increase of DWU will increase the number of
compute nodes
Azure SQL Data Warehouse – Blob storage
Control
Node
Compute
Node
Compute
Node
Compute
Node
Compute
Node
SQL DB
SQL DB
SQL DB
SQL DB
Blob storage [WASB(S)]
Massively Parallel Processing (MPP) Engine
HDInsight
• RA-GRS storage
• +PB’s of storage
• Load data without incurring compute costs