SQL Server Enterprise Data Warehousing Scott Hulke Microsoft Technology Center - Dallas
Dec 19, 2015
SQL Server Enterprise Data Warehousing
Scott HulkeMicrosoft Technology Center - Dallas
Agenda
What’s new in R2 for Data WarehousingData Warehousing ArchitecturesCustomer ExamplesDemos
Customer Insights & Tech Trends
Compliance &Risk
Management
Right InformationAt the Right Time
IT Agility andCost Efficiency
BUSINESS REQUIREMENTS CONSTANTLY GROWING
Globalization
Virtualization Hardware Innovation
Cloud Services
GAME CHANGERS DRIVING TECHNOLOGY INNOVATION
Digitally Born Data
Enterprise security and scalability
Data consistency across heterogeneous systems
High-scale, complex event processing
TRUSTED, SCALABLE PLATFORM
IT & DEVELOPER EFFICIENCY
MANAGED SELF-SERVICE BI
Multi-server management
Virtualization & Live Migration
Accelerated development& deployment
Self-service analytics
Self-service reporting
Streamlined collaboration
& management
Large Scale Data Warehousing in R2
StreamInsightParallel Data WarehouseRDBMS Engine Enhancements
Enhanced CompressionDataCenter Edition
Up to 256 processor cores
Not R2-specific, but…Hardware advances such as increased core density, solid state disks (SSDs)
Complex Event Processing
"Intelligence isn't just about knowing what is happening. It's about looking at the patterns in real time. If business people want to truly optimize their resources, they must examine patterns in real time.”
- Mike Gualtieri - Forrester
SITUATION TODAY
Pressure to handle large volumes of complex data in real time
The size and frequency of data make it challenging to store for data mining and analysis
Growing need to monitor, analyze and act on the data in motion
SQL SERVER 2008 R2 STREAMINSIGHT
Process large volumes of events across multiple
data streams in less than a second
Manage your business through historical data mining and continuous insights
Built-in support for different types of event handling and rich query semantics
Example CEP Scenarios
8
Dat
a St
ream
Stream Data Store & Archive
Event Processing Engine
Dat
a St
ream
Asset Specs & Parameters
Power, Utilities:• Energy consumption• Outages• Smart grids• 100,000 events/sec
Visual trend-line and KPI monitoringBatch & product managementAutomated anomaly detectionReal-time customer segmentation Algorithmic tradingProactive condition-based maintenance
Web Analytics:• Click-stream data• Online customer
behavior• Page layout• 100,000 events /sec
Manufacturing:• Sensor on plant floor• React through device
controllers• Aggregated data • 10,000 events/sec
• Threshold queries• Event correlation from
multiple sources• Pattern queries
Lookup
Asset Instrumentation for Data Acquisition, Subscriptions to Data Feeds
Financial Services:• Stock & news feeds• Algorithmic trading• Patterns over time• Super-low latency• 100,000 events /sec
High Scale Data Warehouse
“Parallel Data Warehouse is a natural complement to SQL Server, so we are excited about the possibilities the DatAllegro acquisition will bring.”
- Ron Van Zanten, Directing Officer of Business Intelligence, Premier Bankcard Inc
SITUATION TODAY
Data volumes are exploding
Growing population of users accessing information
Increasingly complex data analyses performed against data
SQL SERVER 2008 R2 PARALLEL DATA WAREHOUSEPredictable scale-out through MPP on SQL Server and Windows
Massive Scale with Low TCO – 10s to 100TB+ (total cost starts at $15k/TB!)
Integrated BI platform for small and very large Enterprises
SQL Server Parallel Data Warehouse
Choice of hardware vendorHigh scale through Massively Parallel Processing (MPP) systemHub and Spoke architectureDeep integration with Microsoft BI
11
A data warehouse appliance with massive scalability
Enterprise-level Scalability and Security
Fewest critical vulnerabilities of any Enterprise Database according to NIST
SITUATION TODAY
Businesses need a data platform which keeps up with demands of their growing business
Sensitive & valuable information needs to be highly-secured
More demand for 24/7 availability
SQL SERVER 2008 and R2
Enhanced data compression improves performance and reduces storage requirements
Transparent data encryption prevents access to
secure data from unauthorized users
Supports up to 256 logical processors
EmpowerYour Users
Memory becoming increasingly affordable
99% of all BI apps of fortune 5000 companies can fit in 1 TB of RAM
8-12 core processors will be standard
Client Computersby 2012:
Taking Advantage of Latest Trends
Agenda
What’s new in R2 for Data WarehousingData Warehousing ArchitecturesCustomer ExamplesDemos
Microsoft DW Solutions
SSIS
Microsoft & PartnerServices
Two SQL DW Infrastructure Options: SQL Classic DW or Fast Track SQL DW
SQL 2008 Data WarehouseSMP Server
Shared Network Bandwidth
Enterprise Shared SAN Storage
Dedicated Network Bandwidth
SQL Classic DWArchitectureLeverages Shared SAN
Fast Track SQL DW ArchitectureArchitecture modeled after DW Appliances “ Appliance Like” solutionsUses Dedicated SAN arrays and Network
SAN Arrays 1:4 cpu cores8 Data Disk / Array – 4 Raid 1 PairsSimultaneous SQL Server Reads2 Log and 1 Hot SpareEMC AX4 – HP MSA2312IBM 3400
OLTP Applications SQL Fast Track DW supports “Scan Centric” DW workloads that are index light
Dedicated SAN
SQL Server Fast Track Data Warehouse
A method for designing a cost-effective, balanced system for Data Warehouse workloads Reference hardware configurations developed in conjunction with hardware partners using this methodBest practices for data layout, loading and management
Relational Database Only – Not SSAS, SSIS, SSRS
Fast Track Data Warehouse Components
Software:•SQL Server 2008 Enterprise•Windows Server 2008
Configuration guidelines:• Physical table structures• Indexes• Compression• SQL Server settings• Windows Server settings• Loading
Hardware:•Tight specifications for servers, storage and networking•‘Per core’ building block
SQL Server Fast Track Data Warehouse 2.0 for DELL
2 Processor ConfigurationServer: Dell Power Edge R710 with 2 Quad-core Intel Xeon processors8 CPU Cores32GB MemoryStorage server: EMC CLARiiON AX4Scalability: 4 – 8 TB
4 Processor ConfigurationServer: Dell Power Edge R900 with 4 6-core Intel Xeon processors24 CPU Cores96 GB MemoryStorage server: EMC CLARiiON AX4Scalability: 12 – 24 TB
SQL Server 2008 R2 Parallel DW
“PDW”
SQL Server Parallel DW Architecture
Database Servers
Du
al
Infi
nib
an
d
Control Nodes
Active / Passive
Landing Zone
Backup Node
Storage Nodes
Spare Database Server
Du
al
Fib
er
Ch
an
nel
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
Management Servers
Client Drivers
ETL Load Interface
Corporate Backup Solution
Data Center Monitoring
Corporate Network Private Network
SQL
SQL
SQL Parallel and Fast Track Hub and Spoke
22<Session Name> Microsoft NDA-only
Central EDW Hub
Regional Reporting
Departmental Reporting
ETL Tools
High Performance HQ
Reporting
Agenda
What’s new in R2 for Data WarehousingData Warehousing ArchitecturesCustomer ExamplesDemos
Top statisticsCategory MetricLargest single database 80 TB
Largest table 20 TB
Biggest total data 1 customer 2.5 PB
Highest transactions per second 1 db
36,000
Fastest I/O subsystem in production
20 GB/sec
Fastest “real time” cube 15 sec latency
Data load for 1TB 20 minutes
Largest cube 4.2 TB
Pan Starrs Project
Largest Astronomy project in history4 telescopes capturing 1.5 giga pixel imagesLargest DB approaching 80TB+Total data managed > 1PB5+TB added per dayHA/DR
Relying on backups of the input files for now.
Telecom
CDR Analytics70TB Relational4TB largest cube100+ concurrent queriesItanium 64 core with storage system rated over 20GB/sec throughputLoading 1TB in < 30 minutesProcessing 1m rec/sec in AS cubes
Business
Online gaming applications - Europe‘s largest betting line-upSports Poker Casino Skill Games
90 different sports covered in 22 languages > 12,000 different bets offered per day> 3 million individual and combination bets placed every dayBwin.com sponsors top world soccer teams
Real Madrid AC Milan FC Bayern Munich
Key TechnologiesRunning on SQL Server 2008 & Windows 2008 EnterpriseWindows Communication FoundationSynchronous database mirroring between two centers 12 km apart
Added 1 ms delay on transaction99.99x% availability @ 24 x 7 since migrating to SQL from Oracle.100.00% uptime in 2008 and 2009 (since moving to SQL 2008 and Windows 2008)Zero data loss (financial transactions are involved)
Replication and Log shipping for most databasesDB Mirroring for betting data base.Full suite of SQL products - IS, AS and RSASP.NET for application
Some numbers
Peak financial transactions 6000 per secondPeak db transactions 30,000 per secondDatabases 800+Instances 100+Largest table 2 billion rowsTotal data in SQL Server 100+ TBBackup of 2 TB over network under 1 hrLargest machines 64 core 512 GB IA2 HP
6 x 32 core IA2
http://sqlcat.com/whitepapers/archive/2009/08/13/a-technical-case-study-fast-and-reliable-backup-and-restore-of-a-vldb-over-the-network.aspxhttp://www.microsoft.com/casestudies/Case_Study_Detail.aspx?casestudyid=4000001470
Customer Success StoriesCustomer Problem Solution Benefits
Premier BankcardCredit Card Company Runs its Business with 17-Terabyte Mission Critical BI Solution
Premier needed to enhance scalability and performance for its business intelligence (BI) data warehouse and online transaction processing (OLTP) databases.
Enhanced BI infrastructure by
upgrading 17-terabyte data warehouse to Microsoft® SQL Server™ 2008 Enterprise (64-bit), hosted on 16 Intel Itanium 2 processors
We have about 9,000 concurrent users generating a continuous 700 transactions per second, sometimes more than doubling to 2,000 transactions per second. With the 64-bit version of SQL Server 2008 running on Itanium 2 processors we see no limit to our ability to scale our transaction processing
MySpaceMySpace Uses SQL Server Service Broker to Protect Integrity of 1 Petabyte of Data
MySpace needed to find a data platform to support 130 million monthly active users, with 300,000 new users added each day,8 billion friend relationships it manages, 34 billion e-mail messages it stores, while adding 41 million more daily.
The site’s 1 petabyte of data is managed by 440 Microsoft® SQL Server® instances and resides on 3PAR® Utility Storage.
We needed to see if Service Broker could handle loads of 4,000 messages per second. Our testing found it could handle more than 18,000 messages a second. We were delighted that we could build our solution using Service Broker, rather than creating a custom solution on our own
Entergy Entergy needed a data store for 3 trillion SCADA records in order to control their power grid
The system’s Microsoft
SQL Server handles 80-terabytes of data compressed down to 8. It continues to grow at a rate of 2 terabytes (20-terabytes compressed) per year.
The ability to act proactively is the holy grail in our industry, and that is what we are gaining from our Pegasus RDS hosting of trillions of SCADA records on the Microsoft Application Platform
Agenda
What’s new in R2 for Data WarehousingData Warehousing ArchitecturesCustomer ExamplesDemos
Fast TrackData Warehouse
• Best price/performance for DW workload
• High scan rate through sequential I/O
• Data Compression reduces disk footprint
demo
Data Loads
• Load 100GB in 8 min.• Note: significantly faster loads possible given more powerful HW –
see SSIS World Record Benchmark
• Parallel data load using SSIS• Take advantage of available
hardware
demo
© 2010 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.