Les Conférences Groupe des Utilisateurs SQL Server Juin 2013 – SQL Server in-memory Alexandre Chemla – Masao Frédéric Pichaut – Microsoft
Apr 01, 2015
Les Conférences
Groupe des Utilisateurs SQL Server
Juin 2013 – SQL Server in-memoryAlexandre Chemla – MasaoFrédéric Pichaut – Microsoft
Frédéric PichautSR Escalation EngineerMicrosoft France
SQL Server 14In-Memory “Hekaton”
24 Juin 2013
Agenda• What is “Hekaton”• Integrated in SQL Server• Memory consideration• Storage• New DMV’s• AMR Tool• Bonus…
Memory optimized table and index structures
No buffer Pool
Native compilation of business logic in stored procedures
“Hekaton” is fully integrated into SQL Server
Latch- and lock-free data structures
What is HekatonProject “Hekaton” adds in-memory technology to boost performance of OLTP workloads in SQL “14”
Memory-optimized Table Filegroup Data Filegroup
SQL Server.exe
Hekaton Engine: Memory_optimized Tables &
Indexes
TDS Handler and Session Management
Hekaton Integration and Application Migration
Natively Compiled SPs and Schema
Buffer Pool for Tables & Indexes
Proc/Plan cache for ad-hoc T-SQL and SPs
Client App
Transaction Log
Query Interop
Non-durable Table T1 T4T3T2
T1 T4T3T2
T1 T4T3T2
T1 T4T3T2
Tables
Indexes
Interpreter for TSQL, query plans, expressions
T1 T4T3T2
T1 T4T3T2
Checkpoint & Recovery
Access Methods
Parser, Catalog, Algebrize
r, Optimize
r
Hekaton Compiler
Hekaton Compone
nt
KeyExisting
SQL Compone
nt
Generated .dll
Memory-optimized Table
FilegroupData Filegroup
SQL Server.exe
Hekaton Engine for Memory_optimized Tables & Indexes
TDS Handler and Session Management
Performance Gains
Natively Compiled SPs and Schema
Buffer Pool for Tables & Indexes
Proc/Plan cache for ad-hoc T-SQL and
SPs
Client App
Transaction Log
Query Interop
Interpreter for TSQL, query plans, expressions
Access Methods
Parser, Catalog, Algebrize
r, Optimize
r
Hekaton
Compiler
10-30x more efficient
Reduced log bandwidth &
contention. Log latency remains
Checkpoints are background
sequential IO
No improvements in communication
stack, parameter passing, result set
generation Hekaton Compone
nt
KeyExisting
SQL Compone
nt
Generated .dll
Operation Factor faster (slower) than regular SQL
Comments
Interop Native
Select count(*)1 (2.5) = No clustered index scan in Hekaton
Hash Join1 (1.3) N/A Uses index scan
Nested-loop Join1 4.0 N/A Probes into hash index
Single-row selects1 1.3 40 SP doing selects in loop
Single-row selects1 1.2 17 Native compiled SP calls SQL’s rand()
Single-row updates1 N/A 10 SP doing update in loop
Bwin Session State 6 Version M4
Hekaton Performances
Expectation for OLTP workloads
Advantage of pushing work to
SPs
Interop targets app migration,
not perf
(1) 1 million rows accessed in single query or SP
Agenda• What is “Hekaton”• Integrated in SQL Server• Memory consideration• Storage• New DMV’s• AMR Tool• Bonus…
Integrated ExperienceBackup and RestoreFull and log backup and restore is supported; piece-meal restore is supported
Failover ClusteringFailover time depends on size of durable memory optimized tables
AlwaysOnSecondary has memory optimized tables in memoryFailover time is not dependent on size of durable memory optimized tables
DMVs, Catalog Views, Perfmon counters, XEventsMonitoring memory, GC activity, transaction details
SSMSCreating, managing and monitoring tables, databases and server
Query OptimisationSame SQL Server Optimiser
• StorageALTER DATABASE ContosoOLTP ADD FILEGROUP [ContosoOLTP_hk_fs_fg] CONTAINS MEMORY_OPTIMIZED_DATA;ALTER DATABASE ContosoOLTP
ADD FILE (NAME = [ContosoOLTP_fs_dir], FILENAME = 'H:\MOUNTHEAD\DATA\CONTOSOOLTP_FS_DIR') to FILEGROUP [ContosoOLTP_hk_fs_fg];
• TableCREATE TABLE Customers (
CustomerID nchar (5) NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT=100000),CompanyName nvarchar (40) NOT NULL INDEX IX_CompanyName HASH(CompanyName) WITH (BUCKET_COUNT=65536),ContactName nvarchar (30) NOT NULL , ContactTitle nvarchar (30) NOT NULL , Address nvarchar (60) NOT NULL , City nvarchar (15) NOT NULL INDEX IX_City HASH(City) WITH (BUCKET_COUNT=1024), Region nvarchar (15) NOT NULL INDEX IX_Region HASH(Region) WITH (BUCKET_COUNT=1024), PostalCode nvarchar (10) NOT NULL INDEX IX_PostalCode HASH(PostalCode) WITH (BUCKET_COUNT=100000),Country nvarchar (15) NOT NULL , Phone nvarchar (24) NOT NULL) WITH (MEMORY_OPTIMIZED=ON, , DURABILITY = SCHEMA_AND_DATA)
• Native procedureCREATE PROC InsertCustomers (@CustomerID nchar(5),@CompanyName nvarchar(40),
@ContactName nvarchar(30),@ContactTitle nvarchar(30), @Address nvarchar(60),@City nvarchar(15),@Region nvarchar(15),@PostalCode nvarchar(10),@Country nvarchar(15),@Phone nvarchar(24))
WITH NATIVE_COMPILATION, SCHEMABINDING, execute as owner asBEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, language = 'english')
INSERT INTO [dbo].[Customers] VALUES(@CustomerID,@CompanyName,@ContactName,@ContactTitle,@Address, @City,@Region,@PostalCode,@Country,@Phone,@Fax);END
Hekaton syntaxes
Table CreationCREATE TABLE DDL
Table code generated
Compiler invoked
Table DLL produced
Table DLL loaded
Memory Optimized Tables and Indexes
90,150 Susan Bogota
50, ∞ Jane Prague
100, 200
John Paris
200, ∞ John Beijing
Timestamps NameChain ptrs City
Hash index on City
BP
Hash index on Name
JS
Garbage Collection Removes Unused Rows
Hekaton Memory Transaction log
237 | 001 | George | SEA235 | 002 | Fred | CHI 237 | 001 | George |
SEA
234 | 001 | George | LAX
235 | 002 | Fred | CHI237 | 001 | George | SEA
Checkpoint File Delta File
234 | 001 | 237
XID RowID Name(PK) Airport
Create DeleteXID RowID XID
Del
Add
Add
Add
235 | 002 | Fred | CHI
234 | 001 | George | LAX234 | 001 | George | LAX
234 | 001 | George | LAX234 | 001 | George | LAX
Hekaton Checkpoint Data Flows
Memory Optimized Tables - LimitationsOptimized for high-throughput OLTPNo DML triggersNo XML and no CLR data types
Optimized for in-memoryRows are at most 8060 bytes – no off row dataNo Large Object (LOB) types like varchar(max)
Scoping limitationsNo FOREIGN KEY and no CHECK constraintsNo schema changes (ALTER TABLE) – need to drop/recreate tableNo add/remove index – need to drop/recreate table
Accessing Memory Optimized Tables Natively Compiled Stored
ProceduresAccess only memory optimized tablesMaximum performanceLimited T-SQL surface area
When to useOLTP-style operationsOptimize performance critical business logic
Interpreted T-SQL AccessAccess both memory- and disk-based tables Less performantVirtually full T-SQL surface area
When to useAd hoc queriesReporting-style queriesSpeeding up app migration
T-SQL Compiled to Machine Code
• T-SQL compiled to machine code via C code generator and VC
• Invoking a procedure is just a DLL entry-point
• Aggressive optimizations @ compile-time
Stalling CPU clock rate
Hardware trends
Native Compiled Stored Procedures – Design Considerations
Efficient, business-logic
processingCu
sto
mer
Ben
efi
ts
Hekato
n T
ech
P
illa
rsD
rivers
Native Compiled Stored Procedures
Non-Native Compilation
Performance High. Significantly less instructions to go through
No different than T-SQL calls in SQL Server today
Migration Strategy Application changes – development overhead
Easier app migration as can still access Memory Optimized (MO) tables
Access to objects Can only interact with Memory Optimized tables
All Objects. Access for transactions across MO and b-tree tables
Support for T-SQL Constructs
Limited. T-SQL surface area (limit on MO interaction)
Optimization/Stats and Query Plan
Statistics utilized at CREATE -> Compile time
Statistics updates can be utilized to modify plan at runtime
Flexibility Limited (e.g., no ALTER procedure, compile-time isolation level)
Ad-hoc query patterns
• Statistics on the index key columns are created when the table is empty.
• Need to be updated after data is loaded into the table.• For natively compiled stored procedures, execution
plans for queries in the procedure are optimized when the procedure is compiled. When the procedure is created and when the server restarts, not when statistics are updated.
• The tables need to contain a representative set of data and statistics need to be up-to-date before the procedures are created. (Natively compiled stored procedures are recompiled if the database is taken offline and brought back online.)
Statistics On Memory-optimized Table
Hekaton Concurrency ControlMulti-version data store
Snapshot-based transaction isolation
No TempDB
Conflict detection to ensure isolation
No deadlocks
No locks, no latches, minimal context switches
No blocking
Multi-version
Optimistic
Supported Isolation LevelsSNAPSHOTReads are consistent as of start of the transactionWrites are always consistent
REPEATABLE READRead operations yield same row versions if repeated at commit time
SERIALIZABLETransaction is executed as if there are no concurrent transactions – all actions happen at a single serialization point (commit time)
Example: Write conflict
Time Transaction T1 (SNAPSHOT) Transaction T2 (SNAPSHOT)
1 BEGIN
2 BEGIN
3 UPDATE t SET c1=‘bla’ WHERE c2=123
4 UPDATE t SET c1=‘bla’ WHERE c2=123 (write conflict)
First writer wins
Guidelines for usage1. Declare isolation level – no locking hints
2. Use retry logic to handle conflicts and validation failures
3. Avoid using long-running transactions
Agenda• What is “Hekaton”• Integrated in SQL Server• Memory consideration• Storage• New DMV’s• AMR Tool• Bonus…
“Hekaton” MemoryTable Data
Rule of thumb: 2 x data_size
Index Data
Bucket_count x pointer_size
Monitoring
Use DMVs, SMO, SSMS
Configuration
Use resource governor
Considerations
Management
Memory Size EstimationCREATE TABLE dbo.Orders( OrderID int NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT=1000000), CustomerID int NOT NULL INDEX IX_CustomerID HASH WITH (BUCKET_COUNT=1000000), OrderDate datetime NOT NULL, OrderDescription nvarchar(1000) ) WITH (MEMORY_OPTIMIZED=ON)• Assume the Orders table has 1M rows, and the average length of OrderDescription 78
characters.• Index size:
• The bucket_count 1000000. This is rounded up to the nearest power of 2: 1048576. • 8 * 1048576 + 8 * 1048576 = 16777216 bytes
• Table data size• [row size] * [row count] = [row size] * 8379 • [row size] = [row header size] + [actual row body size] • [row header size] = 24 + 8 * [number of indices] = 24 + 8 * 2 = 40 bytes • [actual row body size]
• SUM([size of shallow types]) = 4 [int] + 4 [int] + 8 [datetime] = 16 • 2 + 2 * [number of deep type columns] = 2 + 2 * 1 = 4 • NULL array = 1 + NULL array padding = 1 • Size so far is 16 + 4 + 1 + 1 = 22. Padding to Nearest multiple of 8 is• [actual row body size] = 24 + 2*78 = 180 bytes. So [row size] = 40 + 180 =
220 bytes • [table size] = 16777216 + 220 * 1000000 = 236777216 bytes ~= 230Mb
• The bucket count should be set to about two times the maximum expected number of distinct values in the index key.
• Balance the amount of memory allocated to the hash table and the number of distinct values in the index key.
• The higher the bucket_count value, the more empty buckets there will be in the index.
• The lower the bucket count, the more values are assigned to a single bucket. This decreases performance for point lookups and inserts, because SQL Server may need to traverse several values in a single bucket to find the value specified by the search predicate.
Determining Bucket Count
Memory ConsiderationsScenarioInserting more rows than rows that can fit in memory
Database does not come online
Transactions start failing
Recovering database that does not fit in memory
Memory pressure from “Hekaton” workload on other workloads
Operations in other workloads start failing
Read error log
Identify via DMVs, SSMS whether “Hekaton” is using most memory
Free up memory
Add memory
Identify and stop long running transactions
Symptom
Diagnosis
Solution
Agenda• What is “Hekaton”• Integrated in SQL Server• Memory consideration• Storage• AMR Tool• New DMV’s• Bonus…
• SCHEMA_ONLY (non-durable table) • When SQL Server is restarted, the non-durable table is recreated, but
starts with no data.• Avoids both transaction logging and checkpoint, which can significantly
reduce IO operations.
• SCHEMA_AND_DATA (durable table) • The data is persisted in the memory-optimized filegroup (a filestream
filegroup) .• It can hold multiple containers.
Durability Options
• Root File• Contains metadata / Other files description
• Data File• The rows are appended in the transaction log order• A given data file will contain transactions that occurred within the
range of transaction end timestamps.
• Delta File• Contains data rows that were deleted. For each deleted row, it inserts
minimal {inserting_tx_id, row_id, deleting_tx_id }• Each data file has a corresponding delta file.
Containers
• Data and delta file are populated by a background thread called offline checkpoint.
Populating Data and Delta File
Memory Optimized Data Filegroup
Ran
ge 3
00-
399
Ran
ge 1
00-
199
Del 150 TS
Ran
ge 2
00-
299
Del 250 TS
Ran
ge 4
00-
499
Del 420 TS
Ran
ge 5
00-
Insert
offline checkpoint Thread
Read Log records
A transaction with a commit timestamp of 600 inserts one new row and deletes rows inserted by transactions with a commit timestamp of 150, 250 and 420
Merge OperationFiles as of Time 500
Memory Optimized Data Filegroup
Memory Optimized Data Filegroup
Key
Ran
ge 1
00-
199
Ran
ge 2
00-
299
Ran
ge 3
00-
399
R
an
ge 4
00-
499
Ran
ge 1
00-
199
Ran
ge 2
00-
399
Ran
ge 4
00-
499
Ran
ge 5
00-
599
Data file with rows generated in timestamp range a-b
Delta file with IDs of Deleted Rows
Merge200-399
Ran
ge 2
00-
299
Ran
ge 3
00-
399
Deleted Files
Files as of Time 600
“Hekaton” Storage ConsiderationsCapacity needed is 2-3 x size of durable memory optimized tables
Use sequential bandwidth sufficient to meet RTO
Spinning media
Latency is important
SSDs
Per transaction log consumption is less than disk based tables
Data
Log
Agenda• What is “Hekaton”• Integrated in SQL Server• Memory consideration• Storage• AMR Tool• New DMV’s• Bonus…
New DMV’s for In-Memory OLTPsys.dm_db_xtp_checkpoint Returns database that has one or more IM objects
sys.dm_db_xtp_checkpoint_files Displays information about checkpoint files
sys.dm_db_xtp_hash_index_stats Useful for understanding and tuning the bucket counts and duplicates for index key
sys.dm_db_xtp_index_stats Reports statistics about scans on an index
sys.dm_db_xtp_memory_consumers Reports the database-level memory consumers in the IM database engine.
sys.dm_db_xtp_object_stats Reports statistics about operations on a memory optimized object.
sys.dm_db_xtp_table_memory_stats Returns memory usage statistics for each IM table (user and system) in the current database.
sys.dm_db_xtp_transactions Reports the active transactions in the IM database engine.
sys.dm_xtp_consumer_memory_usage
Reports memory usage for all memory consumers including @database level and @system level.
sys.dm_xtp_gc_stats Reports information about the current behavior of the IM garbage-collection process.
sys.dm_xtp_system_memory_consumers
Reports information about memory usage.
sys.dm_xtp_threads Reports the threads that the IM database engine has started internally.
sys.dm_xtp_transaction_stats Reports statistics about transactions that have run since the server started.
New and Updated PropertiesNew or updated property,
system view, stored procedures, or DMV
Change
OBJECTPROPERTYEX New property: TableIsMemoryOptimized.
SERVERPROPERTY New property: IsXTPSupported.
sys.data_spaces The following columns display additional values: type and type_desc
sys.indexes The following columns display additional values: type and type_desc.
sys.parameters New column: is_nullable.
sys.all_sql_modules New column: uses_native_compilation.
sys.sql_modules New column: uses_native_compilation.
sys.table_types New column: is_memory_optimized.
sys.tables New columns: durability, durability_desc, and is_memory_optimized.
sys.hash_indexes New: Shows the current hash indexes and the hash index properties
sp_xtp_merge_checkpoint_files New stored procedure: Merges all data and delta files in the transaction range specified.
There are also 13 new waits types in sys.dm_os_wait_stats New Extended Event under category xtp
AMR• Analysis, Migrate and
Report Tool• Configure Management
Data Warehouse,• Configure Data Collection,
and• Run AMR Reports to identify
performance hotspots• Included in CPT1
BEGIN
Is MDW Set up?
Configure Management Data Warehouse
Configure Data Collection
Establish System Performance Baseline Run workload
Run AMR Reports
Migrate
Run Workload and collect performance
metrics
Compare to Baseline and set as new baseline
COMPLETE
AMR Data Collection
AMR Report• Table Analysis• Usage Analysis• Contention analysis
• Store Procedure Analysis• Usage Analysis
AMR Report• Table Analysis• Gain
expected
Agenda• What is “Hekaton”• Integrated in SQL Server• Memory consideration• Storage• AMR Tool• Bonus…
• Fast execution for data warehouse queries• Speedups of 10x and more
• No need for separate base table• Save space
• Data can be inserted, updated or deleted
• Eliminate need for other indexes
• More data types supported• Removes many limitations from non-clustered columnstores in
SQL 2012
Clustered Column Store Index
0.0
5.0
10.0
15.0
20.0
Space Used in GB101 million row table (Table + index space)
Structure of a CCI Partition• CREATE CLUSTERED COLUMNSTORE
Organizes and compresses data into CS• BULK INSERT: Creates new CS row groups• INSERT: Rows are placed in the RS (heap)
• When RS is big enough, a new CS row group is created
• DELETE: Rows are marked in the Deleted Bitmap
• UPDATE: Delete plus Insert
Not intended for OLTP applications, but great for read-mostly data warehouses!Most data is in CS format
Column Store (CS)
DeletedBitmap
Row Store (RS)
Partition
Commandes• CREATE TABLE <table> ( … )
• CREATE CLUSTERED COLUMNSTORE INDEX <name> on <table> Converts entire table to CS formatTake care of memory needed and parallelism (MAXDOP 1)
• BULK INSERT, SELECT INTO <name> on <table>Creates new CS row groups
• INSERT/UPDATEStore in Row Store
Tuple MoverWhen RS reaches 1M rows, convert to a CS row groupRuns every 5 minutes by defaultStarted explicitly by ALTER INDEX <name> ON <table> REORGANIZE
Partitioning works on clustered columnstoresJust like any other tableThe motivation is manageability more than performance
• SQL Server 2012There are several engine limitations thatcan cause queries to run in row modeinstead of batch mode
• SQL Server 14Support for all flavors of JOIN
OUTER JOINSemi-join: IN, NOT IN
UNION ALLScalar aggregatesMixed mode plansImprovements in bitmaps, spill support, …
Batch Mode Improvements
LinksSQL Server 2014 Hastens Transaction Processinghttp://www.cio.com/article/734462/SQL_Server_2014_Hastens_Transaction_Processing
Hekaton Breaks Throughhttp://research.microsoft.com/en-us/news/features/hekaton-122012.aspx
Hekaton: SQL Server’s Memory-Optimized OLTP Enginehttp://research.microsoft.com/apps/pubs/default.aspx?id=193594
© 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Les Conférences
Groupe des Utilisateurs SQL Server
GUSS.fr