Use this title slide only with an image BlackRock Achieves Performance and Scale Using SAP ASE Prasad Pandhigunta, BlackRock Ashok Swaminathan, SAP March 1, 2016
Use this title slide only with an image
BlackRock Achieves Performance and Scale Using SAP ASE
Prasad Pandhigunta, BlackRockAshok Swaminathan, SAPMarch 1, 2016
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 2Internal
AGENDA
• Achieving performance and scalability using SAP ASE at BlackRock
• Overview of SAP ASE 16 SP02 Performance Features
SAP ASE15.7 - Performance and Scale
Prasad Pandhigunta
March 1, 2016
FOR SAP WEBCAST ATTENDEES ONLY
FOR SAP WEBCAST ATTENDEES ONLY
BlackRock – World’s largest Asset Manager• 135+ Investment Teams
• 7,700+ Portfolios Managed
• $ 4.6 Trillion Assets Under Management
• 12,000+ Employees around the world
Aladdin® - Asset Management Platform
• Offered as a hosted service to other investment management managers• http://www.blackrock.com/aladdin/offerings/aladdin-overview
Aladdin® is used to manage or oversee $17 Trillion including $4.6 Trillion of BlackRock client
assets
BlackRock
4
FOR SAP WEBCAST ATTENDEES ONLY
SAP ASE is central to BlackRock’s Aladdin® Platform
250 Production ASE Servers, 1000 Replication Servers
Almost all the servers are running on ASE15.7 SP63+++
ASE configured with up to 80 Engines and 1.5 TB of memory
Total Aladdin DB disk allocation 500 TB
Maximum total database size for an ASE 25 TB
Largest individual database 9 TB
3 Trillion Logical Reads per day
Over 11,000 SQL statements per second in our busiest server
SAP ASE - Operating Environment
5
FOR SAP WEBCAST ATTENDEES ONLY
Security, Scale and Simplicity
3 S
6
FOR SAP WEBCAST ATTENDEES ONLY
1. Scalability Limitations• CPU utilization spikes with no clear correlation to a root cause• Low Logical IOs observed during CPU spikes• Increasing engines was not helpful• No understanding of why and where the bottleneck was in scaling up
2. Data server Size Limitations• 8 TB Max Limit
3. Query Plan Stability• Limited options available to influence query plans for runtime problems
4. Engine Group/Execution Class• Lack of “DEFAULT CLASS” prevented strict ring fencing
ASE 12.5.4 to 15.7– Compelling Reasons To Upgrade
7
FOR SAP WEBCAST ATTENDEES ONLY
1. Abstract Plan Injection2. Update Stats – Changing densities3. Logical IOs to CPU Ratio Analysis4. Statement Cache Settings5. Procedure Cache Settings6. Data Cache Tuning7. DES Bind8. Engine Groups9. VLDB – Very Large Databases
ASE 15.7 - Lessons Learnt Post Upgrade
8
FOR SAP WEBCAST ATTENDEES ONLY
Problem: Non-optimal planSolution: Abstract Plan Injection1. sp_configure ‘abstract plan load’,12. Use different optimizer settings to get the right plan and
inject the plan as shown3. You can also append the optimal plan to the query and
inject the plan4. Hashkey of the runtime query should match the
hashkey in sysqueryplans5. The plan injection login id and db context should match
the runtime user login id and db context6. Auto literal parametrization rules apply
1. Abstract Plan Injection
9
FOR SAP WEBCAST ATTENDEES ONLY
Abstract Plan Injection
10
select distinct v1.wkflw_item_id into #MemoField_wi from wkflwdb.dbo.wkflw_attribute_value as v0 ,wkflwdb.dbo.wkflw_attribute_value as v1 where v0.data_dictionary_id = @@@V0_INT and v0.wkflw_attribute_value_id = @@@V1_INT and v0.value_varchar IN (@@@V2_V CHAR1 ,@@@V3_VCHAR1 @@@V4_VCHAR1 , @@@V5_VCHAR1 , @@@V6_VCHAR1 , @@@V7_VCHAR1 , @@@V8_VCHAR1) and v1.data_dictionary_id = @@@V9_INT and v1.wkflw_attribute_value_id = @@@V10_INT and v1.value_varchar IN (@@@V11_VCHAR1) and v0.wkflw_item_id = v1.wkflw_ite m_id order by wkflw_item_id asc ( insert ( distinct_sorted ( nl_join ( i_scan idx_wkflw_attribute_value3 ( table ( v1 [wkflwdb.dbo.wkflw_attribute_value] ) ) ) ( i_scan idx_wkflw_attribute_value3 ( table ( v0 [wkflwdb.dbo.wkflw_attribute_value] ) ) ) ) )) ( prop ( table ( v1 [wkflwdb.dbo.wkflw_attribute_value] ) ) ( parallel 1 ) ( prefetch 2 ) ( lru ) ) ( prop ( table ( v0 [wkflwdb.dbo.wkflw_attribute_value] ) ) ( parallel 1 ) ( prefetch 16 ) ( lru ) )
Select * from sysqueryplans
FOR SAP WEBCAST ATTENDEES ONLY
Problem: Non-optimal query plan – Non optimal join orderSolution: Manipulate Densities
1. More than anything in statistics, densities play the most important role in generating an optimal plan Statistics for column: "fund"
Range cell density: 0.003093883935182
Total density: 0.024056913728555
2. The smaller the density values the more selective the index is3. sp_modifystats trade,fund,MODIFY_DENSITY,total,factor,”0.001”4. Best way to manage skew in the data
2. Update Statistics - Changing densities
11
FOR SAP WEBCAST ATTENDEES ONLY
Problem: High CPU Utilization but low LogicalReads Solution: Check for Logical IOs to CPU Ratio
1. This is Miles to the Gallon measure for the DBA
2. Query Elapsed Time = CPUTime + WaitTime
3. CPUTime = Parse and Compile Time + Time To Perform LIOs/Sorts etc. + CPU cycles spent spinning for spinlocks
4. monProcessActivity – Ratio of LogicalReads to CPUTime
5. Identify any changes in LIO to CPUTime ratio for your application
6. If the ratio decreased significantly indicates one of two things
– High Spinlock Contention or High Parse and Compile time
7. Look for high Spin Counts in monSpinlockActivity
• “Contention” does tell the full story. Use Spins while looking for Spinlock contention
8. If no clear cut spinlock contention look for queries taking too much time to parse and compile
• Large queries with too many tables and joins
• Statement Cache/Prepared Statements is not being used effectively (ExecutionCount of LWPs is very low)
• High create/drop temp table statements can also result in low Logical IOs to CPURation
3. Logical IOs to CPU Ratio
12
FOR SAP WEBCAST ATTENDEES ONLY
Logical IOs to CPURatio
13
program_name count(*) LIO PIO cpu LIORatio
DecoStatementCal 69 1,323,728,571 330,033 5,055,900 261
binary_preserver 191 244,632,068 442,591 4,578,300 53irv_server 459 79,816,387 14,571 3,642,400 21index_copy.pl 1,485 35,767,223 24,857 2,640,400 13
CashExceptionSer 511 8,445,364 81,522 2,047,900 4
ops_mng_reports. 32 625,564,517 1,339,311 1,975,200 316
IndexValidations 33 335,114,722 0 1,695,400 197SLServer 92 697,156,579 5,448,493 1,644,200 424
generate_adx_ext 122 152,953,052 10,788 1,447,200 105
fGP 256 228,649,043 2,958,052 1,217,900 187write_barra_exp 28 46,474,605 1,167 1,024,900 45
publishDoubleDip 7 217,229,469 913,336 826,300 262
gen_menu.pl 65 341,956,496 51,927 819,800 417
eod_cred_dig_che 8 237,230,606 674,441 816,200 290
missing_factor_c 13 148,446,955 185,088 777,600 190buysell_px.pl 15 193,496,991 10,105 695,400 278EjvPriceLoad.pl 10 37,153,816 4,696 683,600 54gen_pni 13 226,815,157 616,467 681,700 332dtcc_soi_file_ge 9 291,024,845 31,737 631,300 460
The LIOs to CPURatio for the highlighted applications is low. Generally the root cause is parse and compile time, spinlock contention or very high number of create/drop temp tables
FOR SAP WEBCAST ATTENDEES ONLY
Problem: High Parse and Compile Time, High spinlock contention on system tables
Solution: Statement Cache with auto literal parametrization1. Too large a size of statement cache increases hash chain length which in turn increases spinlock contention
(SSQLCACHE_SPIN)
2. Too much churn in statement cache when temp tables are involved and there are no persistent connections
– Use TF - 467 to prevent caching of statements with temp tables
3. Cached Statements – Low reuse count
– Literals in the select clause does not parametrize (select ‘123’ as col1 from mytable)
– Literals in sub-queries does not parametrize (select * from myTable where id = (select max (id) from myTable2 where name = ‘ABC’))
– Literals in Like Clause does not parametrize (select * from myTables where col1 like ‘ABC%’)
4. Total of 6 Grabs on spinlock per execution of single cached statement
– Convert high frequency queries to prepared statements or stored procedures to reduce SSQLCACHE_SPIN
4. Statement Cache
14
FOR SAP WEBCAST ATTENDEES ONLY
Statement Cache
15
• SSQLCACHE_SPIN contention- High frequency statements as shown below are better served if they are prepared statements or stored procedures instead of cached statements
Statement Cache
FOR SAP WEBCAST ATTENDEES ONLY
Problem: Persistent 702 errors and Server Stability ProblemsSolution: Right Sizing of Procedure Cache1. Make procedure cache big enough2. Huge SQL statement batches need large allocation of procedure cache
– Batch Inserts via perl– Can cause server stability issues if there is a sudden surge
3. Watch for proliferation of LWPs from prepared statements– Typically problems with OCS version or programming errors– Same hashkey but many copies of LWP
Select substring(ObjectName,14,15),count(*) from master..monCachedProcedure
where ObjectName like ‘*s[qh]%
Group by substring(ObjectName,14,15)
Order by 2
4. Procedure Cache utilization keeps growing and does not reduce even after purging– Typically OCS problems
5. Procedure Cache
16
FOR SAP WEBCAST ATTENDEES ONLY
Problem: High data cache spinlock contentionSolution: 21 named caches + default data cache
1. Static data/small tables with high logical IOs is bound to a cache with ‘relaxed LRU’– All the data fits into the cache (eg. 10 GB table data bound to 11 GB cache)
2. DOL tables in certain queries can increase spinlock contention abnormally– One spinlock per nested loop iteration– Had to convert a small set of tables back to APL
3. Too small sized cache pools and too high partitions leads to problem• The 4K memory pool of named cache tempdb_cache (cache id 9, cachelet id 14) is configured too small
for current demands 4. 2K Vs 16K buffer pool configuration is critical
– Our default is 2:1 ratio but monitor monCachePool to avoid too many physical IOs in a specific pool
6. Data Cache Tuning
17
FOR SAP WEBCAST ATTENDEES ONLY
Problem: Very high IDES Chain Spinlock contentionSolution: 60 tables are des_bound using dbcc tune
1. CPU Utilization was very high with no proportional increase in Logical IOs• Ides chain spinlocks were dominant and the root cause of contention
2. Typically use “used_count” in monOpenObjectActivity while look for candidates to des_bind3. Looking forward to ASE16’s keep count enhancements to eliminate or reduce some of the spinlocks like
IDES chain
7. DES Bind
18
FOR SAP WEBCAST ATTENDEES ONLY
Problem: Persistent workload balancing issuesSolution: Define Engine Groups and bind to execution class 1. Each engine is bound only to one engine group
2. Network Engines handle network traffic for all SPIDs
3. Critical Applications are bound to a specific engine group
4. DEFAULT_CLASS handles all the applications NOT bound
5. sp_bindexeclass NULL,’DF’,NULL,’DEFAULT_CLASS’
8. Engine Groups/Execution Classes
19
classname priority engine_group engines
NETWORK_CLASS HIGH NETWORK_ENGINES 0,1,2,3,4
ADMIN_CLASS HIGH ADMIN_ENGINES 5,6,7
GRP1_CLASS HIGH GRP1_ENGINES 8,9,10,11,12,13,14
GRP2_CLASS HIGH GRP2_ENGINES 15,16,17,18,19,20,21
GRP3_CLASS HIGH GRP3_ENGINES 22,23,24,25,26,27,28
GRP4_CLASS HIGH GRP4_ENGINES 29,30,31,32,33,34,35
GRP5_CLASS HIGH GRP5_ENGINES 36,37,38,39,40,41,42
GRP6_CLASS HIGH GRP6_ENGINES 43,44,45,46,47,48,49
DEFAULT_CLASS HIGH DEFAULT_ENGINES 50,51,52,53,54,55,56,58,59,60,61,62,63
FOR SAP WEBCAST ATTENDEES ONLY
sp_listener ‘status’ – Just 4 engines are active
8. Engine Groups/Execution Classes
20
proto host port engine status tcp dbhost 4500 0 active Network Enginestcp dbhost 4500 1 active Network Enginestcp dbhost 4500 2 active Network Enginestcp dbhost 4500 3 active Network Enginestcp dbhost 4500 4 stopped tcp dbhost 4500 5 stopped tcp dbhost 4500 6 stopped
sp_listener ‘status’
FOR SAP WEBCAST ATTENDEES ONLY
Problem: Current maximum total database size for an ASE 25 TB with maximum single database size 9 TBSolution: VLDB implementation1. Semantic Partitioning
• Range partition• Mostly local indexes• Maintenance window needed reduced from over 48 hours to just few hours
– Update stats performed on partitions where data_change() > 0 • Great boost in performance – Most queries target just the last partition• Replication latency reduced and attributed to the reduction in the depth of the indexes in local indexes
2. Migration to ASE with 16K page size allows for a maximum DB size of 64 TB
9. Very Large Databases (VLDB)
21
FOR SAP WEBCAST ATTENDEES ONLY
Security, Scale and Simplicity
Tomorrow’s story!
22
23
Disclosure
FOR SAP WEBCAST ATTENDEES ONLY
No part of this material may be reproduced or transmitted in any form or by any means, electronic, mechanical, recording or otherwise, without the prior written consent of BlackRock. © 2016 BlackRock, Inc. All rights reserved. BLACKROCK, BLACKROCK SOLUTIONS, and iSHARES are registered trademarks of BlackRock, Inc. or its subsidiaries. All other trademarks are the property of their respective owners.
SAP ASE 16 SP02 Performance Features
Ashok Swaminathan (Senior Director, Product Management, SAP)March 2016
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 25Internal
Disclaimer
This presentation outlines our general product direction and should not be relied on in making a purchase decision. This presentation is not subject to your license agreement or any other agreement with SAP. SAP has no obligation to pursue any course of business outlined in this presentation or to develop or release any functionality mentioned in this presentation. This presentation and SAP's strategy and possible future developments are subject to change and may be changed by SAP at any time for any reason without notice. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent.
Overview of SAP ASE 16 SP02 Performance Features
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 27Internal
SAP ASE Investments In High Performance
Version Features Benefits15.7 • ASE In-memory Database (IMDB) - entire database in
memory, but without support for durability of transactions• Typical usage : IMDB as temp DB or use of IMDB for fast
computations without the need to persist changes• Performance increase by 2X (up to 10x in some customer cases)
16.0 • Run Time Logging Enhancements (Buffer Unpinning)• Metadata Management Enhancements• Lock Management Enhancements
• Allowed scale up beyond 32 cores (up to 80 cores), near linear scaling
• Achieved over 1 million/minute business transactions
16 SP02 • Compiled Queries• Transactional Memory• Lockless Data cache• Latchfree Btree• Non Volatile Cache Management
• Latency and Throughput benefits
• Investments starting from ASE 15.7• Early vendor to introduce In-Memory technology• Continued focus with each release
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 28Internal
ASE 16 – Start Of High Performance Focus
16 32 48 64 800
200000400000600000800000
10000001200000
Scale-up Performance
Transactions per minute
> 1 million tpm
Engines
Environment
SUSE Linux Enterprise Server 11
Platform: 8 sockets/80 cores Intel® Xeon® [email protected] with 1TB RAM
Feature Description
Scale up on large SMP systems
• Minimize locking/latch contention, in highly concurrent environments• Enhancements in the area of run time logging optimization, metadata management, lock management
Partition level Locking • Granular locking for partitioned table• Enables Concurrent DDL and DML operations on a table
Dynamic Thread Assignment
• Allows executing query plans in parallel with fewer resources/threads• If # of threads < work units, threads completing execute remaining tasks
Index Compression • Index compression enabling storage savings for large indexes • Completes current capabilities - data compression (row/page), LOB, backup
Optimized Star Join Queries
• Allows hints in the join syntax• Can use the syntax “plan (use fact_table tablename)” –query processing optimizations related to star join
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 29Internal
SAP ASE 16 SP02 – High Performance Features
To Be Developed
T/SQLJDBCODBC
XOLTP QP
Parallel Dynamic ThreadingCode Compile
MVC
C
In-memory Row Store
Main MemoryLockless Metadata
Transaction Memory
Kernel Threading
DiskFlash Disk
Upcoming New In SP02
Buffer Cache
Scripts
Available
Latch Free BTree
Lockless Data Cache
Simplified Native Access Plans (Compiled Queries)• Compiled query plans – faster execution• Transparent to applications and users
Latchfree B-Tree on Indexes• Reduces contention• Increases concurrency and performance
Lockless Data Cache• Decreases cache contention• Increases concurrency and performance
Transactional Memory• Minimize contention leveraging hardware for identifying memory
conflicts
Non Volatile Cache Management• Leverage SSD for storing frequently accessed/updated pages
Latch Free BTree
Lockless Data Cache
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 30Internal
Simplified Native Access Plan (SNAP) - Overview
A new query plan execution feature Avoids repetitive query exec code generation from the physical
operators output from query optimization– Works ono Cached SQL Statements in statement cacheo Statements within stored procedureso Fully prepared SQL statements
Available on x86-64/amd64 Linux
Transparent to the userEnabled by configuration parameter; ASE uses SNAP automatically when possible.Unsupported plans continue to use lava execution engine.
SELECT {column list}FROM table COND1 due_dt <=getdate()COND2 (AND) recv_date is null
SELECT {column id’s & datatypes}FROM objid=123456COND1 col_id=3 (dt) >= (dt) ‘Jan 1 2015’COND2 (AND) col_id=4 (dt) IS NULL
SQL Parsing
Normalization
Pre-Processing
Query Optimization
Native Access Plan (Query Exec Code)
Query Compilation
TDSLANG select * from table where due_dt =getdate() and recv_date is null
Receive Buffer
Query Execution
Query Execution
SNAP
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 31Internal
Latch Free BTree
Latch is used to ensure physical consistence of page between DMLs and query.
Minimize contention where SH_LATCH and EX_LATCH block each other, when multiple threads need to synchronize between each other when modifying/reading index pages
Logical lock semantics not changed
BLOCK
page
Query DMLs
SH_LATCH EX_LATCH
Select * from customer where money > 5000 Insert into customer values(6000)
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 32Internal
Lockless Data Cache
Cache ContentionAccess to pages in named caches are through hashtables. Spinlocks protect buckets in named cache’s hashtable.
Contention on cache spinlock can be resolved by Moving object(s) to separate named caches
or Partitioning the cache
For example, lockless data cache can help when there is significant contention on a single cache partition but not significant contention on other cache partitions
This means less waits and spins on spinlocks, and thus lower contention. In turn lower CPU utilization, higher performance and improved scalability
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 33Internal
Transactional Memory
Objective: Increase concurrency (and minimize contention) by leveraging hardware for tracking memory conflicts as a
result of concurrent execution of code Background
– Spinlock (used to protect certain data structures) has been fragmented and distributed in the ASE code– Contention is on Lock – Many times, not on actual data which we want to protect
Solution Allow multiple threads to modify data structures, unless there is conflict at memory level Newer hardware from Intel (Haswell-EX/Xeon E7 V3 processor), and IBM (Power), keeps track of memory
reads/writes and memory contention By allowing concurrent processes to modify memory, and undoing the offending process/thread when there is a
memory conflict, concurrency is increased
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 34Internal
Non-Volatile Cache Management
DB ACCESS/STORAGE
Buffer Manager
DISK
• SSD cache will act as a intermediate storage.
• Buffer manager reads pages from disk (cold pages)
• Hot and warm pages that cannot fit in Buffer manager will be evicted to the SSD cache.
• Pages which are dirty will also be shifted to the SSD cache. This dirty pages then will be written in a delayed manner to the HDD disk.
• Frequently read pages will be in SSD
SSD CACHE
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 35Internal
Future Directions – Relating To Performance Features
XOLTP
In-Memory Processing
(DRC)
Transactional Memory
Buffer Cache Extension via
SSD
Compiled Queries
MVCC
© 2015 SAP SE or an SAP affiliate company. All rights reserved.
Thank you
Contact information:
Ashok [email protected]
© 2015 SAP SE or an SAP affiliate company. All rights reserved.
QUESTIONS