Top Banner
Use this title slide only with an image BlackRock Achieves Performance and Scale Using SAP ASE Prasad Pandhigunta, BlackRock Ashok Swaminathan, SAP March 1, 2016
37

BlackRock Achieves Performance and Scale Using SAP ASE

Feb 12, 2017

Download

Data & Analytics

SAP Technology
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: BlackRock Achieves Performance and Scale Using SAP ASE

Use this title slide only with an image

BlackRock Achieves Performance and Scale Using SAP ASE

Prasad Pandhigunta, BlackRockAshok Swaminathan, SAPMarch 1, 2016

Page 2: BlackRock Achieves Performance and Scale Using SAP ASE

© 2015 SAP SE or an SAP affiliate company. All rights reserved. 2Internal

AGENDA

• Achieving performance and scalability using SAP ASE at BlackRock

• Overview of SAP ASE 16 SP02 Performance Features

Page 3: BlackRock Achieves Performance and Scale Using SAP ASE

SAP ASE15.7 - Performance and Scale

Prasad Pandhigunta

March 1, 2016

FOR SAP WEBCAST ATTENDEES ONLY

Page 4: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

BlackRock – World’s largest Asset Manager• 135+ Investment Teams

• 7,700+ Portfolios Managed

• $ 4.6 Trillion Assets Under Management

• 12,000+ Employees around the world

Aladdin® - Asset Management Platform

• Offered as a hosted service to other investment management managers• http://www.blackrock.com/aladdin/offerings/aladdin-overview

Aladdin® is used to manage or oversee $17 Trillion including $4.6 Trillion of BlackRock client

assets

BlackRock

4

Page 5: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

SAP ASE is central to BlackRock’s Aladdin® Platform

250 Production ASE Servers, 1000 Replication Servers

Almost all the servers are running on ASE15.7 SP63+++

ASE configured with up to 80 Engines and 1.5 TB of memory

Total Aladdin DB disk allocation 500 TB

Maximum total database size for an ASE 25 TB

Largest individual database 9 TB

3 Trillion Logical Reads per day

Over 11,000 SQL statements per second in our busiest server

SAP ASE - Operating Environment

5

Page 6: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Security, Scale and Simplicity

3 S

6

Page 7: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

1. Scalability Limitations• CPU utilization spikes with no clear correlation to a root cause• Low Logical IOs observed during CPU spikes• Increasing engines was not helpful• No understanding of why and where the bottleneck was in scaling up

2. Data server Size Limitations• 8 TB Max Limit

3. Query Plan Stability• Limited options available to influence query plans for runtime problems

4. Engine Group/Execution Class• Lack of “DEFAULT CLASS” prevented strict ring fencing

ASE 12.5.4 to 15.7– Compelling Reasons To Upgrade

7

Page 8: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

1. Abstract Plan Injection2. Update Stats – Changing densities3. Logical IOs to CPU Ratio Analysis4. Statement Cache Settings5. Procedure Cache Settings6. Data Cache Tuning7. DES Bind8. Engine Groups9. VLDB – Very Large Databases

ASE 15.7 - Lessons Learnt Post Upgrade

8

Page 9: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Problem: Non-optimal planSolution: Abstract Plan Injection1. sp_configure ‘abstract plan load’,12. Use different optimizer settings to get the right plan and

inject the plan as shown3. You can also append the optimal plan to the query and

inject the plan4. Hashkey of the runtime query should match the

hashkey in sysqueryplans5. The plan injection login id and db context should match

the runtime user login id and db context6. Auto literal parametrization rules apply

1. Abstract Plan Injection

9

Page 10: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Abstract Plan Injection

10

select distinct v1.wkflw_item_id into #MemoField_wi from wkflwdb.dbo.wkflw_attribute_value as v0 ,wkflwdb.dbo.wkflw_attribute_value as v1 where v0.data_dictionary_id = @@@V0_INT and v0.wkflw_attribute_value_id = @@@V1_INT and v0.value_varchar IN (@@@V2_V CHAR1 ,@@@V3_VCHAR1 @@@V4_VCHAR1 , @@@V5_VCHAR1 , @@@V6_VCHAR1 , @@@V7_VCHAR1 , @@@V8_VCHAR1) and v1.data_dictionary_id = @@@V9_INT and v1.wkflw_attribute_value_id = @@@V10_INT and v1.value_varchar IN (@@@V11_VCHAR1) and v0.wkflw_item_id = v1.wkflw_ite m_id order by wkflw_item_id asc ( insert ( distinct_sorted ( nl_join ( i_scan idx_wkflw_attribute_value3 ( table ( v1 [wkflwdb.dbo.wkflw_attribute_value] ) ) ) ( i_scan idx_wkflw_attribute_value3 ( table ( v0 [wkflwdb.dbo.wkflw_attribute_value] ) ) ) ) )) ( prop ( table ( v1 [wkflwdb.dbo.wkflw_attribute_value] ) ) ( parallel 1 ) ( prefetch 2 ) ( lru ) ) ( prop ( table ( v0 [wkflwdb.dbo.wkflw_attribute_value] ) ) ( parallel 1 ) ( prefetch 16 ) ( lru ) )

Select * from sysqueryplans

Page 11: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Problem: Non-optimal query plan – Non optimal join orderSolution: Manipulate Densities

1. More than anything in statistics, densities play the most important role in generating an optimal plan Statistics for column: "fund"

Range cell density: 0.003093883935182

Total density: 0.024056913728555

2. The smaller the density values the more selective the index is3. sp_modifystats trade,fund,MODIFY_DENSITY,total,factor,”0.001”4. Best way to manage skew in the data

2. Update Statistics - Changing densities

11

Page 12: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Problem: High CPU Utilization but low LogicalReads Solution: Check for Logical IOs to CPU Ratio

1. This is Miles to the Gallon measure for the DBA

2. Query Elapsed Time = CPUTime + WaitTime

3. CPUTime = Parse and Compile Time + Time To Perform LIOs/Sorts etc. + CPU cycles spent spinning for spinlocks

4. monProcessActivity – Ratio of LogicalReads to CPUTime

5. Identify any changes in LIO to CPUTime ratio for your application

6. If the ratio decreased significantly indicates one of two things

– High Spinlock Contention or High Parse and Compile time

7. Look for high Spin Counts in monSpinlockActivity

• “Contention” does tell the full story. Use Spins while looking for Spinlock contention

8. If no clear cut spinlock contention look for queries taking too much time to parse and compile

• Large queries with too many tables and joins

• Statement Cache/Prepared Statements is not being used effectively (ExecutionCount of LWPs is very low)

• High create/drop temp table statements can also result in low Logical IOs to CPURation

3. Logical IOs to CPU Ratio

12

Page 13: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Logical IOs to CPURatio

13

program_name count(*) LIO PIO cpu LIORatio

DecoStatementCal 69 1,323,728,571 330,033 5,055,900 261

binary_preserver 191 244,632,068 442,591 4,578,300 53irv_server 459 79,816,387 14,571 3,642,400 21index_copy.pl 1,485 35,767,223 24,857 2,640,400 13

CashExceptionSer 511 8,445,364 81,522 2,047,900 4

ops_mng_reports. 32 625,564,517 1,339,311 1,975,200 316

IndexValidations 33 335,114,722 0 1,695,400 197SLServer 92 697,156,579 5,448,493 1,644,200 424

generate_adx_ext 122 152,953,052 10,788 1,447,200 105

fGP 256 228,649,043 2,958,052 1,217,900 187write_barra_exp 28 46,474,605 1,167 1,024,900 45

publishDoubleDip 7 217,229,469 913,336 826,300 262

gen_menu.pl 65 341,956,496 51,927 819,800 417

eod_cred_dig_che 8 237,230,606 674,441 816,200 290

missing_factor_c 13 148,446,955 185,088 777,600 190buysell_px.pl 15 193,496,991 10,105 695,400 278EjvPriceLoad.pl 10 37,153,816 4,696 683,600 54gen_pni 13 226,815,157 616,467 681,700 332dtcc_soi_file_ge 9 291,024,845 31,737 631,300 460

The LIOs to CPURatio for the highlighted applications is low. Generally the root cause is parse and compile time, spinlock contention or very high number of create/drop temp tables

Page 14: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Problem: High Parse and Compile Time, High spinlock contention on system tables

Solution: Statement Cache with auto literal parametrization1. Too large a size of statement cache increases hash chain length which in turn increases spinlock contention

(SSQLCACHE_SPIN)

2. Too much churn in statement cache when temp tables are involved and there are no persistent connections

– Use TF - 467 to prevent caching of statements with temp tables

3. Cached Statements – Low reuse count

– Literals in the select clause does not parametrize (select ‘123’ as col1 from mytable)

– Literals in sub-queries does not parametrize (select * from myTable where id = (select max (id) from myTable2 where name = ‘ABC’))

– Literals in Like Clause does not parametrize (select * from myTables where col1 like ‘ABC%’)

4. Total of 6 Grabs on spinlock per execution of single cached statement

– Convert high frequency queries to prepared statements or stored procedures to reduce SSQLCACHE_SPIN

4. Statement Cache

14

Page 15: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Statement Cache

15

• SSQLCACHE_SPIN contention- High frequency statements as shown below are better served if they are prepared statements or stored procedures instead of cached statements

Statement Cache

Page 16: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Problem: Persistent 702 errors and Server Stability ProblemsSolution: Right Sizing of Procedure Cache1. Make procedure cache big enough2. Huge SQL statement batches need large allocation of procedure cache

– Batch Inserts via perl– Can cause server stability issues if there is a sudden surge

3. Watch for proliferation of LWPs from prepared statements– Typically problems with OCS version or programming errors– Same hashkey but many copies of LWP

Select substring(ObjectName,14,15),count(*) from master..monCachedProcedure

where ObjectName like ‘*s[qh]%

Group by substring(ObjectName,14,15)

Order by 2

4. Procedure Cache utilization keeps growing and does not reduce even after purging– Typically OCS problems

5. Procedure Cache

16

Page 17: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Problem: High data cache spinlock contentionSolution: 21 named caches + default data cache

1. Static data/small tables with high logical IOs is bound to a cache with ‘relaxed LRU’– All the data fits into the cache (eg. 10 GB table data bound to 11 GB cache)

2. DOL tables in certain queries can increase spinlock contention abnormally– One spinlock per nested loop iteration– Had to convert a small set of tables back to APL

3. Too small sized cache pools and too high partitions leads to problem• The 4K memory pool of named cache tempdb_cache (cache id 9, cachelet id 14) is configured too small

for current demands 4. 2K Vs 16K buffer pool configuration is critical

– Our default is 2:1 ratio but monitor monCachePool to avoid too many physical IOs in a specific pool

6. Data Cache Tuning

17

Page 18: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Problem: Very high IDES Chain Spinlock contentionSolution: 60 tables are des_bound using dbcc tune

1. CPU Utilization was very high with no proportional increase in Logical IOs• Ides chain spinlocks were dominant and the root cause of contention

2. Typically use “used_count” in monOpenObjectActivity while look for candidates to des_bind3. Looking forward to ASE16’s keep count enhancements to eliminate or reduce some of the spinlocks like

IDES chain

7. DES Bind

18

Page 19: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Problem: Persistent workload balancing issuesSolution: Define Engine Groups and bind to execution class 1. Each engine is bound only to one engine group

2. Network Engines handle network traffic for all SPIDs

3. Critical Applications are bound to a specific engine group

4. DEFAULT_CLASS handles all the applications NOT bound

5. sp_bindexeclass NULL,’DF’,NULL,’DEFAULT_CLASS’

8. Engine Groups/Execution Classes

19

classname priority engine_group engines

NETWORK_CLASS HIGH NETWORK_ENGINES 0,1,2,3,4

ADMIN_CLASS HIGH ADMIN_ENGINES 5,6,7

GRP1_CLASS HIGH GRP1_ENGINES 8,9,10,11,12,13,14

GRP2_CLASS HIGH GRP2_ENGINES 15,16,17,18,19,20,21

GRP3_CLASS HIGH GRP3_ENGINES 22,23,24,25,26,27,28

GRP4_CLASS HIGH GRP4_ENGINES 29,30,31,32,33,34,35

GRP5_CLASS HIGH GRP5_ENGINES 36,37,38,39,40,41,42

GRP6_CLASS HIGH GRP6_ENGINES 43,44,45,46,47,48,49

DEFAULT_CLASS HIGH DEFAULT_ENGINES 50,51,52,53,54,55,56,58,59,60,61,62,63

Page 20: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

sp_listener ‘status’ – Just 4 engines are active

8. Engine Groups/Execution Classes

20

proto host port engine status            tcp dbhost 4500 0 active Network Enginestcp dbhost 4500 1 active Network Enginestcp dbhost 4500 2 active Network Enginestcp dbhost 4500 3 active Network Enginestcp dbhost 4500 4 stopped  tcp dbhost 4500 5 stopped  tcp dbhost 4500 6 stopped  

sp_listener ‘status’

Page 21: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Problem: Current maximum total database size for an ASE 25 TB with maximum single database size 9 TBSolution: VLDB implementation1. Semantic Partitioning

• Range partition• Mostly local indexes• Maintenance window needed reduced from over 48 hours to just few hours

– Update stats performed on partitions where data_change() > 0 • Great boost in performance – Most queries target just the last partition• Replication latency reduced and attributed to the reduction in the depth of the indexes in local indexes

2. Migration to ASE with 16K page size allows for a maximum DB size of 64 TB

9. Very Large Databases (VLDB)

21

Page 22: BlackRock Achieves Performance and Scale Using SAP ASE

FOR SAP WEBCAST ATTENDEES ONLY

Security, Scale and Simplicity

Tomorrow’s story!

22

Page 23: BlackRock Achieves Performance and Scale Using SAP ASE

23

Disclosure

FOR SAP WEBCAST ATTENDEES ONLY

No part of this material may be reproduced or transmitted in any form or by any means, electronic, mechanical, recording or otherwise, without the prior written consent of BlackRock. © 2016 BlackRock, Inc. All rights reserved. BLACKROCK, BLACKROCK SOLUTIONS, and iSHARES are registered trademarks of BlackRock, Inc. or its subsidiaries. All other trademarks are the property of their respective owners. 

Page 24: BlackRock Achieves Performance and Scale Using SAP ASE

SAP ASE 16 SP02 Performance Features

Ashok Swaminathan (Senior Director, Product Management, SAP)March 2016

Page 25: BlackRock Achieves Performance and Scale Using SAP ASE

© 2015 SAP SE or an SAP affiliate company. All rights reserved. 25Internal

Disclaimer

This presentation outlines our general product direction and should not be relied on in making a purchase decision. This presentation is not subject to your license agreement or any other agreement with SAP. SAP has no obligation to pursue any course of business outlined in this presentation or to develop or release any functionality mentioned in this presentation. This presentation and SAP's strategy and possible future developments are subject to change and may be changed by SAP at any time for any reason without notice. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent.

Page 26: BlackRock Achieves Performance and Scale Using SAP ASE

Overview of SAP ASE 16 SP02 Performance Features

Page 27: BlackRock Achieves Performance and Scale Using SAP ASE

© 2015 SAP SE or an SAP affiliate company. All rights reserved. 27Internal

SAP ASE Investments In High Performance

Version Features Benefits15.7 • ASE In-memory Database (IMDB) - entire database in

memory, but without support for durability of transactions• Typical usage : IMDB as temp DB or use of IMDB for fast

computations without the need to persist changes• Performance increase by 2X (up to 10x in some customer cases)

16.0 • Run Time Logging Enhancements (Buffer Unpinning)• Metadata Management Enhancements• Lock Management Enhancements

• Allowed scale up beyond 32 cores (up to 80 cores), near linear scaling

• Achieved over 1 million/minute business transactions

16 SP02 • Compiled Queries• Transactional Memory• Lockless Data cache• Latchfree Btree• Non Volatile Cache Management

• Latency and Throughput benefits

• Investments starting from ASE 15.7• Early vendor to introduce In-Memory technology• Continued focus with each release

Page 28: BlackRock Achieves Performance and Scale Using SAP ASE

© 2015 SAP SE or an SAP affiliate company. All rights reserved. 28Internal

ASE 16 – Start Of High Performance Focus

16 32 48 64 800

200000400000600000800000

10000001200000

Scale-up Performance

Transactions per minute

> 1 million tpm

Engines

Environment

SUSE Linux Enterprise Server 11

Platform: 8 sockets/80 cores Intel® Xeon® [email protected] with 1TB RAM

Feature Description

Scale up on large SMP systems

• Minimize locking/latch contention, in highly concurrent environments• Enhancements in the area of run time logging optimization, metadata management, lock management

Partition level Locking • Granular locking for partitioned table• Enables Concurrent DDL and DML operations on a table

Dynamic Thread Assignment

• Allows executing query plans in parallel with fewer resources/threads• If # of threads < work units, threads completing execute remaining tasks

Index Compression • Index compression enabling storage savings for large indexes • Completes current capabilities - data compression (row/page), LOB, backup

Optimized Star Join Queries

• Allows hints in the join syntax• Can use the syntax “plan (use fact_table tablename)” –query processing optimizations related to star join

Page 29: BlackRock Achieves Performance and Scale Using SAP ASE

© 2015 SAP SE or an SAP affiliate company. All rights reserved. 29Internal

SAP ASE 16 SP02 – High Performance Features

To Be Developed

T/SQLJDBCODBC

XOLTP QP

Parallel Dynamic ThreadingCode Compile

MVC

C

In-memory Row Store

Main MemoryLockless Metadata

Transaction Memory

Kernel Threading

DiskFlash Disk

Upcoming New In SP02

Buffer Cache

Scripts

Available

Latch Free BTree

Lockless Data Cache

Simplified Native Access Plans (Compiled Queries)• Compiled query plans – faster execution• Transparent to applications and users

Latchfree B-Tree on Indexes• Reduces contention• Increases concurrency and performance

Lockless Data Cache• Decreases cache contention• Increases concurrency and performance

Transactional Memory• Minimize contention leveraging hardware for identifying memory

conflicts

Non Volatile Cache Management• Leverage SSD for storing frequently accessed/updated pages

Latch Free BTree

Lockless Data Cache

Page 30: BlackRock Achieves Performance and Scale Using SAP ASE

© 2015 SAP SE or an SAP affiliate company. All rights reserved. 30Internal

Simplified Native Access Plan (SNAP) - Overview

A new query plan execution feature Avoids repetitive query exec code generation from the physical

operators output from query optimization– Works ono Cached SQL Statements in statement cacheo Statements within stored procedureso Fully prepared SQL statements

Available on x86-64/amd64 Linux

Transparent to the userEnabled by configuration parameter; ASE uses SNAP automatically when possible.Unsupported plans continue to use lava execution engine.

SELECT {column list}FROM table COND1 due_dt <=getdate()COND2 (AND) recv_date is null

SELECT {column id’s & datatypes}FROM objid=123456COND1 col_id=3 (dt) >= (dt) ‘Jan 1 2015’COND2 (AND) col_id=4 (dt) IS NULL

SQL Parsing

Normalization

Pre-Processing

Query Optimization

Native Access Plan (Query Exec Code)

Query Compilation

TDSLANG select * from table where due_dt =getdate() and recv_date is null

Receive Buffer

Query Execution

Query Execution

SNAP

Page 31: BlackRock Achieves Performance and Scale Using SAP ASE

© 2015 SAP SE or an SAP affiliate company. All rights reserved. 31Internal

Latch Free BTree

Latch is used to ensure physical consistence of page between DMLs and query.

Minimize contention where SH_LATCH and EX_LATCH block each other, when multiple threads need to synchronize between each other when modifying/reading index pages

Logical lock semantics not changed

BLOCK

page

Query DMLs

SH_LATCH EX_LATCH

Select * from customer where money > 5000 Insert into customer values(6000)

Page 32: BlackRock Achieves Performance and Scale Using SAP ASE

© 2015 SAP SE or an SAP affiliate company. All rights reserved. 32Internal

Lockless Data Cache

Cache ContentionAccess to pages in named caches are through hashtables. Spinlocks protect buckets in named cache’s hashtable.

Contention on cache spinlock can be resolved by Moving object(s) to separate named caches

or Partitioning the cache

For example, lockless data cache can help when there is significant contention on a single cache partition but not significant contention on other cache partitions

This means less waits and spins on spinlocks, and thus lower contention. In turn lower CPU utilization, higher performance and improved scalability

Page 33: BlackRock Achieves Performance and Scale Using SAP ASE

© 2015 SAP SE or an SAP affiliate company. All rights reserved. 33Internal

Transactional Memory

Objective: Increase concurrency (and minimize contention) by leveraging hardware for tracking memory conflicts as a

result of concurrent execution of code Background

– Spinlock (used to protect certain data structures) has been fragmented and distributed in the ASE code– Contention is on Lock – Many times, not on actual data which we want to protect

Solution Allow multiple threads to modify data structures, unless there is conflict at memory level Newer hardware from Intel (Haswell-EX/Xeon E7 V3 processor), and IBM (Power), keeps track of memory

reads/writes and memory contention By allowing concurrent processes to modify memory, and undoing the offending process/thread when there is a

memory conflict, concurrency is increased

Page 34: BlackRock Achieves Performance and Scale Using SAP ASE

© 2015 SAP SE or an SAP affiliate company. All rights reserved. 34Internal

Non-Volatile Cache Management

DB ACCESS/STORAGE

Buffer Manager

DISK

• SSD cache will act as a intermediate storage.

• Buffer manager reads pages from disk (cold pages)

• Hot and warm pages that cannot fit in Buffer manager will be evicted to the SSD cache.

• Pages which are dirty will also be shifted to the SSD cache. This dirty pages then will be written in a delayed manner to the HDD disk.

• Frequently read pages will be in SSD

SSD CACHE

Page 35: BlackRock Achieves Performance and Scale Using SAP ASE

© 2015 SAP SE or an SAP affiliate company. All rights reserved. 35Internal

Future Directions – Relating To Performance Features

XOLTP

In-Memory Processing

(DRC)

Transactional Memory

Buffer Cache Extension via

SSD

Compiled Queries

MVCC

Page 36: BlackRock Achieves Performance and Scale Using SAP ASE

© 2015 SAP SE or an SAP affiliate company. All rights reserved.

Thank you

Contact information:

Ashok [email protected]

Page 37: BlackRock Achieves Performance and Scale Using SAP ASE

© 2015 SAP SE or an SAP affiliate company. All rights reserved.

QUESTIONS