TEN QUERY TUNING TECHNIQUES Every SQL Programmer Should Know Kevin Kline Director of Engineering Services at SQL Sentry Microsoft MVP since 2003 Facebook, LinkedIn, Twitter at KEKLINE [email protected] KevinEKline.com, ForITPros.com
Jan 24, 2015
TEN QUERY TUNING TECHNIQUES
Every SQL Programmer Should Know
Kevin KlineDirector of Engineering Services at SQL
SentryMicrosoft MVP since 2003Facebook, LinkedIn, Twitter at [email protected], ForITPros.com
FOR FRIENDS OF SQL SENTRY
• Free Plan Explorer download: http://www.sqlsentry.net/plan-explorer/
• Free query tuning consultations: http://answers.sqlperformance.com.
• Free new ebook (regularly $10) to attendees. Send request to [email protected].
• SQL Server educational videos, scripts, and slides: http://SQLSentry.TV
• Tuning blog: http://www.sqlperformance.com/
• Monthly eNews tips and tricks: http://www.sqlsentry.net/newsletter-archive.asp
AGENDA• Introductions• Test & tuning environment
• 1. Clearing caches
• Looking for red flags• 2. Reading execution plans
• Query tuning techniques:• 8 more specific examples of widespread approaches that lead to poor
performance
• Summary
3
TEST & TUNING ENVIRONMENT
• Code to clear the caches*: o CHECKPOINTo DBCC [FreeProcCache | FreeSystemCache | FlushProcInDB(<dbid>) ]o DBCC DropCleanBuffers
• Code to set measurements:o SET STATISTICS [TIME | IO]o SET SHOWPLAN [TEXT | XML] or Graphic Execution Plans
• Code for Dynamic Management Views (DMV) checks.o System info – sys.dm_os_performance_counters, sys.dm_os_wait_statso Query info – sys.dm_exec_requestso Index info – sys.dm_db_index_usage_stats, sys.dm_io_virtual_file_stats
RED FLAGS IN YOUR SQL CODE
• Red Flags Query Operators:o Lookups, Scans, Spools, Parallelism Operations
• Other Red Flags:o Dissimilar estimated versus actual row countso High physical readso Missing statistics alarmso Large sort operationso Implicit data type conversions
DEMOS: DEFAULT CURSORS
• I don’t always use cursors…o …but when I do, I avoid the default optionso Slow and heavy-handed: Global, updateable, dynamic,
scrollableo I use LOCAL FAST_FORWARDo May want to test STATIC vs. DYNAMIC, when tempdb is a
bottleneck
• Blog post: http://bit.ly/AB-cursors
DEMOS: WHERE IN VERSUS WHERE EXISTS
• There are lots of ways to find data existing within subsets:• IN, EXISTS, JOIN, Apply, subquery
• Which technique is best?• Blog post: http://bit.ly/AB-NOTIN
OPTIMIZING FOR SELECT VERSUS DML
• Big differences between a SELECT and a DML statement that effects the same rows.
• Shouldn’t blindly create every index the Tuning Advisor or execution plan tells you to!
• Blog post - http://bit.ly/AB-BlindIndex
READS & INDEX STRUCTURE• 8K pages• Leaf pages ARE the data.• Non-leaf pages are pointers.
Leaf Pages
Root Page
Level 0
Intermediate Pages
Level 1
Level 2
WRITES & INDEX STRUCTURE• Each change to the leaf pages requires all index
structures be updated.
Leaf Pages
Root Page
Level 0
Intermediate Pages
Level 1
Level 2
Page Split
DML
Actualplace-ment
DEMOS: UNWANTED RECOMPILES
Execution
Load metadataNOIn Memory?
compile
optimize
ExecuteY
ES
ReComp
Execute
CAUSES OF RECOMPILE
• Expected: Because we request it:• CREATE PROC … WITH RECOMPILE or EXEC myproc … WITH
RECOMPILE• SP_RECOMPILE foo
• Expected: Plan was aged out of memory• Unexpected: Interleaved DDL and DML• Unexpected: Big changes since last execution:
• Schema changes to objects in underlying code• New/updated index statistics• Sp_configure
INTERLEAVED DDL AND DML• CREATE PROC testddldml AS … ;• CREATE TABLE #testdml; -- (DDL)• <some T-SQL code here>• INSERT INTO #testdml; -- (DML + RECOMPILE)• <some T-SQL code here>• ALTER TABLE #testdml; -- (DDL)• <some T-SQL code here>• INSERT INTO #testdml; -- (DML + RECOMPILE)• <some T-SQL code here>• DROP TABLE #testdml; -- (DDL)• <some T-SQL code here>
DEMOS: THE "KITCHEN SINK" PROCEDURE
• Usually see it as a one-query-for-all-queries procedure, or even one-proc-for-for-all-transactions procedure:o Where name starts with S, or placed an order this year, or lives in Texaso Insert AND Update AND Delete AND Select
• Conflicting optional parameters make optimization impossibleo OPTION (RECOMPILE)o Dynamic SQL + Optimize for ad hoc workloadso Specialized procedures
• Better approach?o Specialize and optimize each piece of code to do ONE THING really effectively
DEMOS: SP_EXECUTESQL VS. EXEC(…)
• I don’t always use dynamic SQL…o …but when I do, I always use sp_executesqlo Less fuss with concatenation and implicit/explicit conversionso Better protection against SQL injection (but not for all things)o At worst case, behavior is the same
• Can promote better plan re-use• Encourages strongly typed parameters instead of
building up a massive string
IMPLICIT CONVERSIONS
• SQL Server has to do a lot of extra work / scans when conversion operations are assumed by the SQL programmer.
• Happens all the time with data types you’d think wouldn’t need it, e.g. between date types and character types.
• Very useful data type conversion chart at http://bit.ly/15bDRRA.
• Data type precedence call also have an impact: http://bit.ly/13Zio1f.
IMPLICIT CONVERSION RESOURCES
• Ian Stirk’s Column Mismatch Utility at http://www.sqlservercentral.com/articles/Administration/65138/.
• Jonathan Kehayias’ plan cache analyzer at http://sqlblog.com/blogs/jonathan_kehayias/archive/2010/01/08/finding-implicit-column-conversions-in-the-plan-cache.aspx.
• Jonathan Kehayias’ index scan study at http://www.sqlperformance.com/2013/04/t-sql-queries/implicit-conversion-costs
DEMOS: COMMA-DELIMITED PARAMETERS
• Example: pass a comma-separated list of OrderIDs• String splitting is expensive, even using CLR• Table-valued parameters are typically a better approach
DEMOS: TEMPORARY STRUCTURES
• Which are better, temp tables or temp variables?
Temp Table Temp Variable
Stored in? Tempdb Tempdb
Statistics? Yes No (1 row)
Indexs/Keys? Yes 1 UK / PK only
Truncate? Yes No
Recompiles? Yes No
Parallelism? Yes No
Metadata Overhead?
Low Lowest
Lock Overhead? Normal Lowest
CODING STANDARDS AND DISSIMILARITY
• Might sound frivolous, but naming schemes are importanto Convention is not important; but rather being consistent and logical
• Story: dbo.UpdateCustomer vs. dbo.Customer_Update
• Always specify schema when creating, altering, referencing objectso Object resolution works a little bit harder without ito More importantly, it can get the wrong answero And will often yield multiple copies of the same plan
• Do not use the sp_ prefix on stored procedureso This has observable overhead, no matter how specific you are
MIMICKING PRODUCTION
• Your dev machine is usually nothing like productiono Build representative data when you cano Build a stats-only database when you can’t (a.k.a. a database clone)
• Will allow you to see plan issues, but not speedo Make sure settings are the same
• @@VERSION, edition• Max memory if possible, sp_configure options• Logins (and permissions), tempdb settings• Parameterization settings, recovery model, compression, snapshot isolation• Compatibility level (usually not an issue when working with a restore)• Run a full business cycle workload after a restore
o Simulate equivalent hardware: DBCC OPTIMIZER_WHATIFo Use Distributed Replay when you can
• Not perfect, but more realistic than single-threaded trace replay
SUMMARYTest Harness for your SQL Code to
ensure stable results.Remember the red flags in SQL code
running against SQL Server.Plus 8 more coding techniques:1. Writing “good” cursors2. WHERE IN versus WHERE EXISTS
3. The nature of indexes and Optimizing for Reads versus Writes
4. Inadvertent recompiles5. EXEC versus SP_EXECUTESQL6. Data types and Implicit conversion7. Dissimilar execution plans and
tracking plan cache bloat8. Mimicking the production environmentMore info?
Let’s connect!Facebook, LinkedIn, Twitter at KEKLINE.
Email at [email protected]
Blogs at http://KevinEKline.com Andhttp://ForITPros.com
WRAP UPEngage with us on social media
o I’m thankful for your word of mouth promotions and endorsements!
Share your tough SQL tuning problems with us: http://answers.sqlperformance.com
Download SQL Sentry Plan Explorer for free: http://www.sqlsentry.com/plan-explorer/
Check out our other award winning tools: http://www.sqlsentry.net/download
NOLOCK
http://www.flickr.com/photos/relyjus/4289185639/
NOLOCK
• It is a turbo button …if you’re ok with inaccuracy
• There are times it is perfectly valido Ballpark row countso Please use session-level setting, not table hint
• Usually, though, better to use SNAPSHOT or RCSIo But test under heavy load