Ten Query Tuning Techniques Every SQL Programmer

advertisement
TEN QUERY TUNING
TECHNIQUES
Every SQL Programmer Should Know
Kevin Kline
Director of Engineering Services at SQL Sentry
Microsoft MVP since 2003
Facebook, LinkedIn, Twitter at KEKLINE
KEKline@sqlsentry.com
KevinEKline.com, ForITPros.com
FOR FRIENDS OF SQL SENTRY
• Free Plan Explorer download:
http://www.sqlsentry.net/plan-explorer/
• Free query tuning consultations:
http://answers.sqlperformance.com.
• Free new ebook (regularly $10) to attendees.
Send request to sales@sqlsentry.net.
• SQL Server educational videos, scripts, and
slides: http://SQLSentry.TV
• Tuning blog: http://www.sqlperformance.com/
• Monthly eNews tips and tricks:
http://www.sqlsentry.net/newsletterarchive.asp
AGENDA
• Introductions
• Test & tuning environment
•
1. Clearing caches
• Looking for red flags
•
2. Reading execution plans
• Query tuning techniques:
•
8 more specific examples of widespread approaches that lead to poor
performance
• Summary
TEST & TUNING ENVIRONMENT
• Code to clear the caches*:
o CHECKPOINT
o DBCC [FreeProcCache | FreeSystemCache | FlushProcInDB(<dbid>) ]
o DBCC DropCleanBuffers
• Code to set measurements:
o SET STATISTICS [TIME | IO]
o SET SHOWPLAN [TEXT | XML] or Graphic Execution Plans
• Code for Dynamic Management Views (DMV) checks.
o System info – sys.dm_os_performance_counters, sys.dm_os_wait_stats
o Query info – sys.dm_exec_requests
o Index info – sys.dm_db_index_usage_stats, sys.dm_io_virtual_file_stats
RED FLAGS IN YOUR SQL CODE
• Red Flags Query Operators:
o Lookups, Scans, Spools, Parallelism Operations
• Other Red Flags:
o
o
o
o
o
Dissimilar estimated versus actual row counts
High physical reads
Missing statistics alarms
Large sort operations
Implicit data type conversions
DEMOS: DEFAULT CURSORS
• I don’t always use cursors…
o
o
o
o
…but when I do, I avoid the default options
Slow and heavy-handed: Global, updateable, dynamic, scrollable
I use LOCAL FAST_FORWARD
May want to test STATIC vs. DYNAMIC, when tempdb is a
bottleneck
• Blog post: http://bit.ly/AB-cursors
DEMOS: WHERE IN VERSUS
WHERE EXISTS
• There are lots of ways to find data existing within
subsets:
•
IN, EXISTS, JOIN, Apply, subquery
• Which technique is best?
• Blog post: http://bit.ly/AB-NOTIN
OPTIMIZING FOR SELECT VERSUS
DML
• Big differences between a SELECT and a DML
statement that effects the same rows.
• Shouldn’t blindly create every index the Tuning Advisor
or execution plan tells you to!
• Blog post - http://bit.ly/AB-BlindIndex
READS & INDEX STRUCTURE
• 8K pages
• Leaf pages ARE the data.
• Non-leaf pages are pointers.
Root Page
Intermediate
Pages
Leaf Pages
Level 2
Level 1
Level 0
WRITES & INDEX STRUCTURE
• Each change to the leaf pages requires all index
structures be updated.
Root Page
Intermediate
Pages
Leaf Pages
Level 2
Level 1
Level 0
DML
Page
Split
Actual
placement
DEMOS: UNWANTED RECOMPILES
Execution
In Memory?
NO
Load metadata
compile
YES
ReComp optimize
Execute
CAUSES OF RECOMPILE
• Expected: Because we request it:
• CREATE PROC … WITH RECOMPILE or EXEC myproc … WITH
RECOMPILE
• SP_RECOMPILE foo
• Expected: Plan was aged out of memory
• Unexpected: Interleaved DDL and DML
• Unexpected: Big changes since last execution:
• Schema changes to objects in underlying code
• New/updated index statistics
• Sp_configure
INTERLEAVED DDL AND DML
•
•
•
•
•
•
•
•
•
•
•
CREATE PROC testddldml AS … ;
CREATE TABLE #testdml;
<some T-SQL code here>
INSERT INTO #testdml;
<some T-SQL code here>
ALTER TABLE #testdml;
<some T-SQL code here>
INSERT INTO #testdml;
<some T-SQL code here>
DROP TABLE #testdml;
<some T-SQL code here>
-- (DDL)
-- (DML + RECOMPILE)
-- (DDL)
-- (DML + RECOMPILE)
-- (DDL)
DEMOS: THE "KITCHEN SINK"
PROCEDURE
• Usually see it as a one-query-for-all-queries procedure,
or even one-proc-for-for-all-transactions procedure:
o Where name starts with S, or placed an order this year, or lives in Texas
o Insert AND Update AND Delete AND Select
• Conflicting optional parameters make optimization
impossible
o OPTION (RECOMPILE)
o Dynamic SQL + Optimize for ad hoc workloads
o Specialized procedures
• Better approach?
o Specialize and optimize each piece of code to do ONE THING really effectively
DEMOS: SP_EXECUTESQL VS.
EXEC(…)
• I don’t always use dynamic SQL…
o
o
o
o
…but when I do, I always use sp_executesql
Less fuss with concatenation and implicit/explicit conversions
Better protection against SQL injection (but not for all things)
At worst case, behavior is the same
• Can promote better plan re-use
• Encourages strongly typed parameters instead of
building up a massive string
IMPLICIT CONVERSIONS
• SQL Server has to do a lot of extra work / scans when
conversion operations are assumed by the SQL
programmer.
• Happens all the time with data types you’d think wouldn’t
need it, e.g. between date types and character types.
• Very useful data type conversion chart at
http://bit.ly/15bDRRA.
• Data type precedence call also have an impact:
http://bit.ly/13Zio1f.
IMPLICIT CONVERSION
RESOURCES
• Ian Stirk’s Column Mismatch Utility at
http://www.sqlservercentral.com/articles/Administration/6
5138/.
• Jonathan Kehayias’ plan cache analyzer at
http://sqlblog.com/blogs/jonathan_kehayias/archive/2010
/01/08/finding-implicit-column-conversions-in-the-plancache.aspx.
• Jonathan Kehayias’ index scan study at
http://www.sqlperformance.com/2013/04/t-sqlqueries/implicit-conversion-costs
DEMOS: COMMA-DELIMITED
PARAMETERS
• Example: pass a comma-separated list of OrderIDs
• String splitting is expensive, even using CLR
• Table-valued parameters are typically a better approach
DEMOS: TEMPORARY
STRUCTURES
• Which are better, temp tables or temp variables?
Temp Table
Temp Variable
Stored in?
Tempdb
Tempdb
Statistics?
Yes
No (1 row)
Indexs/Keys?
Yes
1 UK / PK only
Truncate?
Yes
No
Recompiles?
Yes
No
Parallelism?
Yes
No
Metadata
Overhead?
Low
Lowest
Lock Overhead?
Normal
Lowest
CODING STANDARDS AND
DISSIMILARITY
• Might sound frivolous, but naming schemes are
important
o Convention is not important; but rather being consistent and logical
• Story: dbo.UpdateCustomer vs. dbo.Customer_Update
• Always specify schema when creating, altering,
referencing objects
o Object resolution works a little bit harder without it
o More importantly, it can get the wrong answer
o And will often yield multiple copies of the same plan
• Do not use the sp_ prefix on stored procedures
o This has observable overhead, no matter how specific you are
MIMICKING PRODUCTION
• Your dev machine is usually nothing like production
o Build representative data when you can
o Build a stats-only database when you can’t (a.k.a. a database clone)
• Will allow you to see plan issues, but not speed
o Make sure settings are the same
• @@VERSION, edition
• Max memory if possible, sp_configure options
• Logins (and permissions), tempdb settings
• Parameterization settings, recovery model, compression, snapshot isolation
• Compatibility level (usually not an issue when working with a restore)
• Run a full business cycle workload after a restore
o Simulate equivalent hardware: DBCC OPTIMIZER_WHATIF
o Use Distributed Replay when you can
• Not perfect, but more realistic than single-threaded trace replay
SUMMARY
Let’s connect!
Facebook, LinkedIn, Twitter
at KEKLINE.
Email at
KEKline@sqlsentry.com
Blogs at
http://KevinEKline.com
And
http://ForITPros.com
WRAP UP
Engage with us on social media
o
I’m thankful for your word of mouth promotions and
endorsements!
Share your tough SQL tuning problems with us:
http://answers.sqlperformance.com
Download SQL Sentry Plan Explorer for free:
http://www.sqlsentry.com/plan-explorer/
Check out our other award winning tools:
http://www.sqlsentry.net/download
NOLOCK
http://www.flickr.com/photos/relyjus/4289185639/
NOLOCK
• It is a turbo button …if you’re ok with inaccuracy
• There are times it is perfectly valid
o Ballpark row counts
o Please use session-level setting, not table hint
• Usually, though, better to use SNAPSHOT or RCSI
o But test under heavy load
Download