Performance for Small/Medium Business

advertisement
SQL Performance 2011/12
Joe Chang, SolidQ
www.qdpma.com
http://sqlblog.com/blogs/joe_chang/default.aspx
jchang6@yahoo.com
2011
• Hardware is Powerful & Cheap
– CPU (cores), memory, and now IO too!
• Quad-core since 2006, 1GHz since 2000
• Is Performance still a concern?
– Yes, along with fundamentals
• Modern Performance Strategy
– Can handle minor inefficiencies
– Identify and circumvent the really bad things
Modern Hardware
• 4-12 cores per processor socket
• 16GB DIMM at less than $1K
• SSD
– Enterprise grade SSD still moderately expensive
• Both SLC and MLC
– Consumer SSD – really cheap, $3-4K per TB
• Uneven performance characteristics over time
• Desired IO performance: 1-2GB/s, 20-50K IOPS
Hardware Baseline 2011
Entry
• 1 Xeon E3 quad-core
• 16GB memory
– 4 x 4GB unbuffered ECC
• SSD options:
– 2-4 SATA SSDs
– 1-2 PCI-E SSDs
Mid-range
• 2 Xeon 5600 6-core
• 48 – 192GB
– 6 x 8GB to 12 x 16GB
• SSD options
– 16+ SATA SSDs
– 4-5 PCI-E SSDs
Performance Fundamentals
• Network round-trips
– Owner qualified, case correct
• Log write latencies
– Sufficiently low to support transaction volume
• Not necessarily separate data and log disks
• Normalization – correct data trumps all!
• Indexes – a few good ones, and not too many!
• SQL – that the optimizer
What can go wrong?
• With immense hardware resources
• And a great database engine
– What can go wrong?
• Following a fixed set of rules and procedures
– Basic transactions processing should work well
– If your process does something unanticipated
• Some things can go horribly wrong
Performance Concepts
• Query Optimizer
• Execution plan operators
– Formula for component operation
• Data distribution statistics
– to estimate rows & pages, automatically updated
– Rules when estimate not possible
• Stored procedure compile rules
– Parameters and variables
Stored Procedure
Basics
Parameters and Variables
• On compile
– Parameter values used to for row estimate
– Variables – assume unknown value
• Consider effect of skewed distribution
Parameter & Variable
6 rows for value 1
4 rows for value unknown
Consider impact for skewed data distributions
Stored Procedure Compile Options
• WITH RECOMPILE
• OPTIMIZE FOR
• Plan Guide
• Temp table
– KEEP PLAN, KEEPFIXED PLAN
Compile & Execute Time
• Plan reuse desired when
– Compile cost is high relative to execute cost
• Recompile desired when
– Execute cost is high relative to compile cost
Statistics
Basics
Statistics
• No statistics – table variables
• Temp table – statistics auto recompute
– 6 row modified, 500 rows, every 20% thereafter
• Statistics sampling
– Random page, how to handle skewed
distribution?
• Upper and lower bounds
– Problems caused by incrementing columns
• Propagation errors
Statistics Recompute
• Scenario: start with accurate statistics
• Update column with new values
– That did not previously exist
• If fewer than 20% of rows updated
– Auto-recompute is not triggered
Sampling
• Default sampling percentage is usually good
• Caution: not a random row sample!
– Random sampling of page
• From nonclustered index if available
• If there is correlation between pages & values
– Then serious over estimation possible
Out of range
• Statistics sampling tries to identify lower and
upper bound
Bad Execution Plan Examples
Not comprehensive
Scenarios
Not comprehensive
• Or condition
• Multiple optional search arguments
• Skewed distributions
– 1 business logic for Small and large data sets
– Reports for 1 day, 1 week, 1 month, 1 year
• Statistics related problems
– Resulting in horrible execution plan
When the Query Optimizer
Does not understand you
Simple OR Conditions
Simple Index Seek
OR Condition in Join
Alternative – Union
UNION and UNION ALL
• UNION
– Only distinct rows
– Sort to eliminate duplicates
• Can be expensive for high row counts
• UNION ALL
– All rows
– No sort to eliminate duplicates
Multiple Optional SARGs
This was suppose to work, but does not
Parameterized SQL
Skew and Range Variation
Statistics Out-of Range
Statistics Out of Range (cont.)
Table Variable
No Statistics
assumes 1 row, 1page
Why? No recompiles
Loop Join – Scan Inner Source
Estimate 1 row
Really Bad News
The Correct Plan
Estimate 1 row
Hash Join forced
with hint
Loop Join vs. Hash Join
Alternative Plan with Index
Temp Table versus CTE
• Consider options
– SELECT xxx INTO #Temp
– FROM Sql
Main Expression
– WITH tmp AS (SELECT xxx FROM Sql)
Main Expression
Download