How are execution plans created

EXECUTION PLANS
By Nimesh Shah , Amit Bhawnani
Outline



What is execution plan
How are execution plans created
How to get an execution plan
 Graphical
 Text
What is execution plan


The execution plan is created by the optimizer and
used for the execution of a statement. Once the
execution of a statement has started, the execution
plan is followed in a step-by-step manner to
retrieve the required result. It is an explanation of
the steps to perform during statement execution.
An execution plan is composed of primitive
operations. Examples of primitive operations are:
reading a table completely, using an index,
performing a nested loop or a hash join.
How are execution plans created
How are execution plans created


Parse: - The first phase is to parse the SQL query for syntaxes and create a query
processor tree which defines logical steps to execute the SQL. This process is also
called as ‘algebrizer’.
Optimize: - The next step is to find a optimized way of executing the query
processor tree defined by the ‘algebrizer’. This task is done by using
‘Optimizer’.’Optimizer’ takes data statistics like how many rows, how many unique
data exist in the rows, do the table span over more than one page etc. In other
words it takes information about data’s data. These all statistics are taken, the
query processor tree is taken and a cost based plan is prepared using resources,
CPU and I/O. The optimizer generates and evaluates many plan using the data
statistics, query processor tree, CPU cost, I/O cost etc to choose the best plan.
The optimizer arrives to an estimated plan, for this estimated plan it tries to find an
actual execution plan in the cache. Estimated plan is basically which comes out from
the optimizer and actual plan is the one which is generated once the query is
actually executed.

Execute: - The final step is to execute the plan which is sent by the optimizer.
How are execution plans created





The creation of an execution plan takes time.
Not every execution option is always explored, a “good
enough” execution plan is often generated, then sent to the
database engine for execution.
The execution plan is estimated, and may change when the
T-SQL is actually executed by the database engine.
Execution plans are usually cached (in the plan cache) for
later use so that it can be reused if an identical (or
paramerized) query is submitted for execution again.
Reusing a cached execution plan can save time because a
new execution plan does not have to be recreated each
time the same query is re-executed.
How to get an execution plan

SQL Server has multiple ways to get execution
plans. The two most important methods are:
 Graphical
 The
graphical representation of SQL Server execution plans
is easily accessible in the Management Studio but is hard to
share. Especially because detailed information for the
individual operations is only visible when the mouse is over
the particular operation ("hover").
 Text
 The
table wise execution plan is hard to read but easy to
copy. The table includes all the information in show shot.
Graphical Execution plan
Interpreting Graphical Execution Plans




You read a graphical execution plan from right to left and
top to bottom.
Icons (operators) - The icons you see in the above execution
plan are 2 of the several operators that represent various
actions and decisions that potentially make up an execution
plan.
Arrows - The arrow pointing between two operators
represent data being passed between them. The thickness
of the arrow reflects the amount of data being passed,
thicker meaning more rows.
Costs (per operator) - Below each icon is displayed a
number as a percentage. This number represents the relative
cost to the query for that operator
Tooltips

Each of the icons and the arrows has, associated with it, a pop-up window called a
ToolTip, which you can access by hovering your mouse pointer over the icon.

Physical Operation - Lists the physical operation being performed for the
node,such as a Clustered Index Scan, Index Seek, Aggregate, Hash or Nested
Loop Join,and so on

Logical Operation—Lists the logical operation that corresponds with the
physical operation, such as the logical operation of a union being physically
performed as a merge join.

Estimated I/O Cost—Indicates the estimated relative I/O cost for the operation.
Preferably, this value should be as low as possible.

Estimated CPU Cost—Lists the estimated relative CPU cost for the operation.

Estimated Number of Executions—Lists the estimated number of times this
operation will be executed.

Estimated Operator Cost—Indicates the estimated cost to execute the physical
operation. For best performance, you want this value as low as possible.
Tooltips (contd.)

Estimated Number of Rows—Lists the estimated number of rows to be output
by the operation and passed on to the parent operation.

Estimated Row Size—Indicates the estimated average row size of the rows
being passed through the operator.

Estimated Subtree Cost—Lists the estimated cumulative total cost of this
operation and all child operations preceding it in the same subtree.

Object—Indicates which database object is being accessed by the operation
being performed by the current node.

Predicate—Indicates the search predicate specified for the object in the original
query.

Seek Predicates—Indicates the search predicate being used in the seek against
the index when an index seek is being performed.

Output List—Indicates which columns of data are being returned by the
operation.

Ordered—Indicates whether the rows are being retrieved via an index in sorted
order.
Logical and Physical Operators

Each operator implements a single basic operation,
such as:







Scanning data from a table
Seeking data in a table
Aggregating data
Sorting
Joining two data sets
Etc.
In total, there are 79 different operator that can
be included in an execution plan.
Table Scan




Seeing a table scan often indicates a problem that needs to be
addressed.
A table scan indicates that every row in the table had to be
examined to see if it met the query criteria, which can mean slow
performance if there are a large numbers of rows.
A table scan indicates there is no clustered index on the table, and
the table is a heap.
In most cases, you will want to add a clustered index to every table,
as it has the potential of boosting the performance of the query.
Clustered Index Scan




A clustered index scan is a scan of all the rows of a
table that has a clustered index.
Like a table scan, clustered index scans can be slow
and use up lots of server resources.
Generally, clustered index scans should generally
be avoided (but better than table scans).
On the other hand, when tables are small or many
rows are returned, then a clustered index scan might
be the fastest way to return data.
Clustered Index Seek


If there is an available and useful index, and there
is a sargeable WHERE clause, the query optimizer
can usually, very quickly, identify the rows to be
returned and return them without having to scan
each row of the table.
Ideally, for best query performance, clustered index
seeks should be as used often as feasible. Consider
them the “golden standard” for returning data.
Nonclustered Index Scan



All records in the table are scanned, and all rows
that match the WHERE clause are returned.
As with all scans, it can be slow and require extra
I/O resources.
Generally, non-clustered index scans should be
avoided.
Nonclustered Index Seek




A non-clustered index is used to identify the row(s)
to be returned, so every row does not need to be
scanned (assumes sargeable WHERE clause).
This is generally much faster than a non-clustered
index scan.
Like clustered index seeks, non-clustered index seeks
are generally a good thing.
One exception is if bookmark (RID or Key) lookups
occur as part of the non-clustered index seek, then
performance may lag if many rows are returned.
RID Lookup/ Key Lookup





A RID/Key Lookup is generally an indicator of a
performance issue.
A RID Lookup is a form of a bookmark index lookup on
a table without a clustered index (a heap).
A Key Lookup is a form of a bookmark index lookup on
a table with a clustered index.
While Lookups are often faster than most “scans,” this is
often not the case if many rows have to be returned.
Generally, RID Lookups should be eliminated with the
addition of an appropriate clustered index, and if
necessary, a covering or included index.
Joins (Loop, Merge, Hash)



The nested loop join compares each row from one table (the “outer
table”) to each row from the other table (the “inner table”), looking
for rows that satisfy the join predicate.
The merge join works by simultaneously reading and comparing the
two sorted inputs one row at a time. For each step, it compares the
next row from each input. If the rows are equal, it outputs a joined
row and continues. If the rows are not equal, it discards the lesser of
the two inputs and continues.
The hash join algorithm executes in two phases known as the “build”
and “probe” phases. During the build phase, it reads all rows from
the first input, hashes the rows on the equijoin keys, and creates or
builds an in-memory hash table. During the probe phase, it reads all
rows from the second input (often called the right or probe input),
hashes these rows on the same equijoin keys, and looks or probes for
matching rows in the hash table.
Text execution plan
References


http://msdn.microsoft.com/enus/library/ms175913.aspx
www.bradmcgehee.com