Query Execution Chapter 15 Section 15.1 Presented by Jiten Oswal CS 257 1 Agenda Query Processor and major parts of Query processor Physical-Query-Plan Operators Scanning Tables Basic approaches to locate the tuples of a relation R Sorting While Scanning Tables Computation Model for Physical Operator I/O Cost for Scan Operators Iterators 2 What is a Query Processor Group of components of a DBMS that converts a user queries and datamodification commands into a sequence of database operations It also executes those operations Must supply detail regarding how the query is to be executed 3 Major parts of Query processor Query Execution: The algorithms that manipulate the data of the database. Focus on the operations of extended relational algebra. 4 Outline of Query Compilation Query compilation Parsing : A parse tree for the query is constructed Query Rewrite : The parse tree is converted to an initial query plan and transformed into logical query plan (less time) Physical Plan Generation : Logical Q Plan is converted into physical query plan by selecting algorithms and order of execution of these operator. 5 Physical-Query-Plan Operators Physical operators are implementations of the operator of relational algebra. They can also be use in non relational algebra operators like “scan” which scans tables, that is, bring each tuple of some relation into main memory 6 Scanning Tables One of the basic thing we can do in a Physical query plan is to read the entire contents of a relation R. Variation of this operator involves simple predicate, read only those tuples of the relation R that satisfy the predicate. 7 Scanning Tables Basic approaches to locate the tuples of a relation R Table Scan Relation R is stored in secondary memory with its tuples arranged in blocks It is possible to get the blocks one by one Index-Scan If there is an index on any attribute of Relation R, we can use this index to get all the tuples of Relation R 8 Sorting While Scanning Tables Number of reasons to sort a relation Query could include an ORDER BY clause, requiring that a relation be sorted. Algorithms to implement relational algebra operations requires one or both arguments to be sorted relations. Physical-query-plan operator sort-scan takes a relation R, attributes on which the sort is to be made, and produces R in that sorted order 9 Computation Model for Physical Operator Physical-Plan Operator should be selected wisely which is essential for good Query Processor . For “cost” of each operator is estimated by number of disk I/O’s for an operation. The total cost of operation depends on the size of the answer, and includes the final write back cost to the total cost of the query. 10 Parameters for Measuring Costs Parameters that affect the performance of a query Buffer space availability in the main memory at the time of execution of the query Size of input and the size of the output generated The size of memory block on the disk and the size in the main memory also affects the performance 11 Parameters for Measuring Costs B: The number of blocks are needed to hold all tuples of relation R. Also denoted as B(R) T:The number of tuples in relationR. Also denoted as T(R) V: The number of distinct values that appear in a column of a relation R V(R, a)- is the number of distinct values of column for a in relation R 12 I/O Cost for Scan Operators If relation R is clustered, then the number of disk I/O for the table-scan operator is = ~B disk I/O’s If relation R is not clustered, then the number of required disk I/O generally is much higher A index on a relation R occupies many fewer than B(R) blocks That means a scan of the entire relation R which takes at least B disk I/O’s will require more I/O’s than the entire index 13 Iterators for Implementation of Physical Operators Many physical operators can be implemented as an Iterator. Iterator are methods by which the operators compromising a physical query plan can pass requests for tuples and answers among themselves. Three methods forming the iterator for an operation are: 14 Iterators for Implementation of Physical Operators Open () GetNext() Close() 1. Open( ) : This method starts the process of getting tuples It initializes any data structures needed to perform the operation 15 2. GetNext( ): Returns the next tuple in the result If there are no more tuples to return, GetNext returns a special value NotFound 3. Close( ) : Ends the iteration after all tuples It calls Close on any arguments of the operator 16 Thank You 17