Nested Loop Joins (15.2)
Chinmay Bhawe( 119 )
April 16, 2012
O Tuple-Based Nested-Loop Join
O An Iterator for Tuple-Based Nested-Loop Join
O A Block-Based Nested-Loop Join Algorithm
O Analysis of Nested-Loop Join
15.3.1
Tuple-Based Nested-Loop Join
O The simplest variation of nested-loop join has loops that range over individual tuples of the relations involved. In this algorithm, which we call tuple-based nested-loop join, we compute the join as follows
R
S
O
For each tuple r in R Do if r and s join to make a tuple t THEN output t;
O If we are careless about how the buffer the blocks of relations R and S, then this algorithm could require as many as T(R)T(S) disk .there are many situations where this algorithm can be modified to have much lower cost.
O
O
One case is when we can use an index on the join attribute or attributes of R to find the tuples of R that match a given tuple of S, without having to read the entire relation R.
The second improvement looks much more carefully at the way tuples of R and S are divided among blocks, and uses as much of the memory as it can to reduce the number of disk I/O's as we go through the inner loop.
We shall consider this block-based version of nested-loop join.
15.3.2
An Iterator for Tuple-Based Nested-Loop Join
O
}
O
O
Open() {
O R.Open();
S.open();
A:=S.getnext();
GetNext() {
Repeat { r:= R.Getnext();
IF(r= Not found) {/* R is exhausted for the current s*/
R.close(); s:=S.Getnext();
IF( s= Not found) RETURN Not Found;
/* both R & S are exhausted*/
R.Close(); r:= R.Getnext();
}
} until ( r and s join)
RETURN the join of r and s;
}
}
Close() {
R.close ();
S.close ();
15.3.3
We can Improve Nested loop Join by compute R
|><| S.
1.
2.
Organizing access to both argument relations by blocks.
Using as much main memory as we can to store tuples belonging to the relation S, the relation of the outer loop.
FOR each chunk of M-1 blocks of S DO BEGIN read these blocks into main-memory buffers; organize their tuples into a search structure whose search key is the common attributes of R and S;
FOR each block b of R DO BEGIN read b into main memory;
FOR each tuple t of b DO BEGIN find the tuples of S in main memory that join with t ; output the join of t with each of these tuples;
END ;
END ;
END ;
15.3.4
Analysis of Nested-Loop Join
Assuming S is the smaller relation, the number of chunks or iterations of outer loop is B(S)/(M - 1). At each iteration, we read hf -
1 blocks of S and
B(R) blocks of R. The number of disk I/O's is thus
B(S)/M-1(M-1+B(R)) or B(S)+B(S)B(R)/M-1
Assuming all of M, B(S), and B(R) are large, but M is the smallest of these, an approximation to the above formula is
B(S)B(R)/M. That is, cost is proportional to the product of the sizes of the two relations, divided by the amount of available main memory.
O
O
O
O
O
O
O
O
O
B(R) = 1000, B(S) = 500, M = 101
Important Aside: 101 buffer blocks is not as unrealistic as it sounds. There may be many queries at the same time, competing for main memory buffers.
Outer loop iterates 5 times
At each iteration we read M-1 (i.e. 100) blocks of S and all of R (i.e. 1000) blocks.
Total time: 5*(100 + 1000) = 5500 I/O’s
Question: What if we reversed the roles of R and S?
We would iterate 10 times, and in each we would read
100+500 blocks, for a total of 6000 I/O’s.
Compare with one-pass join, if it could be done!
We would need 1500 disk I/O’s if B(S) M-1
1.
2.
The cost of the nested-loop join is not much greater than the cost of a one-pass join, which is 1500 disk 110's for this example. In fact.if B(S) 5 lZI - 1, the nestedloop join becomes identical to the one-pass join algorithm of Section 15.2.3
Nested-loop join is generally not the most efficient join algorithm.
In This topic we have learned about how the nested tuple Loop join are used in database using query execution and what is the process for that.
?