1 91.4902 – Advanced Database System: Assignment 2 QUESTION

advertisement
91.4902 – Advanced Database System: Assignment 2
1
QUESTION 1
The implementation of the join operation (R ⋈A=B S) using sort merge is outlined by the
following algorithm.
set i  1; j  1;
while (i  n) and (j  m)
do {if R(i)[A] > S(j)[B] then set j  j +1
else if R(i)[A] < S(j)[B] then set i  i +1
else {/* R(i)[A] = S(j)[B], so we output a matched tuple*/
set k  i;
while (k  n) and (R(k)[A] = S(j)[B])
do {set l  j;
while (l  m) and (R(i)[A] = S(l)[B])
do {output; l  l + 1;}
set k  k +1;}
set i  k, j  l;}}
From the given two relation, R and S, the two tables below are the illustration of the the
initial relations before any steps of the abovementioned sort-merge algorithm has been
implemented. Note that the two relations relation below has been sorted by the value of
the join attributes A and B, and therefore the sort phase of the sort-merge algorithm
identified on the fist two lines above, can be ignored.
A
3
3
5
6
7
8
C
a
b
c
e
f
g
R
B
1
2
3
3
8
8
D
j
j
h
k
f
g
S
The implementation of the sort phase is initiated by the instantiation of two accessing
pointers (accessing indexes) – i and j. Following the instantiation of these two accessing
index, the values of both indexes are initialized to 1 in order to make reference on the
first records of the relations R and S. Since the value of the attribute A in R is greater
than the value of the attribute B in S, the value of the index j increment by 1 until the
values of both A and B are equal. When j is equal to 3 and i is equal to 1, the value of B
is equal to the value of A, and hence the combination of the two tuples referenced by the
indexes i and j is assigned as the first record of the output relation T. The followings are
the illustrations of these processes.
91.4902 – Advanced Database System: Assignment 2
A
3
3
5
6
7
8
i=1
C
a
b
c
e
f
g
j=3
2
B
1
2
3
3
8
8
R
i=1
A
3
3
5
6
7
8
S
C
a
b
c
e
f
g
A
3
R
j=3
D
j
j
h
k
f
g
B
1
2
3
3
8
8
D
j
j
h
k
f
g
C
a
B
3
D
h
T
S
The conditional process of the algorithm proceeds by searching the other record of S that
has a matched value of attribute B. This is done by instantiating a new index l and set the
value of l to j + 1. Following the instantiation and the initialization of index l, the while
loop starts the iteration for searching the matched record of S by examining the value of
attribute B. When the index l is equal to 4, the value of B in the 4 th record of S is equal to
the value of A in the record referenced by the index i (S(4)[B] = R(1)[A]). Hence, the
combination of the two records referenced by the current index is assigned as a new
record of relation T.
91.4902 – Advanced Database System: Assignment 2
A
3
3
5
6
7
8
i=1
C
a
b
c
e
f
g
l=4
3
B
1
2
3
3
8
8
R
i=1
A
3
3
5
6
7
8
C
a
b
c
e
f
g
R
l=4
B
1
2
3
3
8
8
D
j
j
h
k
f
g
D
j
j
h
k
f
g
S
A
3
3
C
a
a
B
3
3
D
h
k
T
S
After the creation of new output tuple, the index l is incremented by1. Checking the next
record in the relation S against the condition, the searching condition is no longer valid
for the current record and the while loop is then terminated when l is equal to 5.
In the further step of the algorithm, another internal while loop iteration is implemented
against the records of R. In the following while loop, another index, named k, is
instantiated and its value is initialized to i + 1. When the index k is equal to 2 and the
index j is equal to 3, the value of A in the relation R is equal to the value of B in the
relation S, and hence a new output tuple is generated in S by taking the combination of
the two matched records. The following graphs illustrated the aforementioned processes.
91.4902 – Advanced Database System: Assignment 2
A
3
3
5
6
7
8
k=2
C
a
b
c
e
f
g
j=3
4
B
1
2
3
3
8
8
R
k=2
A
3
3
5
6
7
8
C
a
b
c
e
f
g
R
j=3
B
1
2
3
3
8
8
D
j
j
h
k
f
g
D
j
j
h
k
f
g
S
A
3
3
3
C
a
a
b
B
3
3
3
D
h
k
h
T
S
Incrementing the index k by 1 triggers the termination of the second internal while loop
iteration, since the value of A in the third record of R is greater than the value of B in the
third record of S.
In the completion of the main while loop iteration, the last line of the algorithm sets the
index i and the index j to the value of k-1 and l-1 respectively. The index i is now equal
to 2, whereas the index j is equal to 4. Since the attribute value of A for the second record
of R is equal to the attribute value of B for the fourth record of S, a new record in T is
therefore created to take the combination of the currently pointed record. For further
details, the above processes are illustrated in the followings.
91.4902 – Advanced Database System: Assignment 2
A
3
3
5
6
7
8
i=2
C
a
b
c
e
f
g
j=4
5
B
1
2
3
3
8
8
R
i=2
A
3
3
5
6
7
8
C
a
b
c
e
f
g
R
j=4
B
1
2
3
3
8
8
D
j
j
h
k
f
g
D
j
j
h
k
f
g
S
A
3
3
3
3
C
a
a
b
b
B
3
3
3
3
D
h
k
h
k
T
S
Following the above processes, the index l is set to 5 (the value of j + 1). As the index l
points to the next record in S, <8, f>, the attribute value of B for the fifth record of S fails
to satisfy the second condition of the first internal while loop, and hence the while loop is
not valid for execution.
Further processing the steps in the algorithm, the index k is set to 3 and the condition for
the while loop is applied to the pointed record in R, <5, c>. Since the attribute value of A
for the third record of R does not match the attribute value of B for the fourth record of S,
the second internal while loop is also failed to be executed.
Due to the implementation of the last assignment line, the index value of i is set to 3 (the
value of k), and the index value of j is set to 5 (the value of l). This last line ends the
current iteration and the algorithm proceeds by checking the value of i and j against the
value of n and m respectively. Since the value of i is less than the value of n and the value
of j is less than m, the next while loop iteration is triggered for execution. Starting the
iteration, the value of A in the current record of R is checked against the value of B in the
current record of S. When i is equal to 3 and j is equal to 5, the value of A is less than the
value of B and therefore the value of the index i is incremented by 1. Similar happen
when i is equal to 4 and j is equal to 5, the index i is further incremented by 1. The
91.4902 – Advanced Database System: Assignment 2
6
current record of R that is pointed by the index i = 5, has the value of A less than the
value of B for the current record of S, and the index i is therefore incremented by 1.
When the value of the index i is equal to 6, the index refer to the record whose value of A
is equal to the value of B for the current record of S. The current records of the relation R
and the relation S is combined to generate a new record of relation T, as illustrated in the
following.
A
3
3
5
6
7
8
i=6
C
a
b
c
e
f
g
j=5
B
1
2
3
3
8
8
R
i=6
A
3
3
5
6
7
8
C
a
b
c
e
f
g
R
j=5
B
1
2
3
3
8
8
D
j
j
h
k
f
g
D
j
j
h
k
f
g
S
A
3
3
3
3
8
C
a
a
b
b
g
B
3
3
3
3
8
D
h
k
h
k
f
T
S
Following the creation of a new output record, the index l is then set to 6 (the value of j +
1). Since the value of l is equal to m and the value of attributes A and B is equal for the
currently pointed records of R and S (R(6)[A] = S(6)[B]), those two records are merged
to form a newly created record of relation T. The aforementioned processes are illustrated
as follow.
91.4902 – Advanced Database System: Assignment 2
A
3
3
5
6
7
8
i=6
C
a
b
c
e
f
g
7
B
1
2
3
3
8
8
l=6
S
R
i=6
A
3
3
5
6
7
8
C
a
b
c
e
f
g
A
3
3
3
3
8
8
R
l=6
B
1
2
3
3
8
8
D
j
j
h
k
f
g
D
j
j
h
k
f
g
C
a
a
b
b
g
g
B
3
3
3
3
8
8
D
h
k
h
k
f
g
T
S
After the above illustrated generation of output, the index l is increment by 1. As the
index l is equal to 7, the next while loop iteration is not valid for execution because the
value of l has exceeded the value of m. The steps of the algorithm proceed with setting
the value of k to be the value of i + 1. Due to the set operation applied on k, the value of k
exceeds the value of n, and the while loop that follows is therefore not valid for
execution. The next line to be implemented is the assignment line that sets the values of
indexes i and j to the values of k-1 and l-1 respectively. Since the value of the index i has
exceeded n, the next while loop iteration is no longer valid for execution and therefore
the sort-merge algorithm is accomplished with the following result.
A
3
3
3
3
8
8
C
a
a
b
b
g
g
B
3
3
3
3
8
8
D
h
k
h
k
f
g
91.4902 – Advanced Database System: Assignment 2
8
QUESTION 2
S.Name, S.StudentNumber
⋈G.StuNumber = S.StudentNumber
⋈E.SectionIdentifier = G.SectionIdentifier
⋈C.CourseNumber = E.CourseNumber
⋈C.CourseNumber = P.CourseNumber
σC.Department = “BC”
STUDENT
GRADE_REPORT
SECTION
PREREQUISITE
COURSE
Name,
Student
⋈CourseNum
((σDept
= ‘BC’
= CourseNum
(COURSE) ⋈CourseNum
SECTION ⋈
= CourseNum
SecIdentifier = SecIdentifier
PREREQUISITE
GRADE_REPORT)
⋈StuNumber = StuNumber STUDENT)
According to the analysis of the given relational algebra expression, the above illustrated
graph represents an initial query tree that corresponds to the algebra expression. Applying
the Heuristic Algebraic Optimization Algorithm to the initial query tree, the following is
the illustration of the optimized query tree. Although there are six step of transition
involved in the complete algorithm, there is only one step applicable to the relational
algebra expression under discussion. Hence, the fifth step has been applied to the above
illustrated initial query tree, and there are several PROJECT operations created as the
result. The purpose of using PROJECT operation here is to limit the resulted attributes of
91.4902 – Advanced Database System: Assignment 2
9
each subtrees to those which are required in the query result and in the subsequent
operation of the query tree.
⋈G.StudentNumber = S.StudentNumber
G.StudentNumber
S.Name, S.StudentNumber
ber
⋈E.SectionIdentifier = G.SectionIdentifier
E.SectionIdentifier
⋈C.CourseNumber = E.CourseNumber
⋈C.CourseNumber = P.CourseNumber
C.CourseNumber
σC.Department = “BC”
COURSE
G.SectionIdentifier, G.StudentNum
GRADE_REPORT
E.CourseNumber, E.SectionIdentifier
P.CourseNumber
PREREQUISITE
STUDENT
SECTION
91.4902 – Advanced Database System: Assignment 2
10
QUESTION 3
Given the initial values of X and Y, the following table illustrates the transition of the
values of X and Y at each point of time. Note, the values of N and M used for the
modification in both transactions, are equal to 12 and 8 respectively.
Time
1
2
3
4
5
6
7
8
9
Transaction 1
READ(X)
X:= X - N
Transaction 2
T1
T2
T1.X = 90
T1.X = 90 - 12 = 78
READ(X)
X:= X + M
WRITE(X)
READ(Y)
T2.X = 90
T2.X = 90 + 8 = 98
X = T1.X = 78
T1.Y = 100
X = T2.X = 98
WRITE(X)
Y:= Y + N
WRITE(Y)
Overwrite the
previously
updated value
of X
T1.Y = 100 + 12 = 112
Y = T1.Y = 112
In the above computation of T1 and T2, the final value of the data item X is shown to be
98 (at time = 7, X = T2.X = 98), while the final value of the other data item, Y, is shown
to be 112 (at time = 9, X = T1.Y = 112). These figures of X and Y had resulted due to the
last write operation on X performed by T2 and the only write operation on Y performed
by T1 respectively. However, the above interleaved transaction had yielded an incorrect
value of X. The incorrectness of the final value of X was due to the interleaving write
operation on X that overwrote the value of X previously written by T1. This means that
the value of X was first updated to 78 and written back to the database, and later at time =
7, the interleaving write operation overwrote the value of X written previously. The
aforementioned update problem is known as the lost update problem.
The following computational shows the correct value of X and Y, and therefore further
proof on why the above interleaved transactions had caused an incorrect update value of
X.
T1  T2
T1.X = 90
T1.X = 90 - 12 = 78
T1 X = T1.X = 78
T1.Y = 100
T1.Y = 100 + 12 = 112
Y = T1.Y = 112
T2.X = 78
T2 T2.X = 78 + 8 = 86
X = T2.X = 86
T2  T1
T2.X = 90
T2 T2.X = 90 + 8 = 98
X = T2.X = 98
T1.X = 98
T1.X = 98 - 12 = 86
T1 X = T1.X = 86
T1.Y = 100
T1.Y = 100 + 12 = 112
Y = T1.Y = 112
91.4902 – Advanced Database System: Assignment 2
11
The above computational processes implemented the transactions one after another. The
first computational column implemented T1 and then T2, while the second implemented
T1 after T2. In either way, both of computations yielded the same values of X and Y,
which were 86 and 112 respectively. These two results contradict with the two results of
the former computational processes, and hence the interleaved transactions can be proven
to yield an incorrect update.
91.4902 – Advanced Database System: Assignment 2
12
QUESTION 4
 S1: R1(X), R2(X), R1(Y), R2(Z), W1(X), R1(Z), W2(Z), C2, W1(Y), C1.
Given the above representation, the schedule S1 is identified to consist of two
transactions. The operations belong to the first transaction are highlighted in yellow,
while the operation belong to the second transaction are highlighted in light blue.
From the definition of strict schedule, a strict schedule is a schedule in which
transactions can neither read nor write an item X until the last transaction that wrote
X has committed. Analyzing the above schedule, there were three write operations
that were applied on the data item X, Z and Y. As we identify each operation by the
color and the data item it accessed, the first write operation that was on data item X
and was performed by the first transaction was not followed by any read or write
operation performed on the same data item and by different transaction. Similarly, the
second and third write operation that were on data items X and Z respectively, were
not followed by any read or write operation accessing the same data items and
performed by different transaction. Hence, there was no violation of the strict
schedule occurred before both of the transactions committed, and the above schedule
is said to be a strict schedule. Since S1 is strict schedule, it implies that the transaction
S1 is also cascadeless and recoverable.

S2: R1(X), W1(X), R2(Z), R2(Y), R2(X), W2(Y), C2, A1
Using the same representation, the first transaction is identified by the operations
highlighted in yellow, whereas the second transaction is identified by the operations
highlighted in blue. Analyzing the above schedule, the first transaction is identified to
write on data item X on the second operation of the schedule. Following this write
operation, the second transaction triggered the read operation on the data item X that
had been written by the first transaction. Since the second transaction had read the
data item X written by the first transaction and it had committed before the first
transaction, this schedule is nonrecoverable. Due to the non-recoverability of the
above schedule, it implies that S2 is neither cascadeless nor strict.

S3: R1(X), W1(X), R1(Y), R2(Y), W2(Y), W1(Y), C1, R2(X), W2(X), C2
Referring to the above representation of the schedule S3, there is a violation against
the definition of strict rule in the sixth operation of the schedule. W2(Y) implies the
write operation on Y that is performed by the second transaction. Since the second
transaction had not committed while the first transaction wrote on the same data item,
the above schedule is not strict. Checking the schedule against the definition of
cascadeless schedule, the above schedule is identified to be cascadeless. The above
schedule is said to be cascadeless because all the read operations occurred in the
schedule only applied on the data items written by the committed schedule.
Download