Uploaded by myadmin

CMPE 272 HW #1 solutions

advertisement
CMPE 272
HW #1
KEY SHEET
16.34. Consider a disk with the following characteristics (these are not parameters of any
particular disk unit): block size B = 512 bytes; interblock gap size G = 128 bytes; number
of blocks per track = 20; number of tracks per surface = 400. A disk pack consists of 15
double-sided disks.
a. What is the total capacity of a track, and what is its useful capacity (excluding
interblock gaps)?
Given Block size B=512 bytes
Interblock gap size G=128 bytes
Number of blocks per track = 20
Number of tracks per surface = 400.
Total capacity of a track (TC) = (Block size + Interblock gap size) * number of blocks
= (512 + 128) * 20 = 12800 = 12.8 KB
Useful Capacity of a track (UC) = Block size * number of blocks/tracks
= 512 * 20 = 10240 =10.24 KB
b. How many cylinders are there?
Total number of cylinders = number of tracks per surface= 400
c. What are the total capacity and the useful capacity of a cylinder?
Total cylinder capacity = no.of double sided disks * 2 (since double sided) * Total
capacity = 15*2*12800 =384000 =384 KB
Useful Capacity of a cylinder = no.of double sided disks * 2 (since double-sided) *
Useful capacity
=15*2*10240
=307200=307.2 KB
d. What are the total capacity and the useful capacity of a disk pack?
Total capacity of a disk pack = number of tracks per surface * Total Cylinder capacity
=400*384000 = 153600000 =153.6KB
Useful capacity of a disk pack = number of tracks per surface * Useful Cylinder capacity
=400*307200 =122880000 = 122.8 KB
e. Suppose that the disk drive rotates the disk pack at a speed of 2,400 rpm (revolutions
per minute); what are the transfer rate (tr) in bytes/msec and the block transfer time (btt)
in msec? What is the average rotational delay (rd) in msec? What is the bulk transfer
rate? (See Appendix B.)
Speed = 2400 rpm
Time for one disk revolution in msec = 60 * 1000/Disk drive rpm = 60 * 1000/2400 = 25 ms
Transfer rate (TR) = total track size/Time for one disk revolution
=12800/25
=512 bytes/msec
Block transfer rate (BTT) = B/TR = 512/512= 1ms
Average rotational delay (rd) = Time for one disk revolution/2 = 25/2 =12.5 ms
Bulk Transfer rate (BTR) = TR*B/(B+G) = 512*512/(512+128) =409.6 bytes/msec
f. Suppose that the average seek time is 30 msec. How much time does it take (on the
average) in msec to locate and transfer a single block, given its block address?
Seek time (S) = 30 ms
Time to locate and transfer a single block = S+rd+BTT
= 30+12.5+1 = 43,5 ms
g. Calculate the average time it would take to transfer 20 random blocks, and compare
this with the time it would take to transfer 20 consecutive blocks using double buffering
to save seek time and rotational delay.
Time to transfer 20 random blocks = 20*43.5 = 870 ms
Time to transfer 20 consecutive blocks = sd+rd+ (20*BTT) = 30+12.5+(20*1)=62.5 ms
16.35. A file has r = 20,000 STUDENT records of fixed length. Each record has the
following fields: Name (30 bytes), Ssn (9 bytes), Address (40 bytes), PHONE (10 bytes),
Birth_date (8 bytes), Sex (1 byte), Major_dept_code (4 bytes), Minor_dept_code (4 bytes),
Class_code (4 bytes, integer), and Degree_program (3 bytes). An additional byte is used
as a deletion marker. The file is stored on the disk whose parameters are given in
Exercise 16.27.
a. Calculate the record size R in bytes.
Assuming block size =512 bytes
Record size R = Sum of all field size
=30+9+40+10+8+1+4+4+4+3+1 =114B
b. Calculate the blocking factor bfr and the number of file blocks b, assuming an
unspanned organization
BFR = B/R = 512/114 = 4.49
No.of block (b) = no. of records/BFR = 20000/4 = 5000
c. Calculate the average time it takes to find a record by doing a linear search on the file
if (i) the file blocks are stored contiguously, and double buffering is used; (ii) the file
blocks are not stored contiguously.
Avg time to find a record by doing a linear search on file, the search is performed on average
half = 5000/2 = 2500
BTR = TR(B/B+G) = 512*(512/(512+128)) = 409.6 bytes ~410 bytes
(i) If the blocks are stored consecutively and double buffering is used, the time to read
2500 consecutive blocks
Time taken to read 2500 blocks = S+rd+(2500/(B/BTR)
= 30+12.5+2500(512/410)
= 3167.5 ms = 3.1675 sec
ii) For scattered blocks, we need to seek time for each block so that the time will be
=(S+rd+BTT)*2500 = (30+12.5+1)*2500 = 108750 sec = 108.75 ms
d. Assume that the file is ordered by Ssn; by doing a binary search, calculate the time it
takes to search for a record given its Ssn value.
Avg time to find a record by doing a binary search on the file when it's ordered by Ssn
=log2b(S+rd+BTT) = log25000(30+12.5+1) = 0.5655 ms
Assuming block size =2400 bytes
a)Record size R = Sum of all field size
=30+9+40+10+8+1+4+4+4+3+1 =114B
b) BFR = floor(B/R) = 2400/114 = 21 sec/block
No.of block (b) = ceil(no. of records/BFR) = 20000/21 = 952.3 = 953
c) Avg time to find a record by doing a linear search on file, the search is performed on average
half = 952/2 = 476
BTR = TR(B/B+G) = 2400*(2400/(2400+600)) = 1920 bytes
(i) If the blocks are stored consecutively and double buffering is used, the time to read
476 consecutive blocks
Time taken to read 2500 blocks = S+rd+(476/(B/BTR)
= 20+10+476(2400/1920)
= 625 ms = 0.625 sec
ii) For scattered blocks, we need to seek time for each block so that the time will be
=(S+rd+BTT)*476 = (20+10+1)*476 = 14756 sec
d) Avg time to find a record by doing a binary search on the file when it's ordered by Ssn
=log2b(S+rd+BTT) = log2952(20+10+1) = 310 ms
16.36. Suppose that only 80% of the STUDENT records from Exercise 16.28 have a value
for Phone, 85% for Major_dept_code, 15% for Minor_dept_code, and 90% for
Degree_program; and suppose that we use a variable-length record file. Each record has
a 1-byte field type for each field in the record, plus the 1-byte deletion marker and a
1-byte end-of-record marker. Suppose that we use a spanned record organization, where
each block has a 5-byte pointer to the next block (this space is not used for record
storage).
a. Calculate the average record length R in bytes.
Fixed size = (30+1)+(9+1)+(40+1)+(8+1)+(1+1)+(4+1)+1+1 = 100 bytes
Variable Size = ((10+1)*0.8)+((4+1)*0.85)+((4+1)*0.15)+((3+1)*0.9))
= 8.8+4.25+0.75+3.6=17.4 bytes
The average record size R = R(fixed) + R(variable) = 100 + 17.4 = 117.4 bytes
b. Calculate the number of blocks needed for the file.
Using a spanned record organization, where each block has a 5-byte pointer to the next
block, the bytes available in each block (B - 5) = (512-5) = 507 bytes.
The number of blocks needed for the file : b = ceiling((r*R)/(B-5))
= ceiling(2348000/507) = 4631 blocks
Assuming block size =2400 bytes
a) Fixed size = (30+1)+(9+1)+(40+1)+(8+1)+(1+1)+(4+1)+1+1 = 100 bytes
Variable Size = ((10+1)*0.8)+((4+1)*0.85)+((4+1)*0.15)+((3+1)*0.9))
= 8.8+4.25+0.75+3.6=17.4 bytes
The average record size R = R(fixed) + R(variable) = 100 + 17.4 = 117.4 bytes
b) Using a spanned record organization, where each block has a 5-byte pointer to
the next block, the bytes available in each block (B - 5) = (2400-5) = 2395 bytes.
The number of blocks needed for the file : b = ceiling((r*R)/(B-5))
= ceiling(117.4*2000/2395) =980.3 = 981 blocks
18.13. Consider SQL queries Q1, Q8, Q1B, and Q4 in Chapter 6 and Q27 in Chapter 7. a.
Draw at least two query trees that can represent each of these queries. Under what
circumstances would you use each of your query trees? b. Draw the initial query tree for
each of these queries, and then show how the query tree is optimized by the algorithm
outlined in Section 18.7. c. For each query, compare your own query trees of part (a) and
the initial and final query trees of part (b).
Q8: select E.frame, E.lname,S.fname,S.lname
From Employee E, Employee S
Where E.Superssn=S.ssn
Q8’s tree 1:
Project E.frame, E.lname,S.fname,S.lname
E.Superssn=S.ssn JOIN
Employee E, employee S
Q8’s tree 2:
Project Cartesian Product
Employee E, employee S
E.frame, E.lname,S.fname,S.lname
Select E.Superssn =S.Ssn
Initial Query tree is the same as tree 2. Replace selection and Cartesian product by join in tree
1. Tree1 is the result after optimization
Q27 : select E.frame, E.lname,1.1*salary
from Employee, Works_on, Project
where ssn=Esssn and Pno =Pnumber and Pname =’ProductX’
Q8’s tree 1:
Project frame, lname,Salary
Pno =Pnumber join
Employee Project
Ssn =Essn join select Pname=’ProductX’
Works_on
Q8’s tree 2:
Project fname, name, Salary
Pno=Pnumber and Ssn = Essn and Pname = ‘ProductX’
Select employee, Project
Cartesian Product
Works_on
Cartesian Product
The initial Query tree of Q27 is ‘Q27’ tree 2. But the heuristic approximation process will not be
the same as tree1. This can be more optimised as follows:
Project fname, name, Salary
Pno = Pnumber join Employee
Project
Ssn =Essn Join
Select Pname =’ProductX’ works_on
18.14. A file of 4,096 blocks is to be sorted with an available buffer space of 64 blocks.
How many passes will be needed in the merge phase of the external sort-merge
algorithm?
Let nR = no.of initial runs , b= no.of file blocks , nB = available buffer space , dM = degree of
merging
b=4096
nB=64
Sorting Phase nR = b/nB =4096/64 =64
Dm = min(nB-1,nB) = min(63,64) =63
Number of passes np= ceil (logdM 𝑛R) = log63 64 =1.004
Number of passes = 2
20.14. Change transaction T2 in Figure 20.2(b) to read read_item(X); X := X + M; if X > 90
then exit else write_item(X);
Discuss the final result of the different schedules in Figures 20.3(a) and (b), where M = 2
and N = 2, with respect to the following questions: Does adding the above condition
change the final outcome? Does the outcome obey the implied consistency rule (that the
capacity of X is 90)?
read_item(X);
X := X + M;
if X > 90 then exit
else write_item(X);
If M=2, then with initial value of X=88, we have X=X+M =88+2 =90
Only when initial X>88, do we exit with the if() statement getting to be evaluated as True. Or in
the other words, for write_item(X) not to be executed, the initial value of X>88.
20.22 Which of the following schedules is (conflict) serializable ? For each serializable schedule,
determine the equivalent serial schedules.
a. r1(X); r3(X); w1(X); r2(X); w3(X)
b. r1(X); r3(X); w1(X); w1(X); r2(X)
c. r3(X); r2(X); w3(X); r1(X); w1(X)
d. r3(X); r2(X); r1(X); w3(X); w1(X)
a) Given Schedule r1(X); r3(X); w1(X); r2(X); w3(X)
Conflict graph :
There is a cycle T1-> T2-> T3-> T1 in this graph. So, the given set of scheduling
times/transactions is not serializable.
b) Given Schedule r1(X); r3(X); w1(X); w1(X); r2(X)
There is a cycle T1> T3-> T1 in this graph. So, the given set of scheduling times/transactions is
not serializable.
c) Given Schedule r3(X); r2(X); w3(X); r1(X); w1(X)
This graph doesn't contain any cycle. Hence, this schedule is serializable. Now, T2-> T3->T1 is a
serial schedule and is equivalent to r2(X); w3(X); r1(X); w1(X)
d) Given Schedule r3(X); r2(X); r1(X); w3(X); w1(X)
There is a cycle T1> T3-> T1 in this graph. So, the given set of scheduling times/transactions is
not serializable.
20.23. Consider the three transactions T1, T2, and T3, and the schedules S1 and S2 given
below. Draw the serializability (precedence) graphs for S1 and S2, and state whether each
schedule is serializable or not. If a schedule is serializable, write down the equivalent serial
schedule(s).
T1: r1 (X); r1 (Z); w1 (X);
T2: r2 (Z); r2 (Y); w2 (Z); w2 (Y);
T3: r3 (X); r3 (Y); w3(Y);
S1: r1 (X); r2(Z);r1(Z); r3(X); w1(X); w3(Y); r2(Y) ; w2(Z); w2(Y);
S2: r1 (X); r2(Z); r3(X); r1(Z) ; r2(Y) ; r3(Y) ; w1 (X); w2 (Z) ; w3( Y) ; w2(Y);
Time
T1
T0
r1(X)
T1
T2
T2
T3
r2(Z)
r1(Z)
T3
r3(X)
T4
r3 (Y)
T5
w1(X)
T6
w3(Y)
T7
r2(Y)
T8
w2(Z)
T9
w2(Y)
From the above schedule table, we can determine the conflicting operations and dependencies
Conflicting Operations
Dependencies between the Transactions
r1(Z) , w2(Z)
T1-> T2
r3(X) ,w1(X)
T3->T1
r3(Y), w2(Y)
T3->T2
w3(Y), w2(Y)
T3-> T2
Using the above to tables we can draw the precedence graph:
Since there are no cycles in the precedence Schedule ‘S1’ is Serializable.
Schedule: S2
Time
T1
T0
r1(X)
T1
T2
r2(Z)
T2
T3
r3(X)
r1(Z)
T4
r2(Y)
T5
T6
T7
r3 (Y)
w1(X)
w3(Y)
w2(Z)
T8
T9
T3
w3(Y)
w2(Y)
From the above schedule table, we can determine the conflicting operations and dependencies
Conflicting Operations
Dependencies between the Transactions
R3(Z) , w2(Z)
T3-> T1
R1(Z), W2(Z)
T1->T2
R2(Y), W3(Y)
T2->T3
R3(Y), W2(Y)
T3-> T2
Using the above to tables we can draw the precedence graph:
From the above precedence graph, we can see that there is a loop between T2 and T3.
Hence, schedule S2 is not Serializable.
”
Download