Uploaded by Olyviah Amwoma

ICS 3101 - Advanced database systems - July 2020

advertisement
FACULTY OF INFORMATION TECHNOLOGY
BACHELOR OF INFORMATICS AND COMPUTER SCIENCE
END OF SEMESTER EXAMINATION
ICS 3101: ADVANCED DATABASE SYSTEMS
DATE: July 2020
Time: 2 Hours
Instructions
1.
This examination consists of FIVE questions.
2.
Answer Question ONE (COMPULSORY) and any other TWO questions.
Question ONE (30 Marks)
a) Consider a disk with block size B = 512 bytes. A block pointer is PR = 6 bytes long, and
a record pointer is PR = 7 bytes long. A file has r = 30,000 EMPLOYEE records of fixed
length. Each record has the following fields: Name(30 bytes), Ssn (9 bytes),
Department_code (9 bytes), Address (40 bytes),Phone (10 bytes), Birth_date (8 bytes),
Gender (1 byte), Job_code (4 bytes), and Salary (4 bytes, real number).An additional
byte is used as a deletion marker.
i) Calculate the record size R in bytes. (2 Marks)
ii) Calculate the blocking factor bfr and the number of file blocks b, assuming an
unspanned organization.(2 Marks)
iii) Suppose that the file is ordered by the key field Ssn and we want to construct a
primary index on Ssn. Calculate :
a. The index blocking factor bfri(which is also the index fan-out fo) (2
Mark)
b. The number of first-level index entries and the number of first-level
index blocks (2 Marks)
c. The number of levels needed if we make it into a multilevel index (2
Marks)
d. The total number of blocks required by the multilevel index (3 Marks)
e. The number of block accesses needed to search for and retrieve a record
from the file given its Ssn value using the primary index. (2 Marks)
Page 1 of 4
b) Conceptual modelling is an important step in data warehousing design. With the aid of a
diagram, discuss three schemas used in the conceptual modelling of a data warehouse.
(6 Marks)
c) Describe the circumstances in which you would choose to use embedded SQL rather
than SQL alone or only a general-purpose programming language.(2 Marks)
d) Consider the bank database schema given below. Write an SQL trigger to carry out the
following action: On delete of an account, for each owner of the account, check if the
owner has any remaining accounts, and if she does not, delete her from the depositor
relation(4 Marks)
Branch (branchName, branchCity,assets)
Customer (customerName,customerStreet,customerCity)
Loan (loanNumber,branchName,amount)
Borrower (customerName,loanNo)
Account (accountNo,branchName,balance)
Deposit (customerName, AccountNo)
e) Define the following terms in relation to file structure and organization (3 Marks)
i.
Heap file
ii.
Sorted File
iii.
Transfer Rate
Question TWO (15 Marks)
a) Consider these relations with the following properties:
r(A, B, C)
s(C, D, E)
30,000 tuples
60,000 tuples
25 tuples fit on 1 block
30 tuples fit on 1 block
i) Estimate the number of disk block accesses required for a natural join of r and s
using a nested-loop join if r is used as the outer relation. Show the calculation for
best and worst case scenario (4 Marks)
ii) Estimate the number of disk block accesses required for a natural join of r and s
using a block nested-loop join if s is used as the outer relation. Assume that there are
more than 2000 memory buffers available to facilitate this operation, where each
Page 2 of 4
memory buffer can buffer one disk block. Show the calculation for the best and
worst case scenario (4 Marks)
b) Discuss the index that would be the most efficient to evaluate the following SQL query.
Justify your answer (3 Marks)
SELECT S.Id
FROM Student S
WHERE S.grade <= 2 AND S.grade >= 3.3
c) CAP Theorem and BASE properties are central to NoSQL database. Discuss how these
properties improve the performance of the database(4 Marks)
Question THREE (15 Marks)
A STUDENT file with STUDENTID as the hash key includes records with the following
STUDENTID values: 123, 456, 789, 102, 131, 415, 161, 718, 192, 021, 222, 324, 252, 627,
282, 930 The file uses 8 buckets, numbered 0 to 7. Each bucket is one disk block and holds two
records.
a) Load these records into the file in the given order, using the hash function h(K)
= K mod 8. Clearly show the hashing table and the buckets(6 Marks)
b) Calculate the average number of block accesses for a random retrieval on
EmployeeID.(3 Marks)
c) Discuss three collision resolution methods which are used to handle overflows
when hashing(6 Marks)
Question FOUR (15 Marks)
a) Discuss the differences among primary, secondary, and clustering indexes? How do
these differences affect the ways in which these indexes are implemented?(2 Marks)
b) Discuss Three key properties of a data warehouse (3 Marks)
c) Discuss how a B-tree differs from a B+-tree? Why is a B +-tree the most preferred
access structure to a data file?(4 Marks)
d) Construct a B+ tree of order 3 with the following set of data: 8, 5, 1, 7, 3, 12, 9, 6.Then
delete 5,12 and 9 from the list (6 Marks)
Question FIVE (15 Marks)
Use the following Company Schema to answer question Five
Employee
Fname
Ssn(PK) Bdate Address Gender Salary Supervisor_Ssn Dno(FK)
Department
Dname
Dnumber(pk)
Manager_Ssn
Page 3 of 4
Mng_start_date
a) Define the following terms
i.
Windowing (2 Marks)
ii.
Stored procedures (2 Marks)
b) Suppose we want to check whenever an employee’s salary is greater than the salary of
his/her direct supervisor in the company database. This action would call external stored
procedure SALARY_VIOLATION.Write SQL syntax for this action (3 Marks)
c) Data helps analysts to make informed decisions in an organization. Discuss five main
differences between OLAP and OLTP and explain the need of having a separate data
warehouse (5 Marks)
d) Discuss the following terms in relation to data warehousing; Data mart, data
extraction and dimension (3 Marks)
Page 4 of 4
Download