FACULTY OF INFORMATION TECHNOLOGY BACHELOR OF INFORMATICS AND COMPUTER SCIENCE END OF SEMESTER EXAMINATION ICS 3101: ADVANCED DATABASE SYSTEMS DATE: July 2020 Time: 2 Hours Instructions 1. This examination consists of FIVE questions. 2. Answer Question ONE (COMPULSORY) and any other TWO questions. Question ONE (30 Marks) a) Consider a disk with block size B = 512 bytes. A block pointer is PR = 6 bytes long, and a record pointer is PR = 7 bytes long. A file has r = 30,000 EMPLOYEE records of fixed length. Each record has the following fields: Name(30 bytes), Ssn (9 bytes), Department_code (9 bytes), Address (40 bytes),Phone (10 bytes), Birth_date (8 bytes), Gender (1 byte), Job_code (4 bytes), and Salary (4 bytes, real number).An additional byte is used as a deletion marker. i) Calculate the record size R in bytes. (2 Marks) ii) Calculate the blocking factor bfr and the number of file blocks b, assuming an unspanned organization.(2 Marks) iii) Suppose that the file is ordered by the key field Ssn and we want to construct a primary index on Ssn. Calculate : a. The index blocking factor bfri(which is also the index fan-out fo) (2 Mark) b. The number of first-level index entries and the number of first-level index blocks (2 Marks) c. The number of levels needed if we make it into a multilevel index (2 Marks) d. The total number of blocks required by the multilevel index (3 Marks) e. The number of block accesses needed to search for and retrieve a record from the file given its Ssn value using the primary index. (2 Marks) Page 1 of 4 b) Conceptual modelling is an important step in data warehousing design. With the aid of a diagram, discuss three schemas used in the conceptual modelling of a data warehouse. (6 Marks) c) Describe the circumstances in which you would choose to use embedded SQL rather than SQL alone or only a general-purpose programming language.(2 Marks) d) Consider the bank database schema given below. Write an SQL trigger to carry out the following action: On delete of an account, for each owner of the account, check if the owner has any remaining accounts, and if she does not, delete her from the depositor relation(4 Marks) Branch (branchName, branchCity,assets) Customer (customerName,customerStreet,customerCity) Loan (loanNumber,branchName,amount) Borrower (customerName,loanNo) Account (accountNo,branchName,balance) Deposit (customerName, AccountNo) e) Define the following terms in relation to file structure and organization (3 Marks) i. Heap file ii. Sorted File iii. Transfer Rate Question TWO (15 Marks) a) Consider these relations with the following properties: r(A, B, C) s(C, D, E) 30,000 tuples 60,000 tuples 25 tuples fit on 1 block 30 tuples fit on 1 block i) Estimate the number of disk block accesses required for a natural join of r and s using a nested-loop join if r is used as the outer relation. Show the calculation for best and worst case scenario (4 Marks) ii) Estimate the number of disk block accesses required for a natural join of r and s using a block nested-loop join if s is used as the outer relation. Assume that there are more than 2000 memory buffers available to facilitate this operation, where each Page 2 of 4 memory buffer can buffer one disk block. Show the calculation for the best and worst case scenario (4 Marks) b) Discuss the index that would be the most efficient to evaluate the following SQL query. Justify your answer (3 Marks) SELECT S.Id FROM Student S WHERE S.grade <= 2 AND S.grade >= 3.3 c) CAP Theorem and BASE properties are central to NoSQL database. Discuss how these properties improve the performance of the database(4 Marks) Question THREE (15 Marks) A STUDENT file with STUDENTID as the hash key includes records with the following STUDENTID values: 123, 456, 789, 102, 131, 415, 161, 718, 192, 021, 222, 324, 252, 627, 282, 930 The file uses 8 buckets, numbered 0 to 7. Each bucket is one disk block and holds two records. a) Load these records into the file in the given order, using the hash function h(K) = K mod 8. Clearly show the hashing table and the buckets(6 Marks) b) Calculate the average number of block accesses for a random retrieval on EmployeeID.(3 Marks) c) Discuss three collision resolution methods which are used to handle overflows when hashing(6 Marks) Question FOUR (15 Marks) a) Discuss the differences among primary, secondary, and clustering indexes? How do these differences affect the ways in which these indexes are implemented?(2 Marks) b) Discuss Three key properties of a data warehouse (3 Marks) c) Discuss how a B-tree differs from a B+-tree? Why is a B +-tree the most preferred access structure to a data file?(4 Marks) d) Construct a B+ tree of order 3 with the following set of data: 8, 5, 1, 7, 3, 12, 9, 6.Then delete 5,12 and 9 from the list (6 Marks) Question FIVE (15 Marks) Use the following Company Schema to answer question Five Employee Fname Ssn(PK) Bdate Address Gender Salary Supervisor_Ssn Dno(FK) Department Dname Dnumber(pk) Manager_Ssn Page 3 of 4 Mng_start_date a) Define the following terms i. Windowing (2 Marks) ii. Stored procedures (2 Marks) b) Suppose we want to check whenever an employee’s salary is greater than the salary of his/her direct supervisor in the company database. This action would call external stored procedure SALARY_VIOLATION.Write SQL syntax for this action (3 Marks) c) Data helps analysts to make informed decisions in an organization. Discuss five main differences between OLAP and OLTP and explain the need of having a separate data warehouse (5 Marks) d) Discuss the following terms in relation to data warehousing; Data mart, data extraction and dimension (3 Marks) Page 4 of 4