Bahria University Department of Computer Sciences Advanced Databases Lab08: Using Distributed DB Design Concepts Objectives To understand the fragmentation concept of distributed databases To understand the replication concept of distributed databases Fragmentation A horizontal fragment of a relation is a subset of the tuples in that relation, specified by a condition on one or more attributes, where each subset has a certain logical meaning. Vertical fragmentation divides a relation “vertically” by columns, keeping only certain attributes of the relation. Mixed (Hybrid) fragmentation is achieved when a relation is fragmented both horizontally and vertically. Replication Fully replicated distributed database is the one in which the whole database is replicated (copied) at every site in the distributed system. Partial replication is when some fragments of the database may be replicated whereas others may not. The number of copies of each fragment can range from one up to the total number of sites in the distributed system. Example of Fragmentation and Replication Using the database in Figures 3.5 and 3.6, suppose that the company has three computer sites. Site 1 is used by company headquarters and accesses all employee and project information regularly, in addition to keeping track of DEPENDENT information for insurance purposes. In other words, the whole database in Figure 3.6 can be stored at site 1. BU, CS Department Advanced Databases 2/3 Spring 2014 Lab08: Using Distributed DB Design Concepts Sites 2 and 3 are for departments 5 and 4, respectively. At each of these sites, we expect frequent access to the EMPLOYEE and PROJECT information for the employees who work in that department and the projects controlled by that department. Further, we assume that these sites mainly access the Name, Ssn, Salary, and Super_ssn attributes of EMPLOYEE. To determine the fragments to be replicated at sites 2 and 3, first we can horizontally fragment DEPARTMENT by its key Dnumber. Then we apply derived fragmentation to the EMPLOYEE, PROJECT, and DEPT_LOCATIONS relations based on their foreign keys for department number—called Dno, Dnum, and Dnumber, respectively. We can vertically fragment the resulting EMPLOYEE fragments to include only the attributes {Name, Ssn, Salary, Super_ssn, Dno}. BU, CS Department Advanced Databases 3/3 Spring 2014 Lab08: Using Distributed DB Design Concepts We must now fragment the WORKS_ON relation and decide which fragments of WORKS_ON to store at sites 2 and 3. We are confronted with the problem that no attribute of WORKS_ON directly indicates the department to which each tuple belongs. In fact, each tuple in WORKS_ON relates an employee e to a project P. We could fragment WORKS_ON based on the department D in which e works or based on the department D' that controls P. Exercises Exercise 1 1. Using the example given above perform the following steps: a) Create all the example tables and add all the sample data given above. b) For the fragmentation of sites 2 and 3, draw the tables on a paper along with the data. c) Use SQL statements to create views to fulfill the requirements given in the example to simulate the fragmentation for sites 2 and 3. For all views, use the site suffix, for example, for fragmentaion of EMPLOYEE relation at site 2 use EMPLOEE_2. The views should contain the correct data and the correct columns. d) Upon creation of views, run the SELECT queries on your views to ensure proper fragmentation. Note: For proper understanding of the steps to follow, read the details of the example given in the book (page 898). 2. Is there any replication involved while fulfilling the requirements? If yes, where?