DBMS Question Bank - Unit 1

CS6302 – DBMS VENKATESH.R AP/CSE Vi INSTITUTE OF TECHNOLOGY DEPARTMENT OF COMPUTER SCIENCE QUESTION BANK FOR DATABASE MANAGEMENT SYSTEM QUESTION BANK – UNIT I PART – A 1. Define database and database management systems. A database-management system (DBMS) is a collection of interrelated data and a set of programs to access those data. The collection of data, usually referred to as the database, contains information relevant to an enterprise. 2. What is the primary goal of DBMS? The primary goal of a DBMS is to provide a way to store and retrieve database information that is both convenient and efficient 3. List the applications of database system.  Banking  Airlines  Universities  Credit card transactions  Telecommunication  Finance  Sales  Manufacturing  Human Resources 4. What are the disadvantages of file processing system?  Data redundancy and inconsistency  Difficulty in accessing data  Data Isolation  Integrity problems  Atomicity problems  Concurrent-access anomalies  Security problems 5. What are the advantages of using database system?  Controlling redundancy  Restricting unauthorized access  Providing multiple user interfaces  Enforcing integrity constraints.  Providing back up and recovery 6. List the levels of data abstraction  Physical level 1 CS6302 – DBMS   VENKATESH.R AP/CSE Logical level View level 7. Define Instances and Schema Instance: Collection of data stored in the data base at a particular moment is called an Instance of the database. Schema: The overall design of the data base is called the data base schema. 8. Define the terms physical schema and logical schema Physical schema: The physical schema describes the database design at the physical level, which is the lowest level of abstraction describing how the data are actually stored. Logical schema: The logical schema describes the database design at the logical level, which describes what data are stored in the database and what relationship exists among the data. 9. What is conceptual schema The schemas at the view level are called subschema’s that describe different views of the database. 10. Define data model A data model is a collection of conceptual tools for describing data, data relationships, data semantics and consistency constraints. 11. List the data models  Entity-Relationship model  Relational model  Object Oriented data model  Semi Structured data model  Network Model  Hierarchical model 12. What is a storage manager? List the components of storage manager. A storage manager is a program module that provides the interface between the low level data stored in a database and the application programs and queries submitted to the system. The storage manager components include  Authorization and integrity manager  Transaction manager  File manager  Buffer manager 13. What is the purpose of storage manager? The storage manager is responsible for the following  Interaction with he file manager  Translation of DML commands in to low level file system commands  Storing, retrieving and updating data in the database 14. List the data structures implemented by the storage manager. The storage manager implements the following data structure 2 CS6302 – DBMS VENKATESH.R AP/CSE Data files Data dictionary Indices 15. What is a data dictionary? A data dictionary is a data structure which stores metadata about the structure of the database ie. the schema of the database. 16. What is an entity relationship model? The entity relationship model is a collection of basic objects called entities and relationship among those objects. An entity is a thing or object in the real world that is distinguishable from other objects. 17. What are attributes? Give examples. An entity is represented by a set of attributes. Attributes are descriptive properties possessed by each member of an entity set. Example: possible attributes of customer entity are customer name, customer id, Customer Street, customer city. 18. What is relationship? Give examples A relationship is an association among several entities. Example: A depositor relationship associates a customer with each account that he/she has. 19. Define the terms Entity set and Relationship set Entity set: The set of all entities of the same type is termed as an entity set. Relationship set: The set of all relationships of the same type is termed as a relationship set. 20. Define single valued and multivalued attributes. Single valued attributes: attributes with a single value for a particular entity are called single valued attributes. Multivalued attributes: Attributes with a set of value for a particular entity are called multivalued attributes. 21. What are stored and derived attributes? Stored attributes: The attributes stored in a data base are called stored attributes. Derived attributes: The attributes that are derived from the stored attributes are called derived attributes. 22. What are composite attributes? Composite attributes can be divided in to sub parts. 23. Define null values. In some cases a particular entity may not have an applicable value for an attribute or if we do not know the value of an attribute for a particular entity. In these cases null value is used. 24. Define the term Entity type 3 CS6302 – DBMS VENKATESH.R AP/CSE Entity type: An entity type defines a collection of entities that have the same attributes. 25. What is meant by the degree of relationship set? The degree of relationship type is the number of participating entity types. 26. Define the terms Key attribute and Value set Key attribute: An entity type usually has an attribute whose values are distinct from each individual entity in the collection. Such an attribute is called a key attribute. Value set: Each simple attribute of an entity type is associated with a value set that specifies the set of values that may be assigned to that attribute for each individual entity. 27. Define weak and strong entity sets? Weak entity set: entity set that do not have key attribute of their own are called weak entity sets. Strong entity set: Entity set that has a primary key is termed a strong entity set. 28. What does the cardinality ratio specify? Mapping cardinalities or cardinality ratios express the number of entities to which another entity can be associated. Mapping cardinalities must be one of the following:  One to one  One to many  Many to one  Many to many 29. Explain the two types of participation constraint. Total: The participation of an entity set E in a relationship set R is said to be total if every entity in E participates in at least one relationship in R. Partial: if only some entities in E participate in relationships in R, the participation of entity set E in relationship R is said to be partial. 30. Define the terms DDL and DML DDL: Data base schema is specified by a set of definitions expressed by a special language called a data definition language. DML: A data manipulation language is a language that enables users to access or manipulate data as organized by the appropriate data model. 31. Write short notes on relational model The relational model uses a collection of tables to represent both data and the relationships among those data. The relational model is an example of a record based model. 32. Define tuple and attribute  Attributes: column headers  Tuple: Row 33. List and define the two classes of DML  Procedural – user specifies what data is required and how to get those data  Declarative (nonprocedural) – user specifies what data is required without specifying how to get those data 4 CS6302 – DBMS VENKATESH.R AP/CSE 34. What is the role of a database administrator?  Schema definition  Storage structure and access method definition  Schema and physical organization modification  Granting user authority to access the database  Specifying integrity constraints  Acting as liaison with users  Monitoring performance and responding to changes in requirements 35. Define and explain the two types of Data Independence. Two types of data independence are i. Physical data independence ii. Logical data independence Physical data independence: The ability to modify the physical schema without causing application programs to be rewritten in the higher levels. Modifications in physical level occur occasionally whenever there is a need to improve performance. Logical data independence: The ability to modify the logical schema without causing application programs to be rewritten in the higher level (external level). Modifications in physical level occur frequently more than that in physical level, whenever there is an alteration in the logical structure of the database. 36. Explain DML pre-compiler. DML precompiler converts DML statements embedded in an application program to normal procedure calls in the host language. The precompiler must interact with the DML compiler to generate the appropriate code. 37. Define file manager and buffer manager. File manager: File manager manages the allocation of space on disk storage and the data structures used to represent information stored on disk. Buffer manager: Buffer manager is responsible for fetching data from the disk storage into the main memory, and deciding what data to cache in memory. 38. Define specialization and generalization Specialization and generalization define a containment relationship between a higherlevel entity set and one or more lower-level entity sets. Specialization is the result of taking a subset of a higher-level entity set to form a lowerlevel entity set. Generalization is the result of taking the union of two or more disjoint (lower-level) entity sets to produce a higher-level entity set. The attributes of higher-level entity sets are inherited by lower-level entity sets. 39. Define the term aggregation Aggregation is an abstraction in which relationship sets (along with their associated entity sets) are treated as higher-level entity sets, and can participate in relationships. 5 CS6302 – DBMS VENKATESH.R AP/CSE 40. Define the term disjoint and overlapping • Disjoint. A disjointness constraint requires that an entity belong to no more than one lower-level entity set. In our example, an account entity can satisfy only one condition for the account-type attribute; an entity can be either a savings account or a checking account, but cannot be both. • Overlapping. In overlapping generalizations, the same entity may belong to more than one lower-level entity set within a single generalization. For an illustration, consider the employee work team example, and assume that certain managers participate in more than one work team. A given employee may therefore appear in more than one of the team entity sets that are lower-level entity sets of employee. Thus, the generalization is overlapping. As another example, suppose generalization applied to entity sets customer and employee leads to a higher-level entity set person. The generalization is overlapping if an employee can also be a customer. PART – B 1. a) Explain the views of data. b) Explain the two types of data models in brief 2. Write short notes on database system structure. 3. How do database systems differ from file systems? Explain. 4. a) Discuss various design issues in E-R model. b) How to draw an E-R diagram? Explain its major components. 5. Write in detail about the extended E-R features. 6. Explain the design of an E-R database schema. 7. Write brief notes on database languages. 8. Explain weak entity sets with suitable example. 9. Draw an E-R diagram for Engineering College enterprise 10. Consider an employment agency computerizing its service. An employer may offer a number of jobs. Applicants may apply for many jobs. The job history of each applicant is recorded as previous job. Employer details include employer name and address. Job details include job code, title and salary. Applicant details include application number, name, address, phone, date of birth and applying date. Job history details include job history id, title, salary, duration and skill required. Construct an E-R diagram for the above scenario and transform it to a logical database design. 11. A weak entity set can always be made into a strong entity set by adding to its attributes the primary-key attributes of its identifying entity set. Outline what sort of redundancy will result if we do so. 12. Illustrate the issues to be considered while developing an ER-diagram. 13. Construct an ER diagram to a modal online book store. 14. Construct an ER diagram for a car insurance company that has a set of customers, each of whom owns one/more cars. Each car has associated with it zero to any number of recorded accidents. Construct appropriate tables for the above ER diagram. 15. Define data model. Explain the different types of data models with relevant examples. 6 CS6302 – DBMS VENKATESH.R AP/CSE QUESTION BANK – UNIT II PART – A 15. Define the term relation Relation is a subset of a Cartesian product of list domains. 16. Define tuple variable Tuple variable is a variable whose domain is the set of all tuples. 3. Define the term Domain. For each attribute there is a set of permitted values called the domain of that attribute. 4. What is a candidate key? Minimal super keys are called candidate keys. 5. What is a primary key? Primary key is chosen by the database designer as the principal means of identifying an entity in the entity set. 6. What is a super key? A super key is a set of one or more attributes that collectively allows us to identify uniquely an entity in the entity set. 7. Define- relational algebra. The relational algebra is a procedural query language. It consists of a set of operations that take one or two relation as input and produce a new relation as output. 8. What is a SELECT operation? The select operation selects tuples that satisfy a given predicate. We use the lowercase letter s to denote selection. 9. What is a PROJECT operation? The project operation is a unary operation that returns its argument relation with certain attributes left out. Projection is denoted by pie (p). 10. Write short notes on tuple relational calculus. The tuple relational calculation is anon procedural query language. It describes the desired information with out giving a specific procedure for obtaining that information. A query or expression can be expressed in tuple relational calculus as {t | P (t)} which means the set of all tuples‘t’ such that predicate P is true for‘t’. Notations used: • t[A] ® the value of tuple ‘t’ on attribute, A • t Î r ® tuple ‘t’ is in relation ‘r’ • $ ® there exists Definition for ‘there exists’ ($): $ t Î r(Q(t)) which means there exists a tuple ‘t’ in relation ‘r’ such that predicate Q(t) is true. • " ® for all Definition for ‘for all’ ("): "t Î r(Q(t)) which means Q(t) is true for all tuples ‘t’ in relation ‘r’. 7 CS6302 – DBMS VENKATESH.R AP/CSE • _ ® Implication Definition for Implication (_): P_Q means if P is true then Q must be true. 11. Write short notes on domain relational calculus The domain relational calculus uses domain variables that take on values from an attribute domain rather than values for entire tuple. 12. Define query language? A query is a statement requesting the retrieval of information. The portion of DML that involves information retrieval is called a query language. 13. Write short notes on Schema diagram. A database schema along with primary key and foreign key dependencies can be depicted pictorially by schema diagram. Each relation appears as a box with attributes listed inside it and the relation name above it. 14. What is foreign key? A relation schema r1 derived from an ER schema may include among its attributes the primary key of another relation schema r2.this attribute is called a foreign key from r1 referencing r2. 15. What are the parts of SQL language? The SQL language has several parts:  Data - definition language  Data manipulation language  View definition  Transaction control  Embedded SQL  Integrity  Authorization 16. What are the categories of SQL command? SQL commands are divided in to the following categories:  Data - definition language  Data manipulation language  Data Query language  Data control language  Data administration statements  Transaction control statements 17. What are the three classes of SQL expression? SQL expression consists of three clauses:  Select  From  Where 18. Give the general form of SQL query? Select A1, A2…………., An 8 CS6302 – DBMS VENKATESH.R AP/CSE From R1, R2……………, Rm Where P 19. What is the use of rename operation? Rename operation is used to rename both relations and a attributes. It uses the as clause, taking the form: Old-name as new-name 20. Define tuple variable? Tuple variables are used for comparing two tuples in the same relation. The tuple variables are defined in the from clause by way of the as clause. 21. List the string operations supported by SQL?  Pattern matching Operation  Concatenation  Extracting character strings  Converting between uppercase and lower case letters. 22. List the set operations of SQL?  Union  Intersect operation  The except operation 23. What is the use of Union and intersection operation? Union: The result of this operation includes all tuples that are either in r1 or in r2 or in both r1 and r2.Duplicate tuples are automatically eliminated. Intersection: The result of this relation includes all tuples that are in both r1 and r2. 24. What are aggregate functions? And list the aggregate functions supported by SQL? Aggregate functions are functions that take a collection of values as input and return a single value. Aggregate functions supported by SQL are Average: avg  Minimum: min  Maximum: max  Total: sum  Count: count 25. What is the use of group by clause? Group by clause is used to apply aggregate functions to a set of tuples. The attributes given in the group by clause are used to form groups. Tuples with the same value on all attributes in the group by clause are placed in one group. 26. What is the use of sub queries? 9 CS6302 – DBMS VENKATESH.R AP/CSE A sub query is a select-from-where expression that is nested with in another query. A common use of sub queries is to perform tests for set membership, make set comparisons, and determine set cardinality. 27. What is view in SQL? How is it defined? Any relation that is not part of the logical model, but is made visible to a user as a virtual relation is called a view. We define view in SQL by using the create view command. The form of the Create view command is Create view v as <query expression> 28. What is the use of with clause in SQL? The with clause provides a way of defining a temporary view whose definition is available only to the query in which the with clause occurs. 29. List the table modification commands in SQL?  Deletion  Insertion  Updates  Update of a view 30. List out the statements associated with a database transaction?  Commit work  Rollback work 31. What is transaction? Transaction is a unit of program execution that accesses and possibly updated various data items. 32. List the SQL domain Types? SQL supports the following domain types. 1) Char(n) 2) varchar(n) 3) int 4) numeric(p,d) 5) float(n) 6) date. 33. What is the use of integrity constraints? Integrity constraints ensure that changes made to the database by authorized users do not result in a loss of data consistency. Thus integrity constraints guard against accidental damage to the database. 34. Mention the 2 forms of integrity constraints in ER model?  Key declarations  Form of a relationship 35. What is trigger? Triggers are statements that are executed automatically by the system as the side effect of a modification to the database. 36. What are domain constraints? 10 CS6302 – DBMS VENKATESH.R AP/CSE A domain is a set of values that may be assigned to an attribute .all values that appear in a column of a relation must be taken from the same domain. 37. What are referential integrity constraints? A value that appears in one relation for a given set of attributes also appears for a certain set of attributes in another relation. 38. What is assertion? Mention the forms available. An assertion is a predicate expressing a condition that we wish the database always to satisfy.  Domain integrity constraints.  Referential integrity constraints 39. Give the syntax of assertion? Create assertion <assertion name>check<predicate> 36. What is the need for triggers? Triggers are useful mechanisms for alerting humans or for starting certain tasks automatically when certain conditions are met. 37. List the requirements needed to design a trigger. The requirements are  Specifying when a trigger is to be executed.  Specify the actions to be taken when the trigger executes. 38. Give the forms of triggers?  The triggering event can be insert or delete.  For updated the trigger can specify columns.  The referencing old row as clause  The referencing new row as clause  The triggers can be initiated before the event or after the event. 39. What does database security refer to? Database security refers to the protection from unauthorized access and malicious destruction or alteration. 40. List some security violations (or) name any forms of malicious access.  Unauthorized reading of data  Unauthorized modification of data  Unauthorized destruction of data. 41. List the types of authorization.  Read authorization  Write authorization  Update authorization 11 CS6302 – DBMS  VENKATESH.R AP/CSE Drop authorization 42. What is authorization graph? Passing of authorization from one user to another can be represented by an authorization graph. 43. List out various user authorization to modify the database schema.  Index authorization  Resource authorization  Alteration authorization  Drop authorization 44. What are audit trails? An audit trail is a log of all changes to the database along with information such as which user performed the change and when the change was performed. 45. Mention the various levels in security measures.  Database system  Operating system  Network  Physical  Human 46. Name the various privileges in SQL?  Delete  Select  Insert  Update 47. Mention the various user privileges.  All privileges directly granted to the user or role.  All privileges granted to roles that have been granted to the user or role. 48. Give the limitations of SQL authorization.  The code for checking authorization becomes intermixed with the rest of the application code.  Implementing authorization through application code rather than specifying it declaratively in SQL makes it hard to ensure the absence of loopholes. 49. Give some encryption techniques?  DES  AES  Public key encryption 50. What does authentication refer? Authentication refers to the task of verifying the identity of a person. 12 CS6302 – DBMS VENKATESH.R AP/CSE 51. List some authentication techniques.  Challenge response scheme  Digital signatures  Nonrepudiation 52. List the disadvantages of relational database system  Repetition of data  Inability to represent certain information. 53. Draw the schema diagram of a banking enterprise. 54. Define materialized views. Certain database systems allow view relations to be stored, but they make sure that, if the actual relations used in the view definition change, the view is kept up to date. Such views are called materialized views. 55. What does set membership do in nested sub queries? SQL draws on the relational calculus for operations that allow testing tuples for membership in a relation. The in connective tests for set membership, where the set is a collection of values produced by a select clause. The not in connective tests for the absence of set membership. 56. Define ODBC and explain its role. The Open DataBase Connectivity (ODBC) standard defines a way for an application program to communicate with a database server. ODBC defines an application program interface (API) that applications can use to open a connection with a database, send queries and updates, and get back results. Applications such as graphical user interfaces, statistics packages, and spreadsheets can make use of the same ODBC API to connect to any database server that supports ODBC. 57. List the Integrity Constraints.  Not Null  Unique  Check 13 CS6302 – DBMS VENKATESH.R AP/CSE 58. Write a SQL statement to find the names and loan numbers of all customers who have loan at Chennai branch. Select cust_name, loan_no from depositor, loan where branch_name=’Chennai’ and depositor.loan_no=loan.loan_no 59. List the different types of joins in SQL.  Inner Join  Left Outer Join  Right Outer Join  Full Outer Join 60. What is static SQL? How does it differ from dynamic SQL? The SQL statements in the program are static if they do not change each time the program in run time. Dynamic SQL statements can be built at run time and placed in a string host variable. They are then sent to the DBMS for processing. PART – B 1. Consider the following relational schema. An employee can work in more than one department. Emp(eid: integer, ename: string, salary: real) Works(eid: integer, did: integer) Dept(did: integer, dname: string, managerid: integer, floornum: integer) Write SQL queries for the following. Be sure to underline your variables to distinguish them from your constants. i. Print the names of all employees who work on the 10th floor and make less than $50,000. ii. Print the names of all managers who manage three or more departments on the same floor. iii. Print the names of all managers who manage 10 or more departments on the same floor. iv. Give every employee who works in the toy department a 10 percent raise v. Print the names of the departments that employee Santa works in. vi. Print the names and salaries of employees who work in both the toy department and the candy department. vii. Print the names of employees who earn a salary that is either less than $10,000 or more than $100,000. viii. Print all of the attributes for employees who work in some department that employee Santa also works in. ix. Fire Santa. x. Print the names of employees who make more than $20,000 and work in either the video department or the toy department. xi. Print the names of all employees who work on the floor(s) where Jane Dodecahedron works. xii. Print the name of each employee who earns more than the manager of the department that he or she works in. xiii. Print the name of each department that has a manager whose last name is Psmith and who is neither the highest-paid nor the lowest-paid employee in the department. 14 CS6302 – DBMS VENKATESH.R AP/CSE 2. Consider the relational database employee (empname, street,city) works (empname.companyname,salary) company(companyname,city) manages(empname,managername) and give relational algebraic expressions for the following queries: i. Find the names of all employees who work for first Bank Corporation. ii. Find the names.street addresses,and cities of residence of all employees who work for first bank corporation and earn more than 200000 per annum. iii. Find the names of all employees in this database who live in the city as the companies for which they work. iv. Find the names of all the employees who earn more than every employees of Small Corporation. 3. Let R= (A,B,C) and let r1and r2 both be relations on the schema R. Give an expression in the domain relational calculus that is equivalent to each of the following: i. ?A (r) ii. sB = 17 (r) iii. r1 n r2 iv. r1 U r2 v. r1 – r2 4. Explain the various keys involved in relational database with a suitable example. 5. Discuss about the triggers. How do triggers offer a powerful mechanism for dealing with the changes to database with suitable example? 6. What are nested queries? Explain with example. 7. What are the relational algebra operations supported in SQL? Write the SQL statement for each operation. 8. Explain the basic relational algebra operations with the symbol used with example. 9. Discuss about tuple relational calculus and domain relational calculus 10. Consider the database given by the following schemas: Customer( Cust _no, sales_ person _no, City) Sales_person(sales_person_no,Sales_person_name,common_prec,year_of_hire) Give the SQL for the following: i. Display the list of all customers by cust_no with the city in which each is located. ii. List the names of the sales persons who have accounts in Delhi. 11. Discuss the strengths and weaknesses of the trigger mechanism. Compare triggers with other integrity 12. Discuss Extended Relational-Algebra Operations in detail. 13. What is a domain type? List and explain the various domain types available in SQL with example. 14. Write brief notes on the following i. Set Operations ii. Aggregate Functions iii. Null Values 15. Define and Illustrate Integrity constraint with suitable examples. 16. State and explain Embedded SQL and Dynamic SQL in detail. 17. Write the following queries in SQL, based on this schema: Suppliers(sid: integer, sname: string, city: string) Parts(pid: integer, pname: string, color: string) Orders(sid: integer, pid: integer, quantity: integer) 15 CS6302 – DBMS VENKATESH.R AP/CSE i. For each supplier from whom all of the following things have been ordered in quantities of at least 150, print the name and city of the supplier: a blue gear, a red crankshaft, and a yellow bumper. ii. Print the names of the purple parts that have been ordered from suppliers located in Madison, Milwaukee, or Waukesha. iii. Print the names and cities of suppliers who have an order for more than 150 units of a yellow or purple part. iv. Print the pids of parts that have been ordered from a supplier named American but have also been ordered from some supplier with a different name in a quantity that is greater than the American order by at least 100 units. v. Print the names of the suppliers located in Madison. Could there be any duplicates in the answer? vi. Print all available information about suppliers that supply green parts. vii. For each order of a red part, print the quantity and the name of the part. viii. Print the names of the parts that come in both blue and green. (Assume that no two distinct parts can have the same name and color.) ix. Print (in ascending order alphabetically) the names of parts supplied both by a Madison supplier and by a Berkeley supplier. x. Print the names of parts supplied by a Madison supplier, but not supplied by any Berkeley supplier. Could there be any duplicates in the answer? xi. Print the total number of orders. xii. Print the largest quantity per order for each sid such that the minimum quantity per order for that supplier is greater than 100. xiii. Print the average quantity per order of red parts. 18. Consider the following schema Suppliers(sid: integer, sname: string, address: string) Parts(pid: integer, pname: string, color: string) Catalog(sid: integer, pid: integer, cost: real) The key elds are underlined, and the domain of each eld is listed after the eld name. Thus sid is the key for Suppliers, pid is the key for Parts, and sid and pid together form the key for Catalog. The Catalog relation lists the prices charged for parts by Suppliers. Write the following queries in relational algebra, tuple relational calculus, and domain relational calculus: i. Find the names of suppliers who supply some red part. ii. Find the sids of suppliers who supply some red or green part. iii. Find the sids of suppliers who supply some red part or are at 221 Packer Ave. iv. Find the sids of suppliers who supply some red part and some green part. v. Find the sids of suppliers who supply every part. vi. Find the sids of suppliers who supply every red part. vii. Find the sids of suppliers who supply every red or green part. viii. Find the sids of suppliers who supply every red part or supply every green part. ix. Find pairs of sids such that the supplier with the rst sid charges more for some part than the supplier with the second sid. x. Find the pids of parts that are supplied by at least two different suppliers. xi. Find the pids of the most expensive parts supplied by suppliers named Yosemite Sham. xii. Find the pids of parts supplied by every supplier at less than $200. (If any supplier either does not supply the part or charges more than $200 for it, the part is not selected.) 16 CS6302 – DBMS VENKATESH.R AP/CSE 19. Consider the following relations containing airline flight information: flights(flno: integer, from: string, to: string, distance: integer, departs: time, arrives: time) aircraft(aid: integer, aname: string, cruisingrange: integer) certified(eid: integer, aid: integer) employees(eid: integer, ename: string, salary: integer) Note that the Employees relation describes pilots and other kinds of employees as well; every pilot is certified for some aircraft (otherwise, he or she would not qualify as a pilot), and only pilots are certified to fly. Write the following queries in relational algebra, tuple relational calculus, and domain relational calculus. Note that some of these queries may not be expressible in relational algebra (and, therefore, also not expressible in tuple and domain relational calculus)! For such queries, informally explain why they cannot be expressed. i. Find the eids of pilots certified for some Boeing aircraft. ii. Find the names of pilots certified for some Boeing aircraft. iii. Find the aids of all aircraft that can be used on non-stop flights from Bonn to Madras. iv. Identify the flights that can be piloted by every pilot whose salary is more than $100,000. v. (Hint: The pilot must be certified for at least one plane with a sufficiently large cruising range.) vi. Find the names of pilots who can operate some plane with a range greater than 3,000 miles but are not certified on any Boeing aircraft. vii. Find the eids of employees who make the highest salary. viii. Find the eids of employees who make the second highest salary. ix. Find the eids of pilots who are certified for the largest number of aircraft. x. Find the eids of employees who are certified for exactly three aircraft. xi. Find the total amount paid to employees as salaries. xii. Is there a sequence of flights from Madison to Timbuktu? Each flight in the sequence is required to depart from the city that is the destination of the previous flight; the rst flight must leave Madison, the last flight must reach Timbuktu, and there is no restriction on the number of intermediate flights. Your query must determine whether a sequence of flights from Madison to Timbuktu exists for any input Flights relation instance 20. Consider the following relation Sailors(sid: integer, sname: string, rating: integer, age: real) Boats(bid: integer, bname: string, color: string) Reserves(sid: integer, bid: integer, day: date) Write SQL queries for the following. i. Find the names and ages of all sailors ii. Find all sailors with rating above 7. iii. Find the names of sailors who have reserved boat number 103. iv. Find the sids of sailors who have reserved a red boat. v. Find the names of sailors who have reserved a red boat. vi. Find the color of boat reserved by Lubber. vii. Find the names of sailors who have reserved at least one boat. viii. Compute increments for the ratings of persons who have sailed two different boats on the same day ix. Find the ages of sailors whose name begins and ends with B and has at least three characters. x. Find the names of sailors who have reserved a red or a green boat. 17 CS6302 – DBMS VENKATESH.R AP/CSE xi. Find the names of sailors who have reserved both a red and a green boat. xii. Find the names of sailors who have reserved red boats but not green boats. xiii. Find the ages of sailors whose name begins and ends with B and had at least three characters xiv. Find all sids of sailors who have a rating of 10 or have reserved boat 104. xv. Find the names of sailors who have not reserved a red boat. xvi. Find sailors whose rating is better than some sailor called Harito. xvii. Find sailors whose rating is better than every sailor called Horatio. xviii. Find the sailors with highest rating xix. Find the names of sailors who have reserved all boats. xx. Find the average age of all sailors. xxi. Find the average age of sailors with a rating of 10 xxii. Find the name and age of the oldest sailor. xxiii. Count the number of sailors. xxiv. Find the names of sailors who are older than the oldest sailor with a rating of 10. xxv. Find the age of the youngest sailor for each rating level. xxvi. Find the age of the youngest sailor who is eligible to vote for each rating level with at least two such sailors. xxvii. Find those ratings for which the average age of sailors is the minimum overall ratings. 21. Consider the following relations: Student(snum: integer, sname: string, major: string, level: string, age: integer) Class(name: string, meets at: time, room: string, d: integer) Enrolled(snum: integer, cname: string) Faculty(fid: integer, fname: string, deptid: integer) The meaning of these relations is straightforward; for example, Enrolled has one record per student-class pair such that the student is enrolled in the class. Write the following queries in SQL. No duplicates should be printed in any of the answers. i. Find the names of all Juniors (Level = JR) who are enrolled in a class taught by I. Teach. ii. Find the age of the oldest student who is either a History major or is enrolled in a course taught by I. Teach. iii. Find the names of all classes that either meet in room R128 or have five or more students enrolled. iv. Find the names of all students who are enrolled in two classes that meet at the same time. v. Find the names of faculty members who teach in every room in which some class is taught. vi. Find the names of faculty members for whom the combined enrollment of the courses that they teach is less than five. vii. Print the Level and the average age of students for that Level, for each Level. viii. Print the Level and the average age of students for that Level, for all Levels except JR. ix. Find the names of students who are enrolled in the maximum number of classes. x. Find the names of students who are not enrolled in any class. xi. For each age value that appears in Students, find the level value that appears most often. xii. For example, if there are more FR level students aged 18 than SR, JR, or SO students aged 18, you should print the pair (18, FR). 18 CS6302 – DBMS VENKATESH.R AP/CSE 22. The following relations keep track of airline flight information Flights(flno: integer, from: string, to: string, distance: integer, departs: time, arrives: time, price: integer) Aircraft(aid: integer, aname: string, cruisingrange: integer) Certified(eid: integer, aid: integer) Employees(eid: integer, ename: string, salary: integer) Note that the Employees relation describes pilots and other kinds of employees as well; every pilot is certified for some aircraft, and only pilots are certified to fly. Write each of the following queries in SQL. i. Find the names of aircraft such that all pilots certified to operate them earn more than 80,000. ii. For each pilot who is certified for more than three aircraft, find the eid and the maximum cruisingrange of the aircraft that he (or she) is certified for. iii. Find the names of pilots whose salary is less than the price of the cheapest route from Los Angeles to Honolulu. iv. For all aircraft with cruisingrange over 1,000 miles, find the name of the aircraft and the average salary of all pilots certified for this aircraft. v. Find the names of pilots certified for some Boeing aircraft. vi. Find the aids of all aircraft that can be used on routes from Los Angeles to Chicago. vii. Identify the flights that can be piloted by every pilot who makes more than $100,000. (Hint: The pilot must be certified for at least one plane with a sufficiently large cruising range.) viii. Print the enames of pilots who can operate planes with cruisingrange greater than 3,000 miles, but are not certified on any Boeing aircraft. ix. A customer wants to travel from Madison to New York with no more than two changes of flight. List the choice of departure times from Madison if the customer wants to arrive in New York by 6 p.m. x. Compute the difference between the average salary of a pilot and the average salary of all employees (including pilots). xi. Print the name and salary of every nonpilot whose salary is more than the average salary for pilots. 23. Consider the following relational schema. An employee can work in more than one department; the pct time eld of the Works relation shows the percentage of time that a given employee works in a given department. Write the following queries in SQL: i. Print the names and ages of each employee who works in both the Hardware department and the Software department. ii. For each department with more than 20 full-time-equivalent employees (i.e., where the part-time and full-time employees add up to at least that many full-time employees), print the did together with the number of employees that work in that department. iii. Print the name of each employee whose salary exceeds the budget of all of the departments that he or she works in. iv. Find the managerids of managers who manage only departments with budgets greater than $1,000,000. v. Find the enames of managers who manage the departments with the largest budget. vi. If a manager manages more than one department, he or she controls the sum of all the budgets for those departments. Find the managerids of managers who control more than $5,000,000. 19 CS6302 – DBMS VENKATESH.R AP/CSE vii. Find the managerids of managers who control the largest amount. 24. Consider the instance of the Sailors relation shown in the below figure Figure 1.1 An Instance of Sailors i. Write SQL queries to compute the average rating, using AVG; the sum of the ratings, using SUM; and the number of ratings, using COUNT. ii. If you divide the sum computed above by the count, would the result be the same as the average? How would your answer change if the above steps were carried out with respect to the age field instead of rating? iii. Consider the following query: Find the names of sailors with a higher rating than all sailors with age < 21. The following two SQL queries attempt to obtain the answer to this question. Do they both compute the result? If not, explain why. Under what conditions would they compute the same result? SELECT S.sname FROM Sailors S WHERE NOT EXISTS (SELECT * FROM Sailors S2 WHERE S2.age < 21 AND S.rating <= S2.rating) SELECT * FROM Sailors S WHERE S.rating > ANY (SELECT S2.rating FROM Sailors S2 WHERE S2.age < 21) iv. Consider the instance of Sailors shown in Figure 1.1. Let us define instance S1 of Sailors to consist of the first two tuples, instance S2 to be the last two tuples, and S to be the given instance. (a) Show the left outer join of S with itself, with the join condition being sid=sid. (b) Show the right outer join of S with itself, with the join condition being sid=sid. (c) Show the full outer join of S with itself, with the join condition being sid=sid. (d) Show the left outer join of S1 with S2, with the join condition being sid=sid. (e) Show the right outer join of S1 with S2, with the join condition being sid=sid. (f) Show the full outer join of S1 with S2, with the join condition being sid=sid. 25. Explain the term impedance mismatch in the context of embedding SQL commands in a host language such as C. 20 CS6302 – DBMS VENKATESH.R AP/CSE QUESTION BANK – UNIT III PART – A 17. Define Boyce Codd Normal Form A relation schema R is in BCNF with respect to a set F of functional dependencies if, for all functional dependencies in F+ of the form α->β, where α R and β R, at least one of the following holds:  α→β is a trivial functional dependency (that is β α).  Α is the superkey for schema R. 18. What is first normal form? The domain of attribute must include only atomic (simple, indivisible) values. 17.What is meant by functional dependencies? The notion of functional dependency generalizes the notion of superkey. Consider a relation schema R, and let α R and β R, The functional dependency α→β holds on schema R if, in any legal relation r(R), for all pairs of tuples t1 and t2 in r such that t1[α] = t2[α], it is also the case that t1[β] = t2[β]. 18. What is a trivial dependency? The functional dependency of the form α→β is trivial if β dependencies are satisfied by all the relations α. Trivial functional 19. What are axioms? Axioms or rules of inference provide a simpler technique for reasoning about functional dependencies. 6. What is meant by computing the closure of a set of functional dependency? The closure of F denoted by F+ is the set of functional dependencies logically implied by F. 7. What is meant by normalization of data? It is a process of analyzing the given relation schemas based on their Functional Dependencies (FDs) and primary key to achieve the properties  Minimizing redundancy  Minimizing insertion, deletion and updating anomalies. 8. Define canonical cover? A canonical cover Fc for F is a set of dependencies such that F logically implies all dependencies in FC and Fc logically implies all dependencies in F. Fc must have the following properties. 9. List the properties of canonical cover. Fc must have the following properties.  No functional dependency in Fc contains an extraneous attribute.  Each left side of a functional dependency in Fc is unique. 10. List the desirable properties of decomposition. 21 CS6302 – DBMS    VENKATESH.R AP/CSE Lossless-join decomposition Dependency preservation Repetition of information 14. What is 2NF? A relation schema R is in 2NF if it is in 1NF and every non-prime attribute A in R is fully functionally dependent on primary key. 15. What is multivalued dependency? Multivalued dependence x→→y specified on relation schema R where x and y are both subsets of R, specifies the following constraints on any relation state r of R. If two tuples t1 and t2 exist in r with the following properties where we use Z to denote (R – (x y)) t3[x]=t4[x]=t1[x]=t2[x] t3[y]=t1[y] and t4[y]=t2[y] t3[z]=t2[z]=t4[z]=t1[z] 16. List the two ways on using functional dependencies. 1. To test relations to see whether they are legal under a given set of functional dependencies. If a relation r is legal under a set F of functional dependencies, we say that r satisfies F. 2. To specify constraints on the set of legal relations. We shall thus concern ourselves with only those relations that satisfy a given set of functional dependencies. If we wish to constrain ourselves to relations on schema R that satisfy a set F of functional dependencies, we say that F holds on R. 17. When functional dependency is said to be logically implied? Given a relational schema R, a functional dependency f on R is logically implied by a set of functional dependencies F on R if every relation instance r(R) that satisfies F also satisfies f. Suppose we are given a relation schema R = (A, B, C, G, H, I) and the set of functional dependencies A→B A→C CG→H CG→I B→H The functional dependency A→H is logically implied. 18. Define extraneous attributes. An attribute of a functional dependency is said to be extraneous if we can remove it without changing the closure of the set of functional dependencies. The formal definition of extraneous attributes is as follows. Consider a set F of functional dependencies and the functional dependency α →β in F.  Attribute A is extraneous in α if A Є α, and F logically implies (F − {α →β}) {(α − A) → β}.  Attribute A is extraneous in β if A Є β, and the set of functional dependencies (F − {α →β}) {α → (β − A)} logically implies F. 19. List and define Armstrong’s axioms. This collection of rules is called Armstrong’s axioms in honor of the person who first proposed it.  Reflexivity rule. If α is a set of attributes and β α, then α →β holds.  Augmentation rule. If α → β holds and γ is a set of attributes, then γα → γβ holds. 22 CS6302 – DBMS  VENKATESH.R AP/CSE Transitivity rule. If α →β holds and β → γ holds, then α → γ holds. 20. When Armstrong’s Axioms are said to be sound and complete? Armstrong’s axioms are sound, because they do not generate any incorrect functional dependencies. They are complete, because, for a given set F of functional dependencies, they allow us to generate all F+ 21. List the additional rules of Armstrong’s Axioms. Although Armstrong’s axioms are complete, it is tiresome to use them directly for the computation of F+. To simplify matters further, we list additional rules. It is possible to use Armstrong’s axioms to prove that these rules are correct  Union rule. If α → β holds and α → γ holds, then α →βγ holds.  Decomposition rule. If α →βγ holds, then α → β holds and α →γ holds.  Pseudotransitivity rule. If α→β holds and γβ →δ holds, then αγ →δ holds. 22. Define Decomposition. The bad design suggests that we should decompose a relation schema that has many attributes into several schemas with fewer attributes. Careless decomposition, however, may lead to another form of bad design. 23. When decomposition is said to be lossless join decomposition? Let C represent a set of constraints on the database, and let R be a relation schema. A decomposition {R1,R2, . . .,Rn} of R is a lossless-join decomposition if, for all relations r on schema R that are legal under C, r = ΠR1 (r) ΠR2 (r) … ΠRn (r) Let R be a relation schema, and let F be a set of functional dependencies on R. Let R1 and R2 form a decomposition of R. This decomposition is a lossless-join decomposition of R if at least one of the following functional dependencies is in F+:  R1 ∩ R2 → R1  R1 ∩ R2 → R2 In other words, if R1 ∩ R2 forms a superkey of either R1 or R2, the decomposition of R is lossless-join decomposition 24. Define the dependency preservation. A relational-database design is said to be dependency preservation, when an update is made to the database, the system should be able to check that the update will not create an illegal relation—that is, one that does not satisfy all the given functional dependencies. If we are to check updates efficiently, we should design relational-database schemas that allow update validation without the computation of joins. We say that a decomposition having the property F’+ = F+ is a dependencypreserving decomposition. 25. Define Third Normal Form BCNF requires that all nontrivial dependencies be of the form α → β, where α is a superkey. 3NF relaxes this constraint slightly by allowing nontrivial functional dependencies whose left side is not a superkey. A relation schema R is in third normal form (3NF) with respect to a set F of functional dependencies if, for all functional dependencies in F+ of the form α → β, where α R and β R, at least one of the following holds: 23 CS6302 – DBMS   VENKATESH.R AP/CSE α → β is a trivial functional dependency. α is a superkey for R. 26. Define 3NF synthesis algorithm. Decomposition algorithm is also called the 3NF synthesis algorithm, since it takes a set of dependencies and adds one schema at a time, instead of decomposing the initial schema repeatedly. The result is not uniquely defined, since a set of functional dependencies can have more than one canonical cover, and, further, in some cases the result of the algorithm depends on the order in which it considers the dependencies in Fc. 27. Define Fourth Normal Form A relation schema R is in fourth normal form (4NF) with respect to a set D of functional and multivalued dependencies if, for all multivalued dependencies in D+ of the form α →→ β, where α R and β R, at least one of the following holds  α →→ β is a trivial multivalued dependency.  α is a superkey for schema R. A database design is in 4NF if each member of the set of relation schemas that constitutes the design is in 4NF. 28. What is join dependency? Join dependencies constrain the set of legal relations over a schema R to those relations for which a given decomposition is a lossless-join decomposition. Let R be a relation schema and R1, R2,..., Rnbe a decomposition of R. If R = R1 R2 …… Rn, we say that a relation r® satisfies the join dependency*(R1, R2,...Rn) if: r=ΠR1®⋈ΠR2®⋈……⋈ΠRn®. A join dependency is trivial if one of the Ri is R itself. PART-B: 1. 2. 3. 4. 5. 6. What is Normalization? Explain 1NF, 2NF, 3NF and BCNF with suitable example. What is FD? Explain the role of FD in the process of normalization. Explain multivalued dependency. Describe 4NF with suitable example. Write brief notes on Join Dependency With example, explain 5NF. 24 CS6302 – DBMS VENKATESH.R AP/CSE QUESTION BANK – UNIT IV PART – A 1. What is transaction? Collections of operations that form a single logical unit of work are called transactions. 2. What are the two statements regarding transaction? The two statements regarding transaction of the form:  Begin transaction  End transaction 3. What are the properties of transaction? The properties of transactions are:  Atomicity  Consistency  Isolation  Durability 4. What is recovery management component? Ensuring durability is the responsibility of a software component of the base system called the recovery management component. 5. When is a transaction rolled back? Any changes that the aborted transaction made to the database must be undone. Once the changes caused by an aborted transaction have been undone, then the transaction has been rolled back. 6. What are the states of transaction?  The states of transaction are  Active  Partially committed  Failed  Aborted  Committed  Terminated 7. What is a shadow copy scheme? 25 CS6302 – DBMS VENKATESH.R AP/CSE It is simple, but efficient, scheme called the shadow copy schemes. It is based on making copies of the database called shadow copies that one transaction is active at a time. The scheme also assumes that the database is simply a file on disk. 8. Give the reasons for allowing concurrency? The reasons for allowing concurrency is if the transactions run serially, a short transaction may have to wait for a preceding long transaction to complete, which can lead to unpredictable delays in running a transaction. So, concurrent execution reduces the unpredictable delays in running transactions. 9. What is average response time? The average response time is that the average time for a transaction to be completed after it has been submitted. 10. What are the two types of Serializability? The two types of Serializability is  Conflict Serializability  View Serializability. 11. Define lock? Lock is the most common used to implement the requirement is to allow a transaction to access a data item only if it is currently holding a lock on that item. 12. What are the different modes of lock? The modes of lock are:  Shared(S)  Exclusive(X) 13. Define deadlock? Neither of the transaction can ever proceed with its normal execution. This situation is called deadlock. 14. Define the phases of two phase locking protocol Growing phase: a transaction may obtain locks but not release any lock. Shrinking phase: a transaction may release locks but may not obtain any new locks. 15. Define upgrade and downgrade? A mechanism for conversion from shared lock to exclusive lock is known as upgrade. 26 CS6302 – DBMS VENKATESH.R AP/CSE A mechanism for conversion from exclusive lock to shared lock is known as downgrade. 16. What is a database graph? The partial ordering implies that the set D may now be viewed as a directed acyclic graph, called a database graph. 17. What are the two methods for dealing deadlock problem? The two methods for dealing deadlock problem is deadlock detection and deadlock recovery. 18. What is a recovery scheme? An integral part of a database system is a recovery scheme that can restore the database to the consistent state that existed before the failure. 19. What are the two types of errors? The two types of errors are:  Logical error  System error 20. What are the storage types? The storage types are:  Volatile storage  Nonvolatile storage 21. Define blocks? The database system resides permanently on nonvolatile storage, and is partitioned into fixed-length storage units called blocks. 22. What is meant by Physical blocks? The input and output operations are done in block units. The blocks residing on the disk are referred to as physical blocks. 23. What is meant by buffer blocks? The blocks residing temporarily in main memory are referred to as buffer blocks. 24. What is meant by disk buffer? The area of memory where blocks reside temporarily is called the disk buffer. 25. What is meant by log-based recovery? 27 CS6302 – DBMS VENKATESH.R AP/CSE The most widely used structures for recording database modifications is the log. The log is a sequence of log records, recording all the update activities in the database. There are several types of log records. 26. What are uncommitted modifications? The immediate-modification technique allows database modifications to be output to the database while the transaction is still in the active state. Data modifications written by active transactions are called uncommitted modifications. 27. Define shadow paging. An alternative to log-based crash recovery technique is shadow paging. This technique needs fewer disk accesses than do the log-based methods. 28. Define page. The database is partitioned into some number of fixed-length blocks, which are referred to as pages. 29. Explain current page table and shadow page table. The key idea behind the shadow paging technique is to maintain two page tables during the life of the transaction: the current page table and the shadow page table. Both the page tables are identical when the transaction starts. The current page table may be changed when a transaction performs a write operation. 30. What are the drawbacks of shadow-paging technique?  Commit Overhead  Data fragmentation  Garbage collection 31. Define garbage collection. Garbage may be created also as a side effect of crashes. Periodically, it is necessary to find all the garbage pages and to add them to the list of free pages. This process is called garbage collection. 32. Differentiate strict two phase locking protocol and rigorous two phase locking protocol. In strict-two-phase locking protocol all exclusive mode locks taken by a transaction is held until that transaction commits. Rigorous-two-phase locking protocol requires that all locks be held until the transaction commits. 33. How the time stamps are implemented 28 CS6302 – DBMS VENKATESH.R AP/CSE Use the value of the system clock as the time stamp. That is a transaction’s time stamp is equal to the value of the clock when the transaction enters the system. Use a logical counter that is incremented after a new timestamp has been assigned; that is the time stamp is equal to the value of the counter. 34. What are the time stamps associated with each data item? W-timestamp(Q) denotes the largest time stamp if any transaction that executed WRITE(Q) successfully. R-timestamp (Q) denotes the largest time stamp if any transaction that executed READ(Q) successfully. 35. What benefit is provided by strict-two-phase locking? What are the advantages results? Cascading rollbacks can be avoided by a modification of two phase locking protocol. This protocol required not only that locking be two-phase, but also that all exclusive mode locks taken than by a transaction be held until that transaction commits. PART-B: 7. Explain the different forms of Serializability. 8. What are the different types of schedules that are acceptable for recoverability? 9. Discuss two-phase locking protocol and timestamp-based protocol. 10. Explain the deferred and immediate modification versions of the log-based recovery scheme. 11. Explain the four different properties of transactions that a DBMS must ensure to maintain database. 12. What is concurrency control? How is it implemented in DBMS? Explain. 13. Explain the various recovery techniques during transaction in detail. 14. Write short notes on shadow paging. 15. Explain testing for Serializability with respect to concurrency control schemes. How will you determine, whether a schedule is serializable or not. 29 CS6302 – DBMS VENKATESH.R AP/CSE QUESTION BANK – UNIT V YEAR/SEM: II/IV PART – A 1. What is an index? An index is a structure that helps to locate desired records of a relation quickly, without examining all records 2. Define query optimization. Query optimization refers to the process of finding the lowest –cost method of evaluating a given query. 3. What are called jukebox systems? Jukebox systems contain a few drives and numerous disks that can be loaded into one of the drives automatically. 4. What are the types of storage devices? _ Primary storage _ Secondary storage _ Tertiary storage _ Volatile storage _ Nonvolatile storage 5. What is called remapping of bad sectors? If the controller detects that a sector is damaged when the disk is initially formatted, or when an attempt is made to write the sector, it can logically map the sector to a different physical location. 6. Define access time. Access time is the time from when a read or write request is issued to when data transfer begins. 7. Define seek time. The time for repositioning the arm is called the seek time and it increases with the distance that the arm is called the seek time. 8. Define average seek time. The average seek time is the average of the seek times, measured over a sequence of random requests. 9. Define rotational latency time. The time spent waiting for the sector to be accessed to appear under the head is called the rotational latency time. 10. Define average latency time. The average latency time of the disk is one-half the time for a full rotation of the disk. 11. What is meant by data-transfer rate? 30 CS6302 – DBMS VENKATESH.R AP/CSE The data-transfer rate is the rate at which data can be retrieved from or stored to the disk. 12. What is meant by mean time to failure? The mean time to failure is the amount of time that the system could run continuously without failure. 13. What is a block and a block number? A block is a contiguous sequence of sectors from a single track of one platter. Each request specifies the address on the disk to be referenced. That address is in the form of a block number. 14. What are called journaling file systems? File systems that support log disks are called journaling file systems. 15. What is the use of RAID? A variety of disk-organization techniques, collectively called redundant arrays of independent disks are used to improve the performance and reliability. 16. What is called mirroring? The simplest approach to introducing redundancy is to duplicate every disk. This technique is called mirroring or shadowing. 17. What is called mean time to repair? The mean time to failure is the time it takes to replace a failed disk and to restore the data on it. 18. What is called bit-level striping? Data striping consists of splitting the bits of each byte across multiple disks. This is called bit-level striping. 19. What is called block-level striping? Block level striping stripes blocks across multiple disks. It treats the array of disks as a large disk, and gives blocks logical numbers. 20. What are the two main goals of parallelism? _ Load –balance multiple small accesses, so that the throughput of such accesses increases. _ Parallelize large accesses so that the response time of large accesses is reduced 21. What are the factors to be taken into account when choosing a RAID level? o Monetary cost of extra disk storage requirements. o Performance requirements in terms of number of I/O operations o Performance when a disk has failed. o Performances during rebuild. 22. What is meant by software and hardware RAID systems? 31 CS6302 – DBMS VENKATESH.R AP/CSE RAID can be implemented with no change at the hardware level, using only software modification. Such RAID implementations are called software RAID systems and the systems with special hardware support are called hardware RAID systems. 23. Define hot swapping? Hot swapping permits the removal of faulty disks and replaces it by new ones without turning power off. Hot swapping reduces the mean time to repair. 24. What are the ways in which the variable-length records arise in database systems? _ Storage of multiple record types in a file. _ Record types that allow variable lengths for one or more fields. _ Record types that allow repeating fields. 25. What is the use of a slotted-page structure and what is the information present in the header? The slotted-page structure is used for organizing records within a single block. The header contains the following information. _ The number of record entries in the header. _ The end of free space _ An array whose entries contain the location and size of each record. 26. What are the two types of blocks in the fixed –length representation? Define them. • Anchor block: Contains the first record of a chain. • Overflow block: Contains the records other than those that are the first record of a chain. 27. What is known as heap file organization? In the heap file organization, any record can be placed anywhere in the file where there is space for the record. There is no ordering of records. There is a single file for each relation. 28. What is known as sequential file organization? In the sequential file organization, the records are stored in sequential order, according to the value of a “search key” of each record. 29. What is hashing file organization? In the hashing file organization, a hash function is computed on some attribute of each record. The result of the hash function specifies in which block of the file the record should be placed. 30. What is known as clustering file organization? In the clustering file organization, records of several different relations are stored in the same file. 31. What are the types of indices? _ Ordered indices 32 CS6302 – DBMS VENKATESH.R AP/CSE _ Hash indices 32. What are the techniques to be evaluated for both ordered indexing and hashing? _ Access types _ Access time _ Insertion time _ Deletion time _ Space overhead 33. What is known as a search key? An attribute or set of attributes used to look up records in a file is called a search key. 34. What is a primary index? A primary index is an index whose search key also defines the sequential order of the file. 35. What are called index-sequential files? The files that are ordered sequentially with a primary index on the search key, are called index-sequential files. 36. What are the two types of indices? _ Dense index _ Sparse index 37. What are called multilevel indices? Indices with two or more levels are called multilevel indices. 38. What is B-Tree? A B-tree eliminates the redundant storage of search-key values .It allows search key values to appear only once. 39. What is a B+-Tree index? A B+-Tree index takes the form of a balanced tree in which every path from the root of the root of the root of the tree to a leaf of the tree is of the same length. 40. What is a hash index? A hash index organizes the search keys, with their associated pointers, into a hash file structure. 41. What are called as index scans? Search algorithms that use an index are referred to as index scans. 42. What is called as external sorting? Sorting of relations that do not fit into memory is called as external sorting. 43. What is called as recursive partitioning? The system repeats the splitting of the input until each partition of the build input fits in the memory. Such partitioning is called recursive partitioning. 33 CS6302 – DBMS VENKATESH.R AP/CSE 44. What is called as an N-way merge? The merge operation is a generalization of the two-way merge used by the standard in-memory sort-merge algorithm. It merges N runs, so it is called an N-way merge. 45. What is known as fudge factor? The number of partitions is increased by a small value called the fudge factor, which is usually 20 percent of the number of hash partitions computed. 46. What are the sparse and dense indices? Sparse Indices: An index record appears only some of the search-key values. Dense Indices: An index record appears for every search-key value in the file. In a dense primary index, the index record contains the search-key value and a pointer to the first data record with that search-key value. 47. Compare sequential access devices versus random access devices with an example. Tape storage is referred to as sequential access storage. Disk storage is referred to as direct access storage. 48. What can be done to reduce the occurrences of bucket overflows in a hash file organization? To reduce the occurrence of overflows, we can  Choose the hash function more carefully and make better estimation of the relation size.  If the estimated size of the relation is nr and the number of records per block is fr, allocate (nr/fr)*(1+d) buckets instead of (nr/fr) buckets. Here d is a fudge factor, typically around 0.2. Some space is wasted. But the benefit is that some of the skew is handled and the probability of overflow is reduced. 49. Define volatile storage. Volatile storage loses its contents when the power to thedevice is removed. 50. Define buffer replacement strategy When there is no room left in the buffer, a block must be removed from the buffer before a new one can be read in. Most operating systems use a least recently used (LRU) scheme, in which the block that was referenced least recently is written back to disk and is removed from the buffer. This simple approach can be improved on for database applications. 51. What are pinned blocks? For the database system to be able to recover from crashes, it is necessary to restrict those times when a block may be written back to disk. For instance, most recovery systems require that a block should not be written to disk while an update on the block is in progress. A block that is not allowed to be written back to disk is said to be pinned. Although many operating systems do not support pinned blocks, such a feature is essential for a database system that is resilient to crashes. 52. Define forced output of blocks. 34 CS6302 – DBMS VENKATESH.R AP/CSE There are situations in which it is necessary to write back the block to disk, even though the buffer space that it occupies is not needed. This write is called the forced output of a block. 53. List the disadvantages of Byte-String representation. It is not easy to reuse space occupied formerly by a deleted record. Although techniques exist to manage insertion and deletion, they lead to a large number of small fragments of disk storage that are wasted. There is no space, in general, for records to grow longer. If a variable-length record becomes longer, it must be moved—movement is costly if pointers to the record are stored elsewhere in the database (e.g., in indices, or in other records), since the pointers must be located and updated. 54. What is meant by two-way and multi-way joins? Two-way join is a join on two-files. Joins involving more than two files are called multi-way joins. 55. Define hash join. In the hash join algorithm, a hash function h is used to partition tuples of both relations. The basic idea is to partition the tuples of each of relations into sets that have the same hash value on the join attributes. 56. Define hybrid hash join The hybrid hash-join algorithm is a variation of partition hash join, where the joining phase for one of the partitions is included in the partitioning phase. 57. What are the goals of database tuning?  To make applications run faster  To lower the response time of queries/transactions.  To improve the overall throughput of transactions. 58. What are the steps involved in query processing? The basic steps are:  Parsing and translation  Optimization  Evaluation 59. What is called an evaluation primitive? A relational algebra operation annotated with instructions on how to evaluate is called an evaluation primitive. 60. What is called a query evaluation plan? A sequence of primitive operations that can be used to evaluate ba query is a query evaluation plan or a query execution plan. 61. What is called a query –execution engine? The query execution engine takes a query evaluation plan, executes that plan, and returns the answers to the query. 35 CS6302 – DBMS VENKATESH.R AP/CSE PART – B 2. Explain the various stages involved in Query processing. 3. What is the need for indexing and hashing? How will they improve the access efficiency in a database? Explain. 4. Write notes on RAID levels. 5. Explain in detail about the hash and hybrid hash join operation. 6. Explain why allocations of records to blocks affect database system performance significantly. 7. Describe the structure of B+ tree and give the algorithm for search in the B+ tree with example. 8. Give the comparison between ordered indexing and hashing. 9. Explain the security provided in commercial query languages. 10. How would you estimate the cost of the query? 11. Explain how RAID system improves performance and reliability? 12. Explain in detail about the static and dynamic hashing in detail. 13. Describe the different types of file organization. Explain using a sketch of each of them with their advantages and disadvantages. 14. Explain the index schemas used in detail. 15. How does a DBMS represent a relational query evaluation plan? 16. Construct a B+ tree for the following set of key values: (2,3,6,9,12,18,19,23,29,31,40,67,90,110) Assume that the tree is initially empty and values are added in ascending order. Note: The number of pointers that will fit in one node is three. Illustrate the tree construction process step by step. 17. Explain the distinction between closed and open hashing. Discuss the relative merits of each technique in database applications. 18. What are the causes of bucket overflow in a hash file organization? What can be done to reduce the occurrence of bucket overflows? Discuss. 19. Discuss physical storage media in detail. 20. Write brief notes on magnetic disks. 21. Define and differentiate fixed and variable length records on file organization. 22. Explain organization of records in files. 23. Explain about database tuning. 24. Write brief notes on external-merge sort. 25. Explain the methods used to implement a join operation. 26. Explain in detail about search algorithm used to implement a select operation. 27. How will you search the RAID level? Discuss. 28. Explain storage access in detail. 29. Differentiate B-Tree with B+ tree. 30. Explain sorting of relations in detail. 31. Write in detail about the evaluation of expressions in detail. 36

DBMS Question Bank - Unit 1

Related documents

Products

Support

DBMS Question Bank - Unit 1

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib