DBMS Question Bank - Unit 1

advertisement
CS6302 – DBMS
VENKATESH.R AP/CSE
Vi INSTITUTE OF TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE
QUESTION BANK FOR DATABASE MANAGEMENT
SYSTEM
QUESTION BANK – UNIT I
PART – A
1. Define database and database management systems.
A database-management system (DBMS) is a collection of interrelated data and a
set of programs to access those data. The collection of data, usually referred to as the
database, contains information relevant to an enterprise.
2. What is the primary goal of DBMS?
The primary goal of a DBMS is to provide a way to store and retrieve database
information that is both convenient and efficient
3. List the applications of database system.
 Banking
 Airlines
 Universities
 Credit card transactions
 Telecommunication
 Finance
 Sales
 Manufacturing
 Human Resources
4. What are the disadvantages of file processing system?
 Data redundancy and inconsistency
 Difficulty in accessing data
 Data Isolation
 Integrity problems
 Atomicity problems
 Concurrent-access anomalies
 Security problems
5. What are the advantages of using database system?
 Controlling redundancy
 Restricting unauthorized access
 Providing multiple user interfaces
 Enforcing integrity constraints.
 Providing back up and recovery
6. List the levels of data abstraction
 Physical level
1
CS6302 – DBMS


VENKATESH.R AP/CSE
Logical level
View level
7. Define Instances and Schema
Instance: Collection of data stored in the data base at a particular moment is called
an Instance of the database.
Schema: The overall design of the data base is called the data base schema.
8. Define the terms physical schema and logical schema
Physical schema: The physical schema describes the database design at the physical
level, which is the lowest level of abstraction describing how the data are actually stored.
Logical schema: The logical schema describes the database design at the logical
level, which describes what data are stored in the database and what relationship exists
among the data.
9. What is conceptual schema
The schemas at the view level are called subschema’s that describe different
views of the database.
10. Define data model
A data model is a collection of conceptual tools for describing data, data
relationships, data semantics and consistency constraints.
11. List the data models
 Entity-Relationship model
 Relational model
 Object Oriented data model
 Semi Structured data model
 Network Model
 Hierarchical model
12. What is a storage manager? List the components of storage manager.
A storage manager is a program module that provides the interface between the low
level data stored in a database and the application programs and queries submitted to the
system.
The storage manager components include
 Authorization and integrity manager
 Transaction manager
 File manager
 Buffer manager
13. What is the purpose of storage manager?
The storage manager is responsible for the following
 Interaction with he file manager
 Translation of DML commands in to low level file system commands
 Storing, retrieving and updating data in the database
14. List the data structures implemented by the storage manager.
The storage manager implements the following data structure
2
CS6302 – DBMS
VENKATESH.R AP/CSE
Data files
Data dictionary
Indices
15. What is a data dictionary?
A data dictionary is a data structure which stores metadata about the structure of the
database ie. the schema of the database.
16. What is an entity relationship model?
The entity relationship model is a collection of basic objects called entities and
relationship among those objects. An entity is a thing or object in the real world that is
distinguishable from other objects.
17. What are attributes? Give examples.
An entity is represented by a set of attributes. Attributes are descriptive properties
possessed by each member of an entity set.
Example: possible attributes of customer entity are customer name, customer id,
Customer Street, customer city.
18. What is relationship? Give examples
A relationship is an association among several entities.
Example: A depositor relationship associates a customer with each account that
he/she has.
19. Define the terms Entity set and Relationship set
Entity set: The set of all entities of the same type is termed as an entity set.
Relationship set: The set of all relationships of the same type is termed as a
relationship set.
20. Define single valued and multivalued attributes.
Single valued attributes: attributes with a single value for a particular entity are called
single valued attributes.
Multivalued attributes: Attributes with a set of value for a particular entity are called
multivalued attributes.
21. What are stored and derived attributes?
Stored attributes: The attributes stored in a data base are called stored attributes.
Derived attributes: The attributes that are derived from the stored attributes are called
derived attributes.
22. What are composite attributes?
Composite attributes can be divided in to sub parts.
23. Define null values.
In some cases a particular entity may not have an applicable value for an attribute or
if we do not know the value of an attribute for a particular entity. In these cases null value is
used.
24. Define the term Entity type
3
CS6302 – DBMS
VENKATESH.R AP/CSE
Entity type: An entity type defines a collection of entities that have the same
attributes.
25. What is meant by the degree of relationship set?
The degree of relationship type is the number of participating entity types.
26. Define the terms Key attribute and Value set
Key attribute: An entity type usually has an attribute whose values are distinct from
each individual entity in the collection. Such an attribute is called a key attribute.
Value set: Each simple attribute of an entity type is associated with a value set that
specifies the set of values that may be assigned to that attribute for each individual entity.
27. Define weak and strong entity sets?
Weak entity set: entity set that do not have key attribute of their own are called weak
entity sets.
Strong entity set: Entity set that has a primary key is termed a strong entity set.
28. What does the cardinality ratio specify?
Mapping cardinalities or cardinality ratios express the number of entities to which
another entity can be associated. Mapping cardinalities must be one of the following:
 One to one
 One to many
 Many to one
 Many to many
29. Explain the two types of participation constraint.
Total: The participation of an entity set E in a relationship set R is said to be total if
every entity in E participates in at least one relationship in R.
Partial: if only some entities in E participate in relationships in R, the participation of
entity set E in relationship R is said to be partial.
30. Define the terms DDL and DML
DDL: Data base schema is specified by a set of definitions expressed by a special
language called a data definition language.
DML: A data manipulation language is a language that enables users to access or
manipulate data as organized by the appropriate data model.
31. Write short notes on relational model
The relational model uses a collection of tables to represent both data and the
relationships among those data. The relational model is an example of a record based model.
32. Define tuple and attribute
 Attributes: column headers
 Tuple: Row
33. List and define the two classes of DML
 Procedural – user specifies what data is required and how to get those data
 Declarative (nonprocedural) – user specifies what data is required without
specifying how to get those data
4
CS6302 – DBMS
VENKATESH.R AP/CSE
34. What is the role of a database administrator?
 Schema definition
 Storage structure and access method definition
 Schema and physical organization modification
 Granting user authority to access the database
 Specifying integrity constraints
 Acting as liaison with users
 Monitoring performance and responding to changes in requirements
35. Define and explain the two types of Data Independence.
Two types of data independence are
i. Physical data independence
ii. Logical data independence
Physical data independence:
The ability to modify the physical schema without causing application programs to be
rewritten in the higher levels. Modifications in physical level occur occasionally whenever
there is a need to improve performance.
Logical data independence:
The ability to modify the logical schema without causing application programs to be
rewritten in the higher level (external level). Modifications in physical level occur frequently
more than that in physical level, whenever there is an alteration in the logical structure of the
database.
36. Explain DML pre-compiler.
DML precompiler converts DML statements embedded in an application program to
normal procedure calls in the host language. The precompiler must interact with the DML
compiler to generate the appropriate code.
37. Define file manager and buffer manager.
File manager: File manager manages the allocation of space on disk storage and the
data structures used to represent information stored on disk.
Buffer manager: Buffer manager is responsible for fetching data from the disk storage
into the main memory, and deciding what data to cache in memory.
38. Define specialization and generalization
Specialization and generalization define a containment relationship between a higherlevel entity set and one or more lower-level entity sets. Specialization is the result of taking a
subset of a higher-level entity set to form a lowerlevel entity set. Generalization is the result
of taking the union of two or more disjoint (lower-level) entity sets to produce a higher-level
entity set. The attributes of higher-level entity sets are inherited by lower-level entity sets.
39. Define the term aggregation
Aggregation is an abstraction in which relationship sets (along with their associated
entity sets) are treated as higher-level entity sets, and can participate in relationships.
5
CS6302 – DBMS
VENKATESH.R AP/CSE
40. Define the term disjoint and overlapping
• Disjoint. A disjointness constraint requires that an entity belong to no more than
one lower-level entity set. In our example, an account entity can satisfy only one condition
for the account-type attribute; an entity can be either a savings account or a checking
account, but cannot be both.
• Overlapping. In overlapping generalizations, the same entity may belong to more
than one lower-level entity set within a single generalization. For an illustration, consider the
employee work team example, and assume that certain managers participate in more than one
work team. A given employee may therefore appear in more than one of the team entity sets
that are lower-level entity sets of employee. Thus, the generalization is overlapping.
As another example, suppose generalization applied to entity sets customer and
employee leads to a higher-level entity set person. The generalization is overlapping if an
employee can also be a customer.
PART – B
1. a) Explain the views of data.
b) Explain the two types of data models in brief
2. Write short notes on database system structure.
3. How do database systems differ from file systems? Explain.
4. a) Discuss various design issues in E-R model.
b) How to draw an E-R diagram? Explain its major components.
5. Write in detail about the extended E-R features.
6. Explain the design of an E-R database schema.
7. Write brief notes on database languages.
8. Explain weak entity sets with suitable example.
9. Draw an E-R diagram for Engineering College enterprise
10. Consider an employment agency computerizing its service. An employer may offer a number
of jobs. Applicants may apply for many jobs. The job history of each applicant is recorded as
previous job. Employer details include employer name and address. Job details include job
code, title and salary. Applicant details include application number, name, address, phone,
date of birth and applying date. Job history details include job history id, title, salary,
duration and skill required. Construct an E-R diagram for the above scenario and transform it
to a logical database design.
11. A weak entity set can always be made into a strong entity set by adding to its attributes the
primary-key attributes of its identifying entity set. Outline what sort of redundancy will result
if we do so.
12. Illustrate the issues to be considered while developing an ER-diagram.
13. Construct an ER diagram to a modal online book store.
14. Construct an ER diagram for a car insurance company that has a set of customers, each of
whom owns one/more cars. Each car has associated with it zero to any number of recorded
accidents. Construct appropriate tables for the above ER diagram.
15. Define data model. Explain the different types of data models with relevant examples.
6
CS6302 – DBMS
VENKATESH.R AP/CSE
QUESTION BANK – UNIT II
PART – A
15. Define the term relation
Relation is a subset of a Cartesian product of list domains.
16. Define tuple variable
Tuple variable is a variable whose domain is the set of all tuples.
3. Define the term Domain.
For each attribute there is a set of permitted values called the domain of that attribute.
4. What is a candidate key?
Minimal super keys are called candidate keys.
5. What is a primary key?
Primary key is chosen by the database designer as the principal means of identifying
an entity in the entity set.
6. What is a super key?
A super key is a set of one or more attributes that collectively allows us to identify
uniquely an entity in the entity set.
7. Define- relational algebra.
The relational algebra is a procedural query language. It consists of a set of operations
that take one or two relation as input and produce a new relation as output.
8. What is a SELECT operation?
The select operation selects tuples that satisfy a given predicate. We use the
lowercase letter s to denote selection.
9. What is a PROJECT operation?
The project operation is a unary operation that returns its argument relation with
certain attributes left out. Projection is denoted by pie (p).
10. Write short notes on tuple relational calculus.
The tuple relational calculation is anon procedural query language. It describes the
desired information with out giving a specific procedure for obtaining that information.
A query or expression can be expressed in tuple relational calculus as {t | P (t)} which
means the set of all tuples‘t’ such that predicate P is true for‘t’. Notations used:
• t[A] ® the value of tuple ‘t’ on attribute, A
• t Î r ® tuple ‘t’ is in relation ‘r’
• $ ® there exists
Definition for ‘there exists’ ($): $ t Î r(Q(t)) which means there exists a tuple ‘t’ in
relation ‘r’ such that predicate Q(t) is true.
• " ® for all
Definition for ‘for all’ ("): "t Î r(Q(t)) which means Q(t) is true for all tuples ‘t’ in
relation ‘r’.
7
CS6302 – DBMS
VENKATESH.R AP/CSE
• _ ® Implication
Definition for Implication (_): P_Q means if P is true then Q must be true.
11. Write short notes on domain relational calculus
The domain relational calculus uses domain variables that take on values from an
attribute domain rather than values for entire tuple.
12. Define query language?
A query is a statement requesting the retrieval of information. The portion of DML
that involves information retrieval is called a query language.
13. Write short notes on Schema diagram.
A database schema along with primary key and foreign key dependencies can be
depicted pictorially by schema diagram. Each relation appears as a box with attributes listed
inside it and the relation name above it.
14. What is foreign key?
A relation schema r1 derived from an ER schema may include among its attributes
the primary key of another relation schema r2.this attribute is called a foreign key from r1
referencing r2.
15. What are the parts of SQL language?
The SQL language has several parts:
 Data - definition language
 Data manipulation language
 View definition
 Transaction control
 Embedded SQL
 Integrity
 Authorization
16. What are the categories of SQL command?
SQL commands are divided in to the following categories:
 Data - definition language
 Data manipulation language
 Data Query language
 Data control language
 Data administration statements
 Transaction control statements
17. What are the three classes of SQL expression?
SQL expression consists of three clauses:
 Select
 From
 Where
18. Give the general form of SQL query?
Select A1, A2…………., An
8
CS6302 – DBMS
VENKATESH.R AP/CSE
From R1, R2……………, Rm
Where P
19. What is the use of rename operation?
Rename operation is used to rename both relations and a attributes. It uses the as
clause, taking the form:
Old-name as new-name
20. Define tuple variable?
Tuple variables are used for comparing two tuples in the same relation. The tuple
variables are defined in the from clause by way of the as clause.
21. List the string operations supported by SQL?
 Pattern matching Operation
 Concatenation
 Extracting character strings
 Converting between uppercase and lower case letters.
22. List the set operations of SQL?
 Union
 Intersect operation
 The except operation
23. What is the use of Union and intersection operation?
Union: The result of this operation includes all tuples that are either in r1 or in r2 or
in both r1 and r2.Duplicate tuples are automatically eliminated.
Intersection: The result of this relation includes all tuples that are in both r1 and r2.
24. What are aggregate functions? And list the aggregate functions supported by SQL?
Aggregate functions are functions that take a collection of values as input and return a
single value.
Aggregate functions supported by SQL are
Average: avg
 Minimum: min
 Maximum: max
 Total: sum
 Count: count
25. What is the use of group by clause?
Group by clause is used to apply aggregate functions to a set of tuples. The attributes
given in the group by clause are used to form groups. Tuples with the same value on all
attributes in the group by clause are placed in one group.
26. What is the use of sub queries?
9
CS6302 – DBMS
VENKATESH.R AP/CSE
A sub query is a select-from-where expression that is nested with in another query. A
common use of sub queries is to perform tests for set membership, make set comparisons,
and determine set cardinality.
27. What is view in SQL? How is it defined?
Any relation that is not part of the logical model, but is made visible to a user as a
virtual relation is called a view. We define view in SQL by using the create view command.
The form of the
Create view command is
Create view v as <query expression>
28. What is the use of with clause in SQL?
The with clause provides a way of defining a temporary view whose definition is
available only to the query in which the with clause occurs.
29. List the table modification commands in SQL?
 Deletion
 Insertion
 Updates
 Update of a view
30. List out the statements associated with a database transaction?
 Commit work
 Rollback work
31. What is transaction?
Transaction is a unit of program execution that accesses and possibly updated various
data items.
32. List the SQL domain Types?
SQL supports the following domain types.
1) Char(n) 2) varchar(n) 3) int 4) numeric(p,d) 5) float(n) 6) date.
33. What is the use of integrity constraints?
Integrity constraints ensure that changes made to the database by authorized users do
not result in a loss of data consistency. Thus integrity constraints guard against accidental
damage to the database.
34. Mention the 2 forms of integrity constraints in ER model?
 Key declarations
 Form of a relationship
35. What is trigger?
Triggers are statements that are executed automatically by the system as the side
effect of a modification to the database.
36. What are domain constraints?
10
CS6302 – DBMS
VENKATESH.R AP/CSE
A domain is a set of values that may be assigned to an attribute .all values that appear
in a column of a relation must be taken from the same domain.
37. What are referential integrity constraints?
A value that appears in one relation for a given set of attributes also appears for a
certain set of attributes in another relation.
38. What is assertion? Mention the forms available.
An assertion is a predicate expressing a condition that we wish the database always to
satisfy.
 Domain integrity constraints.
 Referential integrity constraints
39. Give the syntax of assertion?
Create assertion <assertion name>check<predicate>
36. What is the need for triggers?
Triggers are useful mechanisms for alerting humans or for starting certain tasks
automatically when certain conditions are met.
37. List the requirements needed to design a trigger.
The requirements are
 Specifying when a trigger is to be executed.
 Specify the actions to be taken when the trigger executes.
38. Give the forms of triggers?
 The triggering event can be insert or delete.
 For updated the trigger can specify columns.
 The referencing old row as clause
 The referencing new row as clause
 The triggers can be initiated before the event or after the event.
39. What does database security refer to?
Database security refers to the protection from unauthorized access and malicious
destruction or alteration.
40. List some security violations (or) name any forms of malicious access.
 Unauthorized reading of data
 Unauthorized modification of data
 Unauthorized destruction of data.
41. List the types of authorization.
 Read authorization
 Write authorization
 Update authorization
11
CS6302 – DBMS

VENKATESH.R AP/CSE
Drop authorization
42. What is authorization graph?
Passing of authorization from one user to another can be represented by an
authorization graph.
43. List out various user authorization to modify the database schema.
 Index authorization
 Resource authorization
 Alteration authorization
 Drop authorization
44. What are audit trails?
An audit trail is a log of all changes to the database along with information such as
which user performed the change and when the change was performed.
45. Mention the various levels in security measures.
 Database system
 Operating system
 Network
 Physical
 Human
46. Name the various privileges in SQL?
 Delete
 Select
 Insert
 Update
47. Mention the various user privileges.
 All privileges directly granted to the user or role.
 All privileges granted to roles that have been granted to the user or role.
48. Give the limitations of SQL authorization.
 The code for checking authorization becomes intermixed with the rest of the
application code.
 Implementing authorization through application code rather than specifying it
declaratively in SQL makes it hard to ensure the absence of loopholes.
49. Give some encryption techniques?
 DES
 AES
 Public key encryption
50. What does authentication refer?
Authentication refers to the task of verifying the identity of a person.
12
CS6302 – DBMS
VENKATESH.R AP/CSE
51. List some authentication techniques.
 Challenge response scheme
 Digital signatures
 Nonrepudiation
52. List the disadvantages of relational database system
 Repetition of data
 Inability to represent certain information.
53. Draw the schema diagram of a banking enterprise.
54. Define materialized views.
Certain database systems allow view relations to be stored, but they make sure that, if
the actual relations used in the view definition change, the view is kept up to date. Such
views are called materialized views.
55. What does set membership do in nested sub queries?
SQL draws on the relational calculus for operations that allow testing tuples for
membership in a relation. The in connective tests for set membership, where the set is a
collection of values produced by a select clause. The not in connective tests for the absence
of set membership.
56. Define ODBC and explain its role.
The Open DataBase Connectivity (ODBC) standard defines a way for an
application program to communicate with a database server. ODBC defines an application
program interface (API) that applications can use to open a connection with a database,
send queries and updates, and get back results. Applications such as graphical user interfaces,
statistics packages, and spreadsheets can make use of the same ODBC API to connect to any
database server that supports ODBC.
57. List the Integrity Constraints.
 Not Null
 Unique
 Check
13
CS6302 – DBMS
VENKATESH.R AP/CSE
58. Write a SQL statement to find the names and loan numbers of all customers who have loan at
Chennai branch.
Select cust_name, loan_no from depositor, loan where branch_name=’Chennai’ and
depositor.loan_no=loan.loan_no
59. List the different types of joins in SQL.
 Inner Join
 Left Outer Join
 Right Outer Join
 Full Outer Join
60. What is static SQL? How does it differ from dynamic SQL?
The SQL statements in the program are static if they do not change each time the
program in run time. Dynamic SQL statements can be built at run time and placed in a string
host variable. They are then sent to the DBMS for processing.
PART – B
1. Consider the following relational schema. An employee can work in more than one
department.
Emp(eid: integer, ename: string, salary: real)
Works(eid: integer, did: integer)
Dept(did: integer, dname: string, managerid: integer, floornum: integer)
Write SQL queries for the following. Be sure to underline your variables to distinguish them
from your constants.
i. Print the names of all employees who work on the 10th floor and make less than
$50,000.
ii. Print the names of all managers who manage three or more departments on the same
floor.
iii. Print the names of all managers who manage 10 or more departments on the same
floor.
iv. Give every employee who works in the toy department a 10 percent raise
v. Print the names of the departments that employee Santa works in.
vi. Print the names and salaries of employees who work in both the toy department and
the candy department.
vii. Print the names of employees who earn a salary that is either less than $10,000 or
more than $100,000.
viii. Print all of the attributes for employees who work in some department that
employee Santa also works in.
ix. Fire Santa.
x. Print the names of employees who make more than $20,000 and work in either the
video department or the toy department.
xi. Print the names of all employees who work on the floor(s) where Jane Dodecahedron
works.
xii. Print the name of each employee who earns more than the manager of the department
that he or she works in.
xiii. Print the name of each department that has a manager whose last name is Psmith
and who is neither the highest-paid nor the lowest-paid employee in the department.
14
CS6302 – DBMS
VENKATESH.R AP/CSE
2. Consider the relational database
employee (empname, street,city)
works (empname.companyname,salary)
company(companyname,city)
manages(empname,managername)
and give relational algebraic expressions for the following queries:
i. Find the names of all employees who work for first Bank Corporation.
ii. Find the names.street addresses,and cities of residence of all employees who work for
first bank corporation and earn more than 200000 per annum.
iii. Find the names of all employees in this database who live in the city as the companies
for which they work.
iv. Find the names of all the employees who earn more than every employees of Small
Corporation.
3. Let R= (A,B,C) and let r1and r2 both be relations on the schema R. Give an expression
in the domain relational calculus that is equivalent to each of the following:
i. ?A (r)
ii. sB = 17 (r)
iii. r1 n r2
iv. r1 U r2
v. r1 – r2
4. Explain the various keys involved in relational database with a suitable example.
5. Discuss about the triggers. How do triggers offer a powerful mechanism for dealing with the
changes to database with suitable example?
6. What are nested queries? Explain with example.
7. What are the relational algebra operations supported in SQL? Write the SQL statement for
each operation.
8. Explain the basic relational algebra operations with the symbol used with example.
9. Discuss about tuple relational calculus and domain relational calculus
10. Consider the database given by the following schemas:
Customer( Cust _no, sales_ person _no, City)
Sales_person(sales_person_no,Sales_person_name,common_prec,year_of_hire)
Give the SQL for the following:
i. Display the list of all customers by cust_no with the city in which each is located.
ii. List the names of the sales persons who have accounts in Delhi.
11. Discuss the strengths and weaknesses of the trigger mechanism. Compare triggers with other
integrity
12. Discuss Extended Relational-Algebra Operations in detail.
13. What is a domain type? List and explain the various domain types available in SQL with
example.
14. Write brief notes on the following
i. Set Operations
ii. Aggregate Functions
iii. Null Values
15. Define and Illustrate Integrity constraint with suitable examples.
16. State and explain Embedded SQL and Dynamic SQL in detail.
17. Write the following queries in SQL, based on this schema:
Suppliers(sid: integer, sname: string, city: string)
Parts(pid: integer, pname: string, color: string)
Orders(sid: integer, pid: integer, quantity: integer)
15
CS6302 – DBMS
VENKATESH.R AP/CSE
i. For each supplier from whom all of the following things have been ordered in
quantities of at least 150, print the name and city of the supplier: a blue gear, a red
crankshaft, and a yellow bumper.
ii. Print the names of the purple parts that have been ordered from suppliers located in
Madison, Milwaukee, or Waukesha.
iii. Print the names and cities of suppliers who have an order for more than 150 units of a
yellow or purple part.
iv. Print the pids of parts that have been ordered from a supplier named American but
have also been ordered from some supplier with a different name in a quantity that is
greater than the American order by at least 100 units.
v. Print the names of the suppliers located in Madison. Could there be any duplicates in
the answer?
vi. Print all available information about suppliers that supply green parts.
vii. For each order of a red part, print the quantity and the name of the part.
viii. Print the names of the parts that come in both blue and green. (Assume that no two
distinct parts can have the same name and color.)
ix. Print (in ascending order alphabetically) the names of parts supplied both by a
Madison supplier and by a Berkeley supplier.
x. Print the names of parts supplied by a Madison supplier, but not supplied by any
Berkeley supplier. Could there be any duplicates in the answer?
xi. Print the total number of orders.
xii. Print the largest quantity per order for each sid such that the minimum quantity per
order for that supplier is greater than 100.
xiii. Print the average quantity per order of red parts.
18. Consider the following schema
Suppliers(sid: integer, sname: string, address: string)
Parts(pid: integer, pname: string, color: string)
Catalog(sid: integer, pid: integer, cost: real)
The key elds are underlined, and the domain of each eld is listed after the eld name. Thus sid
is the key for Suppliers, pid is the key for Parts, and sid and pid together form the key for
Catalog. The Catalog relation lists the prices charged for parts by Suppliers. Write the
following queries in relational algebra, tuple relational calculus, and domain relational
calculus:
i. Find the names of suppliers who supply some red part.
ii. Find the sids of suppliers who supply some red or green part.
iii. Find the sids of suppliers who supply some red part or are at 221 Packer Ave.
iv. Find the sids of suppliers who supply some red part and some green part.
v. Find the sids of suppliers who supply every part.
vi. Find the sids of suppliers who supply every red part.
vii. Find the sids of suppliers who supply every red or green part.
viii. Find the sids of suppliers who supply every red part or supply every green part.
ix. Find pairs of sids such that the supplier with the rst sid charges more for some part
than the supplier with the second sid.
x. Find the pids of parts that are supplied by at least two different suppliers.
xi. Find the pids of the most expensive parts supplied by suppliers named Yosemite
Sham.
xii. Find the pids of parts supplied by every supplier at less than $200. (If any supplier
either does not supply the part or charges more than $200 for it, the part is not
selected.)
16
CS6302 – DBMS
VENKATESH.R AP/CSE
19. Consider the following relations containing airline flight information:
flights(flno: integer, from: string, to: string, distance: integer, departs: time, arrives: time)
aircraft(aid: integer, aname: string, cruisingrange: integer)
certified(eid: integer, aid: integer)
employees(eid: integer, ename: string, salary: integer)
Note that the Employees relation describes pilots and other kinds of employees as
well; every pilot is certified for some aircraft (otherwise, he or she would not qualify as a
pilot), and only pilots are certified to fly.
Write the following queries in relational algebra, tuple relational calculus, and
domain relational calculus. Note that some of these queries may not be expressible in
relational algebra (and, therefore, also not expressible in tuple and domain relational
calculus)! For such queries, informally explain why they cannot be expressed.
i. Find the eids of pilots certified for some Boeing aircraft.
ii. Find the names of pilots certified for some Boeing aircraft.
iii. Find the aids of all aircraft that can be used on non-stop flights from Bonn to Madras.
iv. Identify the flights that can be piloted by every pilot whose salary is more than
$100,000.
v. (Hint: The pilot must be certified for at least one plane with a sufficiently large
cruising range.)
vi. Find the names of pilots who can operate some plane with a range greater than 3,000
miles but are not certified on any Boeing aircraft.
vii. Find the eids of employees who make the highest salary.
viii. Find the eids of employees who make the second highest salary.
ix. Find the eids of pilots who are certified for the largest number of aircraft.
x. Find the eids of employees who are certified for exactly three aircraft.
xi. Find the total amount paid to employees as salaries.
xii. Is there a sequence of flights from Madison to Timbuktu? Each flight in the sequence
is required to depart from the city that is the destination of the previous flight; the rst
flight must leave Madison, the last flight must reach Timbuktu, and there is no
restriction on the number of intermediate flights. Your query must determine whether
a sequence of flights from Madison to Timbuktu exists for any input Flights relation
instance
20. Consider the following relation
Sailors(sid: integer, sname: string, rating: integer, age: real)
Boats(bid: integer, bname: string, color: string)
Reserves(sid: integer, bid: integer, day: date)
Write SQL queries for the following.
i. Find the names and ages of all sailors
ii. Find all sailors with rating above 7.
iii. Find the names of sailors who have reserved boat number 103.
iv. Find the sids of sailors who have reserved a red boat.
v. Find the names of sailors who have reserved a red boat.
vi. Find the color of boat reserved by Lubber.
vii. Find the names of sailors who have reserved at least one boat.
viii. Compute increments for the ratings of persons who have sailed two different boats
on the same day
ix. Find the ages of sailors whose name begins and ends with B and has at least three
characters.
x. Find the names of sailors who have reserved a red or a green boat.
17
CS6302 – DBMS
VENKATESH.R AP/CSE
xi. Find the names of sailors who have reserved both a red and a green boat.
xii. Find the names of sailors who have reserved red boats but not green boats.
xiii. Find the ages of sailors whose name begins and ends with B and had at least three
characters
xiv. Find all sids of sailors who have a rating of 10 or have reserved boat 104.
xv. Find the names of sailors who have not reserved a red boat.
xvi. Find sailors whose rating is better than some sailor called Harito.
xvii. Find sailors whose rating is better than every sailor called Horatio.
xviii. Find the sailors with highest rating
xix. Find the names of sailors who have reserved all boats.
xx. Find the average age of all sailors.
xxi. Find the average age of sailors with a rating of 10
xxii. Find the name and age of the oldest sailor.
xxiii. Count the number of sailors.
xxiv. Find the names of sailors who are older than the oldest sailor with a rating of 10.
xxv. Find the age of the youngest sailor for each rating level.
xxvi. Find the age of the youngest sailor who is eligible to vote for each rating level with
at least two such sailors.
xxvii. Find those ratings for which the average age of sailors is the minimum overall
ratings.
21. Consider the following relations:
Student(snum: integer, sname: string, major: string, level: string, age: integer)
Class(name: string, meets at: time, room: string, d: integer)
Enrolled(snum: integer, cname: string)
Faculty(fid: integer, fname: string, deptid: integer)
The meaning of these relations is straightforward; for example, Enrolled has one record per
student-class pair such that the student is enrolled in the class.
Write the following queries in SQL. No duplicates should be printed in any of the answers.
i. Find the names of all Juniors (Level = JR) who are enrolled in a class taught by I.
Teach.
ii. Find the age of the oldest student who is either a History major or is enrolled in a
course taught by I. Teach.
iii. Find the names of all classes that either meet in room R128 or have five or more
students enrolled.
iv. Find the names of all students who are enrolled in two classes that meet at the same
time.
v. Find the names of faculty members who teach in every room in which some class is
taught.
vi. Find the names of faculty members for whom the combined enrollment of the courses
that they teach is less than five.
vii. Print the Level and the average age of students for that Level, for each Level.
viii. Print the Level and the average age of students for that Level, for all Levels except
JR.
ix. Find the names of students who are enrolled in the maximum number of classes.
x. Find the names of students who are not enrolled in any class.
xi. For each age value that appears in Students, find the level value that appears most
often.
xii. For example, if there are more FR level students aged 18 than SR, JR, or SO students
aged 18, you should print the pair (18, FR).
18
CS6302 – DBMS
VENKATESH.R AP/CSE
22. The following relations keep track of airline flight information
Flights(flno: integer, from: string, to: string, distance: integer, departs: time, arrives: time,
price: integer)
Aircraft(aid: integer, aname: string, cruisingrange: integer)
Certified(eid: integer, aid: integer)
Employees(eid: integer, ename: string, salary: integer)
Note that the Employees relation describes pilots and other kinds of employees as
well; every pilot is certified for some aircraft, and only pilots are certified to fly. Write each
of the following queries in SQL.
i. Find the names of aircraft such that all pilots certified to operate them earn more than
80,000.
ii. For each pilot who is certified for more than three aircraft, find the eid and the
maximum cruisingrange of the aircraft that he (or she) is certified for.
iii. Find the names of pilots whose salary is less than the price of the cheapest route from
Los Angeles to Honolulu.
iv. For all aircraft with cruisingrange over 1,000 miles, find the name of the aircraft and
the average salary of all pilots certified for this aircraft.
v. Find the names of pilots certified for some Boeing aircraft.
vi. Find the aids of all aircraft that can be used on routes from Los Angeles to Chicago.
vii. Identify the flights that can be piloted by every pilot who makes more than $100,000.
(Hint: The pilot must be certified for at least one plane with a sufficiently large
cruising range.)
viii. Print the enames of pilots who can operate planes with cruisingrange greater than
3,000 miles, but are not certified on any Boeing aircraft.
ix. A customer wants to travel from Madison to New York with no more than two
changes of flight. List the choice of departure times from Madison if the customer
wants to arrive in New York by 6 p.m.
x. Compute the difference between the average salary of a pilot and the average salary
of all employees (including pilots).
xi. Print the name and salary of every nonpilot whose salary is more than the average
salary for pilots.
23. Consider the following relational schema. An employee can work in more than one
department; the pct time eld of the Works relation shows the percentage of time that a
given employee works in a given department.
Write the following queries in SQL:
i. Print the names and ages of each employee who works in both the Hardware
department and the Software department.
ii. For each department with more than 20 full-time-equivalent employees (i.e., where
the part-time and full-time employees add up to at least that many full-time
employees), print the did together with the number of employees that work in that
department.
iii. Print the name of each employee whose salary exceeds the budget of all of the
departments that he or she works in.
iv. Find the managerids of managers who manage only departments with budgets greater
than $1,000,000.
v. Find the enames of managers who manage the departments with the largest budget.
vi. If a manager manages more than one department, he or she controls the sum of all the
budgets for those departments. Find the managerids of managers who control more
than $5,000,000.
19
CS6302 – DBMS
VENKATESH.R AP/CSE
vii. Find the managerids of managers who control the largest amount.
24. Consider the instance of the Sailors relation shown in the below figure
Figure 1.1 An Instance of Sailors
i. Write SQL queries to compute the average rating, using AVG; the sum of the ratings,
using SUM; and the number of ratings, using COUNT.
ii. If you divide the sum computed above by the count, would the result be the same as
the average? How would your answer change if the above steps were carried out with
respect to the age field instead of rating?
iii. Consider the following query: Find the names of sailors with a higher rating than all
sailors with age < 21. The following two SQL queries attempt to obtain the answer to
this question. Do they both compute the result? If not, explain why. Under what
conditions would they compute the same result?
SELECT S.sname
FROM Sailors S
WHERE NOT EXISTS (SELECT *
FROM Sailors S2
WHERE S2.age < 21
AND S.rating <= S2.rating)
SELECT *
FROM Sailors S
WHERE S.rating > ANY (SELECT S2.rating
FROM Sailors S2
WHERE S2.age < 21)
iv. Consider the instance of Sailors shown in Figure 1.1. Let us define instance S1 of
Sailors to consist of the first two tuples, instance S2 to be the last two tuples, and S to
be the given instance.
(a) Show the left outer join of S with itself, with the join condition being sid=sid.
(b) Show the right outer join of S with itself, with the join condition being
sid=sid.
(c) Show the full outer join of S with itself, with the join condition being sid=sid.
(d) Show the left outer join of S1 with S2, with the join condition being sid=sid.
(e) Show the right outer join of S1 with S2, with the join condition being sid=sid.
(f) Show the full outer join of S1 with S2, with the join condition being sid=sid.
25. Explain the term impedance mismatch in the context of embedding SQL commands in
a host language such as C.
20
CS6302 – DBMS
VENKATESH.R AP/CSE
QUESTION BANK – UNIT III
PART – A
17. Define Boyce Codd Normal Form
A relation schema R is in BCNF with respect to a set F of functional dependencies if,
for all functional dependencies in F+ of the form α->β, where α R and β R, at least one of
the following holds:
 α→β is a trivial functional dependency (that is β α).
 Α is the superkey for schema R.
18. What is first normal form?
The domain of attribute must include only atomic (simple, indivisible) values.
17.What is meant by functional dependencies?
The notion of functional dependency generalizes the notion of superkey. Consider a
relation schema R, and let α R and β R, The functional dependency α→β holds on
schema R if, in any legal relation r(R), for all pairs of tuples t1 and t2 in r such that t1[α] =
t2[α], it is also the case that t1[β] = t2[β].
18. What is a trivial dependency?
The functional dependency of the form α→β is trivial if β
dependencies are satisfied by all the relations
α. Trivial functional
19. What are axioms?
Axioms or rules of inference provide a simpler technique for reasoning about
functional dependencies.
6. What is meant by computing the closure of a set of functional dependency?
The closure of F denoted by F+ is the set of functional dependencies logically implied
by F.
7. What is meant by normalization of data?
It is a process of analyzing the given relation schemas based on their Functional
Dependencies (FDs) and primary key to achieve the properties
 Minimizing redundancy
 Minimizing insertion, deletion and updating anomalies.
8. Define canonical cover?
A canonical cover Fc for F is a set of dependencies such that F logically implies all
dependencies in FC and Fc logically implies all dependencies in F. Fc must have the
following properties.
9. List the properties of canonical cover.
Fc must have the following properties.
 No functional dependency in Fc contains an extraneous attribute.
 Each left side of a functional dependency in Fc is unique.
10. List the desirable properties of decomposition.
21
CS6302 – DBMS



VENKATESH.R AP/CSE
Lossless-join decomposition
Dependency preservation
Repetition of information
14. What is 2NF?
A relation schema R is in 2NF if it is in 1NF and every non-prime attribute A in R is
fully functionally dependent on primary key.
15. What is multivalued dependency?
Multivalued dependence x→→y specified on relation schema R where x and y are
both subsets of R, specifies the following constraints on any relation state r of R. If two
tuples t1 and t2 exist in r with the following properties where we use Z to denote
(R – (x
y))
t3[x]=t4[x]=t1[x]=t2[x]
t3[y]=t1[y] and t4[y]=t2[y]
t3[z]=t2[z]=t4[z]=t1[z]
16. List the two ways on using functional dependencies.
1. To test relations to see whether they are legal under a given set of functional
dependencies. If a relation r is legal under a set F of functional dependencies, we say that r
satisfies F.
2. To specify constraints on the set of legal relations. We shall thus concern ourselves
with only those relations that satisfy a given set of functional dependencies. If we wish to
constrain ourselves to relations on schema R that satisfy a set F of functional dependencies,
we say that F holds on R.
17. When functional dependency is said to be logically implied?
Given a relational schema R, a functional dependency f on R is logically implied by a
set of functional dependencies F on R if every relation instance r(R) that satisfies F also
satisfies f. Suppose we are given a relation schema R = (A, B, C, G, H, I) and the set of
functional dependencies
A→B A→C CG→H CG→I B→H
The functional dependency A→H is logically implied.
18. Define extraneous attributes.
An attribute of a functional dependency is said to be extraneous if we can remove it
without changing the closure of the set of functional dependencies. The formal definition of
extraneous attributes is as follows. Consider a set F of functional dependencies and the
functional dependency α →β in F.
 Attribute A is extraneous in α if A Є α, and F logically implies (F − {α →β}) {(α −
A) → β}.
 Attribute A is extraneous in β if A Є β, and the set of functional dependencies (F − {α
→β}) {α → (β − A)} logically implies F.
19. List and define Armstrong’s axioms.
This collection of rules is called Armstrong’s axioms in honor of the person who
first proposed it.
 Reflexivity rule. If α is a set of attributes and β α, then α →β holds.
 Augmentation rule. If α → β holds and γ is a set of attributes, then γα → γβ holds.
22
CS6302 – DBMS

VENKATESH.R AP/CSE
Transitivity rule. If α →β holds and β → γ holds, then α → γ holds.
20. When Armstrong’s Axioms are said to be sound and complete?
Armstrong’s axioms are sound, because they do not generate any incorrect functional
dependencies. They are complete, because, for a given set F of functional dependencies, they
allow us to generate all F+
21. List the additional rules of Armstrong’s Axioms.
Although Armstrong’s axioms are complete, it is tiresome to use them directly for the
computation of F+. To simplify matters further, we list additional rules. It is possible to use
Armstrong’s axioms to prove that these rules are correct
 Union rule. If α → β holds and α → γ holds, then α →βγ holds.
 Decomposition rule. If α →βγ holds, then α → β holds and α →γ holds.
 Pseudotransitivity rule. If α→β holds and γβ →δ holds, then αγ →δ holds.
22. Define Decomposition.
The bad design suggests that we should decompose a relation schema that has
many attributes into several schemas with fewer attributes. Careless decomposition,
however,
may lead to another form of bad design.
23. When decomposition is said to be lossless join decomposition?
Let C represent a set of constraints on the database, and let R be a relation schema. A
decomposition {R1,R2, . . .,Rn} of R is a lossless-join decomposition if, for all relations r on
schema R that are legal under C,
r = ΠR1 (r) ΠR2 (r) … ΠRn (r)
Let R be a relation schema, and let F be a set of functional dependencies on R. Let R1
and R2 form a decomposition of R. This decomposition is a lossless-join decomposition of R
if at least one of the following functional dependencies is in F+:
 R1 ∩ R2 → R1
 R1 ∩ R2 → R2
In other words, if R1 ∩ R2 forms a superkey of either R1 or R2, the decomposition of R is
lossless-join decomposition
24. Define the dependency preservation.
A relational-database design is said to be dependency preservation, when an update is
made to the database, the system should be able to check that the update will not create an
illegal relation—that is, one that does not satisfy all the given functional dependencies. If we
are to check updates efficiently, we should design relational-database schemas that allow
update validation without the computation of joins.
We say that a decomposition having the property F’+ = F+ is a dependencypreserving decomposition.
25. Define Third Normal Form
BCNF requires that all nontrivial dependencies be of the form α → β, where α is a
superkey. 3NF relaxes this constraint slightly by allowing nontrivial functional dependencies
whose left side is not a superkey.
A relation schema R is in third normal form (3NF) with respect to a set F of
functional dependencies if, for all functional dependencies in F+ of the form α → β, where α
R and β
R, at least one of the following holds:
23
CS6302 – DBMS


VENKATESH.R AP/CSE
α → β is a trivial functional dependency.
α is a superkey for R.
26. Define 3NF synthesis algorithm.
Decomposition algorithm is also called the 3NF synthesis algorithm, since it takes a
set of dependencies and adds one schema at a time, instead of decomposing the initial
schema repeatedly. The result is not uniquely defined, since a set of functional dependencies
can have more than one canonical cover, and, further, in some cases the result of the
algorithm depends on the order in which it considers the dependencies in Fc.
27. Define Fourth Normal Form
A relation schema R is in fourth normal form (4NF) with respect to a set D of
functional and multivalued dependencies if, for all multivalued dependencies in D+ of the
form α →→ β, where α R and β R, at least one of the following holds
 α →→ β is a trivial multivalued dependency.
 α is a superkey for schema R.
A database design is in 4NF if each member of the set of relation schemas that
constitutes the design is in 4NF.
28. What is join dependency?
Join dependencies constrain the set of legal relations over a schema R to those
relations for which a given decomposition is a lossless-join decomposition.
Let R be a relation schema and R1, R2,..., Rnbe a decomposition of R. If R = R1
R2 …… Rn, we say that a relation r® satisfies the join dependency*(R1, R2,...Rn) if:
r=ΠR1®⋈ΠR2®⋈……⋈ΠRn®.
A join dependency is trivial if one of the Ri is R itself.
PART-B:
1.
2.
3.
4.
5.
6.
What is Normalization? Explain 1NF, 2NF, 3NF and BCNF with suitable example.
What is FD? Explain the role of FD in the process of normalization.
Explain multivalued dependency.
Describe 4NF with suitable example.
Write brief notes on Join Dependency
With example, explain 5NF.
24
CS6302 – DBMS
VENKATESH.R AP/CSE
QUESTION BANK – UNIT IV
PART – A
1. What is transaction?
Collections of operations that form a single logical unit of work are called
transactions.
2. What are the two statements regarding transaction?
The two statements regarding transaction of the form:

Begin transaction

End transaction
3. What are the properties of transaction?
The properties of transactions are:

Atomicity

Consistency

Isolation

Durability
4. What is recovery management component?
Ensuring durability is the responsibility of a software component of the base system
called the recovery management component.
5. When is a transaction rolled back?
Any changes that the aborted transaction made to the database must be undone. Once
the changes caused by an aborted transaction have been undone, then the transaction has
been
rolled back.
6. What are the states of transaction?

The states of transaction are

Active

Partially committed

Failed

Aborted

Committed

Terminated
7. What is a shadow copy scheme?
25
CS6302 – DBMS
VENKATESH.R AP/CSE
It is simple, but efficient, scheme called the shadow copy schemes. It is based on
making copies of the database called shadow copies that one transaction is active at a time.
The scheme also assumes that the database is simply a file on disk.
8. Give the reasons for allowing concurrency?
The reasons for allowing concurrency is if the transactions run serially, a short
transaction may have to wait for a preceding long transaction to complete, which can lead to
unpredictable delays in running a transaction. So, concurrent execution reduces the
unpredictable delays in running transactions.
9. What is average response time?
The average response time is that the average time for a transaction to be completed
after it has been submitted.
10. What are the two types of Serializability?
The two types of Serializability is

Conflict Serializability

View Serializability.
11. Define lock?
Lock is the most common used to implement the requirement is to allow a transaction
to access a data item only if it is currently holding a lock on that item.
12. What are the different modes of lock?
The modes of lock are:

Shared(S)

Exclusive(X)
13. Define deadlock?
Neither of the transaction can ever proceed with its normal execution. This situation
is
called deadlock.
14. Define the phases of two phase locking protocol
Growing phase: a transaction may obtain locks but not release any lock.
Shrinking phase: a transaction may release locks but may not obtain any new locks.
15. Define upgrade and downgrade?
A mechanism for conversion from shared lock to exclusive lock is known as
upgrade.
26
CS6302 – DBMS
VENKATESH.R AP/CSE
A mechanism for conversion from exclusive lock to shared lock is known as
downgrade.
16. What is a database graph?
The partial ordering implies that the set D may now be viewed as a directed acyclic
graph, called a database graph.
17. What are the two methods for dealing deadlock problem?
The two methods for dealing deadlock problem is deadlock detection and deadlock
recovery.
18. What is a recovery scheme?
An integral part of a database system is a recovery scheme that can restore the
database
to the consistent state that existed before the failure.
19. What are the two types of errors?
The two types of errors are:

Logical error

System error
20. What are the storage types?
The storage types are:

Volatile storage

Nonvolatile storage
21. Define blocks?
The database system resides permanently on nonvolatile storage, and is partitioned
into
fixed-length storage units called blocks.
22. What is meant by Physical blocks?
The input and output operations are done in block units. The blocks residing on the
disk
are referred to as physical blocks.
23. What is meant by buffer blocks?
The blocks residing temporarily in main memory are referred to as buffer blocks.
24. What is meant by disk buffer?
The area of memory where blocks reside temporarily is called the disk buffer.
25. What is meant by log-based recovery?
27
CS6302 – DBMS
VENKATESH.R AP/CSE
The most widely used structures for recording database modifications is the log. The
log is a sequence of log records, recording all the update activities in the database. There are
several types of log records.
26. What are uncommitted modifications?
The immediate-modification technique allows database modifications to be output to
the database while the transaction is still in the active state. Data modifications written by
active transactions are called uncommitted modifications.
27. Define shadow paging.
An alternative to log-based crash recovery technique is shadow paging. This
technique
needs fewer disk accesses than do the log-based methods.
28. Define page.
The database is partitioned into some number of fixed-length blocks, which are
referred
to as pages.
29. Explain current page table and shadow page table.
The key idea behind the shadow paging technique is to maintain two page tables
during the life of the transaction: the current page table and the shadow page table. Both the page
tables are identical when the transaction starts. The current page table may be changed when
a
transaction performs a write operation.
30. What are the drawbacks of shadow-paging technique?

Commit Overhead

Data fragmentation

Garbage collection
31. Define garbage collection.
Garbage may be created also as a side effect of crashes. Periodically, it is necessary to
find all the garbage pages and to add them to the list of free pages. This process is called
garbage
collection.
32. Differentiate strict two phase locking protocol and rigorous two phase locking protocol.
In strict-two-phase locking protocol all exclusive mode locks taken by a transaction is
held until that transaction commits.
Rigorous-two-phase locking protocol requires that all locks be held until the
transaction
commits.
33. How the time stamps are implemented
28
CS6302 – DBMS
VENKATESH.R AP/CSE
Use the value of the system clock as the time stamp. That is a transaction’s time
stamp is
equal to the value of the clock when the transaction enters the system.
Use a logical counter that is incremented after a new timestamp has been assigned;
that
is
the time stamp is equal to the value of the counter.
34. What are the time stamps associated with each data item?
W-timestamp(Q) denotes the largest time stamp if any transaction that executed
WRITE(Q) successfully.
R-timestamp (Q) denotes the largest time stamp if any transaction that executed
READ(Q) successfully.
35. What benefit is provided by strict-two-phase locking? What are the advantages results?
Cascading rollbacks can be avoided by a modification of two phase locking protocol.
This protocol required not only that locking be two-phase, but also that all exclusive mode
locks taken than by a transaction be held until that transaction commits.
PART-B:
7. Explain the different forms of Serializability.
8. What are the different types of schedules that are acceptable for recoverability?
9. Discuss two-phase locking protocol and timestamp-based protocol.
10. Explain the deferred and immediate modification versions of the log-based recovery scheme.
11. Explain the four different properties of transactions that a DBMS must ensure to maintain
database.
12. What is concurrency control? How is it implemented in DBMS? Explain.
13. Explain the various recovery techniques during transaction in detail.
14. Write short notes on shadow paging.
15. Explain testing for Serializability with respect to concurrency control schemes. How will you
determine, whether a schedule is serializable or not.
29
CS6302 – DBMS
VENKATESH.R AP/CSE
QUESTION BANK – UNIT V
YEAR/SEM: II/IV
PART – A
1. What is an index?
An index is a structure that helps to locate desired records of a relation quickly,
without examining all records
2. Define query optimization.
Query optimization refers to the process of finding the lowest –cost method of
evaluating a given query.
3. What are called jukebox systems?
Jukebox systems contain a few drives and numerous disks that can be loaded into one
of the drives automatically.
4. What are the types of storage devices?
_ Primary storage
_ Secondary storage
_ Tertiary storage
_ Volatile storage
_ Nonvolatile storage
5. What is called remapping of bad sectors?
If the controller detects that a sector is damaged when the disk is initially formatted,
or when an attempt is made to write the sector, it can logically map the sector to a different
physical location.
6. Define access time.
Access time is the time from when a read or write request is issued to when data
transfer begins.
7. Define seek time.
The time for repositioning the arm is called the seek time and it increases with the
distance that the arm is called the seek time.
8. Define average seek time.
The average seek time is the average of the seek times, measured over a sequence of
random requests.
9. Define rotational latency time.
The time spent waiting for the sector to be accessed to appear under the head is called
the rotational latency time.
10. Define average latency time.
The average latency time of the disk is one-half the time for a full rotation of the disk.
11. What is meant by data-transfer rate?
30
CS6302 – DBMS
VENKATESH.R AP/CSE
The data-transfer rate is the rate at which data can be retrieved from or stored to the
disk.
12. What is meant by mean time to failure?
The mean time to failure is the amount of time that the system could run continuously
without failure.
13. What is a block and a block number?
A block is a contiguous sequence of sectors from a single track of one platter. Each
request specifies the address on the disk to be referenced. That address is in the form of a
block number.
14. What are called journaling file systems?
File systems that support log disks are called journaling file systems.
15. What is the use of RAID?
A variety of disk-organization techniques, collectively called redundant arrays of
independent disks are used to improve the performance and reliability.
16. What is called mirroring?
The simplest approach to introducing redundancy is to duplicate every disk. This
technique is called mirroring or shadowing.
17. What is called mean time to repair?
The mean time to failure is the time it takes to replace a failed disk and to restore the
data on it.
18. What is called bit-level striping?
Data striping consists of splitting the bits of each byte across multiple disks. This is
called bit-level striping.
19. What is called block-level striping?
Block level striping stripes blocks across multiple disks. It treats the array of disks as
a large disk, and gives blocks logical numbers.
20. What are the two main goals of parallelism?
_ Load –balance multiple small accesses, so that the throughput of such accesses
increases.
_ Parallelize large accesses so that the response time of large accesses is reduced
21. What are the factors to be taken into account when choosing a RAID level?
o Monetary cost of extra disk storage requirements.
o Performance requirements in terms of number of I/O operations
o Performance when a disk has failed.
o Performances during rebuild.
22. What is meant by software and hardware RAID systems?
31
CS6302 – DBMS
VENKATESH.R AP/CSE
RAID can be implemented with no change at the hardware level, using only software
modification. Such RAID implementations are called software RAID systems and the
systems with special hardware support are called hardware RAID systems.
23. Define hot swapping?
Hot swapping permits the removal of faulty disks and replaces it by new ones without
turning power off. Hot swapping reduces the mean time to repair.
24. What are the ways in which the variable-length records arise in database systems?
_ Storage of multiple record types in a file.
_ Record types that allow variable lengths for one or more fields.
_ Record types that allow repeating fields.
25. What is the use of a slotted-page structure and what is the information present in the
header?
The slotted-page structure is used for organizing records within a single block.
The header contains the following information.
_ The number of record entries in the header.
_ The end of free space
_ An array whose entries contain the location and size of each record.
26. What are the two types of blocks in the fixed –length representation? Define them.
• Anchor block: Contains the first record of a chain.
• Overflow block: Contains the records other than those that are the first record of a
chain.
27. What is known as heap file organization?
In the heap file organization, any record can be placed anywhere in the file where
there is space for the record. There is no ordering of records. There is a single file for each
relation.
28. What is known as sequential file organization?
In the sequential file organization, the records are stored in sequential order,
according to the value of a “search key” of each record.
29. What is hashing file organization?
In the hashing file organization, a hash function is computed on some attribute of
each record. The result of the hash function specifies in which block of the file the
record should be placed.
30. What is known as clustering file organization?
In the clustering file organization, records of several different relations are stored in
the same file.
31. What are the types of indices?
_ Ordered indices
32
CS6302 – DBMS
VENKATESH.R AP/CSE
_ Hash indices
32. What are the techniques to be evaluated for both ordered indexing and hashing?
_ Access types
_ Access time
_ Insertion time
_ Deletion time
_ Space overhead
33. What is known as a search key?
An attribute or set of attributes used to look up records in a file is called a search key.
34. What is a primary index?
A primary index is an index whose search key also defines the sequential order of the
file.
35. What are called index-sequential files?
The files that are ordered sequentially with a primary index on the search key, are
called index-sequential files.
36. What are the two types of indices?
_ Dense index
_ Sparse index
37. What are called multilevel indices?
Indices with two or more levels are called multilevel indices.
38. What is B-Tree?
A B-tree eliminates the redundant storage of search-key values .It allows search key
values to appear only once.
39. What is a B+-Tree index?
A B+-Tree index takes the form of a balanced tree in which every path from the root
of the root of the root of the tree to a leaf of the tree is of the same length.
40. What is a hash index?
A hash index organizes the search keys, with their associated pointers, into a hash file
structure.
41. What are called as index scans?
Search algorithms that use an index are referred to as index scans.
42. What is called as external sorting?
Sorting of relations that do not fit into memory is called as external sorting.
43. What is called as recursive partitioning?
The system repeats the splitting of the input until each partition of the build input fits
in the memory. Such partitioning is called recursive partitioning.
33
CS6302 – DBMS
VENKATESH.R AP/CSE
44. What is called as an N-way merge?
The merge operation is a generalization of the two-way merge used by the standard
in-memory sort-merge algorithm. It merges N runs, so it is called an N-way merge.
45. What is known as fudge factor?
The number of partitions is increased by a small value called the fudge factor, which
is usually 20 percent of the number of hash partitions computed.
46. What are the sparse and dense indices?
Sparse Indices: An index record appears only some of the search-key values.
Dense Indices: An index record appears for every search-key value in the file. In a
dense primary index, the index record contains the search-key value and a pointer to the first
data record with that search-key value.
47. Compare sequential access devices versus random access devices with an example.
Tape storage is referred to as sequential access storage. Disk storage is referred to as
direct access storage.
48. What can be done to reduce the occurrences of bucket overflows in a hash file
organization?
To reduce the occurrence of overflows, we can
 Choose the hash function more carefully and make better estimation of the relation
size.
 If the estimated size of the relation is nr and the number of records per block is fr,
allocate (nr/fr)*(1+d) buckets instead of (nr/fr) buckets. Here d is a fudge factor,
typically around 0.2. Some space is wasted. But the benefit is that some of the skew is
handled and the probability of overflow is reduced.
49. Define volatile storage.
Volatile storage loses its contents when the power to thedevice is removed.
50. Define buffer replacement strategy
When there is no room left in the buffer, a block must be removed from the buffer
before a new one can be read in. Most operating systems use a least recently used (LRU)
scheme, in which the block that was referenced least recently is written back to disk and is
removed from the buffer. This simple approach can be improved on for database
applications.
51. What are pinned blocks?
For the database system to be able to recover from crashes, it is necessary to restrict
those times when a block may be written back to disk. For instance, most recovery systems
require that a block should not be written to disk while an update on the block is in progress.
A block that is not allowed to be written back to disk is said to be pinned. Although many
operating systems do not support pinned blocks, such a feature is essential for a database
system that is resilient to crashes.
52. Define forced output of blocks.
34
CS6302 – DBMS
VENKATESH.R AP/CSE
There are situations in which it is necessary to write back the block to disk, even
though the buffer space that it occupies is not needed. This write is called the forced output
of a block.
53. List the disadvantages of Byte-String representation.
It is not easy to reuse space occupied formerly by a deleted record. Although
techniques exist to manage insertion and deletion, they lead to a large number of small
fragments of disk storage that are wasted.
There is no space, in general, for records to grow longer. If a variable-length record becomes
longer, it must be moved—movement is costly if pointers to the record are stored elsewhere
in the database (e.g., in indices, or in other records), since the pointers must be located and
updated.
54. What is meant by two-way and multi-way joins?
Two-way join is a join on two-files. Joins involving more than two files are called
multi-way joins.
55. Define hash join.
In the hash join algorithm, a hash function h is used to partition tuples of both
relations. The basic idea is to partition the tuples of each of relations into sets that have the
same hash value on the join attributes.
56. Define hybrid hash join
The hybrid hash-join algorithm is a variation of partition hash join, where the joining
phase for one of the partitions is included in the partitioning phase.
57. What are the goals of database tuning?
 To make applications run faster
 To lower the response time of queries/transactions.
 To improve the overall throughput of transactions.
58. What are the steps involved in query processing?
The basic steps are:
 Parsing and translation
 Optimization
 Evaluation
59. What is called an evaluation primitive?
A relational algebra operation annotated with instructions on how to evaluate is called
an evaluation primitive.
60. What is called a query evaluation plan?
A sequence of primitive operations that can be used to evaluate ba query is a query
evaluation plan or a query execution plan.
61. What is called a query –execution engine?
The query execution engine takes a query evaluation plan, executes that plan, and
returns the answers to the query.
35
CS6302 – DBMS
VENKATESH.R AP/CSE
PART – B
2. Explain the various stages involved in Query processing.
3. What is the need for indexing and hashing? How will they improve the access efficiency in a
database? Explain.
4. Write notes on RAID levels.
5. Explain in detail about the hash and hybrid hash join operation.
6. Explain why allocations of records to blocks affect database system performance
significantly.
7. Describe the structure of B+ tree and give the algorithm for search in the B+ tree with
example.
8. Give the comparison between ordered indexing and hashing.
9. Explain the security provided in commercial query languages.
10. How would you estimate the cost of the query?
11. Explain how RAID system improves performance and reliability?
12. Explain in detail about the static and dynamic hashing in detail.
13. Describe the different types of file organization. Explain using a sketch of each of them with
their advantages and disadvantages.
14. Explain the index schemas used in detail.
15. How does a DBMS represent a relational query evaluation plan?
16. Construct a B+ tree for the following set of key values:
(2,3,6,9,12,18,19,23,29,31,40,67,90,110)
Assume that the tree is initially empty and values are added in ascending order.
Note: The number of pointers that will fit in one node is three. Illustrate the tree construction
process step by step.
17. Explain the distinction between closed and open hashing. Discuss the relative merits of each
technique in database applications.
18. What are the causes of bucket overflow in a hash file organization? What can be done to
reduce the occurrence of bucket overflows? Discuss.
19. Discuss physical storage media in detail.
20. Write brief notes on magnetic disks.
21. Define and differentiate fixed and variable length records on file organization.
22. Explain organization of records in files.
23. Explain about database tuning.
24. Write brief notes on external-merge sort.
25. Explain the methods used to implement a join operation.
26. Explain in detail about search algorithm used to implement a select operation.
27. How will you search the RAID level? Discuss.
28. Explain storage access in detail.
29. Differentiate B-Tree with B+ tree.
30. Explain sorting of relations in detail.
31. Write in detail about the evaluation of expressions in detail.
36
Download