1. What is database? A database is a logically coherent collection of data with some inherent meaning, representing some aspect of real world and which is designed, built and populated with data for a specific purpose. 2. What is DBMS? It is a collection of programs that enables user to create and maintain a database. In other words it is general-purpose software that provides the users with the processes of defining, constructing and manipulating the database for various applications. 3. What are the advantages of DBMS? Redundancy is controlled. Unauthorised access is restricted. Providing multiple user interfaces. Enforcing integrity constraints. Providing backup and recovery. 4. What are the disadvantage in File Processing System? Data redundancy and inconsistency. Difficult in accessing data. Data isolation. Data integrity. Concurrent access is not possible. Security Problems. 5. Describe the three levels of data abstraction? The are three levels of abstraction: Physical level: The lowest level of abstraction describes how data are stored. Logical level: The next higher level of abstraction, describes what data are stored in database and what relationship among those data. View level: The highest level of abstraction describes only part of entire database. 6. Define the "integrity rules"? There are two Integrity rules. Entity Integrity: States that "Primary key cannot have NULL value" Referential Integrity: States that "Foreign Key can be either a NULL value or should be Primary Key value of other relation. 7. What is Data Independence? Data independence means that "the application is independent of the storage structure and access strategy of data". In other words, The ability to modify the schema definition in one level should not affect the schema definition in the next higher level. Two types of Data Independence: Physical Data Independence: Modification in physical level should not affect the logical level. Logical Data Independence: Modification in logical level should affect the view level. 8. What is Data Model? A collection of conceptual tools for describing data, data relationships data semantics and constraints. 9. What is E-R model? This data model is based on real world that consists of basic objects called entities and of relationship among these objects. Entities are described in a database by a set of attributes. 10. What is Object Oriented model? This model is based on collection of objects. An object contains values stored in instance variables with in the object. An object also contains bodies of code that operate on the object. These bodies of code are called methods. Objects that contain same types of values and the same methods are grouped together into classes. 11. What is an Entity? It is a 'thing' in the real world with an independent existence. 12. What is an Entity type? It is a collection (set) of entities that have same attributes. 13. What is an Entity set? It is a collection of all entities of particular entity type in the database. 14. What is Weak Entity set? An entity set may not have sufficient attributes to form a primary key, and its primary key compromises of its partial key and primary key of its parent entity, then it is said to be Weak Entity set. 15. What is an attribute? It is a particular property, which describes the entity. 16. What is a Relation Schema and a Relation? A relation Schema denoted by R(A1, A2, ..., An) is made up of the relation name R and the list of attributes Ai that it contains. A relation is defined as a set of tuples. Let r be the relation which contains set tuples (t1, t2, t3, ..., tn). Each tuple is an ordered list of n-values t=(v1,v2, ..., vn). 17. What is Relationship? It is an association among two or more entities. 18. What is Relationship type? Relationship type defines a set of associations or a relationship set among a given set of entity types. 19. What is DDL (Data Definition Language)? A data base schema is specifies by a set of definitions expressed by a special language called DDL. 20. What is DML (Data Manipulation Language)? This language that enable user to access or manipulate data as organised by appropriate data model. Procedural DML or Low level: DML requires a user to specify what data are needed and how to get those data. Non-Procedural DML or High level: DML requires a user to specify what data are needed without specifying how to get those data. 21. What is Relational Algebra? It is procedural query language. It consists of a set of operations that take one or two relations as input and produce a new relation. 22. What is Relational Calculus? It is an applied predicate calculus specifically tailored for relational databases proposed by E.F. Codd. E.g. of languages based on it are DSL ALPHA, QUEL. 23. What is normalization? It is a process of analysing the given relation schemas based on their Functional Dependencies (FDs) and primary key to achieve the properties (1).Minimizing redundancy, (2). Minimizing insertion, deletion and update anomalies. 24. What is Functional Dependency? A Functional dependency is denoted by X Y between two sets of attributes X and Y that are subsets of R specifies a constraint on the possible tuple that can form a relation state r of R. The constraint is for any two tuples t1 and t2 in r if t1[X] = t2[X] then they have t1[Y] = t2[Y]. This means the value of X component of a tuple uniquely determines the value of component Y. 25. What is Lossless join property? It guarantees that the spurious tuple generation does not occur with respect to relation schemas after decomposition. 26. What is 1 NF (Normal Form)? The domain of attribute must include only atomic (simple, indivisible) values. 27. What is Fully Functional dependency? It is based on concept of full functional dependency. A functional dependency X Y is full functional dependency if removal of any attribute A from X means that the dependency does not hold any more. 28. What is 2NF? A relation schema R is in 2NF if it is in 1NF and every non-prime attribute A in R is fully functionally dependent on primary key. 29. What is 3NF? A relation schema R is in 3NF if it is in 2NF and for every FD X A either of the following is true X is a Super-key of R. A is a prime attribute of R. In other words, if every non prime attribute is non-transitively dependent on primary key. 30. What is BCNF (Boyce-Codd Normal Form)? A relation schema R is in BCNF if it is in 3NF and satisfies an additional constraint that for every FD X A, X must be a candidate key. 31. What is 4NF? A relation schema R is said to be in 4NF if for every Multivalued dependency X Y that holds over R, one of following is true. 1.) X is subset or equal to (or) XY = R. 2.) X is a super key. 32. What is 5NF? A Relation schema R is said to be 5NF if for every join dependency {R1, R2, ..., Rn} that holds R, one the following is true 1.) Ri = R for some i. 2.) The join dependency is implied by the set of FD, over R in which the left side is key of R. 33. What are partial, alternate, artificial, compound and natural key? Partial Key: It is a set of attributes that can uniquely identify weak entities and that are related to same owner entity. It is sometime called as Discriminator. Alternate Key: All Candidate Keys excluding the Primary Key are known as Alternate Keys. Artificial Key: If no obvious key, either stand alone or compound is available, then the last resort is to simply create a key, by assigning a unique number to each record or occurrence. Then this is known as developing an artificial key. Compound Key: If no single data element uniquely identifies occurrences within a construct, then combining multiple elements to create a unique identifier for the construct is known as creating a compound key. Natural Key: When one of the data elements stored within a construct is utilized as the primary key, then it is called the natural key. 34. What is meant by query optimization? The phase that identifies an efficient execution plan for evaluating a query that has the least estimated cost is referred to as query optimization. 35. What is durability in DBMS? Once the DBMS informs the user that a transaction has successfully completed, its effects should persist even if the system crashes before all its changes are reflected on disk. This property is called durability. 36. What do you mean by atomicity and aggregation? Atomicity: Either all actions are carried out or none are. Users should not have to worry about the effect of incomplete transactions. DBMS ensures this by undoing the actions of incomplete transactions. Aggregation: A concept which is used to model a relationship between a collection of entities and relationships. It is used when we need to express a relationship among relationships. 37. What is a query? A query with respect to DBMS relates to user commands that are used to interact with a data base. The query language can be classified into data definition language and data manipulation language. 38. What do you mean by Correlated subquery? Subqueries, or nested queries, are used to bring back a set of rows to be used by the parent query. Depending on how the subquery is written, it can be executed once for the parent query or it can be executed once for each row returned by the parent query. If the subquery is executed for each row of the parent, this is called a correlated subquery. A correlated subquery can be easily identified if it contains any references to the parent subquery columns in its WHERE clause. Columns from the subquery cannot be referenced anywhere else in the parent query. The following example demonstrates a non-correlated subquery. 39. Define SQL and state the differences between SQL and other conventional programming Languages. SQL is a nonprocedural language that is designed specifically for data access operations on normalized relational database structures. The primary difference between SQL and other conventional programming languages is that SQL statements specify what data operations should be performed rather than how to perform them. 40. What is database Trigger? A database trigger is a PL/SQL block that can defined to automatically execute for insert, update, and delete statements against a table. The trigger can e defined to execute once for the entire statement or once for every row that is inserted, updated, or deleted. For any one table, there are twelve events for which you can define database triggers. A database trigger can call database procedures that are also written in PL/SQL. 41. What are stored-procedures? And what are the advantages of using them? Stored procedures are database objects that perform a user defined operation. A stored procedure can have a set of compound SQL statements. A stored procedure executes the SQL commands and returns the result to the client. Stored procedures are used to reduce network traffic. 42. what is a database? a dbms is a complex software system that is used to manage, store and manipulate data and metadata used to describe the data. 43. what is a key?what are different keys in database? a key is nothing but a attribute or group of attributes. they are used to perform some specific operation depending on their operation. the keys are classified into primary key, secondary key, alternative key super key, candidate key, compoundor concatinated or composite key. 44. what is a primary key? primary key: an attribute to identify a record uniquely is considered to be primary key. for eg in the student table student_no is the primary key because it can be used to identify unique record or unique student. 45. what is a secondary key? an attribute used to identify a group of records satisfying a given condition is said to be a secondary key. in the employee table designation is a secondary key because more than one employee can have the same designation 46. what is a candidate key? register no usually alloted in the exams is also unique for each student in that case for identifying a student uniquely either student_no or register_no can beused. here two different candidates are contesting for primary key post. any of them can be selected as primary key. 47. what is an alternate key? if any one of the candidate keys among the different candidate keys available is selected as primary key then remaining keys are called alternate key. 48. what is a super key? with primary key if any other attribute is added then that combination is called super key in other words, primary key is the minimum possible super key. in the student table student_no+student_name is one the super key. 49. what is a composite key? if the primary key is combination of more than one key then it is calles thecomposite key.in the table called marks student_no+subject is the composite key. 50. what is a relation? a relation consists of a homogeneous set of tuples. 51. what is a table? it is the representation of a relation having records as rows and attributes ascolumns. 52. what is an attribute? an object or entity is characterised by its properties or attributes. inrelational database systems attributes corresponds to fields. 53. what is a domain? the set of allowable value for the attribute is the domain of the attribute. 54. what is a tuple? tuples are the members of a relation.an entity type having attributes can berepresented by set of these attributes called tuple. 55. what are operations in relational algebra? union:the term of the relation as performed by combining the tuples from onerelation with those a second relation to produce a third relation, duplicate tuples are eliminated, the the relation must be union compatable. difference:the difference of two relations is a third relation having tuples that occur in the first relation but not in the second relation. intersection:the intersection operation selects the common tuples from the two relations. cartesian product:the cartesian product of two relations is the concatination of tuples belongingto the two relations.a new resultant scheme is created consisting ofconcatination of all possible combination of tuples. 56. what are different dbms facilities? 1)the data definition facility or data definition language(DDL) 2)the data manipulation facility or data manipulation language(DML) 3)the data control facility(DCL) 57. what is a data directory or data dictionary? the result of compilation of DDL statements is a set of tables which are stored in a special file called data dictionary or data directory. 58. what is a SQL? structered query language(sql) originated in 1974 at IBM.SQL was the data definition and manipulation language. 59. what are the features of SQL? portability, client srever architecture, dynamic data definition, multiple views of data, complete data base language, interactive, high level structure and SQL standards. 60. What is Union, minus and Interact commands? MINUS operator is used to return rows from the first query but not from the second query. INTERSECT operator is used to return rows returned by both the queries. 61. Delete vs. Truncate table. Delete logs the deletion of each row whereas Truncate doesn't log deleted rows in the transaction log. This makes truncate command is bit faster than Delete command. 62. Define constraints. Constraints enforce integrity of the database. Constraints can be of following types Not Null, Check, Unique, Primary key, Foreign key 63. Define stored procedure. Stored procedure is a set of pre-compiled SQL statements, executed when it is called in the program. 64. Define Trigger. Triggers are similar to stored procedure except it is executed automatically when any operations are occurred on the table. 65. What is domain value or value set of an attribute? It is the set of values that may be assigned to that attribute for each individual entities . LAQ What is a database? Describe the advantages and disadvantages of using of DBMS Ans: Database - A database is a collection of related data and/or information stored so that it is available to many users for different purposes. Advantages Of DBMS Centralized Management and Control - One of the main advantages of using a database system is that the organization can exert, via the DBA, centralized management and control over the data. Reduction of Redundancies and Inconsistencies - Centralized control avoids unnecessary duplication of data and effectively reduces the total amount of data storage required. Removing redundancy eliminates inconsistencies. Data Sharing - A database allows the sharing of data under its control by any number of application programs or users. Data Integrity - Data integrity means that the data contained in the database is both accurate and consistent. Centralized control can also ensure that adequate checks are incorporated in the DBMS to provide data integrity. Data Security - Data is of vital importance to an organization and may be confidential. Such confidential data must not be accessed by unauthorized persons. The DBA who has the ultimate responsibility for the data in the DBMS can ensure that proper access procedures are followed. Different levels of security could be implemented for various types of data and operations. Data Independence - Data independence is the capacity to change the schema at one level of a database system without having to change the schema at the next level. It is usually considered from two points of view: physical data independence and logical data independence. Physical data independence is the capacity to change the internal schema without having to change conceptual schema. Logical data independence is the capacity to change the conceptual schema without having to change external schemas or application programs. Providing Storage Structures for Efficient Query Processing - Database systems provide capabilities for efficiently executing queries and updates. Auxiliary files called indexes are used for this purpose. Backup and Recovery - These facilities are provided to recover databases from hardware and/or software failures. Disadvantages Of DBMS Cost of Software/Hardware and Migration - A significant disadvantage of the DBMS system is cost. Reduced Response and Throughput - The processing overhead introduced by the DBMS to implement security, integrity, and sharing of the data causes a degradation of the response and throughput times. Problem with Centralization - Centralization also means that the data is accessible from a single source namely the database. This increases the potential of security breaches and disruption of the operation of the organization because of downtimes and failures. Explain five duties of Database Administrator. Ans: 1. DBA administers the three levels of the database and, in consultation with the overall user community, sets up the definition of the global view or conceptual level of the database. 2. Mappings between the internal and the conceptual levels, as well as between the conceptual and external levels, are also defined by the DBA. 3. DBA ensures that appropriate measures are in place to maintain the integrity of the database and that the database is not accessible to unauthorized users. 4. DBA is responsible for granting permission to the users of the database and stores the profile of each user in the database. 5. DBA is responsible for defining procedures to recover the database from failures with minimal loss of data. Explain Various Normalization Forms? The Normal Forms Are As Follows… First normal form (1NF) sets the basic rules for an organized database: Eliminate duplicative columns from the same database table. Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key). Second normal form (2NF) further Handles the concept of removing duplicative data: Meet all the requirements of the first normal form. Remove subsets of data that apply to multiple rows of a table and place them in separate tables. Create relationships between these new tables and their predecessors through the use of foreign keys. Third normal form (3NF) goes one large step further: Meet all the requirements of the second normal form. Remove columns that are not dependent upon the primary key. Finally, fourth normal form (4NF) has one additional requirement: Meet all the requirements of the third normal form. A relation is in 4NF if it has no multi-valued dependencies Describe the evolution of DBMS First general purpose DBMS was designed by Charles Bachman at general purpose Electric in early 1960’s which was called as Integrated Data Store This Ids formed the basis formed the basis for the network model which was standardized by the conference on data System languages. (CODASYL). In late 1960’s IBM developed the information management system (IMS) DBMS This IMS formed an alternative data representation framework called hierarchical data model. A system called SABRE for airline reservations was jointly developed by American Airlines and IBM Now days the same system is being used by web-based travel services called Travelocity. In 1970’s Edgar codd proposed a new data representation frame work called relational data model which became academic disciplines and its popularity changed to commercial landscape. Then in 1980’s SQL query language for relational databases was developed as apart of IBM’s system and project The current standards of SQL were adopted by ANSI and ISO. In 1990’s advances were made in the area’s of databases system and also researches were made on query languages Finally DBMS has gained lot of importance as more and more data is brought online and made ever accessible through computer networking. Compare file system and DBMS A Database Management System (DMS) is a combination of computer software, hardware, and information designed to electronically manipulate data via computer processing. Two types of database management systems are DBMS’s and FMS’s. In simple terms, a File Management System (FMS) is a Database Management System that allows access to single files or tables at a time. FMS’s accommodate flat files that have no relation to other files. The FMS was the predecessor for the Database Management System (DBMS), which allows access to multiple files or tables at a time (see Figure 1 below). File Management Systems Advantages Simpler to use Less expensive· Fits the needs of many small businesses and home users Popular FMS’s are packaged along with the operating systems of personal computers (i.e. Microsoft Card file and Microsoft Works) Good for database solutions for hand held devices such as Palm Pilot Disadvantages Typically does not support multi-user access Limited to smaller databases Limited functionality (i.e. no support for complicated transactions, recovery, etc.) Decentralization of data Redundancy and Integrity issues The goals of a File Management System can be summarized as follows (Calleri, 2001): Data Management. An FMS should provide data management services to the application. Generality with respect to storage devices. The FMS data abstractions and access methods should remain unchanged irrespective of the devices involved in data storage. Validity. An FMS should guarantee that at any given moment the stored data reflect the operations performed on them. Protection. Illegal or potentially dangerous operations on the data should be controlled by the FMS. Concurrency. In multiprogramming systems, concurrent access to the data should be allowed with minimal differences. Performance. Compromise data access speed and data transfer rate with functionality. From the point of view of an end user (or application) an FMS typically provides the following functionalities File creation, modification and deletion. Ownership of files and access control on the basis of ownership permissions. Facilities to structure data within files (predefined record formats, etc). Facilities for maintaining data redundancies against technical failure (back-ups, disk mirroring, etc.). Logical identification and structuring of the data, via file names and hierarchical directory structures. Database Management Systems Database Management Systems provide the following advantages and disadvantages: Advantages Disadvantages Greater flexibility Difficult to learn Good for larger databases Packaged separately from the operating system (i.e. Oracle, Microsoft Access, Lotus/IBM Approach, Borland Paradox, Claris FileMaker Pro) Greater processing power Slower processing speeds Fits the needs of many medium to large-sized organizations Requires skilled administrators Storage for all relevant data Expensive Provides user views relevant to tasks performed Ensures data integrity by managing transactions (ACID test = atomicity, consistency, isolation, durability) Supports simultaneous access Enforces design criteria in relation to data format and structure Provides backup and recovery controls Advanced security The goals of a Database Management System can be summarized as follows (Connelly, Begg, and Strachan, 1999, pps. 54 – 60): Data storage, retrieval, and update (while hiding the internal physical implementation details) A user-accessible catalog Transaction support Concurrency control services (multi-user update functionality) Recovery services (damaged database must be returned to a consistent state) Authorization services (security) Support for data communication Integrity services (i.e. constraints) Services to promote data independence Utility services (i.e. importing, monitoring, performance, record deletion, etc.) The components to facilitate the goals of a DBMS may include the following: Query processor Data Manipulation Language preprocessor Database manager (software components to include authorization control, command processor, integrity checker, query optimizer, transaction manager, scheduler, recovery manager, and buffer manager) Data Definition Language compiler File manager Catalog manager Explain the level of abstraction in DBMS Levels of abstraction: Data is described in three levels of abstraction as shown in the figure below. They The conceptual Physical External Payroll View 1 Billing View 2 Record External Schemas View 3 Conceptual schema Physical schema Disk External and conceptual schema are defined by data definition language.(DDL) SQL commands are used to describe physical schema. The info of these schemas is stored in system catalogs. Conceptual schema: Hides details. – In the relational model, the conceptual schema presents data as a set of tables. • DBMS maps from conceptual to physical schema automatically. • Physical schema can be changed without changing application: – DBMS would change mapping from conceptual to physical transparently – This property is referred to as physical data independence Physical Schema Physical schema describes details of how data is stored: tracks, cylinders, indices etc. Early applications worked at this level – explicitly dealt with details. Problem: Routines were hard-coded to deal with physical representation. Changes to data structure difficult to make. Application code becomes complex since it must deal with details. Rapid implementation of new features impossible. External Schema In the relational model, the external schema also presents data as a set of relations. An external schema specifies a view of the data in terms of the conceptual level. It is tailored to the needs of a particular category of users. Portions of stored data should not be seen by some users. Students should not see their files in full. Faculty should not see billing data. Information that can be derived from stored data might be viewed as if it were stored. Application is written in terms of an external schema. A view is computed when accessed (not stored). Different external schemas can be provided to different categories of users. Translation from external to conceptual done automatically by DBMS at run time. Conceptual schema can be changed without changing application: Mapping from external to conceptual must be changed. Referred to as conceptual data independence. What is Data Modeling? Data modeling is the act of exploring data-oriented structures. Like other modeling artifacts data models can be used for a variety of purposes, from high-level conceptual models to physical data models. From the point of view of an object-oriented developer data modeling is conceptually similar to class modeling. With data modeling you identify entity types whereas with class modeling you identify classes. Data attributes are assigned to entity types just as you would assign attributes and operations to classes. There are associations between entities, similar to the associations between classes – relationships, inheritance, composition, and aggregation are all applicable concepts in data modeling. Traditional data modeling is different from class modeling because it focuses solely on data – class models allow you to explore both the behavior and data aspects of your domain, with a data model you can only explore data issues. Because of this focus data modelers have a tendency to be much better at getting the data “right” than object modelers. However, some people will model database methods (stored procedures, stored functions, and triggers) when they are physical data modeling. It depends on the situation of course, but I personally think that this is a good idea and promote the concept in my UML data modeling profile (more on this later). Although the focus of this article is data modeling, there are often alternatives to data-oriented artifacts (never forget Agile Modeling’s Multiple Models principle). For example, when it comes to conceptual modeling ORM diagrams aren’t your only option – In addition to LDMs it is quite common for people to create UML class diagrams and even Class Responsibility Collaborator (CRC) cards instead. In fact, my experience is that CRC cards are superior to ORM diagrams because it is very easy to get project stakeholders actively involved in the creation of the model. Instead of a traditional, analyst-led drawing session you can instead facilitate stakeholders through the creation of CRC cards. A database model is a theory or specification describing how a database is structured and used. Several such models have been suggested.