1 ;Study Material Relational Database Management System Class Subject Unit Semester Staff : : : : : II BCA ‘A’ & ‘B’ Relational Database Management System I IV Malar.N Unit-1: Introduction: Purpose of the database system – view of the data – data models – Database languages – transaction management- storage management- Database administrator- Database users – system structure. Entity relationship model: Basic concepts – keys- entity relationship diagram – weak entity sets- ER features specialization generalization. Relational model: Structure of the relational databases- relational algebra –views. INTRODUCTION: A database management system consists of a collection of interrelated data and set of programs to access those data. The primary aim of the DBMS is to provide an environment that is both convenient and efficient to use in retrieving and storing database information. Purpose Of The Database Systems: Previously organizations used conventional file systems, which were supported by all the operating system. Permanent records are stored in various files and different application programs are written to extract records from and to add records to the appropriate files. Disadvantages of Conventional file system: 1. Data redundancy and inconsistency 2. Difficulty in accessing the data 3. Data isolation 4. Integrity problems 5. Atomicity problems 6. Concurrent access anomalies 7. Security problems Data redundancy and inconsistency: Since files and application programs are written by different programmers, the files and application programs are written in various programming languages. This leads to the redundancy of data. For example address and telephone number of a particular customer may be stored in two different files and it leads to the wastage of storage space. It will also leads to data consistency that is change of data in one file could not appear in various files. For example changed customer address in one file may not be reflecting in the other file. Difficulty in accessing the data: If we want to view some records for a new kind of request, the current system does not have any application program to meet it. So we have two options 1. Develop a new program to satisfy user need 2. Manually extract. Suppose in a bank the officer needs the list of customers who are residing at Coimbatore then he has two choices 1. He can get the list of all customers 2. He can write a new piece of code Both the choices are in efficient even if we write code newly the manager may need the list of customers who are having a balance of 20000 RS then again a new piece of code has to be written. The conventional file processing does not allow needed data to be retrieved in a convenient & efficient manner. 2 Data isolation: Since the data are scattered in various files, files may be in different formats it is difficult to write new applications programs for all purposes. Integrity problems: The data values, which are stored in a database, must satisfy certain types of consistency constraints when new constraints are added, it is difficult to change the programs to enforce them. If these constraints involve more than one file then it is difficult to implement. Example: Bank balance should never fall below 500 RS. Atomicity problems: A computer system like any other system is subject to failure. So if a failure occurs the data should be restored to the consistent state that existed prior to the failure. The data transfer must be atomic I,e it must happen in its entirely or not at all. It is difficult to ensure this in file processing system. Consider an example of transferring 20,000 RS from account A to account if some failure occurs in the middle it might be debited from As account but not credited to B account. Concurrent access anomalies: Many systems allow multiple users to update the data simultaneously. Interaction of concurrent updates may result in inconsistent data. To allow multiple accesses, we have to allow multiple allow two programs run concurrently, then may both read the value. It leaves inconsistent state. To guard against this possibility, the system must maintain some form of supervision. Consider the bank account, A which contains 500 RS, if two customers withdraw rupees 50 and rupees 100 simultaneously. If two programs act or execute at the same time it will read the old balance 500 and the value returned might be 400 or 450. This depends on the value which is returned last but the actual balance should be Rs350. So concurrent usage or updating of bank balances is not possible. Security problems: Not every user of the database system should be able to access all the data .So the database should be able manage the authorization of the users. View of the Data: A DBMS provides an abstract view of the data. It hides certain details of how the data are stored and maintained. Data abstraction: Data base has complex data structures for representing of data to retrieve efficiently. To simplify the users interaction with the system, developers hide the complexity from users through several levels of abstraction. There are three different views of the database they are View level View 1 View 2 Logical level Physical level View n 3 1. Physical level 2. Logical level 3. View level Physical level: This is the lowest level of abstraction and it describes “ How “ the data are actually stored. Complex data structures are used. Here physical level tables are described as a block of consecutive storage locations. Logical level: The next higher level of abstraction describes “ What “ data are stored in the database and what relationships exist among those data. It involves more complex level structures to implement these simple logical structures. The end user of logical level does not need to be aware of the complexity. This will be handled by DBA’s(Data Base Administrator) Here each record is described by a type definition and the inter relationship among these record types. View level: This is the highest level of abstraction describes only part of the entire database; many users of the database may not be concerned with all the information. The system may provide many views for the same database. It hides the details of data types. View provides security mechanism. Instances and Schemas: Databases may exchange from time to time when they are inserted and deleted. The collection of information, which is stored at one particular instance of time. This is called as instance. The overall design of the language database is called schema. There are three different types of schema they are 1. Logical schema 2. Physical schema 3. Sub schema In general database system support one physical schema, one logical schema, and several sub schemas. Analogous example: Schema - > programming language type definition. Instance -> variable, which corresponds to the type definition. Eg : Class Abc {} Abc a; Data independence: The ability to modify a schema definition in one level without affecting a schema definition in the next higher level is called data independence. There are two levels of independence. 1.Physical level data independence: This is the ability to modify the physical schema without causing application programs to be rewritten. 2. Logical data independence: This is the ability to modify the logical schema without causing application programs to be rewritten. Eg add one more field. Logical data independence implementation is tough than physical independence because application program is fully dependent on logical level data independence. 4 Data Models: The underlying structure of the database is the data model, which explains a collection of conceptual tools for describing data, relationships, semantics and consistency constraints. The different data models are 1. Object based logical model 2. Record based Logical 3. Physical model There are many different models known models are 1. 2. 3. 4. and more are likely to come. Several of the more widely Entity relationship model Object oriented model Semantic data model Functional data model Object based logical model 1.Entity Relationship Model: It is a collection of basic objects called as entities and relationships. An entity is a thing or object in the real world that is distinguishable from other objects. For example bank account is an entity. Entities are described by number of attributes. A relationship is an association in between various entities. ER diagrams can express the overall structure of the database. The various symbols used for ER diagrams are 1. Rectangle which represent entity sets 2. Ellipses which represent attributes 3. Diamond which represents relationships among entity sets 4. Lines which link attributes to entity sets and entity sets to attributes. Acc. no Name e Balance City Customer Deposits Account 2.Object oriented model: This model also consists of a collection of objects. The objects contain the values for the instance variables and bodies of the code that operate on the object called methods. The values and same methods are grouped together into classes. The only way in which the can access the data of another object is by invoking a method of other object. Thus the value and method are hidden from the user, thus achieving two levels of data abstraction. The advantage: If we want to do any changes, no need to change the entire program. Simple do change s in the method alone is enough. For example considering the bank account, it contains instance variable account number and balance, if the bank decides to decrease or increase the interest rate, only change is made with in the pay interest method and not in the external interface. Record Based logical models: These models is used o describe the data at the logical and view levels. They are used to specify the logical structure of the database and to provide a higher-level description of the implementation. Record based in named because database is structured in fixed format records of several types. Each record type defines a fixed length. Object based model whose structure leads variable –length records at physical level. There are three types of record based models they are 5 1. Relational model 2. Network model 3. Hierarchical model Relational model: The relational model uses a collection of tables to represent both data and relationships among those data. Each table has multiple columns with unique name. Customer name Social status Street city Account number Henry Lecturer coimbatore 143 Mythili Lecturer 12,saibaba colony Valluvar nagar Dharmapuri 154 Account Number 143 154 Balance 5000 4000 Network model:(Arbitrary graphs) Data in the network model are represented by collection of records and relationships among data are represented by links, which are pointers. henry Lecturer 12 saibaba colony Mythili Lecturer 12 valluvar nagar coimbatore 143 dharmapuri 153 143 5000 153 4000 Hierarchical model: The hierarchical model also stores the data in the form of records and links and the only difference is that the records are in the form of tree. henry Lecturer 12 saibaba colony coimbatore 143 143 Mythili Lecturer 12 valluvar nagar 5000 dharmapuri 153 153 4000 Physical data models: These data models are used to describe data at the lowest level. The two different models, which are available, are 1. Unifying model 2. Frame memory model Database languages: There are two types languages. They are 1. To specify database schema 2. To express database queries and updates. Data definition language(DDL): A database is specified by a set of definitions expressed by a special language called a data definition language. The result of the compilations of DDL statements is a set of tables that is stored in special file called as data dictionary or data directory. A data dictionary is a file that 6 contains Meta data. This file is consulted before actual data are read or modified in the data base system. EG. CREATE, ALTER, DROP STATEMENTS. Data storage and definition language: The storage structure and access methods used by data base system are specified by a set of definition in a special type DDL. Data manipulation language(DML): A data manipulation language is a language that enables users to access or manipulates data as organized by appropriate data model. There are basically two types of DML they are 1. Procedural DML 2. Non-procedural DML Procedural DML: require a user to specify “What “ data are needed and how to get those data. Non-procedural DML: require a user to specify “ What” data are needed alone. Insert, delete, update and select queries are example. Transaction management: A transaction is a collection of operations that performs a single logical function in a database application. Each transaction is a unit of atomicity and consistency. Thus we require that transactions do not violate any database consistency constraints. Atomicity: All transactions, which have happened, should succeed or all of them should fail. The correctness of data should be maintained. Consistency: The correctness requirement is called consistency. It is essential that the execution of the fund transfer to preserve the consistency of the database. This is called the consistency of the database. So it is the responsibility of the programmer to define properly the various transactions such that each preserves the consistency of the database. Ensuring the atomicity and durability properties is the responsibility of the database system itself. It is done by transaction management component. The database must be restored to the state in which it was before the transaction is started executing. The operation of database is to detect system failures and to restore the database to a state that existed prior to the failure. When the database is updated by more than one transaction at a time the consistency of the database is no longer preserved and it is the responsibility of the concurrency control manager to control the interaction among concurrent transactions to ensure consistency of the database. Small firms databases may execute only one transaction at a time. So it is cost is low. Storage Management: Databases require large amount of storage space in terms of giga bytes. Since main memory cannot hold all data permanently so it is stored in disks. So data are moved between disks and main memory when it is necessary. Since it consumes time the movement of records is minimized. A good performance of the database system is measured by the quicker response time. The goal is to facilitate access of data. High-level views help to achieve this goal. A storage manager is a program that provides interface between the low level data stored in the database and the application programs and queries of the system. Thus the storage manager is responsible for storing, retrieving and updating of the data in the database. The raw data are stored on the disk using the file system. The storage manager translates the various DML statements into low-level file system commands. Database Administrator: A person who has some central control over the data and the programs is called Database administrator. The various functions of a DBA includes Schema definition Structure and access method definition Schema and physical organization modification Granting of authorization of data access Integrity constraint specification. 7 Schema definition: The DBA creates the original database schema by writing a set of definitions that translated by DDL compiler. Those are permanently stored in data dictionary. Storage structure and access method definition: The DBA creates appropriate storage structure and access methods by writing a set of definitions, which is translated by the data storage and DDL compiler. Schema and physical organization modification: Programmers rarely modify the physical organization of the data or data base schema. Granting of authorization fro data access: The granting of different types of authorizations allows the DBA to regulate, which parts of the database a user can use. The authorization information is kept in a special system structure that is consulted by the database system whenever access to the data is attempted in the system. Integrity constraint specification: The data values, which are stored in the database, must satisfy certain consistency constraints. Database Users: The four different types of database users are 1. Application programmer 2. Sophisticated users 3. Specialized users 4. Naïve users. Application programmers: They are computer professionals who interact with system through DML calls, which are embedded in a program written in a high level language like pascal, C++, Java etc. The statements are DML statements so they are compiled separately using DML compiler and they are converted to host language procedure calls and then the object code is generated. Sophisticated users: They interact with the system without writing programs. They submit the request to a query processor, which in turn breaks down the DML statements into statements such that the storage manager understands. Specialized Users: Specialized users are sophisticated users who write database applications that do not fit into the traditional data processing framework. Example: Computer aided design, Knowledge based expert system etc. Naïve users: They are unsophisticated users who interact with the system by invoking one of the permanent application programs that have been written previously. Example: A bank teller uses a program called transfer to transfer an amount 5000 from account A to account B. Overall System Structure: The functional components of a database are broadly classified into 1. Query processor components 2. Storage manager components Query processor: The components of a query processor are 1. DML compiler 2. Embedded DML compiler 3. DDL interpreter 4. Query evaluation engine. DML compiler: DML compiler which translates DML statements in a query language into low-level instructions that a query evaluation engine can understand. 8 users Naïve users (Tellers, agents, etc.) application Programmer’s Application interfaces Application programs Embedded DML pre compiler Application programs object code sophisticated users Query DML compiler database administrator Database schema DDL compiler Query evaluation engine Query processor Transaction manager Buffer manager File manager indices Dis Data files Statistical data Disk storage Storage manager Data dictionary Embedded DML compiler: This converts DML statements embedded in an application program to normal procedure calls in the host language. DDL interpreter: which interprets DDL statements and records them in a set of tables containing meta data. Query evaluation engine: This executes low level instructions generated by the DML compiler. The storage manager components provide an interface between the low-level data and queries submitted to the system. Storage Manager: The various components of storage manager are 1. Authorization and integrity manager. 2. Transaction manager 3. File manager 4. Buffer manager 9 Authorization and integrity manager: This tests for the satisfaction of the integrity constraints and checks the authority of the users to access data. Transaction manager: This ensures that the database remains in a consistent state despite the system failures and that concurrent transactions proceed without conflicting. File manager: This manages the allocation of disk space on disk storage and the data structures used to represent information used on the disk. Buffer manager: This is responsible for fetching the data from the disk storage into main memory and deciding what data to cache in memory. In addition to the above statements the following data structures are required as part of physical system implementation. They are 1. Data files, which stores the database itself. 2. Data dictionary, which stores metadata about the structure of the database 3. Indices, which provide fast access to the data items that, hold particular value. 4. Statistical data, which stores statistical information about the data in the database. Entity Relationship model The entity relationship model is based on a perception of a real world that consists of a set of entities and relationships among objects. It was developed to facilitate database design by allowing the specification of an enterprise schema, which represents the overall logical structure of the database. E-R model is useful in mapping the meaning and interaction of entities onto the conceptual schema. Basic concepts: There are three basic concepts of ER model. They are 1.Entity sets 2.Relationship sets 3.Attributes. Entity sets: An entity set is a thing or object in the real world that is distinguishable from the real world objects. An entity set is a set of entities of the same type that share the same properties and attributes. The individual entities that constitute a set are said to be the extension of entity set. For example account number of a customer in a bank identifies the person in the enterprise. An entity has a set of properties and values for some set of properties may uniquely identify an entity. - Entity set do not need to be disjoint for example an employee entity of the bank may contain the same person in customer of that bank. Attributes: An entity is represented by a set of attributes. Attributes are descriptive properties possed by each member of entity set. Example: The set of all customers in a bank can be called as “Customer”. An entity’s represented by set of attributes. For each attributes there are set of permitted attributes. It is called as a domain or value set of that attribute. The domain of the attribute may be the set of all text strings of certain length. An attribute of an entity set is a function that maps from the entity set into domain. Entity set can be described as (attribute, data value). The attributes are of different types they are 1. Simple and composite attributes 2. Single valued and multi valued attributes 3. Null attributes 4. Derived attributes. 1.Simple and composite attributes: Simple attributes cannot be divided into subparts. Composite attributes can be divided into subparts. For example customer name can be divided into first name, last name, middle name etc. Customer address is also a composite attributes which may be divided into street name, door number house name, city, pin code. Etc. 10 2. Single valued and multi-valued attributes: The attributes that we have specified in our examples all have a single value for a particular entity. This is called as single valued attributes. For example an employee number can have only a single number. So employee number is a single valued attribute. The attributes may have a set of values for a specific entity. These attributes are called multi-valued attributes where appropriate upper bound, lower bounds may be placed on the number of values in a multi valued attributes. For example customer address in a bank may have more than one address. Their limited boundary is 1 or two. 3. Null attributes: A null value is used when an entity does not have a value for an attribute. As an illustration if a particular customer has no two addresses the second address value may be Null. 4.Derived attributes: The value of this type of attribute can be derived from the values of other related attributes or entity. For example in bank calculate current loan time from the loan start date to today’s date. Relationship sets: A relationship is an association among several entities. For example we can define a relationship that associates customer Henry with loan L-15. A relationship set is a set of relationships of the same type. It is mathematical relation on n>=2 entity set (possibly non distinct). If E1,E2,…En are entity sets then the relationship set R is a subset of {(e1,e2,….en) | e1 E1, e2 E2 ,…. En En } where (e1,e2…en) is a relationship. Consider two sets customer and loan we define the relationship set borrower to denote the association between a bank loan and customer. L-17 1000 Henry 312-317 main gowthami The association between the entity set is referred to as Participation. Ie the entity sets E1,E2.. En participates in an E-R schema represents that an association exists between the named entities. The function that an entity plays in a relationship is called that entity’s role. The role of the entity is implicit. Maximum cases the entity set of a relationship set are no distinct. The same entity set participating in the relationship set more than once, in different roles. This type of relationship set, which is called recursive relationship set. Implicit role names are necessary to specify how an entity participates in a relationship instances. Descriptive attribute: may also have descriptive attribute. Consider set depositor with entity sets customer and account. We associate the attribute access _date to that relation. Binary relationship: The relationship that involves two entity sets. Ternary –relationship: the relationship that involves more than two entity sets. The number of entity set that participates in a relationship set is degree of the relationship set. Binary relationship degree is 2. Ternary relationship degree is 3. Keys: To specify entities within an entity set and relationship with in a relationship set. A key is a single attribute or combination of two or more attributes of an entity set that is used to identify one or more instances of the set. The difference among these entities and relationships the concept of keys is used. Entity set: Primary key: Unique entity identifier is referred as primary key. Super key: if we add additional attributes to a primary key, the resulting combination would uniquely identify an instance of an entity set. So therefore a primary key is minimum super key. Candidate key: There may be two or more attributes that uniquely identify an instance of an entity set. These attributes or combination of attributes are called candidate key. Here we must decide which of the candidate key is a primary key other keys are alternate key. Secondary key: 11 Secondary key is an attribute or combination of an attribute that may not be a candidate key but that classifies the entity set on a particular characteristic. Eg department attribute. Relationship set: The primary key of an entity set allows us to distinguish among the various entities of the set. We need similar mechanism to distinguish between various relationships of the relationship sets. Let R be a relationship set involving entity sets E1, E2…En. Let primary key (Ei) denote the set of attributes that forms the primary key fro entity set Ei. Assume that the attribute names of all primary keys are unique.( if they are not ,use an appropriate schema). The composition of the primary key for a relationship set depends on the structure of the attribute associated with the relationship. Primary key (E1) U primary key (E2)… U primary key (En) If the relationship set R has attribute a1, a2… an associated with it. Primary key (E1, E2…U {a1, a2…an}) this from a super key for relationship. Entity relationship diagram: An E-R diagram can express the overall logical structure of a database graphically. The major components of an ER diagram are: Rectangles- that represents entity set. Ellipses - which represent attributes. Diamonds - which represent relationship sets Lines -which link attributes to entity sets and entity sets to relationships sets Double ellipses - which represent multi valued attributes. Dashed ellipses -which denote derived attributes A double line, which indicates total participation of an entity in a relationship set. The entity relationship diagram, which consists of two entity sets, customer, loan through a binary relationship set borrower. The relationship borrower may be many-to-many, one-tomany, many-to-one or one-to-one. For a binary relationship set R between entity sets A and B, the mapping cardinality must be one of the following: One to one: an entity in A is associated with at most one entity in B, and an entity in B is associated with at most one entity set A. One to many: An entity in A is associated with any number of entities in B, an entity in B, however, can be associated with at most one entity in A. Many to one: An entity in A is associated with at most one entity in B, an entity in B, however, can be associated with any number of entities in A. Many to many: An entity in A is associated with any number of entities in B, and an entity in B is associated with any number of entities in a. One to one one to many A b A 1 B 2 Many to one A A 1 A b B 2 many to many b B 2 A b A 1 B 2 12 The relationship set borrower may be many to many, one to many, many to one or one tone. To distinguish among these types, we draw either a directed line( ) or an undirected line ( ) between the relationship set and the entity set. A directed line from the relationship set borrower to the entity set loan specifies that borrower is either a one to one or many to one relationship set from customer to loan; borrower cannot be a many to many or a one to many relationship set from customer to loan. An undirected line from the relationship set borrower to the entity set loan specifies that borrower is either a many to many or one tot many relationship set from customer to loan. If a relationship set has also some attributes associated with it then we link these attributes to the relationship set. Access date Cus no Customer balance accno Cus add Account depositor E-R diagram with an attribute attached to a relationship set We can indicate roles in E-R diagram by labeling lines that connect diamonds to rectangle. Employee name manager employee worker Worksfor Non-binary relationship sets ca be specified easily in an E-R diagram. Weak entity set: An entity sets may not have sufficient attributes to form a primary key. Such an entity set is termed as weak entity set. An entity set that has a primary key is termed as Strong entity set. For a week entity set to be meaningful, it must be part of one to many relationships set. Although the week entity set doesn’t have a primary key, we need to distinguish among all those entities in the entity set that deepened on the particular strong entity. The discriminator of a week entity set is set of attributes that allow this distinction to be made. The discriminator of a week entity set is also called the partial key. The primary key of a weak entity set is formed by the primary key of the strong entity set on which the weak entity set is existence dependent, plus the weak entity set’s discriminator. A weak entity set is indicated in E-R diagrams by a doubly outlined box and the corresponding identifying relationship by a doubly outlined diamond. 13 Pay_date Loan _no loan Amount Pay_no Loanpayment Pay_amt payment Extended E-R Features: Specialization: An entity set may include the sub groupings of entities that are distinct in some way from the other entities in the set. For example consider the entity set account with attributes account number and balance. The account can be classified into Savings bank Checking account. The entity account consist of two attributes account number and balance. Each of the above account types may have its own attributes along with the standard account attributes. Thus the process of designating the sub groupings is called as specialization. An Entity may be specialized by on e or more distinguishing feature. In case of an account the type of the account is termed as a distinguishing feature. In terms of an ER diagram specialization is depicted by a triangle which is Labeled ISA. ISA stands for “is a”. Generalization: Generalization is a simple inversion of specialization. The design goes in a bottom up fashion. Generalization proceeds from recognition that a number of entity sets share some common features. Based on the commonalities generalization is used to synthesize these sets into a single, higher level entity sets. It is mainly used to hide the differences and it also provides the economy of representation in that shared attributes are not repeated. Relational Model The relational model has established itself as the primary data model for commercial data processing applications. Structure of relational databases: A relational database is a collection of tables each of which is assigned a unique names . A row in a table represents a relationship among the set of values. Consider the relation In which there are column headers branch name , account number and balance which are called as attributes. For each and every attribute there will be a domain of database. For branchname let D1 be the domain set which denotes all the branch names and D2, D3 be the set which denotes account number and balance respectively . So the account is the subset of D1 x D2 x D3. We require that all the relations the domains should be atomic and a domain is said to be atomic when it is indivisible units . It is possible for several attributes to have a same domain. If there exists two relations customer and employee it may hail from the same domain set. The attribute employee may come from a common domain but balance and branch name cannot be from a same domain. One domain value which is a member of any possible domain value is called as null value. For example the telephone number of the customer may be not known or it would not exist. Database schema : 14 Database schema is the logical design of the database and a database instance which is a superset if the data in the database in the given instant of time. Relation and relation schema : The concept of the relation corresponds to the programming language notion of the variable. The concept of relational schema corresponds to the type definition of the programming language. Branch relation : Branchname branchcity assets Rspuram cbe 1000 Saibabacly cbe 2600 The relation schema is named using capital letters( starting with) and relations are named using small letters. Account schema =(branchname, account number, balance) we denote a fact that account is a relation on the account schema . account( Account schema) A relation instance corresponds to the programming language variable which may hold any changed value when databases are updated. We can relate the tuples(RECORDS) in two relations for example. Branch schema=(branchname , branchcity,assets) Accountschema =(branchname , account number , balance) We can see that branch name is present in both the schemas and we can relate these two schemas. Data redundancy is not counted . suppose we wish to find the information about all the account in branches located in coimbatore , we first look at the branch relation to find the names of all branches located in coimbatore. The for each branch we would look in account to find the information about the maintained in the branch It is not always advisable for single schema rather than multiple schema. The disadvantage of the type of approach is redundancy of data. main Relational Algebra The relational algebra is a procedural query language. It consists of operations, which takes two relations or one relation as input and produces a new relation as output. The fundamental operations of relational algebra are 1. 2. 3. 4. 5. 6. Select Project Union Set difference Cartesian product Rename. Fundamental operations of relational algebra: Select operation: The select operation selects the tuples that satisfy the given predicate, we use the lower case Greek letter sigma . The predicate acts as a subscript to sigma. 15 σ branchname=”cbe”(loan) we can find all the tuples where the loan number is greater than 1200 σamount > 1200(loan) In general the co mparison operator which are used are 1. = 2. # 3. < 4. <= 5. > ^ id used as the logical AND and v is used as logical OR Project operation : The project operation is a unary operation that returns it arguments relation with certain left out. It is a set so the duplicate rows are eliminated . It is denoted by π. Π loan number , loan amount(loan) Composition of relational operations : The result of relational operation is itself is a relation . This fact is useful when we want to find all the customers who live at a place “Saibaba colony” Π customer name(σ customer city = “Saibaba colony”(customer)) Relational algebra expressions can be composed of relational expressions as input. Union operation : This operation can be explained as follows , If we want the list of all names of all the bank customers who have either an account or loan or both. Customer relation does not contain loan information and borrower relation does not contain the customer and bank account information. So there comes the union operation. Π customer name (borrower) , Π borrower name ( depositor) To union these we use the symbol U Π customer name (borrower) U Π borrower name ( depositor) This fetches all the customers who own a account and owe a loan to the bank. Since relations are sets duplicate values are eliminated. The union operationk need the following conditions to hold. 1. The relations r and s must be of the same number of attributes. 2. The domains of the i th attribute of r and i th attribute of s mulst be same for all i. The set difference operation : The set difference operation denoted by – allows us to find tuples that are in one relation but are not in another . The expression r – s results in a relation containing those tuples in r but not in s. Π customer name ( deposior)- Π customer name( borrower) 16 For set difference operation to execute 1. r,s should be of the same arity 2. Domains of the ith attribute of r and ith attribute of s be the same. Cartesian product operation : A Cartesian product is denoted by a cross (x) . It allows us to combine information from any two relation. If same attribute name is used in the relations we use R = borrower x loan. Borrower : Loan: Customer Loan name number Branch name Jones l-17 Comibatroe Smith l-23 Trichy Hayes l-15 Jackson l-93 Salem Curry l-11 Chennai Smith l-17 name of the relations to differentiate. Loan no amount L-17 L-14 L-15 L-12 40000 7000 56778 34456 Eg Suppose we want to find the names of all customers who are having an account with Cbe branch σ branchname=”cbe”(borrower X loan) the output of the above statement is given overleaf. To select the matching loan numbers alone, σ borrower.loannumber = loan.loannumber σ branchname=”cbe”(borrower X loan) to display the customers name Π customer_name (σ borrower.loannumber = loan.loannumber (σ branchname=”cbe”(borrower X loan) ) Rename Operation: Unlike relations in database , the results of the database may not have a unique name that we can use to refer to them. It is useful to given them “Names”. This can be done by using rename operation. This is denoted by the symbol (rho). x(E) To illustrate its use , consider the query . “Find the largest account balance in the bank. Step 1: compute a temporary relation consisting of those balances that are not the largest. Step 2: Take set difference between the relations. Π balance (account) The comparison operation can be done by Cartesian product (account X account) and comparing the value of two balances appearing in one tuple. The temporary relation that consists of the balances that are not the largest. Π account.balance (σ account _balance <d.balance(account X d(account))) Π balance (account) Π amount.balance (σ account _balance <d.balance(account X d(account))) Additional operation: 17 Views: it is not desirable for all the users to see the entire logical model. Security constraints may be set and part of the database might be hidden from the user. It matches better to the users view. A relation that is not part of the logical model but it is made available to the users. Πbranch_name ,customername (depositor X account) U Π branch_name ,customername (borrower X loan) Questions: 2 marks question: What is DBMS? What is a schema? What is data independence? What is procedural DML What is entity set? Explain briefly about attributes. What is relationship set? What is existence dependency? What is select clause algebra Explain the algebra for view. Descriptive questions. Explain the data abstraction. Explain the data models. Describe the use of Transaction management. Explain the use of storage management Explain the overall system structure Explain E-R model. What are the extended features of E-R model? Explain the symbols used in E-R model Explain the mapping Cardinalities Explain Weak entity sets. Explain the different relational symbols used. Explain rename operation. Explain the tuple operation ***************************************************************************