UNIT – I Data base Management System and RDBMS – Normalization – Oracle terminology – Database Connection – Creating tables – The Basics of SQL : SQL Grammar. What is RDBMS? RDBMS stands for Relational Database Management System. RDBMS is the basis for SQL, and for all modern database systems like MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft Access.A Relational database management system (RDBMS) is a database management system (DBMS) that is based on the relational model as introduced by E. F. Codd. Creating tables The data in RDBMS is stored in database objects called tables. The table is a collection of related data entries and it consists of columns and rows. Remember, a table is the most common and simplest form of data storage in a relational database. Following is the example of a CUSTOMERS table: The create table statement is used to create a new table. Here is the format of a simple create table statement: create table "tablename" ("column1" "data type", "column2" "data type", "column3" "data type"); +----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | +----+----------+-----+-----------+----------+ The table and column names must start with a letter and can be followed by letters, numbers, or underscores - not to exceed a total of 30 characters in length. Do not use any SQL reserved keywords as names for tables or column names (such as "select", "create", "insert", etc). Data types specify what the type of data can be for that particular column. If a column called "Last_Name", is to be used to hold names, then that particular column should have a "varchar" (variable-length character) data type. Here are the most common Data types: char(size) Fixed-length character string. Size is specified in parenthesis. Max 255 bytes. varchar(size) Variable-length character string. Max size is specified in parenthesis. number(size) Number value with a max number of column digits specified in parenthesis. date Date value number(size,d) Number value with a maximum number of digits of "size" total, with a maximum number of "d" digits to the right of the decimal. What is field? Every table is broken up into smaller entities called fields. The fields in the CUSTOMERS table consist of ID, NAME, AGE, ADDRESS and SALARY. A field is a column in a table that is designed to maintain specific information about every record in the table. What is record, or row? A record, also called a row of data, is each individual entry that exists in a table. For example there are 3 records in the above CUSTOMERS table. Following is a single row of data or record in the CUSTOMERS table: +----+----------+-----+-----------+----------+ 1 | Ramesh | 32 | Ahmedabad | 2000.00 | A record is a horizontal entity in a table. What is column? A column is a vertical entity in a table that contains all information associated with a specific field in a table. For example, a column in the CUSTOMERS table is ADDRESS which represents location description and would consist of the following: +-----------+ | ADDRESS | +-----------+ | Ahmedabad | | Delhi | | +----+------+ What is NULL value? A NULL value in a table is a value in a field that appears to be blank which means A field with a NULL value is a field with no value. It is very important to understand that a NULL value is different than a zero value or a field that contains spaces. A field with a NULL value is one that has been left blank during record creation. SQL Constraints: Constraints are the rules enforced on data columns on table. These are used to limit the type of data that can go into a table. This ensures the accuracy and reliability of the data in the database. Contraints could be column level or table level. Column level constraints are applied only to one column where as table level constraints are applied to the whole table. Following are commonly used constraints available in SQL: NOT NULL Constraint: Ensures that a column cannot have NULL value. DEFAULT Constraint : Provides a default value for a column when none is specified. UNIQUE Constraint: Ensures that all values in a column are different. PRIMARY Key: Uniquely identified each rows/records in a database table. FOREIGN Key: Uniquely identified a rows/records in any another database table. CHECK Constraint: The CHECK constraint ensures that all values in a column satisfy certain conditions. INDEX: Use to create and retrieve data from the database very quickly. Data Integrity: The following categories of the data integrity exist with each RDBMS: Entity Integrity : There are no duplicate rows in a table. Domain Integrity : Enforces valid entries for a given column by restricting the type, the format, or the range of values. Referential integrity : Rows cannot be deleted, which are used by other records. User-Defined Integrity : Enforces some specific business rules that do not fall into entity, domain, or referential integrity. Database Normalization Database normalization is the process of efficiently organizing data in a database. There are two reasons of the normalization process: 1. Eliminating redundant data, for example, storing the same data in more than one tables. 2. Ensuring data dependencies make sense. Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored. Normalization consists of a series of guidelines that help guide you in creating a good database structure. Normalization guidelines are divided into normal forms; think of form as the format or the way a database structure is laid out. The aim of normal forms is to organize the database structure so that it complies with the rules of first normal form, then second normal form, and finally third normal form. It's your choice to take it further and go to fourth normal form, fifth normal form, and so on, but generally speaking, third normal form is enough. 1. First Normal Form (1NF) 2. Second Normal Form (2NF) ORACLE 3. Third Normal Form (3NF) It is very large and multi-user database management system. Oracle is a relational database management system developed by 'Oracle Corporation'. Oracle works to efficiently manage its resource, a database of information, among the multiple clients requesting and sending data in the network. It is an excellent database server choice for client/server computing. Oracle supports all major operating systems for both clients and servers, including MSDOS, NetWare, UnixWare, OS/2 and most UNIX flavors. History: Oracle began in 1977 and celebrating its 32 wonderful years in the industry (from 1977 to 2009). 1977 - Larry Ellison, Bob Miner and Ed Oates founded Software Development Laboratories to undertake development work. 1979 - Version 2.0 of Oracle was released and it became first commercial relational database and first SQL database. The company changed its name to Relational Software Inc. (RSI). 1981 - RSI started developing tools for Oracle. 1982 - RSI was renamed to Oracle Corporation. 1983 - Oracle released version 3.0, rewritten in C language and ran on multiple platforms. 1984 - Oracle version 4.0 was released. It contained features like concurrency control multi-version read consistency etc. 1985 - Oracle version 4.0 was released. It contained features like concurrency control multi-version read consistency etc. 2007 - Oracle has released Oracle11g. The new version focused on better partitioning, easy migration etc. Features: Concurrency Concurrency Read Consistency Locking Mechanisms Quiesce Database Portability Selfmanaging database SQL*Plus ASM Scheduler Resource Manager Data Warehousing Materialized views Bitmap indexes Table compression Parallel Execution Analytic SQL Data mining Partitioning SQL BASICS: SQL is a standard language for accessing and manipulating databases. SQL stands for Structured Query Language SQL lets you access and manipulate databases SQL is an ANSI (American National Standards Institute) standard What Can SQL do? SQL can execute queries against a database SQL can retrieve data from a database SQL can insert records in a database SQL can update records in a database SQL can delete records from a database SQL can create new databases SQL can create new tables in a database SQL can create stored procedures in a database SQL can create views in a database SQL can set permissions on tables, procedures, and viewsr ADVANTAGES OF DBMS: Scalable: Describes the ability of a system to adapt to different amounts of data and numbers of users while maintaining the same performance. In a scalable system those can be increased through the extension of the computing capacity without any programming work. Expandability: New applications or other additions like new user interfaces can be realised without interfering with already working applications. Standardised and Combined Features: Features for data definition, data organisation and data integrity can be standardised and combined. More Efficient System Operation and Development: It is possible to have more efficient features for maintaining system and further developing of the system. Access Control: Access to a database or parts of it (like single tables) can be controlled or restricted for each user but also for user groups. Recovery and Back-up: A database management system has features which allow to recover data sets, for example, after a failed transaction or a system crash. Additionally, a back-up from the whole system can be made to store in a save place. Quicker Application Development: As all the data handling is 'delegated' to the database managment system it is possible to develop new applications quick and flexible. Disadvantages of dbms 1. Increased costs. Database systems require sophisticated hardware and software and highly skilled personnel. The cost of maintaining the hardware, software, and personnel required to operate and manage a database system can be substantial. Training, licensing, and regulation compliance costs are often overlooked when database systems are implemented. 2. Management complexity. Database systems interface with many different technologies and have a significant impact on a company’s resources and culture. The changes introduced by the adoption of a database system must be properly managed to ensure that they help advance the company’s objectives. Given the fact that database systems hold crucial company data that are accessed from multiple sources, security issues must be assessed constantly. 3. Maintaining currency. To maximize the efficiency of the database system, you must keep your system current. Therefore, you must perform frequent updates and apply the latest patches and security measures to all components. Because database technology advances rapidly, personnel training costs tend to be significant. Vendor dependence. Given the heavy investment in technology and personnel training, companies might be reluctant to change database vendors. As a consequence, vendors are less likely to offer pricing point advantages to existing customers, and those customers might be limited in their choice of database system components. 4. Frequent upgrade/replacement cycles. DBMS vendors frequently upgrade their products by adding new functionality. Such new features often come bundled in new upgrade versions of the software. Some of these versions require hardware upgrades. Not only do the upgrades themselves cost money, but it also costs money to train database users and administrators to properly use and manage the new features. DBMS Keys A key is an attribute (also known as column or field) or a combination of attribute that is used to identify records. Sometimes we might have to retrieve data from more than one table, in those cases we require to join tables with the help of keys. The purpose of the key is to bind data together across tables without repeating all of the data in every table. The various types of key with e.g. in SQL are mentioned below, (For examples let suppose we have an Employee Table with attributes ‘ID’ , ‘Name’ ,’Address’ , ‘Department_ID’ ,’Salary’) (I) Super Key – An attribute or a combination of attribute that is used to identify the records uniquely is known as Super Key. A table can have many Super Keys. E.g. of Super Key 1 ID 2 ID, Name 3 ID, Address 4 ID, Department_ID 5 ID, Salary 6 Name, Address 7 Name, Address, Department_ID ………… So on as any combination which can identify the records uniquely will be a Super Key. (II) Candidate Key – It can be defined as minimal Super Key or irreducible Super Key. In other words an attribute or a combination of attribute that identifies the record uniquely but none of its proper subsets can identify the records uniquely. E.g. of Candidate Key 1 Code 2 Name, Address For above table we have only two Candidate Keys (i.e. Irreducible Super Key) used to identify the records from the table uniquely. Code Key can identify the record uniquely and similarly combination of Name and Address can identify the record uniquely, but neither Name nor Address can be used to identify the records uniquely as it might be possible that we have two employees with similar name or two employees from the same house. (III) Primary Key – A Candidate Key that is used by the database designer for unique identification of each row in a table is known as Primary Key. A Primary Key can consist of one or more attributes of a table. E.g. of Primary Key - Database designer can use one of the Candidate Key as a Primary Key. In this case we have “Code” and “Name, Address” as Candidate Key, we will consider “Code” Key as a Primary Key as the other key is the combination of more than one attribute. (IV) Foreign Key – A foreign key is an attribute or combination of attribute in one base table that points to the candidate key (generally it is the primary key) of another table. The purpose of the foreign key is to ensure referential integrity of the data i.e. only values that are supposed to appear in the database are permitted. E.g. of Foreign Key – Let consider we have another table i.e. Department Table with Attributes “Department_ID”, “Department_Name”, “Manager_ID”, ”Location_ID” with Department_ID as an Primary Key. Now the Department_ID attribute of Employee Table (dependent or child table) can be defined as the Foreign Key as it can reference to the Department_ID attribute of the Departments table (the referenced or parent table), a Foreign Key value must match an existing value in the parent table or be NULL. (V) Composite Key – If we use multiple attributes to create a Primary Key then that Primary Key is called Composite Key (also called a Compound Key or Concatenated Key). E.g. of Composite Key, if we have used “Name, Address” as a Primary Key then it will be our Composite Key. (VI) Alternate Key – Alternate Key can be any of the Candidate Keys except for the Primary Key. E.g. of Alternate Key is “Name, Address” as it is the only other Candidate Key which is not a Primary Key. (VII) Secondary Key – The attributes that are not even the Super Key but can be still used for identification of records (not unique) are known as Secondary Key. E.g. of Secondary Key can be Name, Address, Salary, Department_ID etc. as they can identify the records but they might not be unique.