SQL - all data object manipulation, creation and use, involve SQL’s.
DB2 objects - Database, Tablespaces & Indexspaces - creation & use, and other terminology's associated with databases.
DDL - Data Definition Language
An introduction to SQL
SQL or Structured Query Language is
A Powerful language that performs the functions of data manipulation (DML), data definition
(DDL) and data control or data authorization (DAL/DCL).
A Non procedural language - the capability to act on a set of data and the lack of need to know how to retrieve it. An SQL can perform the functions of more than a procedure.
The De Facto Standard query language for RDBMS
Very flexible
SQL - Features:-
Unlike COBOL or 4GL’s, SQL is coded without data-navigational instructions. The optimal access paths are determined by the DBMS. This is advantageous because the database knows better how it has stored data than the user.
What you want and not how to get it
Set level processing & multiple row processing
SQL - Types (based on the functionality)
Data Definition Language (DDL)
-Create, Alter and Drop
Data Manipulation Language (DML)
-Select, Insert, Update and Delete
•Data Control Language (DCL)
-Grant and Revoke
SQL - Types (Others)
•Static or Dynamic SQL
•Embedded or Stand-alone SQL
The following are the Operations that can be performed by a SQL on the database tables:
Select
Union
Join
Topics dealt with, in DB2 objects
Stogroup, Databases, Tablespaces (types, creation and modification)
Indexspaces (creation and modification)
Some more terms associated with tablespaces
DB2 Objects
• The DB2 Object Hierarchy
Stogroup
• It is a collection of direct access volumes, all of the same device type
• The option is defined as a part of tablespace definition
• When a given space needs to be extended, storage is acquired from the appropriate stogroup
DB2 - IBM’s Relational DBMS
Prerequisite for this course
The participant should be exposed to :
• IBM Mainframe Concepts
• COBOL and File Handling Concepts
• VSAM
Database
• A collection of logically related objects - like Tablespaces, Indexspaces, Tables etc.
• Not a physical kind of object - may occupy more than one disk space
• A STOGROUP & BUFFERPOOL (is buffer area used to maintain recently accessed table and index pages) must be defined for each database.
• Stogroup and user-defined VSAM are the two storage allocations for a DB2 dataset definition.
• In a given database, all the spaces need not have the same stogroup
• These are, in a sense, the most physical of various storage objects in DB2
• More than one volume can be defined in a stogroup. DB2 keeps track of which volume was defined first & uses that volume.
Tablespaces
• Logical address space on secondary storage to hold one or more tables
• A ‘SPACE’ is basically an extendable collection of pages with each page of size 4K or 32K bytes.
• It is the storage unit for for recovery and reorganizing purpose
• Three Type of Tablespaces - Simple, Partitioned & Segmented
• Can contain more than one stored table
• Depending on application, storing more than one Table might enable faster retrieval for joins using these tables
• Usually only one is preferred. This is because a single page can contain rows from all tables defined in the database.
• LOAD with replace option deletes all data
• Can contain more than one stored table, but in a segmented space
• A ‘Segment’ consists of a logically contiguous set of ‘n’ pages.
• Segsize parameter decides the allocation size for the tablespace
• No segment is allowed to contain records for more than one table
• Sequential access to a particular table is more efficient
• Mass Delete is much more efficient than in any other Tablespace
• Reorganizing the tablespace will restore every table to its clustered order
• Lock Table on table locks only the table, not the entire tablespace
• If a table is dropped, the space for that table can be reclaimed with minimum reorg
Partitioned Tablespaces
• Primarily used for Very large tables
• Only one table in a partitioned TS; 1 to 64 partitions/TS
• Numpart parameter specifies the no. of partitions
• It is partitioned in accordance with value ranges for single or a combination of columns. Hence these column(s) cannot be updated
• Individual partitions can be independently recovered and reorganized
• Different partitions can be stored on different storage groups for efficient access.
Tablespace parameters to be specified for TS creation
• LOCKSIZE - indicates the type of locking DB2 performs for the given TS
• Page
• Table
• Tablespace
• ANY - DB2 decides the starting page
Data Definition Language
CREATE
This statement is used to create objects
Syntax : For Creating a Table
CREATE TABLE <tabname> (Col Definitions)
PRIMARY KEY(Columns) / FOREIGN KEY
UNIQUE (Colname) (referential constraint)
[LIKE Table name / View name]
[IN Database Tablespace Name ]
• Foreign Key references dbname.table on ‘relation condition for delete’
• Table1 references table2(target) - Table2’s Primary key is the foreign key defined in Table1
• The Conditions that can be used are CASCADE, RESTRICT & SET NULL (referential constraint for the foreign key definition)
• Inserting (or updating ) rows in the target is allowed only if there are no rows in the referencing table
ALTER
This statement is used for altering all DB2 objects
Syntax : For altering a Table
ALTER TABLE <Tablename>
ADD Column Data-type [ not null with default]
• Alter allows primary & Foreign key specifications to be changed
• It does not support changes to width or data type of a column or dropping a column
DROP
This statement is used for dropping all DB2 objects
Syntax : For dropping a table
DROP TABLE <Tablename>
Some general rules for RI & Table Parameters
• Avoid nulls in columns participating in Arithmetic logic or comparisons
• Primary key cols cannot be nulls
• Limit referential structures to no more than three levels in a direction
• Use DB2’s inherent features rather than program coded RI’s.
SQL - Selection & Projection
•
• Select retrieves a specific number of rows from a table
Projection operation retrieves a specified subset of columns(but all rows) from the table
E.g.. : SELECT CUST_NO, CUST_NAME FROM CUSTOMER;
• The WHERE clause defines the Predicates for the SQL operation.
• The above WHERE clause can have multiple conditions using AND & OR .
Other Clauses
Many other clauses can be used in conjunction with the WHERE clause to code the required predicate, some are :-
• Between / Not Between
• In / Not In
• Like / Not Like
• IS NULL / IS NOT NULL
SELECT using a range :
Between Clause
E.g. SELECT CUST_NO, CUST_NAME, CUST_ADDR FROM CUSTOMER
WHERE CUST_NO BETWEEN 1000 AND 2000;
In Clause
E.g. SELECT CUST_NO, CUST_NAME, CUST_ADDR FROM CUSTOMER
WHERE CUST_NO IN(1000, 1001,1002);
Select clause (contd...)
Like Clause
E.g. SELECT CUST_NO, CUST_NAME, CUST_ADDR
FROM CUSTOMER
WHERE CUST_ID like/not like ‘425%’
Note :- ‘_’ for a single char ; ‘%’ for a string of chars
Escape ‘\’ - escape char; if precedes ‘_’ or ‘%’ overrides their meaning
NULL Clause : To check null the syntax is ‘IS NULL’
E.g. SELECT CUST_NO, CUST_NAME, ORDER_NO
WHERE ORDER_NO IS NULL;
However if there are null values for ORDER_NO, then these are always evaluated as a ‘Not True’ condition in a Query.
Order by and Group by clauses:
• Order by sorts retrieved data in the specified order; uses the WHERE clause
• Group by operator causes the table represented by the FROM clause to be rearranged into groups, such that within one group all rows have the same value for the Group by column (not physically in the database). The Select clause is applied to the grouped data and not to the original table.
Here ‘HAVING’ is used to eliminate groups, just like WHERE is used for rows.
Order by and Group by clauses (contd...)
E.g. SELECT ORDER_NO, SUM(NO_PRODUCTS)
FROM ORDER
GROUP BY ORDER_NO
HAVING AVG(NO_PRODUCTS) < 10
ORDER BY ORDER_NO ;
Functions
Types are two :
• Column Function
• Scalar Function
Column Functions
• Compute from a group of rows aggregate value for a specified column(s)
• AVG, COUNT, MAX, MIN, SUM
Scalar Functions
• Are applied to a column or expression and operate on a single value.
• CHAR, DATE, DAY(S), DECIMAL, DIGITS, FLOAT, HEX, HOUR, INTEGER, LENGTH, MICROSECOND,
MINUTE, MONTH, SECOND, SUBSTR, TIME, TIMESTAMP, VALUE, VARGRAPHIC, YEAR
Complex SQL’s
• One terms a SQL to be complex when data that is to be retrieved comes from more than one table
• SQL provides two ways of coding a complex SQL
• Subqueries and
• Joins
Subqueries
• Nested Select statements
• Specified using the IN(or NOT IN) predicate, equality or non-equality predicate(‘=‘ or ‘<>‘) and comparative operator(<, <=, >, >=)
• When using the equality, non-equality or comparative operators, the inner query should return only a single value
E.g. SELECT CUST_NO, CUST_NAME
FROM CUSTOMER
WHERE ORDER_NO IN (SELECT ORDER_NO FROM ORDER WHERE NO_PRODUCTS <5);
E.g. SELECT CUST_NO, CUST_ADDR
FROM CUSTOMER
WHERE ORDER_NO =
(SELECT ORDER_NO FROM ORDER
WHERE NO_PRODUCTS = 5);
• The nested loop statements gives the user the flexibility for querying multiple tables
• A specialized form is Correlated Subquery - the nested select statement refers back to the columns in previous select statements
• It works on Top-Bottom-Top fashion
• Non-correlated Subquery works in Bottom-to-Top fashion
Correlated Subquery
E.g. SELECT A.CUST_NAME A.CUST_ADDR
FROM CUSTOMER A WHERE A.ORDER_NO IN
(SELECT ORDER_NO
FROM CUSTOMER B
WHERE A.CUST_ID = B.CUST_ID)
ORDER BY A.CUST_ID, A.CUST_NO ;
Corelated Subquery using EXISTS clause :
E.g. SELECT CUST_NO, CUST_NAME
FROM CUSTOMER A
WHERE EXISTS
(SELECT * FROM ORDER B
WHERE B.ORDER_NO = A.ORDER_NO
AND B.ORDER_NO = 5);
Multiple levels of Subquery
E.g. SELECT CUST_NO, CUST_NAME, CUST_ADDR
FROM CUSTOMER
WHERE ORDER_NO IN
(SELECT ORDER_NO FROM ORDER
WHERE PROD_ID IN
(SELECT PROD_ID
FROM PRODUCTS
WHERE PROD_NAME = ‘NUTS’));
Joins
OUTER JOIN : For one or more tables being joined, both matching and non-matching rows are returned. Duplicate columns may be eliminated
The non-matching columns will have nulls in them.
INNER JOIN: Here there is a possibility one or more of the rows from either or both tables being joined will not be included in the table that results from the join operation
Other DML Statement’s
INSERT
E.g..: INSERT INTO Tablename(column1, column2, column3 ,......)
VALUES( value1, value2, value3 ,........)
If any column is omitted in an INSERT statement and that column is NOT NULL, then INSERT fails; if null it is set to null
DML statements (contd...)
• If the column is defined as NOT NULL BY DEFAULT, it is set to that default value
•
• Omitting the list of columns is equivalent to specifying all values
SELECT - INSERT
E.g. INSERT INTO TEMP (A#, B)
SELECT A#, SUM(B)
FROM TEMP1 GROUP BY A# ;
DML statements (contd...)
UPDATE
E.g.. UPDATE tablename
SET Columnname(s) = scalar expression WHERE [ condition ]
• Single or Multiple row updates
• Update with a Subquery
DML statements (contd...)
DELETE
E.g. DELETE FROM Tablename
WHERE [condition ];
• Single or multiple row delete or deletion of all rows
Static SQL
• Hard-coded into an application program
• cannot be modified during the program’s execution except for changes to the values assigned to the host variables
• Cursors are used to access set-level data (i.e when a SQL SELECT returns more than 1 row)
• The general form is EXEC SQL
[SQL statements]
END-EXEC.
Dynamic SQL
• Statements can change throughout the program’s execution
• When the SQL is bound, the application plan or package that is created does not contain the same information as that for a static SQL program
• The access paths cannot be determined before execution
What is an Index ?
‘An index is an ordered set of pointers to rows of a base table’.
Or
‘An Index is a balanced B-tree structure that orders the values of columns in a table’
Why an Index ?
‘One can access data directly and more efficiently’
• Each index is based on the values of data in one or more columns. An index is an object that is separate from the data in the table.
• When you define an index using the CREATE INDEX statement, DB2 builds this structure and maintains it automatically.
• Indexes can be used by DB2 to improve performance and ensure uniqueness.
• In most cases, access to data is faster with an index.
• A table with a unique index cannot have rows with identical keys.
Syntax : For creation of an Index
CREATE INDEX <indexname> ON <tabname>
(colname asc/desc)
Other DB2 Objects
VIEWS
• It is a logical derivation of a table from other table/tables. A View does not exist in its own right.
• They provide a certain amount if logical independence
• They allow the same data to be seen by different users in different ways
• In DB2 a view that is to accept a update must be derived from a single base table
Aliases
• Mean ‘another name’ for the table.
• Aliases are used basically for accessing remote tables (in distributed data processing), which add a location prefix to their names.
• Using aliases creates a shorter name.
Synonym
• Also means another name for the table, but is private to the user who created it.
Syntax:
CREATE VIEW <Viewname> (<columns>)
AS Subquery (Subquery - SELECT FROM other Table(s))
CREATE ALIAS <Aliasname> FOR <Tablename>
CREATE SYNONYM <Synonymname> FOR <Tablename>
Application programming using DB2
Application environments supporting DB2 :
• IMS(Batch/Online), CICS, TSO(Batch/Online)
• CAF - Call Attachment Facility
• All DB2 application types can execute concurrently
• Host Language support - COBOL, PL/1, C, Fortran or Assembly lang
Steps involved in creating a DB2 application
Coding the application
• using Embedded SQL
• using Host variables (DCLGEN)
• using SQLCA
• pre-compile the program
• compile & link edit the program
• bind
Note : Cursors can also be used
Embedded SQL statements
• It is like the file I/O
• Normally the embedded SQL statements contain the host variables coded with the INTO clause of the SELECT statement.
• They are delimited with EXEC SQL ...... END EXEC.
•E.g. EXEC SQL
SELECT Empno, Empname INTO :H-empno, :H-empname
FROM EMPLOYEE
WHERE empno = 1001
END EXEC.
Host Variables
• These are variables(or rather area of storage) defined in the host language to use the predicates of a DB2 table. These are referenced in the SQL statement.
• A means of moving data from and to DB2 tables
• DCLGEN produces host variables, the same as the columns of the table
Host variables can be used
• In WHERE Clause of Select, Insert, Update & Delete
• ‘INTO’ Clause of Select & Fetch statements
• As input of ‘SET’ Clause of Update Statements
• As Input for the ‘VALUES’ Clause of Insert statements
• As Literals in Select list of a Select Statement
E.g. SELECT Cust_No, Cust_name, Cust_addr
INTO :H-CUST-NO, :H-CUST-NAME,
:H-CUST-ADDR
FROM CUSTOMER
WHERE CUST_NO = :H-CUST-NO;
DCLGEN
• Issued for a single table
• Prepares the structure of the table in a COBOL copybook
• The copybook contains a ‘SQL DECLARE TABLE’ statement along with a working storage host variable definition for the table
SQLCA
• An SQLCA is a structure or collection of variables that is updated after each SQL statement executes.
•
SQLCA.
An application program that contains executable SQL statements must provide exactly one
Structure of the SQLCA (for COBOL)
01 SQLCA.
05 SQLCAID PIC X(8).
05 SQLCABC PIC S9(9) COMP
05 SQLCODE PIC S9(9) COMP
05 SQLERRM.
:
05 SQLWARN.
10 SQLWARN0 PIC X(1).
:
10 SQLWARNA PIC X(1).
10 SQLSTATE PIC X(5).
Cursors
• Used when a large number of rows are to be Selected
• Can be likened to a pointer
• Can be used for modifying data using ‘FOR UPDATE OF’ clause
The four (4) Cursor control statements are –
• Declare : name assigned for a particular SQL statement
• Open : readies the cursor for row retrieval; sometimes builds the result table. However it does not assign values to the host variables
• Fetch : returns data from the results table one row at a time and assigns the value to specified host variables
• Close : releases all resources used by the cursor
OPEN
E.g. - For the Open statement
Cursors (contd...)
FETCH
E.g. - For the Fetch statement
Cursors (contd...)
CLOSE
E.g. - For the Close statement
EXEC SQL
CLOSE EMPCUR
END EXEC.
Cursors (contd...)
WHENEVER
E.g. - For the Whenever Clause
EXEC SQL
WHENEVER NOT FOUND
Go To Close-EMPCUR
END EXEC.
Cursors (contd...)
UPDATE
E.g. - For the Update statement using cursors
EXEC SQL
UPDATE EMP
Set Job = :New-job
WHERE current of EMPCUR
END EXEC.
Cursors (contd...)
DELETE
E.g. - For the Delete statement using cursors
EXEC SQL
DELETE FROM EMP
WHERE current of EMPCUR
END EXEC.
Application development guidelines
• Code modular DB2 programs and make them as small as possible
• Use unqualified SQL statements; this enables movement from one environment to another(test to production)
• Never use ‘Select *’ in an embedded SQL program;
• Use joins rather than subqueries
• Use WHERE clause and filter out data
• Use cursors when fetching multiple rows, though they add overheads
• Use FOR UPDATE OF clause for UPDATE or DELETE with cursor - this ensures data integrity.
• Use Inserts minimally ; use LOAD utility instead of INSERT, if the inserts are not application dependent
QMF - Query Management Facility
• It is an MVS- and VM- based query tool
• allows end users to enter SQL queries to produce a variety of reports and graphs as a result of this query
• QMF queries can be formulated in several ways : by direct SQL statements, by means of relational prompted query interface or by query-by-example (QBE). QBE is similar to SQL in some ways but more user friendly
SPUFI - SQL Processing Using File Input
• Supports the online execution of SQL statements from a TSO terminal
• Used for developers to check SQL statements or view table details
• SPUFI menu contains the input file in which the SQL statements are coded, option for default settings and editing and the output file.
Precompile
• Searches all the SQL statements and DB2 related INCLUDE members and comments out every
SQL statement in the program
• The SQL statements are replaced by a CALL to the DB2 runtime interface module, along with parameters.
• All SQL statements are extracted and put in a Database Request Module (DBRM)
• Places a timestamp in the modified source and the DBRM so that these are tied. If there is a mismatch in this a runtime error of ‘-818‘, timestamp mismatch occurs
• All DB2 related INCLUDE statements must be placed between EXEC SQL & END EXEC keywords for the precompiler to recognize them
Compile & Link
•Modified precompiler COBOL output is compiled
•Compiled source is link edited to an executable load module
•Appropriate DB2 host language interface module should also be included in the link edit step(i.e
DSNELI)
Bind
• A type of compiler for SQL statements
• It reads the SQL statements from the DBRM and produces a mechanism to access data (in an efficient manner) as directed by the SQL statements being bound
• Checks syntax, checks for correctness of table & column definitions against the catalog information & performs authorization validation
Bind Types
• BIND PLAN : accepts as input one or more DBRMs and outputs an application plan containing executable logic representing optimized access paths to DB2 data.
• BIND PACKAGE : accepts as input a single DBRM and produces a single package containing the optimized access path. The PLAN in this case contains a reference to the physical location of the package(s).
What is a Package ?
• It is a single bound DBRM with optimized access paths
• It also contains a location identifier, a collection identifier and a package identifier
• A package can have multiple versions, each with its own version identifier
Advantages of Package
• Reduced bind time
• Can specify bind options at the programmer level
• Versioning
• Provides remote data access(in version DB2 V2.3 or higher)
What is a Plan ?
• An application plan contains one or both of the following elements:
• A list of package names
• The bound form of SQL statements taken from one or more DBRMs.
• Every DB2 application requires an application plan.
• Plans are created using the DB2 subcommands BIND PLAN
For the following refer handout
• List of common SQL return codes and solutions
DB2 Utilities
DB2 System administration
DB2 UTILITIES
• Check
• Copy/Mergecopy
• Recover
• Load
• Reorg
• Runstats
• Explain
Check
• Checks the integrity of DB2 data structures
• Checks the referential integrity between two tables and also checks DB2 indexes for consistency
• Can delete invalid rows and copies them to a exception table
• Use CHECK DATA when loading a table without specifying the ‘ENFORCE CONSTRAINTS’ option or after the partial recovery of tablespaces in a referential set
Copy
• Used to create an imagecopy for the complete tablespace or a partition of the tablespace - full imagecopy or incremental imagecopy
• Every successful execution of COPY utility places in the table SYSIBM.SYSCOPY, atleast one row that indicates the status of the imagecopy