CIS 211 Study Guide - Dutchess Community College

FILE ORGANIZATION STUDY GUIDE
TABLE OF CONTENTS:
1
PAGE:
1.
Disk Storage Concepts
2
2.
Processing Files
3
3.
ISAM
8
4.
VSAM
9
5.
Access Method Services
13
6.
Variable Length Records
17
7.
Alternate Index
19
8.
Data Structures
24
9.
Relational Databases
26
10.
Normalization
28
11.
Structured Query Language (SQL)
29
12.
Embedded SQL
36
Appendix A.
COBOL with SQL
40
Appendix B.
Microcomputer SQL
43
Appendix C.
Modulus 11 in COBOL
47
Appendix D.
Modulus 11 in Pascal
48
Appendix E.
CALL an external module in Fujitsu COBOL
51
Appendix F.
VSAM COBOL Status Codes
52
DISK STORAGE CONCEPTS.
DASD - direct access storage device
inexpensive fast & reliable online data storage
virtual storage
update records in place
CONCEPTS IN DATA ORGANIZATION.
logical record - one user data record as processed by application program
physical record (block) - the data between the gaps
control interval –
actual unit of data transfer in VSAM
contains physical record plus control information
IBM mainframe DASD devices can be one of two types:
FBA –
CKD blocking - how many logical records in one physical record
add 1 block for the EOF on an FBA device
Error detection and correction on an FBA device:
CRC - cyclic redundancy check
- write a control total with each record
Parity checking is used between the host CPU and the controller, but
CRC is used between the controller and the 3370.
CONCEPTS IN ACCESSING DATA.
access motion time (seek time) - select correct cylinder
head selection - electronically activate head over a track
rotational delay - half a rotation used as average
data transfer rate - kilobytes / second
2
PROCESSING FILES.
Definitions:
File
bit, byte, field, record, file, database
- collection of related records.
- called a data set in OS
- called a cluster in VSAM
volatility - frequency of record addition & deletion
static = low
dynamic = high
activity - % of records accessed to total records
size - leave room for future growth
File processing - the manner in which the blocks are read from or written to. Examples:
sequential - used in batch processing with activity % over 60
direct
random
indexed
File Access Methods:
SAM - sequential access method
DAM - direct access method
ISAM - indexed sequential access method
VSAM - virtual storage access method
VSAM:
designed especially for DASD
device independent
position individual records of a file on the storage medium without
respect to the physical characteristics of the DASD.
Note: a record is accessed by its displacement, in bytes, from the
beginning of the cluster (file); called RBA (relative byte addressing)
supports both sequential and direct access
3
SEQUENTIAL FILE ACCESS METHOD - REVIEW
Required mainframe DOS/VSE JCL:
DLBL
EXTENT
ASSGN
EXEC
Required COBOL connection:
SELECT / ASSIGN – Environment Division
Which is first? JCL or COBOL?
Processing in COBOL:
OPEN determines which verbs will be allowed
READ determines how records will be accessed
CLOSE writes any records still in buffer, plus EOF
Importance of making backups :
Sequential access:
File rotation scheme
Direct access:
Updating THE master file (versus OLD and NEW master files)
Audit trail
Any file access:
Offsite / secure storage
Site redundancy
4
PROCESSING INDEXED FILES.
index
- cross-reference table created and stored on disk
- relates the key field to the corresponding address of the data
- the key field is defined somewhere in the 01 record for the indexed file.
An indexed file is created sequentially, sorted by key field, in ascending order.
There can be no duplicate key field values. After creation, the file can be accessed
sequentially or randomly.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL SECTION.
SELECT filename
ASSIGN TO device
ORGANIZATION IS INDEXED
/
ACCESS IS
<-RECORD KEY IS
\
FILE STATUS IS
.
SEQUENTIAL (default)
RANDOM
DYNAMIC
File status:
a field in working storage defined as PIC XX
holds status return codes in hexadecimal (hex)
Reading this file:
READ statement form varies with type of access used.
Examples: READ; READ NEXT; READ KEY IS
Dynamic access
file can be processed with either sequential or random access.
To update the file in place using the REWRITE statement:
1. obtain the value of the key field from transaction or algorithm
2. open the indexed master file for I-O
3. move the key field value to the key field of the master file record
4. READ a master record
5. make changes in the record area
6. REWRITE the master record
7. check system result by status code
You can REWRITE without an initial READ. Should you?
You can also DELETE the record, or WRITE new records:
DELETE filename
WRITE recordname
Use of ALTERNATE KEYS: see chapter on Alternate Index (AIX).
Duplicate values of alternate keys are permitted.
5
PROCESSING RELATIVE FILES.
Relative files are another way (besides indexed files) to access files randomly.
The key field is, or is converted to, the actual storage address for the record.
Advantage: fastest access. No index needed.
The key field, defined in working storage, is called the relative key. The programmer
must calculate its value.
The file is created as slots with blank records wherever a record does not exist with the
key value. There may be an algorithm to compute the value of the key (see below).
Alternate keys are NOT allowed. Variable length records are NOT allowed.
SELECT
ASSIGN TO
ORGANIZATION IS RELATIVE
ACCESS IS
RELATIVE KEY IS
FILE STATUS IS
.
To obtain the value of the relative key, if the key field is too large or not suitable (would
leave too many gaps):
1. Best if no conversion needed; key field IS relative key.
2. HASHING algorithm (also known as randomizing).
3. Other algorithms exist. Collisions from synonyms must be handled.
A collision is two or more records assigned to the same space, called a bucket.
Note that a collision triggers an INVALID KEY and requires an algorithm to
compute the new location of the record in the overflow area. There are "digit
analysis" programs which can recommend the best algorithm to minimize
collisions.
Some hashing algorithms:
1. Divide the key value by a prime number just larger than the number of records,
and use the remainder as the relative key.
2. Square the key value and truncate to the number of digits needed.
3. Use extraction of digit(s) in a fixed position of the key.
4. Split the key into two or more parts, add, truncate.
5. Radix transformation. Convert key to another base.
6
Efficient processing of relative files:
It is NOT efficient to process relative files sequentially if the file was created
with a hashing algorithm.
1. Records are "out of order".
2. Lots of unused disk space.
COBOL file processing statements:
OPEN I-O
READ or READ INTO has several forms depending on type of access used:
READ filename [AT END] – sequential read in sequential access
READ filename KEY IS – random read in random or dynamic access
KEY IS key field or KEY IS alternate key field – tells the system
which field to use to read the file by
READ filename NEXT – sequential read in dynamic access
WRITE recordname INVALID KEY
CLOSE
START – used to establish the CRP (current record pointer) if you do not want to start at
the beginning of the file; followed by a READ or READ NEXT for sequential read
REWRITE recordname
DELETE filename
File Status return codes:
The system returns a 2 character hexadecimal number after every I/O operation on an
indexed file. These codes must be checked by the programmer after each I/O verb
affecting the indexed file. A value of '00' indicates successful processing. Any higher
value is a warning or error, although some warnings are expected during normal
processing. Any value greater than '23' indicates a serious error, and processing should
be terminated after showing the reason and closing all files.
The file status return codes may be used with DECLARITIVES / USE AFTER ERROR
to let the system show the code if any I/O error occurs for the file. In this case,
INVALID KEY would not be used.
EXAM 1 includes material to this point.
7
ISAM.
Indexed Sequential Access Method
An access method supporting sequential and direct file processing.
ISAM maintains the logical order of records by an index, or cross-reference file, even
when the physical order is changed from additions or deletions of records.
To delete a record from ISAM, move HIGH-VALUES to the one-byte field at the start of
the record. The record is NOT physically deleted.
When the file is first built, it is built sequentially, and in order. When records are then
added during an update, they are added in entry order at the end of the file. Therefore,
after an update, some of the records in an ISAM file will be "out of order" physically.
ISAM maintains a track index, prime data tracks, and overflow areas.
The system maintains POINTERS to keep the logical order.
Each cylinder of an ISAM file has one index. Each record of data is actually two parts:
normal and overflow.
Problems with ISAM:
1. After a time, ISAM files need to be reorganized (rebuilt) by the programmer,
or performance suffers. Why?
2. ISAM is limited by the physical size and characteristics of the storage device.
3. Like any DASD file that is updated directly, care must be taken to provide:
backups
audit trails
8
VSAM.
Virtual Storage Access Method - a revolutionary IBM product that replaced ISAM:
an access method for direct or sequential processing of fixed and variable length
unblocked records on direct access devices.
logical records are stored in VSAM format, invisible to the user.
Relative byte addressing (RBA) is used to describe the offset, in bytes, of a record from
the beginning of the file. However, the physical location of indexed records changes as
records are added. Index in virtual storage. All transparent to user.
ADVANTAGES:
more efficient than ISAM because index structure is handled in virtual storage as
much as possible
no overflow areas; free space is available throughout
overcomes physical limitations of DASD device (device independent) – why?
allows sequential, indexed, and relative file access
DISADVANTAGES:
sequential access alone would not be worth trouble of VSAM; however, indexed
processing allows sequential access when preferred
often, much wasted disk space for data space and catalog(s)
Serious I/O errors can go undetected because VS COBOL does not handle
VSAM error checking; the programmer must check the FILE STATUS codes.
VSAM processes three types of files:
KSDS - key sequence data set
ESDS - entry sequence data set
RRDS - relative record data set
Records of a KSDS or ESDS file may be either fixed or variable length.
Records of a RRDS are always only fixed length.
9
KSDS file processing:
A file of existing data is built sequentially in ascending order by key value with no
duplicates of the key field value
efficient use of the index; kept mostly in virtual storage; data and index are only
physically stored on master catalog by a CLOSE statement
accessed sequentially or randomly
free space can be allocated (NOTE: default = no free space)
free space allows for records to be added or lengthened easily
can support variable length records
does not have to be reorganized often by programmer
VSAM recovers space when records are deleted or shortened
KSDS is most frequently used VSAM organization
Definitions about the INDEX.
Used to locate a record in a KSDS file
both the index and the file are defined as a cluster
Definition: the “main” index is called the prime index
index relates the key field value to the RBA location in file
RBA:
value of the key field must not be altered during processing one record. Why?
VSAM index is a file with one or more levels; each level is a set of control intervals
each control interval has one index record that can have one or more index entries
Definition: lowest level of the index is called the sequence set; one per control area
records in all higher levels are collectively called the index set
alternate index possible for KSDS; may have duplicates
10
VSAM CATALOGS and data storage concepts:
A catalog keeps track of file and space characteristics. A catalog is similar to VTOC but
relates to VSAM data space, and VSAM manages the location of data for you.
Therefore, no EXTENT card!
4 major VSAM components:
MASTER CATALOG
optional USER CATALOG(s)
data space
files (clusters).
CATALOG contains DATA SPACE; DATA SPACE contains CLUSTERS.
CLUSTER -
VSAM data set = a file.
DEFINE a cluster does NOT mean that data is put into it; use REPRO command in
IDCAMS utility program. See chapter on Access Method Services.
You do NOT choose blocks; assigned when cluster defined.
Password protection and overwrite protection possible.
CONTROL INTERVAL (CI) - unit of data transfer between virtual storage & DASD.
must be a binary multiple of 512 bytes; FBA device = 512
maximum size is 32,768 bytes (CKD only)
size is independent of the type of DASD used, but the size is chosen by VSAM
for the device on which it is defined
stores data records and control information about them
size & number of control intervals per control area is fixed
VSAM chooses size & number of logical records. Depends on:
maximum record size
size of VSAM I/O buffers
the type of DASD device used
COBOL programmer does NOT block VSAM records.
No recording mode!
No label records!
No block contains!
It is possible to have spanned records (larger than one control interval) in ESDS or
KSDS.
11
CONTROL AREA (CA) - contains control intervals & free space
free space for KSDS additions is defined with cluster
adding records out of sequence may cause a control interval split; this can continue as
long as there is free space
Remember, VSAM reclaims space if record is shortened or deleted.
Control interval is normally the size of a cylinder on the DASD device
Each Control Area has same number of Control Intervals
DATA SPACE - storage area for use exclusively by VSAM, defined in the VTOC.
Contains continuous set of Control Intervals.
VSAM also maintains a 'pool' of free space. What about disk efficiency?
CATALOG -
contains information on VSAM data sets.
This is an area on disk not used by files running under ISAM or SAM.
We store SAM info in VTOC but VSAM names in the catalog.
There is a Master Catalog (STUDENT.MCAT at DCC) and there can be User Catalog(s)
if more security is needed. Also safer; if one catalog damaged affects only files in it.
12
ACCESS METHOD SERVICES.
Processing of VSAM files requires the use of the Access Method Services, IDCAMS, to access
the VSAM files through the VSAM catalog. VSAM files are identified by the COBOL external
name.
// EXEC IDCAMS,SIZE=AUTO
syntax: COMMAND PARAMETER
Commands available are Functional or Modal commands.
functional:
DELETE release storage space; includes options
delete cluster removes index, data, and any alternate index defined on it.
PURGE
overwrite protected file
ERASE
binary 0 overwrite sensitive data
DEFINE
reserve catalog, space, or cluster
REPRO
almost any file organization to another
(repro = reproduce)
PRINT
CHARACTER, HEX, or DUMP (both, by default)
LISTCAT
see cluster, data, and index names; catalog
Ex. LISTC ENTRIES(base.cluster.name) ALL -
VERIFY
was EOF set properly; uses CC condition code
IMPORT
from another computer system
EXPORT
to another computer system
PARM SYNCHK
check syntax without altering any data!
modal:
IF LASTCC = 0 modal command reads condition code
THEN - some IDCAMS command
ELSE ; another IDCAMS command. else optional
13
Condition Codes used in LASTCC in the IF command:
0 = successful processing
4 = warning
8 = error
12 = error and function not performed
16 = severe error; rest of job flushed
VSAM names require that each string of 8 characters (or less) be separated with a period.
No trailing period. Name = 44 characters maximum length. First character MUST be a
letter.
Parameters of the same command can be split over lines and a hyphen is the continuation
character.
The hyphen must be the last non-blank character on the line. Either do not use a hyphen,
or use a semicolon, as terminator (to indicate the last line, that is not continued).
The order of parameters is not significant.
Must have matching parentheses total for each command, even though many individual
lines do not.
Commands in col. 2 - 72.
Parameters must be separated. May use space or comma.
/* comments flanked by */ also can indent for readability
Always use SIZE= in VSAM. Size=AUTO leaves room for other modules to be loaded
in the partition.
Passwords are optional for cluster and catalog. Default is not to require a password for
clusters.
READPW parameters - read only
UPDATEPW parameters - update
MASTERPW - all operations permitted
Normal order of IDCAMS commands:
DELETE
DEFINE
REPRO
Do some COBOL processing
PRINT results
14
DEFINE CLUSTER (NAME(yourname.master.file) VOLUME(DAP000) RECORDS(#) -
allow for additions - future growth
RECORDSIZE(80,80) -
1st = average size
2nd = maximum size
KEYS(x,y) -
x = length of field
y = offset = # characters
displaced from beginning of
record
FREESPACE(10,20)) -
(CI%,CA%) room for added records
DATA (NAME(yourname.master.file.data)) - otherwise system
INDEX supplies odd names
(NAME(yourname.master.file.index)) CATALOG(STUDENT.MCAT/IOXYE)
name of ours
Also, use of PARM SYNCHK allows the Access Method Services to check the syntax of
your IDCAMS commands without taking action or altering any actual data.
In the above example, the name of the VSAM cluster used is yourname.master.file but
you would use your own unique name, as long as it has periods at least each 8 characters
and is no longer than 44 characters including the periods.
To print your cluster:
// JOB
// DLBL extname
// EXEC IDCAMS,SIZE=AUTO
PRINT INFILE(extname) CHARACTER
/*
/&
15
COBOL, DITTO, IDCAMS and JCL CONNECTIONS.
SELECT filename
[SAM file]
ASSIGN TO SYS015-one
FD
RECORDING MODE F LABEL RECORDS ARE STANDARD.
01
PIC X(two)
SELECT filename
ASSIGN TO SYS016-three
ORGANIZATION IS INDEXED
ACCESS IS RANDOM
RECORD KEY IS four
FILE STATUS IS five.
FD
01
filename.
recordname.
05 four
[VSAM file]
[no recording mode and
no label records for VSAM]
PIC X( ).
WORKING-STORAGE SECTION
01 five
PIC XX.
_________________________________________________
// ASSGN SYS015,DISK
// DLBL one,'SAM unique file-id here',0,SD
// EXTENT SYS015,...
// DLBL three,'yourname.master.file',,VSAM
// EXEC
,SIZE=AUTO
_________________________________________________
// EXEC IDCAMS,SIZE=AUTO
DELETE ...
DEFINE CLUSTER (NAME(yourname.master.file) ...
REPRO INFILE(one, ENV(RECFM(F),BLKSZ(two))) OUTFILE(three)
_________________________________________________
// EXEC DITTO
[for the SAM file only]
$$DITTO SPR FILEIN=one
/*
_________________________________________________
// EXEC IDCAMS,SIZE=AUTO
[for the VSAM file only]
PRINT INFILE(three) CHARACTER
/*
one
two
three
four
five
- COBOL external name for the SAM file. Max. 7 characters.
- length of the logical record in the SAM file.
- COBOL external name for the VSAM file. Max. 7 characters.
- name of the key field. Should be PIC X in COBOL 85, but OK if not.
- name of the field used to store the two-digit status codes (in Hex).
EXAM 2 includes material to this point.
16
VARIABLE LENGTH RECORDS.
IDCAMS
DEFINE CLUSTER RECORDSIZE(x,y) REPRO INFILE(
, ENV(RECFM(F),BLKSZ(nn))) OUTFILE(
where y > x
inputs SAM file,
fixed (F) length of (nn)
)
DITTO may be used with the CVS (Card to VSam) command to load cards to a VSAM cluster.
Variable length records may be used in ESDS and KSDS files (not RRDS).
Methods of defining variable length records in COBOL.
1. record descriptions of different lengths in one FD
or 2. table defined as OCCURS DEPENDING ON
or 3. RECORD CONTAINS range
or 4. RECORDING MODE IS V (sequential files only)
Example:
FD
01
class-file
RECORD CONTAINS 27 TO 627 CHARACTERS.
class-record.
05 class-name
PIC X(6).
05 room-number
PIC X(7).
05 teacher-name
PIC X(12).
05 number-of-students
PIC 9(2).
05 class-table OCCURS 0 TO 20 TIMES
DEPENDING ON number-of-students.
10 student-number PIC X(9).
10 student-name
PIC X(21).
How many elements? How much storage is required for this table?
How could this table be processed?
If you knew a student number, could you get the person’s name?
17
Notes on the use of OCCURS .. DEPENDING ON element-counter.
1. Because of the possibility of the element-counter causing the compiler to allocate
dynamically the storage beyond the intended range, define the table as the last part of
storage when possible. The next field in storage may be overlapped upon. If the next
area is instead used by another program, this has been known to 'bring down' CICS
on-line processing.
Example:
01
05 table occurs depending on X
05 table occurs depending on Y
Unreliable; must be put in separate records.
2. A subscript can be out of range when the table is not as large as the maximum size.
OCCURS .. DEPENDING ON does not check for a subscript out of range. Consider:
OCCURS X TO Y TIMES DEPENDING ON Z
X <= Z <= Y
An error occurs when a subscript is used that is > Z when the table is not as large
as Y.
3. The compiler calculates the size of the table dynamically. There are three conditions
that cause this calculation:
a. when a file is read in and the value of the element-counter is a field in that
record
b. when a new value is moved to the element-counter (however, if the
element-counter is redefined a value may be moved to that field without causing
a recalculation of the record length)
c, after a record is written, the length is set to the maximum size to accommodate
the next record
18
ALTERNATE INDEX.
Purpose: allow access to a record by secondary (alternate) key
FILE-CONTROL.
SELECT
ASSIGN
ORGANIZATION IS
ACCESS IS
RECORD KEY IS
ALTERNATE RECORD KEY IS
[WITH DUPLICATES]
[ALTERNATE RECORD KEY IS
[WITH DUPLICATES]]
FILE STATUS IS
.
can have more than one
Requires in VSAM the following commands in IDCAMS in this order:
1. define the base cluster:
2. define the alternate index:
3. define a logical path:
4. build the alternate index:
5. process it and print it
DEFINE CLUSTER DEFINE AIX DEFINE PATH BLDINDEX -
Note that printing the AIX will only show a cross-reference between the primary and
alternate keys. To see the data sorted by the alternate key, print the PATH. See the
following sample code:
19
//
//
//
//
/*
/&
20
DLBL
DLBL
DLBL
EXEC
base,'example.base',,VSAM
aix,'example.aix',,VSAM
base1,'example.path',,VSAM
IDCAMS,SIZE=AUTO
first: delete, define & repro the base cluster
DEFINE AIX (NAME(example.aix)RELATE(example.base) VOLUME(DAP000) RECORDS(n) RECORDSIZE(a,m) KEYS(x,y) NONUNIQUEKEY if duplicates allowed
SHAREOPTIONS(2) - must be in define cluster also
UPGRADE)
DEFINE PATH (NAME(example.path) PATHENTRY(example.aix))
BLDINDEX INFILE(base) OUTFILE(aix)
PRINT INFILE(base) CHARACTER
PRINT INFILE(aix) CHARACTER
PRINT INFILE(base1) CHARACTER
PROCESSING ALTERNATE INDEXED VSAM KSDS FILES.
The alternate key must be fixed length & in fixed position.
Duplicates in alternate index keys may be used. To process, move the value you wish to
search for to the alternate key and READ KEY IS to find the first match. Then READ
NEXT which will read sequentially each succeeding record with that key. When there
are no more matches, “AT END” is triggered. This combination of random then
sequential access requires ACCESS IS DYNAMIC in COBOL and uses status codes 02
and 10.
1st record:
READ file KEY IS dataname [IN record]
(this record is called the "record of reference")
then:
READ filename NEXT
(sequential read until key not = the key of the record of reference)
DUPLICATES require: NONUNIQUEKEY in IDCAMS and WITH DUPLICATES in
COBOL
The path is a logical, not physical concept required by the access method; no space is
defined for it.
Definition: the alternate index is built on a cluster called the base cluster. The alternate
index is an indexed cluster, so you may supply a DATA name and INDEX name if you
wish. You actually process the PATH in your application program, however.
The technical file structure (see the next chapter) needed to support an alternate index is
either a MULTIPLE LINKED LIST or an INVERTED FILE. Both structures contain
records for each alternate key value and pointers to the original file. VSAM uses the
inverted file for a KSDS alternate index, which references the alternate key to the record
key in alternate key order.
Updates to a VSAM file with alternate indexing mean that updates must also be made to
the structure supporting the alternate indexing. This reduces performance! Although
VSAM handles this for you when you use the UPGRADE option, alternate indexing
should not be installed unless necessary.
VSAM currently does not provide for retrieval of a record based on more than one
alternate key at a time.
The path is a logical concept, so the IDCAMS PRINT of the path shows all records in the
base cluster but in order of the alternate key. A PRINT of the AIX shows the alternate
key & the primary key, in order of the alternate key. In other words, the AIX is only the
cross-reference of the keys.
A path may be specified over a base cluster for a reason other than using alternate
indexing. For example, the path could provide an alias with different protection
attributes.
21
System conventions include SHAREOPTIONS(2) so that both the path and base cluster
may be accessed, much like two programs trying to access one file. There are several
levels of protection. Use a 2 for KSDS alone, and a 3 under CICS.
The external name of the path must be the same as the external name of the base with a
number at the end, beginning at 1 for the first alternate index, and never exceeding the 7
character maximum.
ex. XYZ, XYZ1, XYZ2
ex. XYZMAST, XYZMAS1, XYZMAS2 ... XYZMA10 etc.
IMPORTANT: despite what the IBM VSAM manuals say, the DCC
implementation does not work like this! You must use no more than 6
letters for the base name, followed by a ‘1’ to make the path name.
Example:
if the base is XYZMAS, the path is XYZMAS1
When you work with DLBLs, remember that a job can have all the DLBLs defined at the
top of the job, and they will work throughout the entire job. However, if you choose to
repeat a DLBL, you must repeat all DLBLs (and related EXTENTs and ASSIGNs if
needed), since having just one “resets” the system’s knowledge of which ones are in
place for that job step.
PROCESSING DUPLICATES IN AN ALTERNATE INDEX
To do this, three conditions must be true:
In COBOL, ACCESS must be DYNAMIC
In COBOL, must have WITH DUPLICATES under the alternate key
IDCAMS DEFINE AIX must have NONUNIQUEKEY
To process a record by its alternate key, move the value you wish to search for to
the alternate key field, and code READ filename KEY IS alternate key
This will find the first record with the matching alternate key value. If it is the
only record with the alternate key value, the file status code is ‘00’. However, if
it is the first of several records that have the same alternate key value, the file
status code will be ‘02’. You would then switch to sequential processing using a
READ NEXT statement to find the desired record, usually in conjunction with an
IF statement to match a third field for a particular value.
22
If you set up a loop to read sequentially all the records with the same alternate
key value, the last matching record (the last duplicate) will give you a file status
value of ‘00’. If you continue to READ sequentially, you will read the rest of the
records in the file from that point, and eventually get a value of ‘10’, which
logically means “end of file” in the sense that there are no more records when
reading sequentially. One method to avoid reading more records than you need
to, is to set up a loop when you get a file status value of ‘02’. Within the loop, use
sequential processing to search for the desired value of another field that will
find the correct record from within those with a matching alternate key value.
You can stop the loop at that point with a switch. Be sure also to set the switch
once the value changes to ‘00’, which means you have read the last duplicate and
still not found what you are looking for. Loop until the switch is set.
File Status code summary on a READ with alternate indexing with duplicates:
‘00’
successfully found only record, or last record if more than one
‘02’
successfully found first matching record, but there are more
‘10’
no more records to process sequentially
A file status code of ‘02’ may also be obtained on a WRITE or a REWRITE
statement if there are already other records with the same alternate key value.
COBOL programmers may also use the START statement to position the READ
statement at the appropriate record. This could be done to position the first
record at somewhere other than the beginning of the file (sequential or dynamic
access). START can also give options to prevent errors if there is no exact match
for the first value of the key field value you select. The format of the START
statement, with three options, is:
START filename KEY IS EQUAL TO dataname
START filename KEY IS GREATER THAN dataname
START filename KEY IS NOT LESS THAN dataname
If you use the START statement with no KEY IS phrase, the primary key
is assumed.
EXAM 3 includes material to this point.
23
DATA STRUCTURES.
Pointer:
A field associated with one piece of data used to identify the location of
another piece of data.
If the data is reorganized the pointers must be changed.
If pointers are destroyed they can be very difficult to reconstruct.
Stack:
all insertions and deletions are made at the same end of the data structure.
Last in - first out.
Queue:
all insertions occur at one end and all deletions at the other;
First in - first out.
Sorted List:
insertions and deletions may occur anywhere; elements are maintained in logical
order based on a key field
Inverted List:
a table, list, index, or directory of data addresses that indicate all the records with
something in common
An index is usually used to speed processing. Indexes can be created on
secondary keys, which may not be unique to one record. An index is more
compact than the data it references, but an index can be very large.
An index can be treated as a file itself, on which an index can be created.
Example: VSAM alternate index.
Linked List:
when the above structures are used with a system of pointers from one element to
the next. Also called a Chain.
Tree Structure: each element of the structure (except the root) has one pointer to it, but may have
zero or many pointers from it to other elements. Element also called node.
Binary Tree:
each element has at most two pointers from it. Very efficient for processing.
In a sequence tree, there is a left pointer from an element which points to the
element in the next level with a lesser value, and a right pointer - next level,
greater value.
This kind of tree is dependent on how the records are loaded. The worst case for
a tree occurs when the records are already sorted (the tree becomes a one-sided
linked list).
24
DATA MODELS.
Data Model:
an abstract representation, or description, of real objects and their associations.
DBMS:
database management system
Schema:
the internal model designed by the data base administrator and seen by the
programmers
includes the data base management system, operating system access methods,
and other programs
Subschema:
the data as seen by a user of a specific data processing technology; also called
external model
DATABASE MODELS:
Models:
Hierarchical, Network, Relational, Object-oriented, XML
Hierarchical:
Parent / child structure. No record has more than one parent. Example:
IBM's IMS (Information Management System)
Network:
Parent / child structure but more flexible than the hierarchical and with
less redundancy. Example: TOTAL
Relational:
A relationship between data elements based on two-dimensional tables
that is easiest to use and does not depend on any specific access.
Examples: ORACLE, SQL/DS, dBASE III, dBASE IV, RBase,
ACCESS, many others.
Note: a vendor's definition of what makes a database relational, does not
always agree with the original Codd specifications. Many so-called
relational databases lack certain features. Also, do not confuse a
relational database product with a simple file manager (often called a flat
file).
Object-oriented Properties of data are stored along with the data, including the function
of inheritance. Used in C++
XML:
25
XML is a recent system developed to deal with real-world data, which
does not always have a fixed record structure. This should be interesting
to watch, since it has the potential to change completely how we deal
with information.
RELATIONAL DATABASES.
database: collection of shared related files.
Note: if you are working with just one file, it’s not really a database. Many
programs that work with one file should not be called database management
software; they are instead more properly called flat-file managers.
record: collection of related fields.
relational: refers to the qualities of a record, that every field in a record is directly related
to the key field of that record.
Bad example: Student name, student number, advisor, name of advisor’s cat
Using databases:
a. Design the structure of the database
b. Load the data
c. Query the database
Relational database structure.
Uses two-dimensional tables.
Each column contains a single value. The order of the columns does not matter.
Each row is different. The order of rows does not matter.
There are rules for a process called Normalization that governs good design of relational
structures.
Terminology used in relational databases:
26
Relational
theory:
_________
Relational
DBMS
_________
File
processing
____________
Relation
Tuple
Attribute
Table
Row
Column
File
Record
Field
Programming with a relational database.
The relational model provides data independence from the physical database. If the
database is changed, application programs do not have to be rewritten.
A disadvantage is much data redundancy. Also, there is not as much efficiency in high
volume applications as the other structures (network and hierarchical).
Data definition concepts:
a table has columns and rows
intersection of row and column is called a field
domain: set of all possible values in a field
view: a logical table derived from tables
not physically stored; may be processed like a table
index: change logical order of data; SQL decides when to use indexes
referential integrity: must not delete a record from one table if that record is needed by
another
27
NORMALIZATION.
Normalization is the application of a set of rules for the most efficient design of a relational
data base to eliminate or at least reduce problems caused by insertion, deletion, or modification of
records.
First normal form: each relation contains no repeating groups (these should be in a table
by themselves)
Second - fifth normal form: more rules to make the database as efficient as possible for
data integrity.
example: no field in a record that is not directly related to the key field
There are several other levels of normalization.
The way a database is structured when it has multiple files being shared affects the efficiency of
information storage and retrieval. A database must store all needed data, and it must be able to
connect that data logically, even across files. The following are design principles, not to be
violated without a very good reason:
1. Assign a unique key field to each record in a file.
2. If two fields are always related to each other, put them in the same record.
3. If the same field is found in more than one record, make separate files and put the key
field of the record in both files to allow the connection to be made from one to the other.
4. Do not repeat a data type within a record. Instead construct two files with a common
key field.
5. Do not put any field in a record that is not directly related to the key field (remember
the Advisor’s Cat). Construct another file and JOIN it to the first file by a field with
common values.
6. Adding indexes for secondary keys, or additional files, can save time in some data
retrieval operations, but this will add to the use of disk space and the complexity of the
overall design.
7. Use enough columns (fields) so that any potential data can be queried or sorted.
8. Give each column enough width for any possible entry.
9. Make columns that may be compared with one another the same data type.
28
STRUCTURED QUERY LANGUAGE (SQL).
SQL/DS - Structured Query Language / Data System
IBM product
ANSI standard for relational databases, Feb. 1985
Concept of relational model invented by E. F. Codd, IBM, 1972
First fully relational DBMS, 1979, Oracle Corp.
IBM versions:
SQL/DS for DOS, 1982
SQL for VM, 1983
SQL for MVS, 1984
Description:
fourth-generation relational database query language. A fourth-generation language
specifies what to do but not how to do it.
non-procedural language to create, access, and modify data
has important data security and integrity features
Methods of accessing SQL:
ISQL - runs as a CICS transaction and allows testing of commands
through QMF
SQL engine in a database management application package or Internet search engine
Batch SQL
// EXEC ARIDBS
(SQL utility - see IBM manual #5046)
Embedded SQL
embedded into a high-level host language like COBOL, Assembler, or PL/1
embedded SQL commands must first be precompiled
SQL uses the VSE/VSAM access method
Microcomputer SQL implementations, such as ACCESS
29
Data definition statements:
CREATE
(table, view, or index)
GRANT, REVOKE
(access to tables by other users)
DROP
(delete a table and its data)
Data query statement:
SELECT
(for queries - retrieve data)
Data manipulation statements:
UPDATE, INSERT, INPUT, DELETE, ALTER
Syntax:
names are short-ID (to 8 characters) or long-ID (to 18)
short-ID used for databases, table space
long-ID used for tables, indexes, views, columns
names usually start with letters; short may start with # or $; long may contain numbers
but not special characters or spaces. Exception: long-ID may use the underscore
avoid SQL reserved words as user-defined names; there are over 300 reserved words
Types of data:
integer
-2,147,483,648 to +2,147,483,647
smallint
-32,768 to +32,767
tinyint
0 to 255
float
floating point -79 decimal positions to +75
decimal(m,n) m = total numbers n = decimal places
char(n)
fixed length (n) data
varchar(n)
to 254 characters
long varchar(n) between 254 and 32,767
entries may be NULL (no value); displays as a ?
30
Catalogs in SQL:
catalogs are tables themselves
syntax: USERID.TABLENAME if user has granted rights
SYSTEM.SYSCATALOG shows all tables
SYSTEM.SYSCOLUMNS detailed list of columns in all tables
SQLDBA.SYSUSERLIST users (only DBA sees passwords)
while a table is created, altered, or deleted, the system locks out any operation on that
table and temporarily locks out the system catalog.
SQL recognizes and can break deadlock
Examples of useful queries of catalogs:
SELECT * FROM SQLDBA.SYSUSERLIST
SELECT * FROM SYSTEM.SYSCOLUMNS WHERE CREATOR = user
SELECT * FROM SYSTEM.SYSPROGAUTH
SELECT * FROM SYSTEM.SYSCATALOG
SELECT * FROM SYSTEM.SYSVIEWS
CREATING TABLES:
CREATE TABLE tablename (columnname type,
...
)
may use NOT NULL after the type to show that there must be data in that field
if a table already exists, error message
DELETING TABLES:
DROP TABLE tablename
31
RETRIEVING DATA FROM A TABLE:
command syntax:
SELECT columnname,...
FROM tablename,...
[WHERE condition]
SELECT specifies columns (which fields are displayed)
FROM specifies tables (from which data file)
WHERE specifies rows (for which records)
Select query shows cost estimate and row count.
* = all columns Example:
SELECT * FROM INVENTORY
DISTINCT removes duplicate rows, so that the columns which are displayed contain different
data in each row Example:
SELECT DISTINCT PARTNO FROM INVENTORY
Retrieving only some rows which meet a condition: WHERE
Alphanumeric data is placed in quotes. Ex. WHERE PART = ‘CAM’
Arithmetic operators:
+ - * /
Relational operators:
= < > <= >= <>
Statistical functions:
AVG MAX MIN SUM COUNT
Logical operators:
AND, OR, NOT used to form compound condition
AND evaluated before OR; parentheses evaluated first
32
Additional operators used in a query: BETWEEN, IN, NULL, LIKE
Examples:
WHERE partno BETWEEN 241 AND 266
WHERE major IN ('CIS', 'ACC', 'BAT')
WHERE description IS [NOT] NULL
WHERE description LIKE 'large %'
Note: LIKE works with alphanumeric data only and finds any number of
appearances of that character string
Special characters in search expression: _ and %
_ (underscore) means ignore a single character in the string
% (percent) means ignore zero or more characters in string
Examples:
'CIS%'
'%A%'
'C_S'
Sorting the output:
finds all courses starting with CIS
finds BAT, ACC, CPA, etc.
finds CIS and CPS etc.
ORDER BY columnname [DESC], ...
Examples:
ORDER BY PARTNO, SUPPNO DESC
ORDER BY 2, 1
ORDER BY 2 DESC, 1
Summarizing:
33
GROUP BY columnname, ...
Using calculated fields:
SELECT suppno, price*qonorder AS order
FROM QUOTATIONS
WHERE price*qonorder > 100
This results in a column name of ORDER after SUPPNO. The addition of ‘AS’ is useful
to provide a more meaningful column name, if supported by SQL version.
RETRIEVING DATA FROM MULTIPLE TABLES:
This is one of the most powerful relational database capabilities. The operation is called JOIN
and allows for an unplanned logical connection between the tables. There must be a column in
each table with data in common for a join. The column names do not need to match, but the data
must. If duplicate column names exist in the joined tables, each reference to column name must
either use an alias, or be a qualified name (containing the name of the table, a period, and the
column name), to eliminate ambiguity.
SELECT ..
FROM tablename1, tablename2
WHERE tablename1.columnname = tablename2.columnname
AND ...
The WHERE condition shows the relationship between the tables. If a
columnname is found in both of the tables, the command requires a prefix of the
tablename and a period so the system knows which column is desired. Column
names that are only in one table require no prefix. There is also normally some
other part of the condition (AND, OR).
ALIAS
It is possible to join a table to itself to solve certain queries. This is done by using an alias for the
name of the table, so the same table is referred to by two different names when you join them in
the query. In the following example, only one table is used, but it is given an alias as FIRST and
SECOND.
SELECT DISTINCT FIRST.columnname
FROM tablename FIRST, tablename SECOND
WHERE FIRST.columname = SECOND.columname AND …;
34
OTHER SQL COMMANDS:
Subqueries
SELECT .. WHERE join AND .. IN (SELECT ...)
There are several types, including nested queries and correlated subqueries.
VIEWS
Views are logical tables.
can make complicated queries simpler
can grant access to a view to see only part of a table
GRANT TO and REVOKE FROM
GRANT privilege ON table TO
INPUT - begin adding data to a table
INSERT - add one row with specific values
HELP
Examples: HELP SELECT or HELP 'CREATE VIEW'
UPDATE - change one value or values in a row or rows
DELETE FROM WHERE - delete row(s) from table
ALTER - change column structure
COMMIT WORK - save changes
ROLLBACK WORK - if error occurs, ignore changes
OTHER SQL FUNCTIONS:
YEAR(birthdate)
HOUR(current time)
SUBSTR(fname,1,1)
DECIMAL(invtotal,9,2)
Ex. SELECT DECIMAL (AVG (fieldname), 7,3)
35
EMBEDDED SQL.
Using embedded SQL in Fujitsu MicroCOBOL is not possible with the limited free version. The
following notes are mainframe-based.
Preprocessing changes SQL code to be processed by the host compiler and converts SQL
statements to access modules in machine code, stored in the SQL/DS database and called by the
host program. The complete development cycle for COBOL is as follows:
1. Define tables
2. Code the host program with embedded SQL top-down
3. Precompile (converts SQL statements to CALL statements)
4. Bind (system finds access paths for each SQL request)
5. Compile (normal, although SQL shown)
6. Link-edit (normal)
7. Execute (steps 3 - 7 done by special JCL)
Writing the program:
begin each SQL statement with EXEC SQL in area B
end each SQL statement with END-EXEC
define your host variables (will be prefaced by a colon when used in a SQL command) in
the DECLARE SECTION
include the SQL Communication Area (for error handling)
EXEC SQL INCLUDE SQLCA END-EXEC
establish error handling WHENEVER a problem occurs
negative numbers are errors
0 = successful processing; 100 = EOF
CONNECT to SQL by a USERID and PASSWORD
SQL DECLARE CURSOR
a cursor is a pointer to the database associates a query with a name
the query may return one or many rows, called the "active set"
verbs are OPEN, FETCH, PUT, DELETE, UPDATE, CLOSE
FETCH functions like a COBOL "READ INTO"
36
COMMIT WORK RELEASE - make changes and release database
ROLLBACK WORK RELEASE - ignore changes and release
The program in Appendix A is a working mainframe COBOL program with an
embedded SQL query. This program was written to provide a simple demonstration of
the technique by which a SQL query can substitute for a great deal of COBOL procedural
code.
Direct references to SQL have a pair of instructions surrounding them: EXEC SQL and
END-EXEC. This indicates that the precompiler will first translate only the statements
between these instructions into COBOL CALL statements. Then the COBOL compiler,
invoked by rather elaborate JCL, can translate the precompiled SQL as calls to object
modules along with the rest of the regular COBOL. Technically, the jobstream which is
actually sent does not contain the source code; the job entry control language punches the
appropriate source code from another file and any required data at the required time.
The embedded SQL query is in lines 53 - 56. The demonstration query in this program
retrieves all the records in the SCHEDULES table, except those scheduled in a building
whose identification starts with the characters "DU". The data will be sorted by course
number.
Line 52 assigns the query to a cursor, named in this case C1. This is not a cursor in the
sense of an indication of where you are on a monitor; the cursor defines a pointer to the
desired record. Lines 80 and 81 show a FETCH C1 INTO which functions as a COBOL
READ INTO statement, reading the desired data from the SQL database. The datanames
which begin with colons on line 81 are defined in the DECLARE section, lines 22 - 29.
Although the names are the same as the attribute names in the query (for example,
'course'), this is not required. The colon in front of the name indicates that the dataname
is a COBOL field which will accept a SQL attribute.
Some very powerful and convenient program control logic is interesting to note here.
SQL queries give a return code: a negative number is an error; a 0 means processing was
successful; a positive number is a warning, even though processing continues. Lines 62
and 83 refer to SQLCODE = 100, which is a specific warning indicating an end-of-file
condition, or no more records in the table.
The SQL WHENEVER commands in lines 69 and 70 may be the most time-saving lines
in the program. If a SQL query return code is a warning, ignore it and continue; if a
return code indicates that an error, any error, occurred, control is transferred
unconditionally to the error-handling paragraph, ERRCHK, which begins on line 91.
This avoids the programmer having to code individual error-handling conditional logic
throughout the program. Just say, if there is ever an error, go there. This is perhaps the
best example of the fourth-generation principle: tell the computer what you want but let it
figure out how. If you need more control, you can still have it. Although the technique
was not used in this program, "WHENEVER statements may be specified in more than
one place in a program. A WHENEVER statement is applicable to all SQL statements
that follow it, until the end of the program or the next WHENEVER statement.
37
Finally, line 111 in the ERRCHK paragraph contains a ROLLBACK WORK statement
which causes any change to any table made during this program to be undone. Only if no
errors occur, and the ERRCHK routine is never called, will the program store any
changes made by the COMMIT WORK RELEASE instruction in line 64. Up until that
statement, a ROLLBACK WORK (called from a WHENEVER) can keep any updates
from becoming permanent, avoiding possible corruption of data and allowing the
programmer to try again on the original data.
During the precompile, SQL commands are read top-down. The precompiler will not
follow COBOL logic. Therefore the DECLARE cursor must come before OPEN cursor
or the OPEN will be flagged as an error etc. Submit the precompile job, not COBOL
code. The special JCL will precompile then compile, link, and execute.
Programming:
In the SQL precompile job:
put any JCL statements that the job needs to run
put your userid and password
reference file name, type, disk which has COBOL source code
In the COBOL source code:
WS error-handling fields STEP-DENOTER and DECODED-SQLCODE
in working-storage to be used by SQL error output routines
WS fields for userid and password as SQL declared variables in the DECLARE
SECTION; best method is to ACCEPT them from card input, so they are not
shown on a printout of the job
when defining host variables, use COMP for smallint, and COMP-3 for decimal
host fields.
INCLUDE the SQLCA module for error handling routines (both errors and
warnings)
move a message to STEP-DENOTER before each SQL command
write your embedded SQL commands, using a colon in front of COBOL host
variables used in the SQL command
use standard COBOL techniques for output formatting
reproduce the recommended error handling paragraph ERRCHK which is
invoked by the SQL WHENEVER clauses
38
Handling SQL return codes:
negative numbers are errors ex. -504
use HELP -504 in ISQL to see info on this error
a 0 means successful processing
positive numbers are warnings
a 100 means no more rows remain in the active set; in other words, the EOF has been
reached
39
APPENDIX A. Sample COBOL program with embedded SQL query.
001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
033
034
035
036
037
038
039
040
041
042
043
044
40
IDENTIFICATION DIVISION.
PROGRAM-ID. SQLDEMO.
AUTHOR.
M.K.FINLEY.
ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
SOURCE-COMPUTER. IBM390 WITH DEBUGGING MODE.
OBJECT-COMPUTER. IBM390 WITH DEBUGGING MODE.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT PRINT-FILE
ASSIGN TO SYS011-SQLPRT.
DATA DIVISION.
FILE SECTION.
FD PRINT-FILE
LABEL RECORDS ARE OMITTED.
01 PRINT-REC
PIC X(133).
WORKING-STORAGE SECTION.
01 STEP-DENOTER
PIC X(50) VALUE SPACES.
01 DECODED-SQLCODE
PIC --------999.
EXEC SQL BEGIN DECLARE SECTION
END-EXEC.
01 COURSE
PIC X(9)
VALUE SPACES.
01 TITLE
PIC X(20) VALUE SPACES.
01 ROOM
PIC X(6)
VALUE SPACES.
01 INSTRUCTOR
PIC X(20) VALUE SPACES.
01 USERID
PIC X(8)
VALUE SPACES.
01 PASSW
PIC X(8)
VALUE SPACES.
EXEC SQL END DECLARE SECTION
END-EXEC.
01 PRINT-LINE
VALUE SPACES.
05 CARRIAGE-CONTROL
PIC X.
05 COURSE-OUT
PIC XXXXXXBXBXX.
05 FILLER
PIC X(3).
05 TITLE-OUT
PIC X(20).
05 FILLER
PIC X(3).
05 ROOM-OUT
PIC XXXBXXX.
05 FILLER
PIC X(3).
05 INSTRUCTOR-OUT
PIC X(20).
05 FILLER
PIC X(65).
01 MESSAGE-LINE
VALUE SPACES.
05 CARRIAGE-CONTROL
PIC X.
05 SQLMESSAGE
PIC X(132).
EXEC SQL INCLUDE SQLCA END-EXEC.
045
046
047
048
049
050
051
052
053
054
055
056
057
058
059
060
061
062
063
064
065
066
067
068
069
070
071
072
073
074
075
076
077
078
079
080
081
082
083
084
085
086
087
088
089
090
41
PROCEDURE DIVISION.
000-MAIN-MODULE.
OPEN OUTPUT PRINT-FILE.
MOVE 'SQLMKF NOW EXECUTING.' TO SQLMESSAGE.
WRITE PRINT-REC FROM MESSAGE-LINE AFTER 3.
PERFORM INITIALIZE-SQL THROUGH INITIALIZE-EXIT.
MOVE 'DECLARE' TO STEP-DENOTER.
EXEC SQL DECLARE C1 CURSOR FOR
SELECT COURSE, TITLE, ROOM, INSTRUCTOR
FROM SCHEDULES
WHERE ROOM NOT LIKE 'DU%'
ORDER BY COURSE
END-EXEC.
EXEC SQL OPEN C1 END-EXEC
MOVE 'COURSE
TITLE
ROOM
INSTR.' TO SQLMESSAGE.
WRITE PRINT-REC FROM MESSAGE-LINE AFTER 2.
PERFORM PROCESS-SQL THROUGH PROCESS-EXIT
UNTIL SQLCODE = 100.
EXEC SQL CLOSE C1 END-EXEC.
EXEC SQL COMMIT WORK RELEASE END-EXEC.
CLOSE PRINT-FILE.
STOP RUN.
INITIALIZE-SQL.
MOVE 'WHENEVERS' TO STEP-DENOTER.
EXEC SQL WHENEVER SQLWARNING CONTINUE END-EXEC.
EXEC SQL WHENEVER SQLERROR GO TO ERRCHK END-EXEC.
ACCEPT USERID.
ACCEPT PASSW.
MOVE 'CONNECT' TO STEP-DENOTER.
EXEC SQL CONNECT :USERID IDENTIFIED BY :PASSW
END-EXEC.
INITIALIZE-EXIT.
EXIT.
PROCESS-SQL.
MOVE 'FETCH ' TO STEP-DENOTER.
EXEC SQL FETCH C1
INTO :COURSE, :TITLE, :ROOM, :INSTRUCTOR
END-EXEC.
IF SQLCODE NOT = 100
MOVE COURSE TO COURSE-OUT
MOVE TITLE TO TITLE-OUT
MOVE ROOM TO ROOM-OUT
MOVE INSTRUCTOR TO INSTRUCTOR-OUT
WRITE PRINT-REC FROM PRINT-LINE AFTER 1.
PROCESS-EXIT.
EXIT.
091 ERRCHK.
092
**********************************************************
093
* THE NEXT ROUTINE PRINTS THE SQLCA STRUCTURE
094
* - SQL CODE = SQL RETURN CODE
095
* - SQL ERRM = SQL ERROR MSG
096
* - SQL ERRP = MODULE DETECTING ERROR
097
* - SQL ERRD = INTERNAL ERROR VALUES
098
* - SQL WARN = SQL WARNING STRUCTURE
099
**********************************************************
100
DISPLAY SPACES.
101
DISPLAY 'CHANGES WILL BE BACKED OUT'.
102
DISPLAY SPACES.
103
DISPLAY 'A PROBLEM HAS BEEN DETECTED IN THE'.
104
DISPLAY SPACES.
105
MOVE SQLCODE TO DECODED-SQLCODE.
106
DISPLAY STEP-DENOTER.
107
DISPLAY SPACES.
108
DISPLAY 'SQLCODE: = ' DECODED-SQLCODE.
109
DISPLAY 'SQLERRM: = ' SQLERRMC.
110
DISPLAY 'SQLERRP: = ' SQLERRP.
111
EXEC SQL ROLLBACK WORK RELEASE END-EXEC.
112 ERRCHK-EXIT.
113
EXIT.
42
APPENDIX B. Microcomputer SQL.
Affordable microcomputer versions of SQL became available on microcomputers in the late
1980s. These early versions, especially shareware programs, were quite limited, with many
commands not supported. dBASE IV implemented a fairly complete version, but its use was
cumbersome at best.
SQL on a microcomputer can be used to learn the query syntax. You can also translate a query
by example (QBE) into SQL, and examine the results to help learn SQL.
Microsoft ACCESS supports SQL, although the steps to get to it are not intuitive, since SQL does
not initially appear on the pull-down menus! Why are they hiding this powerful tool?
How to use SQL in Microsoft ACCESS:
1. Create or load a table in ACCESS.
2. Click on the Queries tab under Objects in the Database window.
3. Choose NEW in the menu of that window.
4. Choose DESIGN VIEW in the New Query window and click OK.
5. A Show Table window appears automatically. Click Close.
6. Choose SQL View from the View menu. A new window with SELECT; appears.
7. Type in a query. Be sure it ends with the semicolon.
8. To run the query (see the results), click on the red exclamation point icon, or
select Query, Run from the menu.
9. Depending on window sizes, your query window may disappear. Find the
window named Query1 (the default first query name). To edit your query, click
on View, SQL View.
10. You can save the query by clicking on the Save icon. The default name is Query1
but you can change the name to a more meaningful one. The next time you load
the database, the query will be loaded as well and will appear as a choice in the
Database window from the Query tab.
11. If you edit a query and save it, it will overwrite the existing query. To keep the
existing query and add a new query, choose File, SAVE AS and give it a different
name.
43
To join two tables in Microsoft Access:
Load the first table
On the Database window, click on Tables
From the top menu, Insert, Table
From the New Table window, click on Import Table & click OK
Navigate to the second table and double-click on it
In the Import window, select what you want to bring in (tables, queries, etc.)
You should now have the second table in the Database window with your first table
You are now ready to write SQL queries using both tables.
Access provides a non-standard command called INNER JOIN. You may use it, or you may use
the standard SQL form. See the two examples below:
SELECT songlist.song, number, publisher
FROM songlist INNER JOIN composers
ON songlist.composer = composers.name;
Is the same query as standard SQL:
SELECT songlist.song, number, publisher
FROM songlist, composers
WHERE songlist.composer = composers.name;
44
Differences between Standard (ANSI) SQL and Microsoft Access SQL
Access SQL is a subset of standard SQL, probably because Microsoft provides other ways within
Access to create and manipulate data and expects that you will use those, not SQL. Microsoft
even hides the SQL menu from you, until you are in a particular design query environment. This
is unfortunate.
Not supported in Access SQL:
CREATE TABLE
- done using Access Table menu
CREATE VIEW and DROP VIEW
- however, you can simulate a VIEW by using SELECT … INTO
UPDATE
DELETE
GRANT
REVOKE
NOT NULL in WHERE conditions. This is because Access won’t let you specify
that a field could be NULL when designing the table.
Wildcard characters “_” or “%”
- however, the “*” is used as a wildcard character
Data types CHAR and VARCHAR (use TEXT in table design)
Special notes:
Access allows either single or double quotes for strings.
Access SQL uses and recommends INNER JOIN ON but you can still join two
tables using standard SQL language (WHERE table.field = table.field AND)
When using GROUP BY, you must list each field in the SELECT.
Access does not recommend joining a table to itself, but it is supported.
Access SQL assumes ALL in a SELECT when you don’t use DISTINCT
45
Using PL/SQL in ORACLE 9I
SQL implementations by different vendors often have minor differences in features and syntax,
particularly in two areas: data types supported, and output formatting.
The following is a quick reference to some of the features of PL/SQL I found that do not match
the batch SQL used at DCC by running jobs with // EXEC ARIDBS.
The textbook reference used for the following information is Chapters 2, 3 and 4 of Guide to
Oracle9I by Morrison & Morrison, published 2003 by Thompson Course Technology.
Please note this is not meant to be a comprehensive list, nor is it meant to imply one SQL
implementation is preferable to another.
Data types:
NCHAR and NVARCHAR data types are not supported in batch SQL.
DATE, TIMESTAMP, INTERVAL YEAR TO MONTH, INTERVAL DAY TO
SECOND, LOB data types are not supported in batch SQL.
Constraints:
Constraints, including foreign and composite keys, are handled differently in batch SQL,
except that NOT NULL is the same.
Query commands:
The DESCRIBE command is not found in batch SQL; use SELECT instead.
The RENAME command is not supported in batch SQL.
DATE and INTERVAL values are handled differently; check the manuals.
The CREATE SEQUENCE and USER SEQUENCE commands are not found in batch
SQL.
Single row number functions, single row character functions and single row date
functions are much more limited in batch SQL than in Oracle.
Oracle’s SQL*Plus functions are not supported in batch SQL.
The exponentiation operator, **, does not work in batch SQL.
Procedural language constructs:
SQL by itself is not a procedural language, so it should not be expected to have
programming constructs like IF THEN or WHILE LOOP that are contained in Oracle’s
implementation.
46
APPENDIX C. MODULUS 11 Check Digit algorithm in COBOL.
// JOB 21100FAC 01 FAC MATT FINLEY - INSTRUCTOR
// OPTION DECK
// EXEC IGYCRCTL,SIZE=IGYCRCTL
PROCESS LIB
IDENTIFICATION DIVISION.
PROGRAM-ID.
MODCALL.
AUTHOR.
MADISON K. FINLEY.
DATE-WRITTEN.
9-27-99.
ENVIRONMENT DIVISION.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 NEEDED-FIELDS.
05 PRODUCT
05 THE-SUM
05 REMAINDER-DIGIT
05 DONT-NEED
LINKAGE SECTION.
01 WORK-AREA.
05 DIGIT-1
05 DIGIT-2
05 DIGIT-3
05 DIGIT-4
05 CHECK-DIGIT
PIC
PIC
PIC
PIC
99.
999.
99.
99.
PIC
PIC
PIC
PIC
PIC
9.
9.
9.
9.
9.
PROCEDURE DIVISION USING WORK-AREA.
000-MAIN-PARA.
*
DISPLAY "MODCALL IS RUNNING.".
COMPUTE THE-SUM = 0.
COMPUTE PRODUCT = DIGIT-1
ADD PRODUCT TO THE-SUM
COMPUTE PRODUCT = DIGIT-2
ADD PRODUCT TO THE-SUM
COMPUTE PRODUCT = DIGIT-3
ADD PRODUCT TO THE-SUM
COMPUTE PRODUCT = DIGIT-4
ADD PRODUCT TO THE-SUM
* 5.
* 4.
* 3.
* 2.
DIVIDE THE-SUM BY 11 GIVING DONT-NEED
REMAINDER REMAINDER-DIGIT.
COMPUTE CHECK-DIGIT = 11 - REMAINDER-DIGIT.
*
DISPLAY CHECK-DIGIT.
000-EXIT.
EXIT.
100-RETURN-PARA.
EXIT PROGRAM.
100-EXIT.
/*
/&
47
APPENDIX D. MODULUS 11 Check Digit algorithm in Pascal.
program modulus(input,output);
{ Calculates a check digit using Modulus 11 }
{ Written by M.K.Finley. 9/27/85
}
{ Last revised 4/29/03. Bug fix 8/28/07.
}
uses CRT;
const
max = 19;
type
datatype = 0..max;
indextype = 1..max;
arraytype = array [indextype] of datatype;
var
digit,weight,product,sum,remainder,check,total : integer;
answer : char;
counter : 1 .. max;
number : arraytype;
procedure welcome;
begin
writeln('
writeln;
writeln('
writeln;
writeln('
writeln('
writeln('
writeln;
end;
MODULUS.PAS
Last revised
by M.K.Finley
4/28/03.
9/27/85.
‘);
');
This program calculates what the check digit is, using');
the Rule of Modulus 11. Use of a check digit can prevent');
unauthorized or invalid account numbers.');
procedure initialize;
begin
answer:='a';
product:=0;
sum:=0;
total:=1
end;
procedure choice;
begin
write('
readln(answer);
writeln
end;
48
{initial screen}
{welcome}
{zero accumulators}
{initialize}
{set switch to continue or exit}
A. Calculate check digit.
{choice}
B. End program. ');
procedure howmany;
{length of input number}
begin
writeln;writeln;
write(' How many digits is the number, including the check digit? ');
readln(total);
if total = 1 then howmany;
total:= total - 1;
writeln
end;
{howmany}
procedure input;
{input & store each digit in an array}
begin
for counter:= 1 to total do
begin
write(' Enter number in position # ',counter, ' ( L to R ): ');
digit:=-1;
readln(digit);
while (digit < 0) or (digit > 9) do
begin
write(' Error. Try # ',counter, ' again.
');
readln(digit)
end;
number[counter]:=digit
end;
writeln
end;
{input}
procedure calculate;
{compute & display Modulus 11 formula}
begin
weight:=total + 1;
for counter:= 1 to total do
begin
write(' Number = ',number[counter]);
write('
weight = ',weight);
product:=number[counter] * weight;
write('
product = ',product);
sum:=sum + product;
writeln('
sum = ',sum);
if weight > 2 then
weight:= weight - 1;
end;
remainder:= sum mod 11;
write(' Remainder = ',remainder);
check:= 11 - remainder;
if check = 11 then check:= 1;
if check = 10 then check:= 0;
writeln('
Check = ',check);
end;
{calculate}
49
procedure results;
{display the answer}
begin
writeln;write(' The number was: ');
for counter:= 1 to total do
write(number[counter]);
writeln;write(' The check digit would be:
writeln(check);
writeln;write(' The new number should be:
for counter:= 1 to total do
write(number[counter]);writeln(check);
writeln
end;
{results}
begin
end.
50
');
');
{MAIN LOGIC}
clrscr;
welcome;
initialize;
choice;
clrscr;
while (answer <> 'b') and (answer <> 'B') do
begin
howmany;
input;
calculate;
results;
initialize;
choice;
clrscr
end
{PROGRAM}
APPENDIX E. Call an external module in Fujitsu COBOL
(Instructions adapted by Dean Finley from Gary Fidler)
______________________________________________________________
Write the calling program. Ex. CALLING.COB
Write the called program. Ex. CALLED.COB
In Programming Staff, click the Project button then click Open
- make sure you are in the folder with the source code
In the Open Project dialog box, enter a project name
ex. CALLTEST.PRJ
then click Open
In the new Open Project dialog box, click Yes to create the file
In the Target Files dialog box, click Add then click OK
In the Dependent Files dialog box, browse to and add files
Ex. CALLING.COB and CALLED.COB
In the list box, select CALLING.COB, click Main Program, then click OK
In the CALLTEST.PRJ window, click Build
When you get the Make ended message, click OK
Close the editor window that opens
In the CALLTEST.PRJ window, click Execute
In the Runtime Environment window, click OK
Check your output file
51
APPENDIX F. VSAM COBOL Status Codes.
CODE:
____
CAUSE & ACTION TO TAKE:
_______________________
00
Successful processing
02
Duplicate key on a READ in sequential access (more records match)
04
Wrong length record on a READ
05
Optional file is not present (I'm not sure how to get this error)
10
End of file in a READ using sequential access when there are duplicates in Dynamic access
20
Invalid key (I'm not sure how to get this error)
21
Invalid key, sequence error (rewrite a record that doesn't exist).
Don't change the value of the key between READ and REWRITE.
22
Duplicate key (record already exists) on WRITE or REWRITE
23
Key not found (record does not exist) on READ, DELETE or START
24
Attempted to write beyond the boundary of the file
30
Hardware parity error (let us know if you get this error)
35
Required file missing. Check on an open if the file even exists.
37
Device conflict.
Did you open a tape for I/O? Did you open KSDS or RRDS as I/O when it is an empty file?
Check also the COBOL SELECT for the correct ACCESS usage.
39
Can't OPEN; file attributes conflict (check record length, the length of the primary key, the length
of the alternate key). Also check the COBOL SELECT for the correct ORGANIZATION usage.
41
Can't OPEN a file already open
42
Can't CLOSE a file already closed
43
Error on REWRITE or DELETE in sequential access only, when you did not READ the record first
to establish the record pointer.
46
Sequential access only: READ shows no more records. Did you READ past the EOF?
47
Can't READ when file was not opened, or opened for output.
48
Can't WRITE when file was not opened for output or I-O
52
49
Can't DELETE or REWRITE because file not opened for I/O
90
Undocumented error. This code occurs when there is an error when none of the other status codes
can explain the error. Difficult to diagnose this one.
91
VSAM password failed. Did you create the file with one?
92
Logic error. Make sure the record description is the same length as specified in the
RECORDSIZE(). File not open? File opened the wrong way? File already open? Reading past the
EOF? Many of these errors are now reported with the 30s codes.
93
Are you working with an alternate index? If so, did you put SHAREOPTIONS(2) in BOTH the
DEFINE CLUSTER and the DEFINE AIX? It has to be in both.
Let us know if you get 93 for any other reason.
94
In a sequential READ or READ NEXT, the current record is undefined
95
Incomplete or invalid information for the VSAM file.
Error in the DLBL for the VSAM file? File not in the catalog?
Check that the KEY(S) in IDCAMS is correct for the record key in the COBOL code. Check your
PIC clauses and calculate the displacement of the key field by adding up the total length of all
PIC clauses before the key field. A key field in position 1 is a displacement of zero. Record key
different from Keys in the Define Cluster?
96
No DLBL for the VSAM file
Did you use the AIX name instead of the PATH name with a cluster that uses an alternate index?
97
OPEN was successful and integrity verified, but the cluster was not properly closed during the last
processing run.
53
FILE ORGANIZATION
STUDY GUIDE FOR CIS 211
Madison K. Finley
Associate Dean of Academic Affairs Emeritus
Adjunct Lecturer, Department of Engineering, Architecture and Computer Technologies
Adjunct Lecturer, Department of Performing, Visual Arts and Communications
Certified Computing Professional (C.C.P.)
I gratefully acknowledge the support of the DCC Computer Information Systems faculty, staff, and students
for all that I have learned over the last twenty-five years. Specific recognition goes to Professor Emeritus
William G. Kleinhomer for his lasting contributions to the development of this course, to Dr. Frank Whittle
for his knowledge and continuing support, and to Gary Fidler for all his assistance in adapting to a new
system.
Copyright ©2007 by Madison K. Finley
All rights reserved. This book or parts thereof may not be reproduced or used in any form
or any means without written permission of the author. Making copies of this book, or
any portion, is a violation of United States copyright laws.
Throughout this guide, trademarked names are used. Rather than place a trademark
symbol in every occurrence of a trademarked name, please note that the names are used
only in an editorial fashion to the benefit of the trademark owner with no intention of
infringement of the trademark.
First printing:
Eleventh printing:
August 1987.
September 2007.
Dutchess Community College
Poughkeepsie, New York 12601
54