Chapter 9 File and Database Processing Programming Logic for Business Copyright 2001 Laura Saret Laura Saret EdD Chapter Outline What Are File Organization and Access? What Are the Various Types of Files Used by Business Systems? Example: Sequential File Merging Program Example: Sequential File Matching Program Example: Sequential File Update Program How Are Direct and Indexed Files Processed? What Are Pointers, Chains, and Rings? Why Use Databases Instead of Traditional Files? What Are the Models for Database Organization? How Do You Select a File or Database Organization? Chapter Objectives After completing this chapter you should be able to.… Explain the differences between sequential and direct (random) access and give examples of applications that use each. Describe the various types of business files including master files, transaction files, table files, temporary files, log files, mirror files, and archive files. Describe the two ways of organizing data on sequential files. Discuss the purposes of sequential file merging, matching, and update programs, as well as construct the logic for such programs. Chapter Objectives (continued) After completing this chapter you should be able to.… Explain the advantages and disadvantages of nonsequential processing. Explain methods for backup. Describe how records are accessed nonsequentially by their addresses, including the purposes of randomizing routines, pointers, chains, and rings. Explain how indexed files are organized and accessed. Describe the issues involved in selecting a file organization. Chapter Objectives (continued) After completing this chapter you should be able to.… Understand the data redundancy, integrity, and independence issues involved with traditional file organization. Explain the purpose of a database management system. Explain how data are organized on databases using the hierarchical, network, and relational models as well as how object-oriented databases differ from traditional databases. Discuss selecting a particular file or database organization. File Organization Describes how records are stored within a file Four organization methods Serial Sequential Direct (relative) Indexed All other organization schemes are variations of or combinations of these methods Serial Organization Records are stored in chronological order (in order as they are input or occur) Examples Recording of telephone charges ATM transactions Telephone queues Sequential Organization Records are stored in order based on a key field Key field A field containing a value that uniquely identifies a record Examples Names Account numbers Student numbers Example Phone directories Direct (Relative) Organization Each record is stored based on a physical address or location on the device Address is calculated from the value stored in the record’s key field Randomizing routine or hashing algorithm does the conversion Disadvantages Creating a good randomizing routine is difficult Collisions Wasted space on the device can be considerable Sequential processing may not be possible Indexed Organization Records can be processed both sequentially and nonsequentially using indexes File Access Method by which computer programs read or write records on a file Sequential access Direct (random) access Sequential Access Every record on the file is processed starting with the first record until EOF is reached Sequential Access Suppose we only want to access the 4th record The 3 records that precede it must be accessed first Inefficient when records must be processed in a sequence other than their physical order on the device Sequential Access Is Efficient When… A large number of the records on the file must be accessed at any given time Direct (Random) Access A high percentage of today’s information processing must be done nonsequentially Records are located by knowing their physical locations or addresses on the device rather than their positions relative to other records Compare Looking for a Recording on a CD versus on a Cassette Tape DIRECT SEQUENTIAL Data stored on a CD device (direct-access) can be accessed either sequentially or nonsequentially Data stored on a tape (sequential access) can be accessed only sequentially Main Disadvantage of Nonsequential Access You must know a record’s address to find it Process of calculating or looking up an address takes more time than getting the “next record” as is done with sequential access You must weigh… amount of time it takes to find individual records number of records to be accessed Types of Files Used by Business Systems Mirror Archive Files Temporary Files Files Master Files Contain the current information for a system customer file student file telephone directory Table Files Type of master file Change infrequently Stored in a tabular format Zipcode Transaction Files Contain the day-to-day information generated from business activities Used to update or process the master file Temporary Files Created and used whenever needed by a system Mirror Files Exact duplicates of other files Help minimize the risk of downtime in cases when the original becomes unusable Must be modified each time the original file is changed Log Files Contain copies of master and transaction records in order to chronicle any changes that are made to the master file Facilitate auditing Provide mechanism for recovery in case of system failure Archive Files Backup files that contain historical versions of other files Sequential File Merging Programs Combine two or more sequential or serial files into a single file Examples merge the phone directories of two cities combine transactions from more than one day Merging cassette tapes + Tape A Tape B Tape C Merging Cassette Tapes + Tape A Tape B Tape C Copy all the music from tape A to tape C Without rewinding tape C, copy all the music from tape B onto tape C Suppose the Cassette Tapes have Been Recorded so that Music Is in Order by Recording Date—You Want the Merged Tape to Be in the Same Order + Tape A Tape B Tape C You need to switch back and forth between Tape A and Tape B to make sure they are copied in the correct order to Tape C Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Adams Andrews Baker Berkowitz Desai Crown Fisher Fisher Smith Miller Patel Tripp Wu Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Merge File Adams Andrews Adams Baker Berkowitz Desai Crown Fisher Fisher Smith Miller Patel Tripp Wu Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Merge File Adams Andrews Adams Baker Berkowitz Andrews Desai Crown Fisher Fisher Smith Miller Patel Tripp Wu Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Merge File Adams Andrews Adams Baker Berkowitz Andrews Desai Crown Baker Fisher Fisher Smith Miller Patel Tripp Wu Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Merge File Adams Andrews Adams Baker Berkowitz Andrews Desai Crown Baker Fisher Fisher Berkowitz Smith Miller Patel Tripp Wu Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Merge File Adams Andrews Adams Baker Berkowitz Andrews Desai Crown Baker Fisher Fisher Berkowitz Smith Miller Crown Patel Tripp Wu Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Merge File Adams Andrews Adams Baker Berkowitz Andrews Desai Crown Baker Fisher Fisher Berkowitz Smith Miller Crown Patel Desai Tripp Wu Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Merge File Adams Andrews Adams Baker Berkowitz Andrews Desai Crown Baker Fisher Fisher Berkowitz Smith Miller Crown Patel Desai Tripp Fisher Wu Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Merge File Adams Andrews Adams Baker Berkowitz Andrews Desai Crown Baker Fisher Fisher Berkowitz Smith Miller Crown Patel Desai Tripp Fisher Wu Fisher Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Merge File Adams Andrews Adams Baker Berkowitz Andrews Desai Crown Baker Fisher Fisher Berkowitz Smith Miller Crown Patel Desai Tripp Fisher Wu Fisher Miller Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Merge File Adams Andrews Adams Baker Berkowitz Andrews Desai Crown Baker Fisher Fisher Berkowitz Smith Miller Crown Patel Desai Tripp Fisher Wu Fisher Miller Patel Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Merge File Adams Andrews Adams Baker Berkowitz Andrews Desai Crown Baker Fisher Fisher Berkowitz Smith Miller Crown Patel Desai Tripp Fisher Wu Fisher Miller Patel Smith Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Merge File Adams Andrews Adams Baker Berkowitz Andrews Desai Crown Baker Fisher Fisher Berkowitz Smith Miller Crown Patel Desai Tripp Fisher Wu Fisher There are no more records on this file We finish copying records from this file Miller Patel Smith Let’s Merge 2 Phone Directory Files Names on File 1 Names on File 2 Merge File Adams Andrews Adams Baker Berkowitz Andrews Desai Crown Baker Fisher Fisher Berkowitz Smith Miller Crown Patel Desai Tripp Fisher Wu Fisher We finish copying records from this file Miller Patel Smith Tripp Let’s Merge 2 Phone Directory Files Names on File 1 Adams Baker Desai Fisher Smith Names on File 2 Merge File Andrews Berkowitz Crown Fisher Miller Patel Tripp Adams Andrews Baker Berkowitz Crown Desai Fisher Fisher Miller Patel Smith Tripp Wu Wu Let’s Examine the Logic for a Merge Program (page 391) Start Do Initialize Done-File1 =? "NO" "YES" Done-File2 =? "NO" "YES" Do Terminate Stop Do Merge Terminate is called when either input file reaches EOF The indicators, Done-File1 and Done-File2, are used to ensure that neither file is read past EOF Terminate finishes processing whichever file still contains records Initialize Assign initial values Read a record from each input file Initialize Assign names and initial values in work area Done-File1 = "NO" Done-File2 = "NO" Open Files Name1 Address1 City1 Phone1 Read a record from File1 EOF? Yes Move "Yes" to Done-File1 No Name2 Address2 City2 Phone2 Read a record from File2 EOF? Yes No Return Move "Yes" to Done-File2 The Merge Routine Determines which Record to Write on the Merge File Merge Is Name1 < Name2? Yes No Write Name1, Address1, City1, Phone1 on MergeFile Write Name2, Address2, City2, Phone2 on MergeFile Read a record from File1 Read a record from File2 EOF? EOF? Name1 Address1 City1 Phone1 Move "Yes" to Done-File1 yes Name2 Address2 City2 Phone2 yes no no Return Move "Yes" to Done-File2 How Does Terminate Finish Processing the Files? Terminate Done-File1 =? "YES" "NO" "NO" Done-File2 =? "YES" Write Name2, Address2, City2, Phone2 on Merge-File Name2 Address2 City2 Phone2 Read a record from File2 No EOF? Write Name1, Address1, City1, Phone1 on Merge-File Name1 Address1 City1 Phone1 Yes Yes Close Files Return Read a record from File1 EOF? No Makes sure that neither file is read past EOF Initialized to “NO” in Initialize routine As long as both indicators contain the value “NO”, Mainline continues to do the Merge routine As soon as one of the indicators becomes “YES”, Mainline calls Terminate When Terminate is called, either Done-File1 or DoneFile2 is “NO”—the other indicator is “YES” Terminate finishes processing the file that still contains records Done-File1 and Done-File2 Sequential File Matching Programs Information from records contained on the master file are needed to process the transaction records The example in section 9.4 (page 393) generates a billing report using… Purchase amounts from transaction records Customer numbers, customer names, addresses, and discount type from master records Master File Billing Report Transaction File Why Use 2 Files to Produce a Billing Report? Accuracy of data input Speed of data input Each time a billing report is generated, you must enter… amount of purchases You do not need to enter constant data… customer names customer addresses discount type Data Dictionary—Input Files (page 394) Both files are in the same order—increasing order by customer number All data have been validated prior to this program Why do both the Master-Record and Transaction-Record contain the customer number? There may be… no master record for a transaction record no transaction record for a master record a match based on the customer numbers—there may be one or more transaction records for a matching master record The Report See page 393 When there is a match, line 5 is generated When there is a transaction with no master, line 6 is generated When there is a master with no transaction, line 7 is generated Let’s Consider Some “Real” Data Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 11 Let’s Consider Some “Real” Data Master File Transaction File Customer Numbers Customer Numbers 1 Match 1 Match 2 No Trans. 4 Match 4 Match 4 Match 6 No Trans. 5 No Master 7 No Trans. 5 No Master 8 Match 8 Match 9 No Trans. 11 No Trans. Hierarchy Chart (page 396) Mainline ProcessInputs Initialize Read-Mast Read-Trans No-Trans Headings Calculations Terminate Match Read-Mast No-Mast Headings Read-Trans Headings Read-Trans Start Mainline Routine Do Initialize What is High-Values? makes sure that we don’t stop processing until both files are at EOF makes sure that we don’t read “past” EOF Look at Read-Mast (page 398) and Read-Trans (page 399) when EOF is obtained, HighValues is moved to the customer number (Cust-No-M or Cust-No-T) Is Cust-No-M or Cust-No-T = High-Values? Yes No Do Process-Inputs Are Cust-No-M and Cust-No-T = No High-Values? Yes Do Terminate Stop How Does Initialize Use HighValues (page 397) INITIALIZE ASSIGN names and initial values in work area HEADING-1 “BILLING REPORT” “PAGE” PAGE-NUMBER = 0 HEADING-2 “CUST.” “NAME” “ADDRESS” “PURCHASES” “BILLING” OTHER LINES-COUNTER = 0 MAX-LINES = 49 HIGH-VALUES PURCH-AMT-TOTAL BILL-AMT-TOTAL CURRENT-CUST-NO OPEN files DO READ-MAST routine IF CUST-NO-M = HIGH-VALUES THEN WRITE “MASTER FILE IS EMPTY” on report ENDIF DO READ-TRANS routine IF CUST-NO-T = HIGH-VALUES THEN WRITE “TRANSACTION FILE IS EMPTY” on report ENDIF LINES-COUNTER = MAX-LINES ENDINITIALIZE Let’s Look at the Sample Files Again and See How Process-Inputs (page 398) Works Initialize reads a record from each input file Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast Return > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 11 Let’s Look at the Sample Files Again and See How Process-Inputs (page 398) Works Process-Inputs calls Match (see page 400) Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast Return > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 11 What Does Match Do? Master File Initializes Purch-Amt-Total and Bill-Amt-Total Customer Numbers to zero Purch-Amt-Total accumulates the total 1 amount of purchases for all transactions 2 with the current customer number 4 Bill-Amt-Total accumulates the total billing amount for all transactions with the 6 current customer number 7 The current customer number is saved in Current-Cust-No to determine when to stop 8 executing the loop that follows 9 The loop calls Calculations to add the 11 appropriate amounts to the totals The next transaction record is read Transaction File Customer Numbers 1 4 4 5 5 8 What Does Match Do? Master File Initializes Purch-Amt-Total and Bill-Amt-Total Customer to zero Numbers Purch-Amt-Total accumulates the total amount of purchases for all transactions 1 with the current customer number 2 Bill-Amt-Total accumulates the total billing 4 amount for all transactions with the current customer number 6 The current customer number is saved in 7 Current-Cust-No to determine when to stop executing the loop that follows 8 The loop calls Calculations to add the 9 appropriate amounts to the totals 11 The next transaction record is read If the transaction has the same customer number as the current customer number, the loop is done again Otherwise, we write a line on the report, and exit the routine Transaction File Customer Numbers 1 4 4 5 5 8 We Exit Match and Return to ProcessInputs to Read the Next Master Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 Return 11 The Master Is Less Than the Transaction Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 Return 11 What Does No-Trans (page 399) Do? Writes a line on the report Returns to Process-Inputs Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 11 We Return to Process-Inputs to Read the Next Master Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 Return 11 We Have a Match Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 Return 11 What Does Match Do? Master File Initializes Purch-Amt-Total and Bill-Amt-Total Customer Numbers to zero Purch-Amt-Total accumulates the total 1 amount of purchases for all transactions 2 with the current customer number 4 Bill-Amt-Total accumulates the total billing amount for all transactions with the 6 current customer number 7 The current customer number is saved in Current-Cust-No to determine when to stop 8 executing the loop that follows 9 The loop calls Calculations to add the 11 appropriate amounts to the totals The next transaction is read Transaction File Customer Numbers 1 4 4 5 5 8 What Does Match Do? Master File Transaction File Initializes Purch-Amt-Total and Bill-Amt-Total to zero Purch-Amt-Total accumulates the total amount of purchases for all transactions with the current customer number Bill-Amt-Total accumulates the total billing amount for all transactions with the current customer number The current customer number is saved in Current-Cust-No to determine when to stop executing the loop that follows The loop calls Calculations to add the appropriate amounts to the totals The next transaction is read Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 If the transaction has the same customer number as the current customer number, the loop is done again Otherwise, we write a line on the report, and exit the routine 9 11 What Does Match Do? Master File Transaction File Initializes Purch-Amt-Total and Bill-Amt-Total to zero Purch-Amt-Total accumulates the total amount of purchases for all transactions with the current customer number Bill-Amt-Total accumulates the total billing amount for all transactions with the current customer number The current customer number is saved in Current-Cust-No to determine when to stop executing the loop that follows The loop calls Calculations to add the appropriate amounts to the totals Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 11 What Does Match Do? Master File Transaction File Initializes Purch-Amt-Total and Bill-Amt-Total to zero Purch-Amt-Total accumulates the total amount of purchases for all transactions with the current customer number Bill-Amt-Total accumulates the total billing amount for all transactions with the current customer number The current customer number is saved in Current-Cust-No to determine when to stop executing the loop that follows The loop calls Calculations to add the appropriate amounts to the totals The next transaction is read Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 If the transaction has the same customer number as the current customer number, the loop is done again Otherwise, we write a line on the report, and exit the routine 9 11 We Exit Match and Return to ProcessInputs to Read the Next Master Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 Return 11 The Master Is Greater Than the Transaction Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 Return 11 What Does No-Mast (page 401) Do? Writes a line on the report Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 11 What Does No-Mast (page 401) Do? Writes a line on the report Reads transaction records until the customer number on the transaction record has changed Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 11 What Does No-Mast (page 401) Do? Writes a line on the report Reads transaction records until the customer number on the transaction record has changed Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 11 We Return to Process-Inputs—Notice that a Master Record is NOT Read Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 Return 11 We Return to Mainline, Which Sends Us Back to Process-Inputs--The Master Is Less Than the Transaction Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 Return 11 What Does No-Trans (page 399) Do? Writes a line on the report Returns to Process-Inputs Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 11 We Return to Process-Inputs to Read the Next Master Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 Return 11 Once Again, We Do No-Trans Write a line on the report Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 11 Return Once Again, We Do No-Trans Read the next master Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 11 Return We Have a Match Master File Initializes Purch-Amt-Total and Bill-Amt- Customer Numbers Total to zero 1 Purch-Amt-Total accumulates the total amount of purchases for all 2 transactions with the current customer 4 number 6 Bill-Amt-Total accumulates the total 7 billing amount for all transactions with the current customer number 8 The current customer number is saved in 9 Current-Cust-No to determine when to 11 stop executing the loop that follows The loop calls Calculations to add the appropriate amounts to the totals The next transaction record is read Transaction File Customer Numbers 1 4 4 5 5 8 We Have a Match Initializes Purch-Amt-Total and Bill-AmtTotal to zero Purch-Amt-Total accumulates the total amount of purchases for all transactions with the current customer number Bill-Amt-Total accumulates the total billing amount for all transactions with the current customer number The current customer number is saved in Current-Cust-No to determine when to stop executing the loop that follows The loop calls Calculations to add the appropriate amounts to the totals The next transaction record is read Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 High-Values We get EOF which moves High-Values to the customer number The customer number is not the same, so we write a line on the report, and exit the routine 11 We Exit Match and Return to ProcessInputs to Read the Next Master Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast Return > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 High-Values 11 From Now Until EOF on the Master File, the Customer Number on the Master will be < the Customer Number on the Transaction Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast Return > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 High-Values 11 So, We Continue to Do No-Trans and Read-Mast Until We Reach EOF on the Master Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Read-Mast Do Match Do Read-Mast Return > Do No-Mast Master File Transaction File Customer Numbers Customer Numbers 1 1 2 4 4 4 6 5 7 5 8 8 9 High-Values 11 If the Master File Had Ended Before the Transaction File, Then… Process-Inputs Cust-No-M: < Cust-No-T = Do No-Trans Do Match Do Read-Mast Do Read-Mast > Do No-Mast Return We Would Have Moved High-Values to the Master’s Customer Number and Kept Doing No-Mast Until EOF Was Obtained on the Transaction File Sequential File Update Programs Modifies a sequential master file by adding, changing, and deleting records Similar logic to matching programs matching programs -- records from the master file are used to process the transactions update programs – records from the transaction file are used to add, change, and delete records on the master file Old master file Transaction file Update program New master file Update report Exception report Sequential File Update Programs The old master file is saved as backup The new master file becomes the old master file the next time the update program is executed Old master file Transaction file Update program New master file Update report Exception report Two Reports are Produced (page 408) How are two reports produced simultaneously with one printer? Spooling Old master file Transaction file Update program New master file Update report Exception report Spooling Reports are written to a disk or tape (instead of to the printer) Update program Reports are printed using a utility program whenever the printer is available Let’s Compare the Processing with Making Changes on a Cassette Tape on which You Have Previously Recorded Songs Delete Song 4 Song 1 Song 2 Song 3 Song 4 Song 5 Song 1 Song 2 Song 3 Song 5 Let’s Compare the Processing with Making Changes on a Cassette Tape on which You Have Previously Recorded Songs Change Song 3 Song Song Song Song Song 1 2 3 4 5 Song Song Song Song Song 1 2 3 4 5 Let’s Study the Input Files (page 410) Each transaction record contains a code indicating whether the transaction is for an add (A), change (C), or delete (D) Both files are in sorted order by customer number Transaction file has been previously validated to ensure that … the code is valid there is a maximum of 1 transaction for each customer number Adds, Changes, and Deletes Code = A new customer a new master record is created with the data from the transaction record an error will occur if the master record already exists Code = D customer is to be deleted (account closed) the transaction only contains the account number an error will occur if the master record does NOT exist Code = C change to record on master file A new master record is created with the changes from the transaction record an error will occur if the master does NOT exist What Changes Can Occur? customer name customer address discount type customer number can NOT be changed Look at Mainline (page 412) What does it mean if Cust-No-T = High-Values after returning from Initialize? there were no records on the Transaction File the program is terminated because it does not make sense to do an update program if there are NO records on the transaction file Initialize writes an error on the Exception Report Mainline does not check Cust-No-M after returning from Initialize. Why not? if there are no records on the master file, the update program can be used to create a new master file all of the transactions must be adds Mainline calls Update-File until both files reach EOF Look at Initialize (page 413) It doesn’t matter whether the Master File or Transaction File is read first Initialize writes an error if there are no records on the Transaction File No error occurs if the Master File is empty Mainline Calls Update-File (page 414) Update-File < Cust-No-M: Cust-No-T > = Do No-Trans Do Match Return Do No-Mast What Happens if Cust-No-M < Cust-No-T? No-Trans is done (page 415) The record on the old master file is copied to the new master file The next master record is read What Happens if Cust-No-M > Cust-No-T? No-Mast is done (page 416) The only “valid” code is A (add) Do Add-Mast (page 418) Write the new master record Write a line on the update report Read the next transaction What Happens if Cust-No-M = Cust-No-T? Match is done (page 415) If the Code is “A”, an error results An error message is written on the report the next transaction is read If the Code is “D”, the master is deleted Delete-Mast (page 416) writes a message on the report the next master is read the next transaction is read If the code is “C”, the master is changed Change-Mast (page 417) writes the changed master on the master file and a message on the report the next master is read the next transaction is read What Data Should We Use to Test the Program? Combinations for… master < transaction master = transaction master > transaction < condition followed by… another < condition > condition = condition = condition followed by… another = condition < condition > condition > condition followed by… another > condition < condition = condition What Data Should We Use to Test the Program? Add with… = condition < condition > condition Change with… = condition < condition > condition Delete with = condition < condition > condition One data set should have the master file reach EOF before the transaction file ends Another data set should have the transaction file reach EOF before the master file ends Let’s Consider Nonsequential Processing Advantages You can find a record without accessing all preceding records Transaction records can be in any order—they don’t have to be sorted Updating does not require an entirely new master file Disadvantages Creation of a backup must be done as a separate processing step—no old master to use as backup To access a record, the program must first determine the address—this is slow compared to “getting the next record” Updates – Direct Files Access records by calculating their physical addresses Adding a record calculate where it should be stored write it on the device Changing a record change the values stored in particular fields rewrite the record on the device in the same physical location it was in originally Deleting a record not physically deleted marked as a deletion—we logically delete it each time we access a record, we check to see whether it has been logically deleted Updates - Indexed Files Three areas are used on the storage device index area contains record keys and addresses stored in sequential order prime data area contains records originally on file overflow data area Contains records added to the file Need for reorganization Indexed files can be maintained by programmers even when the programming language does not provided for indexed processing maintain indexes in tables in sequential files store records in a direct file Pointers, Chains, and Rings Pointers are used in nonsequential processing to maintain a logical order even though records are located randomly Pointers create chains of records RECORD 1 R E C O R D RECORD 3 R E C O R D RECORD 7 * 2 4 RECORD 2 R E C O R D RECORD 4 R E C O R D 3 5 RECORD 5 R E C O R D 6 RECORD 6 R E C O R D 7 Pointers, Chains, and Rings Sometimes files contain more than one pointer This example shows forward and backward pointers RECORD 1 R E C O R D * 2 RECORD 4 RECORD 2 R E C O R D R E C O R D 3 1 RECORD 3 R E C O R D R E C O R D 4 2 R E C O R D R E C O R D 5 3 RECORD 5 * R E C O R D 4 Pointers, Chains, and Rings When a ring is used, there is no “end of chain” RECORD 1 R E C O R D RECORD 3 R E C O R D RECORD 7 2 4 RECORD 2 R E C O R D RECORD 4 1 R E C O R D 3 5 RECORD 5 R E C O R D 6 R E C O R D RECORD 6 R E C O R D 7 Why Use Databases Instead of Traditional Files? Data redundancy Are the same fields are used in many different files Data integrity Are the same data maintained in different files consistent? Data independence If a change is made to the file or database structure, do programs have to be changed to accommodate the new structure? Database Management System (DBMS) A set of programs that serve as the interface between application programs and the database Provides capabilities for the following… creating or modifying the file structure field names field lengths data types special attributes confidential fields can be accessed only by providing the correct password protected fields can be access but cannot be changed without providing the correct password Data entry importing data from another file downloading data from another computer or server Entering data interactively using input forms or screens Data retrieval (extracting data) programming languages allow for customization Models for Database Organization Physical view deals with the actual location and storage of data as well as the pointers—DBMS worries about this Logical view deals with the relationship between records and fields— programmer works with this There are many models used to represent logical views hierarchical model network model relational model object-oriented model Hierarchical Model Developed in 1960s to be used on large mainframes One-to-many relationships Each record is called a node Subordinate items are called children Superior items are called parents President Vice President of Finance and Facilities Dean of Allied Health Vice President of Community Affairs Dean of Business and Social Sciences Vice President of Student Affairs Dean of Humanities and the Arts Vice President of Academic Affairs Dean of Mathematics and Sciences Network Model Many-to-many relationships are possible Science Dept. Biology Student 1 Student 2 Chemistry Student 3 Physics Student 4 Geology Student 5 Student 6 Relational Model Represents data in two-dimensional tables Rows represent records Columns represent fields Name Cohn Test 1 48 Test 2 50 Test 3 68 Test 4 89 Test 5 98 Test 6 92 Test 7 80 Kim 85 77 77 90 79 81 80 Katsumoto 91 41 90 76 58 85 87 Brown 90 90 92 85 77 68 72 Johnson 68 78 45 69 72 73 80 Patel 100 90 92 92 80 79 88 Jerwinski 100 98 88 79 88 92 79 Object-Oriented Models Integrates object-oriented concepts (such as encapsulation and inheritance) with traditional database capabilities Can handle user-defined data types How Do You Select a File or Database Organization? Ease of implementing and maintaining a particular type of file or database organization Availability of software Cost The nature of the application volatility (frequency of changes) activity (percentage of records that are actually used each time the program is run) query (retrieval of specific information) requirements security requirements