INPUT CONTROLS: DATA AND INSTRUCTION INPUT Components in the input subsystem are responsible for bringing information into a system. This information takes two forms: first, it may be raw data to be processed; second, it may be instructions to direct the system to execute particular processes, update or interrogate particular data, or prepare particular types of output. This chapter examines controls over the capture, preparation, and entry of data and instructions into a system. Allen[1977] studied 156 cases of computer fraud and found that 108 of the cases involved addition, deletion, or alteration of an input transaction. DATA CAPTURE METHODS Historically, document based data capture has been used most frequently since the technology needed to support direct entry and hybrid methods has been costly. The costs are decreasing quickly, however, and direct entry and hybrid methods are now widely used. Document-Based Data Capture ➢ When document based data capture methods are used,some type of data preparation activity also is undertaken; scanning, keyboarding operation. ➢ Advantage of document based data capture is that they are easy and flexible. ➢ Documents can be readily distributed close to the points of data capture. ➢ Expensive data capture and input devices are not needed at each location where data capture occurs. ➢ However, they often require substantial amounts of human intervention. Direct Entry Data Capture ➢ Involves immediate recording of an event as it occurs using an input device. ➢ The risks of clerical or operator errors decrease. ➢ Immediate validation of data can be undertaken to provide operators with feedback on data capture errors. ➢ If many data capture points are required it may be difficult to provide direct entry facilities at each location. Hybrid Data Capture ➢ uses a combination of document based-based and direct entry techniques. ➢ The Hardware and Software needed to support hybrid data capture methods are still costly, however, although costs are decreasing rapidly. DATA PREPARATION METHODS. Data preparation comprise one or more of the following tasks. ➢ Converting data to machine readable form ➢ converting data from one machine readable form to another ➢ preparing batches and control totals ➢ Scanning for authenticity, accuracy, completeness, and uniqueness ➢ Verifying data converted to machine-readable form: costly and so it should be used for critical fields where errors are difficult to detect with an input validation program. DIRECT ENTRY DEVICES They are several direct entry devices, such as voice recognition units, process control devices, light pens, joysticks, mouse. For now we look at some of the widely used. Point-of-Sale Terminals ➢ Optical scanning of a premarked code, for e.g , the universal product code, enables faster throughput of items. ➢ Increased accuracy in pricing items ➢ Reduced price marking upon receipt of an item and upon change of the item's price. ➢ Customer satisfaction; a more detailed customer receipt is printed ➢ improved control over tender since the terminal controls the cash drawer. ➢ Better inventory control and shelf allocation through more timely information on item sales. Automatic Teller Machines ➢ They are designed to be phyisically secure – they have the same anti-theft features as a safe. ➢ Camera surveillance must be undertaken, and heat, motion, and sound detectors might be installed. ➢ ATMs usually provide some type of facility for entering a cryptographic key. ➢ If device control software is not secure, fraudulent modifications can be carried out so an ATM dispenses all cash when the software recognizes a particular card. INPUT DEVICES Input devices are used to read the data into the application. Since most input devices function reliably, however, the auditors primary concern is that a regular maintenance schedule for these devices be maintained. Nonetheless, the auditors should understand what type of errors will be prevented, detected, and corrected by the controls in the input devices. Card Readers Card reader malfunctions occur for three reasons: (a) cards are defective in some way. (b) mechanical components have failed so that cards do not move across the read stations in the correct positions or during the correct timing intervals (c) electronic components have failed so that photoelectric cells or brushes in the read station fail to sense the card correctly. To detect card reader malfunctions, four types of controls are used: ➢ Dual read: the card is read twice by two different read stations or the same read station and the results of each are compared ➢ Hole Count: the card is read twice and a count of the holes in each column and row made during each read is compared. ➢ Echo Check: The central processor sends a message to the card reader to activate the read function and the card reader returns a message to the central processor to indicate it has been activated ➢ Character Check: the card reader checks that the combination of holes read represents a valid character. Magentic Ink Character Reognition(MICR) Devices? Optical Character Recognition (OCR) Devices? Optical Mark Sensing Devices? SOURCE DOCUMENT DESIGN The auditor must understand the fundamentals of good source document design. As a basic data input control, a well designed source document achieves several purposes: ➢ increases the speed and accuracy with which data can be recorded. ➢ Controls the workflow ➢ facilitates preparation of the data in machine readable form ➢ For pattern recognition devices, increases the speed and accuracy with which data can be read. ➢ Facilitates subsequent reference checking Three decisions can be made after source document analysis: Choice of Medium ➢ Choices of length and width, grade and weight need to be made ➢ Choices of wrong length and width, or grade and weight can cause a variety of problems Choice of Makeup ➢ They are four types of makeup; padding, multipart set, continous form, snap-apart sets ➢ Choice of wrong makeup results in input errors occuring through documents tearing, and data being too lightly written to be read. Choice of Layout and Style some general design guidelines are: 1) Preprint wherever possible: preprint the responses and have users tick the correct response. 2) Provide titles, headings, notes, and instructions 3) Use techniques for emphasis and to highlight differences 4) Arrange fields for ease of use 5) Where possible,provide multiple choice answers to questions to avoid omissions 6) Use boxes to identify field size errors 7) Combine instructions with questions 8) Space items appropriatel on forms: 9) Design for ease of keying 10) Prenumber source documents 11) Conform to organization's standards DATA CODE CONTROLS Data codes have two purposes that is they uniquely identify an entity or identify an entity as a member of a group or a set and codes are often more efficient than textual or narrative description, since they require a smaller number of characters to carry a given amount of information. Design Requirements a well designed coding system achieves ➢ Flexibility; easy addition of new items or categories ➢ Meaningfulness; where possible, a code should indicate the values of the attributes of the entity ➢ Compactness; Maximum information conveyed with a minimum number of characters ➢ Convenience; A code should be easy to assign, encode, decode and key. ➢ Evolvability; where possible, a code can be adapted to changing user requirements. Data Coding Errors ➢ Addition ➢ Truncation ➢ Transcription ➢ Transposition ➢ Double Transposition Types of Codes Serial Codes ➢ Serial coding systems assign consecutive numbers or alphabetics to an entity irrespective of the attributes of the entity. ➢ Advantage of a serial code are the ease with which a new item can be added and conciseness. ➢ Deleted items must have their codes reassigned to new items. Block Sequence Codes ➢ Block sequence codes assign blocks of numbers to particular categories of an entity. ➢ Block sequence code have the advantage of giving some mnemonic value to the code. ➢ They are problems in choosing the size of the block needed(and the remedy if overflow occurs) and ensuring blocks are not too wasted large so wasted characters occur and the code is no longer concise. Hierarchical Codes ➢ They require the selection of the set of attributes of the entity to be coded and their ordering in terms of importance. ➢ The value of the code for the entity is a combination of the values of the codes for each attribute of the entity ➢ they are more meaningful to their users. ➢ They carry more information about the entity to which they are assigned ➢ sometimes they present problems when changes occur. Association Codes ➢ the attributes of the entity to be coded are selected and unique codes assigned each attribute value. ➢ The code for the entity is simply a linear combination of the different codes assigned the attributes of the entity ➢ They carry substantial information about the entity they represent. ➢ They are not concise. ➢ An example is SHM32DRCOT \ CHECK DIGITS In some cases errors made in transcribing or keying data can have serious consequences. One control used to guard against these types of errors is a check digit. Calculating Check Digits ➢ A check digit is a redundant digit added to a data code that enables the accuracy of other characters in the code to be checked ➢ If the code contains alphabetics, a check digit can still be calculated. Each alphabetic must be assigned a number according to some rule. When to use Check Digits ➢ Use of check digits should be limited to critical fields ➢ Where possible, the computer should assign new codes with their check digits ➢ Checking of check digits should take place only by machine ➢ to save storage space the check digit can be dropped once it has been read into the machine and recalculated upon output INSTRUCTION INPUT Ensuring the quality of instruction input to a computer system is a more difficult objective to achieve than to ensure the quality of data input. Users often attempt to communicate complex actions that they want the system to undertake. On the other hand the input subsystem needs to provide considerable flexibility so users can accomplish their processing objectives. On the other hand, it needs to exercise careful control over the actions they undertake. The languages used to communicate instructions to the system tend to trade off flexibility with control. Question-Answer Dialogs ➢ Used primarily to obtain data input. ➢ Also can be used to obtain instruction input in conjunction with the data input. ➢ In those cases where the required answers are not obvious, a help facility can be used to assist inexperienced users. ➢ Effectiveness and efficiency issues are of primary concern ➢ For experienced users, the alternating sequence of question and answer may be slow and frustrating. ➢ Experienced users may be allowed to stack answers or change to another language mode. Command Languages ➢ SQL ➢ To print the customer numbers of those customers who had more than 10 transactions over $200? ➢ Advantages and disadvantages? Job control languages? Menu-Driven languages? Forms-Based languages? Natural languages? Direct Manipulation languages? AUDIT TRAIL CONTROLS With the data input and instruction input functions, the audit trail in the input subsystem maintains the chronology of events from the time data and instructions are captured until they are entered into the system. Accounting Audit Trail ➢ A source document should show who prepared the document, who authorized the document, when it was prepared, what account or record to be updated, and the batch number of the physical batch in which the document is to be included. ➢ With direct entry data capture, the input program must attach certain audit trail data to the input record e.g the identity of the terminal operator, the identity of the terminal, the time and date of input, and a unique reference number for the transaction that will be carried through the system. ➢ In the case of instruction input, the input subsystem must retain a record on magnetic media containing such data items as the originator of the insructions, the type of instruction and its arguments, the results produced, and the time and date of entry of the instruction. Sometimes a hardcopy of the instruction is available. Operations Audit Trail Some of the types of operations audit trail data that might be collected are. ➢ Time to key in a source document at a terminal ➢ number of read errors made by an OCR device ➢ number of keying errors identified during verification ➢ frequency with which an instruction in a command language is used. ➢ Time taken to execute the same instruction using a light pen vs a mouse By analyzing this data, error-prone input activities can be identified and remedial action taken – the time taken to enter data on a screen may indicate that more user training is needed or a screen redesign is necessary. VALIDATION AND ERROR CONTROLS Input validation controls are used to identify errors in data or instructions before the data is processed or the instructions are executed. DATA INPUT VALIDATION CHECKS Data should be validated as soon as possible after it has been captured and as close as possible to the source of the data. Controls to check the validity of input data can be exercised at four level Field Checks ➢ The validation logic applied to the field in the input validation program does not depend on other fields within the record or other records within the batch. ➢ Several field checks can be applied; missing data/blanks, alphabetics/numerics, range, check digit, size Record Checks ➢ With a record check, the validation logic applied to a field depends on the logical field's interrelationships with the other fields in a record ➢ The following record checks can be applied; Reasonableness, Valid sign-numerics, sequence checks. Batch Checks ➢ They apply validation logic to fields and records based based on their interrelationships with controls established for the batch ➢ Batch checks include control totals, sequence checks, size. File Checks ➢ They ensure correct files are input to a production run of an application system. ➢ These checks are especially important for master files where reconstruction may be difficult and costly. ➢ Internal label, generation number, retention date, and contol totals are checked DESIGN OF THE DATA INPUT VALIDATION PROGRAM A well designed data input program ensures that the quality of the data entering an application system is high, and it facilitates correction and resubmission of errors. The auditor is interested in three aspects; how data is validated, how errors are handled and how errors are reported. Data Validation ➢ System specifications often give a programmer some indication of errors to be expected. ➢ Start the design of the validation routines by specifying what should be correct and then identify deviations that may occur. ➢ The input program must identify as many errors as possible in a record or batch. ➢ Another requirement in writing an input program is ensuring its ability to recover when errors occur. ➢ The input program where possible should correct errors automatically. ➢ Documentation of the input program is essential. Handling of Errors ➢ The validation program must report errors and exercise careful control to ensure the errors are corrected. ➢ Upon receiving error reports, users must identify reasons for errors, correct the data and resubmit the data for validation once again. ➢ If errors are not cleared off an error file within a reasonable time period, the input validation program should remind the users that the errors still await correction. ➢ The user must decide on the number and types of errors that can be tolerated before further processing of the data through the system. Reporting of Errors Errors must be reported in a way that facilitates fast and accurate correction of errors Screen Error Messages ➢ Error messages must be clear and concise, courteous and neutral ➢ The input validation program also must provide various levels of error messages. Printed Error Messages. ➢ The report file must be sorted before printing to facilitate error correction.e.g errors to be corrected by a particular user may be sorted together. ➢ The field in error can be identified by printing indicators such as upward arrows. ➢ Space should exist on the error report for the signature of the person correcting the errors, this gives an audit trail for errors corrected. ➢ The error messages printed must clearly state the nature of the error, where possible printing error codes instead of error messages should be avoided. ➢ At the end of the error report, summary statistics should be printed for transactions processed and the different types of errors identified. ➢ The frequency of each error type also should be printed. INSTRUCTION INPUT VALIDATION CHECKS Instruction input entered via a job control language or interactive dialog also must be validated. The auditor should understand the types of validation that should be carried out by a job control language or interactive dialog and the way in which errors should be reported as a basis for evaluating the quality of the language and the likelihood of user errors being made. Lexical Validation ➢ The language evaluates each “word” entered by a user. ➢ As words are formed from characters, the language must establish rules whereby strings of characters are recognized as discrete words. ➢ Usually this recognition occurs via boundary characters and delimiters. Syntactic Validation ➢ The language reads a string of words identified and validated by the lexical analyzer and attempts to determine the sequence of operations that the string of words is intended to invoke. ➢ The syntax analyzer validates the syntax of an instruction by parsing the string of words entered to determine whether it conforms to a particular rule in the grammar of the language. Semantic Validation ➢ During semantic validation, the language completes its analysis of the meaning of the instruction entered. ➢ The boundary between syntactic validation and semantic validation is often obscure. ➢ During Semantic, the language might check that variables to be multiplied together are numeric types and not alphabetic or alphanumeric types. ➢ The language might prevent a comparison of two numeric values that would be meaningless e.g the salaries of employees with their weights. ➢ The quality of semantic analysis depends on how how well the constraints(logical restrictions) surrounding the data on which the language operates can be expressed. ➢ The language can check that the operations to be undertaken on the data items or the results produced conform to the constraints expressed for the data items in the data definition. Reporting of Errors ➢ Guidelines for reporting errors that were discussed earlier for data validation apply also to instruction validation. ➢ Error messages must communicate to users completely and meaningfully as possible to the nature of errors made during transaction input. ➢ If the language fail to identify an error, unknown to the user, results may be produced that are meaningless. AUDIT TRAIL CONTROLS Audit trail controls relating to input validation and error control maintain chronology of events from the time data is validated to the time data is corrected. Accounting Audit Trail ➢ When data is validated, a time and date stamp should be attached so the timeliness of data validation and error correction and resubmission can be assessed. ➢ If an input validation programme identifies an error, it must generate and attach a unique error number to the data in error unless the data can be corrected immediately. In this way the path of the erroneous data can be traced until the time of the correction. Operations Audit Trail ➢ The operations audit trail should maintain a record of the nature and number of errors made during data and instruction input, the resources consumed to detect and correct errors, and the elapsed time between error identification and error correction. ➢ The amount of central processor time used to detect particular types of errors. ➢ Periodically the operations audit trail should be analyzed to determine whether users need retraining, specific types of data input or instruction input need to be redesigned, or input validation programs need to be rewritten. EXISTENCE CONTROLS ➢ Existence controls must enable input validation programs and files of valid data and erroneous data to be reestablished in the event of destruction or loss. COMMUNICATION CONTROLS The communication subsystem is responsible for transmitting data among all the other subsystems within a system or for transmitting data to or receiving data from another system. This chapter examines the controls that can be established within the communication subsystem to preserve asset safeguarding and data integrity. COMMUNICATION SUBSYSTEM EXPOSURES There are two major types of exposures in the communication subsystem. Component Failure ➢ The primary components in the communication subsystem are communication lines, hardware and software. ➢ In com lines, errors arise because of noise. Noise increases as more data increases in a com line. ➢ Hardware and Software failure can occur for many reasons; circuitry failure, a disk crash, a power surge, insufficient temperature storage, program bugs etc. ➢ The hardware and software failures maybe temporary or permanent, and it may be either localized or global. Subversive Threats ➢ In a subversive attack on the communication subsystem, an intruder attempts to violate the integrity of some component in the subsystem. ➢ Subversive attacks can either be passive or active. ➢ Passive attacks can be performed for traffic analysis, they include; intruders reading and analyzing the clear text source and destination identifiers attached to a message for routing purposes, they may examine the length and frequency of messages being transmitted. These can provide an intruder with important information about messages being transmitted. ➢ They are seven types of active attacks; message insertion, delete a message being transmitted, message modification, change message order, message duplication, denial of message services, and spurious association e.g they may play a handshaking sequence previously used by a legitimate user of the system. CONTROLS OVER COMPONENT FAILURE Previous chapters have already discussed some of the important controls that can be used to prevent, detect, or correct component failures that affect the communication subsystem. Here are some of the additional controls; Treatment of Line Errors ➢ When public lines are used for data transmission, the designer should assume a wide range of line error rates will be encountered. ➢ Certain line errors may have more serious effects than others. ➢ Controls must be implemented to detect and correct line errors. Error Detection Line errors can be detected by using either loop(echo) check or building some form of redundancy into the message transmitted. Loop Check ➢ Involves the receiver of the message sending back the message received to the sender. ➢ Since a loop check at least halves the throughput of communication lines, normally it is used on full duplex. Or where communication lines are short. Redundancy Checks It takes the form of error detection codes. Three major types of codes exist (a) Parity checking codes (b) M-out-of-N codes and (c) cyclic codes. Parity check ➢ Both horizontal and vertical parity checks can be used, a vertical parity check applies to a character, and a horizontal parity check applies to a string of characters. M-out-of-N codes ➢ Characters must be represented by a fixed number of 1 and 0 bits in a character. ➢ For example if a 4-out-of-8 code is used, the bit string of a character must comprise sour 1 bits and four 0 bits if a line error has not occurred. ➢ Unfortunately bursts of noise often oscillates causing one bit to change in one direction and another bit to change in another direction thus leading to an error going unchecked. ➢ A 4-out-of-8 code allows 70 characters where as 128 characters are possible when a parity check is used. Cyclic codes or polynomial codes are more complex than parity checking or M-out-of-N codes. Error Correction Two methods are used to correct errors Error Correcting Codes: They enable line errors to be detected and corrected at the receiving station. However to be able to carry error correction, large amounts of redundancy are required in the messages transmitted. There is also a danger the attempted correction of an error will be carried out incorrectly. Retransmission: If this is to be used, the decision is to be made on how much data is to be transmitted. Retransmission of small quantity of data is faster. The disadvantage is that error detection codes is less efficient for small amounts of data. Error correction through retransmission requires special logic to indicate the correct or incorrect receipt of a message. Noise may also corrupt the control characters; an odd-even record cout enables such errors to be detected. Improving Network Reliablity Besides using hardware and software to detect and correct line errors, a communication network can be designed to reduce the likelihood of line errors and and system failure occurring and to minimize the effects of line errors and system failure when they do occur. Choice of modem? Choice of communication line? Choice of Multiplexing/ Concentration technique? Choice of network topology? Choice of Network control software.? CONTROLS OVER SUBVERSIVE THREATS They are two types of controls over subversive threats to the communication subsystem. The first type seeks to establish physical barriers to the data traversing the subsystem. The second type accepts that an intruder can gain access. Here we look at controls that seek to render data useless if it is intercepted by an intruder. Link Encryption ➢ Protects all data traversing a communication link between two nodes in a network. ➢ The two nodes share a common encryption key. ➢ The message and its source and destination identifiers can be encrypted. ➢ It is possible to mask frequency and length patterns in data by maintaining a continous stream of ciphertext between two nodes. ➢ Link encryption can not protect the integrity of data if a node in the network is subverted. ➢ High costs may have to be incurred to protect the security of each node in the network. End to End Encryption ➢ End to End encryption protects the integrity of data passing between a sender and receiver, independently of the nodes that the data traverses. ➢ It can be refined further to implement association encryption, whereby each session between a sender and a receiver is protected. ➢ It provides limited protection against traffic analysis. ➢ Consequently, link encryption sometimes is used in conjunction with end to end encryption to reduce exposures from traffic analysis Message Authentication Codes ➢ In EFTS, a control used to identify changes to a message in transit is a MAC ➢ It is calculated by applying DES algotithm and a secret key to selected data items in a message or to the entire message. ➢ If the calculated MAC and the received MAC are not equal, the message has been altered in some way during transit. Message Sequence Numbers ➢ Message sequence numbers are required to detect any attack on the order of messages being transmitted between a sender and a receiver. ➢ It must be impossible for the intruder to alter the sequence number in a message. Request- Response Mechanism ➢ Is used to identify attacks by an intruder aimed at denying message services to a sender and receiver. ➢ With request response mechanism, a timer is placed with the sender and receiver. ➢ The timer periodically triggers a control message from the sender, and since the timer at the receiver is synchronized with the sender, the receiver must respond to show the communication link has not been broken. ➢ The intruder must not be able to find valid responses to the control messages. AUDIT TRAIL CONTROLS Audit trails in communication subsystem maintains the chronology of events from the time a sender dispatches a message to the time a receiver obtains the message. Accounting Audit Trail The accounting audit trail must allow a message to be traced through each node in the network. Some of the data items that might be kept in the accounting audit trail are: ➢ Unique identifier of the source node ➢ Unique identifier of the person/process authorizing dispatch of the message. ➢ Time and date at which message despatched ➢ message sequence number. ➢ Unique identifier of each node in the network that the message traversed. ➢ Image of message received at each node traversed in the network. As always, what audit trail information should be kept and how long it should be kept is a cost-benefit decision. Operations Audit Trail Some examples of data items that might be kept in the operations audit trail are: ➢ Number of messages that have traversed each link ➢ Number of messages that have traversed each node. ➢ Queue lengths at each node. ➢ Number of errors occurring on each link of the node ➢ Number of retransmissions that have occurred across each link. ➢ Log of system restarts ➢ Message transit times between nodes and at nodes. EXISTENCE CONTROLS Recovering a communication network if it fails poses some difficult problems. Some of the controls have been discussed earlier and here are additional backup and recovery controls: ➢ Where possible, place redundant components and spare parts throughout the network. ➢ Use equipment with in-built fault diagnosis capabilities. ➢ Acquire high quality test equipment. ➢ Ensure adequate maintenance of hardware and software, especially at remote sites ➢ Ensure adequate logging facilities exist for recovery purposes, especially where store-andforward operations must be carried out in the network. It is essential that well trained personnel with high technical competence operate the network. They must be provided with well documented backup and recovery procedures.