Data Validation

advertisement
Objectives of Control
The objectives of control are:




To ensure that all data are processed
To preserve the integrity of maintained data
To detect, correct and re-process all errors
To prevent and detect fraud
Types of Controls
The different controls can be grouped under five
headings:
Manual controls
Data protection controls
Validation checks
Batch controls
Other controls
Types of Error
System designers must guard against the following
types of error:
Missing source documents
Source documents on which entries are omitted,
illegible or dubious
Transcription errors
Data preparation errors
Program faults
Machine Hardware faults
Manual Checks
Even in advanced systems checking of source
documents is necessary. Checks may include:
Scrutiny to detect:
 Missing entries
 Illegible entries
 Illogical or unlikely entries
Reference of the document to stored data to verify
entries
Re-calculating to check calculations made on the
document
Data Collection Controls
 The collection of data for processing involves
transcribing it into a form suitable for machine
processing.
 There is a real possibility of error at this stage.
 Controls must be imposed to prevent or detect
errors at this stage.
 The type of control depends on the method of data
collection used:
Data Collection Controls 2
 On-line systems. These depend on the data
displayed on a VDU or printed being checked by
the operator before being processed.
 Character recognition. With these techniques,
accuracy depends on the character reader
detecting any doubtful character or mark.
Validation Checks 1
 A computer can’t notice errors in data being
processed in the same way that a human
operator can.
 Validation checks are an attempt to build into
computer programs the ability to detect and
report incorrect data items
 Checks can be made at two stages:
 Input – when data is entered
 Updating – after processing
Validation Checks 2
The main types of validation check used are:
 Presence. Data are checked to ensure that all
necessary fields are present e.g. A payroll
program must have complete employee and
national insurance fields.
 Size. Fields are checked to ensure they contain
the correct number of characters e.g. Student_Id
should contain 6 characters i.e. G01234
 Range. Numbers or codes are checked to ensure
that fall within a permissible range e.g. DOB in a
school database should fall between 1982 and
1989.
Validation Checks 3
 Character. Fields are checked to ensure that they
contain only characters of the correct type e.g.
there are no letters in a number field.
 Format. Also called a picture check. Fields are
checked to ensure that the format is correct e.g.
that a code contains the correct number of letters
and numbers AND they are in the correct
sequence.
 Reasonableness. Quantities are checked to
ensure that they are not abnormally high or low.
 Check Digits. See next slide.
Check Digits
 Use of a check digit allows a number to be selfchecking.
 It is calculated using a mathematical formula and
then becomes part of the number itself.
 When the number is input into the computer the
validation program uses the same mathematical
formula to check the number and ensure that the
number is correct i.e no digits have been
transposed.
 The modulus 11 algorithm is used to create
check digits for ISBN numbers.
Modulus 11
 Assign each digit a weight. The right hand digit (the
least significant) is given a weight of 2, the next
digit to the left 3 and so on
 Multiply each digit by its weight and add the
products together
Number
Weight
Products
2
5
10
Total
54
5
4
20
4
3
12
6
2
12
 Divide the total by 11 and find the remainder
Modulus 11 - 2
 54 divided by 11 = 4 remainder 10.
 Subtract the remainder from 11 to find the check
digit. 11 – 10 =1
 The new number is 25461
 Now try this. The ISBN number of Pat
Heathcote’s book is 095324900. What is the
check digit?
 Why is it presented in this format?
Batch Processing 1
Batch controls are fundamental to most computer
based accounting systems. The main stages of
batch control are:
Batching. Documents are arranged in batches by
being placed in a wallet or clipped together. A batch
cover note (see next slide) is attached to the batch.
Numbering. Each batch is allocated a unique
number, which is entered on the batch cover note.
Batch Registers. Each department responsible for
processing the batch records its receipt and
dispatch in a register. It is then possible to check
that all batches have been dealt with and trace any
batch that gets lost or delayed.
Batch Processing 2
 Batch Totals. Control totals are obtained for each
batch. The control totals comprise:
 The total number of documents in the batch
 Totals of the fields that are required to control
e.g. total value of invoices
 Hash Totals. A hash total is a sum of values
calculated solely for validation purposes e.g.
adding all the employee numbers together.The
computer should be able to perform the same
calculation and report any discrepancy.
 All of the above is recorded on the batch cover
note so that the results of processing can be
checked against manually calculated values.
Verification
 Verification is the process of entering data twice.
 The second entry is compared to the first to
ensure that it is accurate.
 It is commonly used in batch processing where a
second data entry clerk will key in each batch to
verify it.
 You will have come across this technique when
changing a password and you are asked to reconfirm your new password.
Validity vs. Accuracy
 It is possible to to ensure that data is valid.
 It can never be guaranteed to give us correct
information as the following examples show:
 “A student’s DOB is 02/01/87”. The data falls
within a valid range and is entered in the correct
format. However, her actual DOB is 02/10/87!
 “The market research questionnaire data we
collected shows that 95% of the population eats
soup twice a day”. However, this research was
carried out outside a soup kitchen!
 Remember valid data can lead to inaccurate
information.
Further Reading
 Paper
 Heathcote – pages 127 – 131
 De Watteville – page 164
 Mott – pages 66 – 68
 Web
 www.creditron.com/checkdig.htm
(Bar codes and Mod10 explained)
 www.lcweb.loc.gov/issn/check.html
(Modulus 11 explained)
Download