Validate routine

advertisement
Data validation
Please use speaker notes for
additional information!
All data when it is initially entered into the system should be checked for
errors so that bad data does not get put onto permanent disk files.
Remember the rule: "Garbage in, garbage out!" This process of error
checking is called VALIDATING OR EDITING.
Data entered
into the
system
Validate or edit
program
Good
transactions
Errors
(transactions that
contain errors)
In this systems flowchart, I am showing data being entered from a screen. The
validate/edit program is checking the data. Data that passes the tests will be written on
the good transaction file. Data that contains errors are written to the screen. Other
methodologies will be shown on the next few slides.
Data being keyed
in
Keyed in data
stored on disk
Validate or edit
program
Errors
(transactions that
contain errors)
Here I am showing transactions
being keyed in and stored on disk
with no editing happening - this is
just data entry.
Good
transactions
Data entered
into the
system
Validate or edit
program
Errors
(transactions that
contain errors)
Note the double arrow between the data being
entered and the validate/edit program. This
means that the data is being checked and
feedback is going back to the person entering
the data so they can correct errors.
Good
transactions
Data entered
into the
system
Validate or edit
program
Errors
(transactions that
contain errors)
Good
transactions
Invalid
transactions
Can be viewed on the
screen, corrected and then
made a good transaction if
they have no errors. This
would involve additional
processing.
As can be seen, reporting can be an important part of editing. Both valid
and invalid records can be written. Usually if you are reporting valid and
invalid transactions, they are done on separate reports, but sometimes
you will see reports that mix valid and invalid record reporting.
The report can be done using a variety of styles depending on the needs
of the users. The important thing is that on a report of valid transactions
the entire record is printed if the purpose is a paper trail. On an error
report, the reader must be able to identify the error so it can be fixed.
The error report must contain:
* the id# or some other identifying field from the record
* the contents of the field that is in error
* an error message that explains the error
Examples of the kinds of errors that validating looks for:
Validating presence of data:
One common error is no data in a field where data is required. For example id is
frequently required as is name, hours worked for a payroll problem etc. A pseudocode
example testing for the presence of a name is shown below:
Set invalid indicator to no prior to entering the
validate routine
Validate routine
if name = spaces
report missing name
set invalid indicator to yes
end if
Validating data type:
The biggest issue here is is non-numeric data in a numeric field. However you can also
validate for character data in a character field. Pseudocode is shown below:
Assume that payhr is a numeric field and I want to make sure that no non numeric data is
entered in the field. Most languages have a way to ask if a field is numeric. Assume also
that paycode should be an uppercase character field.
Set invalid indicator to no prior to entering the validate
routine
Validate routine
if payhr is not numeric
report non numeric data in payhr
set invalid indicator to yes
end if
if paycode < “A” OR paycode > “Z”
report non uppercase character in paycode
set invalid indicator to yes
end if
Validating data codes:
The valid paycode may be S, F, P and those are the only codes you want entered in that field.
Set invalid indicator to no prior to entering the validate
routine
Validate routine
if paycode = “S” OR paycode = “F” or paycode = “P”
no processing
else
report non uppercase character in paycode
set invalid indicator to yes
end if
Validating data range:
if paycode < “A” OR paycode > “Z”
report non uppercase character in paycode
set invalid indicator to yes
end if
In the top example, I am saying that anything that is outside of the range is an error. I am
using an OR because if it is outside the range on either end it is a problem. This could not
be an AND because a character cannot be outside the range on both ends.
In the bottom example, I am saying that anything inside the range is valid and requires no
processing. I am using the AND because both conditions must be true to make it inside
the range. If either or both conditions are false, paycode is not in the range and I have an
error.
if paycode >= “A” AND paycode <= “Z”
no processing
else
report non uppercase character in paycode
set invalid indicator to yes
end if
N
paycode < A
N
paycode > Z
Y
Y
if paycode < “A” OR paycode > “Z”
report non uppercase character in paycode
set invalid indicator to yes
end if
Error &
set ind
Error &
set ind
N
Error &
set ind
if paycode >= “A” AND paycode <= “Z”
no processing
else
report non uppercase character in paycode
set invalid indicator to yes
end if
paycode >= A
N
Error &
set ind
Y
paycode <= Z
Y
Validating data range (another example):
For this example, I want to make sure that the payhr is within the range of 10 to 25.
Set invalid indicator to no prior to entering the validate
routine
Validate routine
if payhr >= 10.00 AND payhr <= 25.00
no processing
else
report payhr out of range
set invalid indicator to yes
end if
Set invalid indicator to no prior to entering the validate
routine
Validate routine
if payhr < 10.00 OR payhr > 25.00
report payhr out of range
set invalid indicator to yes
end if
Validating data range where the range is dependent on another field:
For this example, I want to make sure that the payhr is within the range of 10 to 25 for
employees with the paycode F.
Set invalid indicator to no prior to entering the validate routine
Validate routine
if paycode = “F” AND (payhr >= 10.00 AND payhr <= 25.00)
no processing
else
report payhr out of range
set invalid indicator to yes
end if
Set invalid indicator to no prior to entering the validate routine
Validate routine
if paycode = “F”
if payhr < 10.00 OR payhr > 25.00
report payhr out of range
set invalid indicator to yes
end if
else
Validating reasonableness and consistency:
For this example, I am checking to see if the state is reasonable for the zipcode. If the zipcode
is 02184 then the state must be MA. I will also check and see if the date of the payment is
larger than today’s date (that will be shown on the next slide).
Set invalid indicator to no prior to entering the validate routine
Validate routine
if zipcode = “02184”
if state = “MA”
no processing
else
report state inaccurate for zip code
set invalid indicator to yes
end if
else
...
Validating reasonableness and consistency:
For this example, I will also check and see if the date of the payment is larger than today’s date.
Set invalid indicator to no prior to entering the validate routine
Validate routine
if dateentered > todaysdate
report date inaccurate
set invalid indicator to yes
end if
else
...
Validating group of fields together:
For this example, I will also to see if an employee has worked 40 hours (nothing more and
nothing less) when I look at the regular hours, the vacation hours and the sick hours.
Set invalid indicator to no prior to entering the validate routine
Validate routine
emphrs = reghrs + vacahrs + sickhrs
if emphrs not = 40
report error in employee hours worked
set invalid indicator to yes
end if
else
...
Validating fields together where some should be empty:
In this example, if the code is S then there should be data in the salary field, but no data in the
pay per hour field.
Set invalid indicator to no prior to entering the validate routine
Validate routine
if paycode = “S”
if salary > 0
if payhr = 0 or space
no processing
else
report error in pay per hour
set invalid indicator to yes
end if
else
report error in salary
set invalid indicator to yes
end if
else
...
Validating accuracy:
The biggest problem is validating accuracy within a valid range. This is almost impossible to
do. For example, a payment sent in to a credit card company. If the person sent 120 and 210
was entered, this would be very difficult to catch. Batch processing can be used to check this
type of data.
Batch editing - in batch editing a group of transactions are grouped together as a batch, for
example 20 transactions might be called a batch - each batch is given a number - before data
entry, the batch of transactions are gathered and totals are run on significant numeric fields this might mean running a total on part number, on hand, cost etc. - as many totals can be
gathered as needed - this total information is entered into a batch header along with the batch
number - when the data is being keyed in, the batch header is keyed in followed by all of the
transactions in the batch and then another batch header followed by its transactions - in the
edit program the batch header is read and the totals on it are stored in memory, then the
transactions are read one at a time and the same totals are accumulated (if you did part number,
on hand and cost you would total the same three fields) - when a new batch header is read, it is
the signal that the old batch is complete and the totals are compared - if the totals that were
accumulated during processing do not match the totals from the batch header, the batch is
considered to be unbalanced and the information is printed out - the advantage of this system
is that the unbalanced batch only involves 20 transactions so finding the error or errors is a
much less significant problem than searching for the errors on thousands of transactions
Validating accuracy:
Check Digit - a check digit is the calculated last digit of an identification type of number such
as employee number or item number - for example, with an eight digit id # the first seven digits
would be assigned and the eighth digit (the check digit) would be calculated using special
formulas designed for this purpose - the eighth digit now becomes part of the id - any time the
id is typed in the calculation can be redone on the first seven digits to see if the answer is the
same as the eighth digit, if it is then the id is considered valid - this is a great technique for
catching transposition of digits etc.
ID: 5329012456
The check digit is created by running the rest of the id
through a formula to produce the check digit before the ID
issued.
Every time a transaction with that ID is processed. The
calculation is redone to make sure that the last digit is
what it should be.
Mainline
Housekeeping
Housekeeping
Set up
variables
Process
Open files
End Housekeeping
Wrapup
Process
Read a
record
Not EOF
Y
Process
Record
Loop
N
End Process
End Mainline
Wrapup
Close Files
End Wrapup
Validate routine
Process record loop
N
Y
Name=spaces
Set invalid
indicator to no
Write to
error
report
Validate
Routine
N
invalid
indicator
= no
Set invalid
indicator to yes
Y
Write to
good
transactions
Read data
to edit
N
Y
Amt > 5000
Write to
error
report
Set invalid
indicator to yes
End Process record loop
End Validate routine
Download