Data Processing

advertisement
Data Processing
Data
Data is the term used to define facts which do not serve any useful purpose until they have been
converted in to a more meaningful form by data processing operations.
Data processing
Refers to a class of programs that organize and manipulate data, usually large amounts of
numeric data.
Computer data processing is any process that a computer program does to enter data and
summarize, analyze or otherwise convert data into usable information.
Accounting programs are the prototypical examples of data processing applications. In
contrast, word processors, which manipulate text rather than numbers, are not usually
referred to as data processing applications.
Evolution of data processing
Data processing (the collecting, manipulating, and distributing of data) has been practiced since earliest
recorded history. The methods of data processing have gone through an evolutionary process: from
manual (data processing), to electromechanical (automatic data processing), to electronic or
computerized. Electronic data processing is often referred to simply as data processing.
Purpose of data processing
Whether manual, electromechanical, or electronic, the purpose of data processing remains the same:
to organize raw data into meaningful information needed for decision-making.
In common parlance, data and information are used interchangeably. But strictly speaking, they are
distinct and separate. A list of the day's checks and deposit slips (the data) means very little to a bank
manager until they have been manipulated into summary report form (information), giving the total
number and dollar value of deposit and withdrawals. Information, then, is data that has been organized
and processed.
The purpose of data processing is to evaluate and organize data, to produce meaningful information
that can be used in decision-making. To be of value, information must be delivered to the right person
at the right time, and in the right place. It must be accurate, timely, complete, concise, and relevant
Data processing is a specialist activity that concerned with systematic recording, arranging, filling,
processing and dissemination of facts relating to the physical events that occur.
Steps in data processing
Input
Three steps are involved when inputting data into the computer: collection, verification, and coding.
Collection refers to gathering the data from a variety of sources and assembling it. Verification means
checking the data to determine whether it is accurate and complete, and if it should be included for
processing. Coding is translating the data into machine-readable form. Data punched into IBM cards is
one example of coding.
Process
During processing or manipulation, one or more of the following tasks may be performed on the input
data.
a) Classifying. Data are organized by characteristics meaningful to the user. For example; a
student may be identified by Social Security number, class and exam number.
b) Sorting. In this step, the data may be arranged in a particular sequence to facilitate processing.
c) Calculating. Calculations may be required to determine a patient's account balance or a
student's grade point average.
Output
a) Output activities include retrieving, converting, storing, and communicating. Retrieving involves
pulling information from storage devices for use by the decision-maker.
b) Converting means translating information from the computer form used to store it, to a form
understandable by the user (such as, a CRT display or printed report).
c) Storing involves transferring the data onto a storage medium, such as a disk or tape file for
future use.
d) Communication takes place when the relevant accurate information is in the right place at the
right time.
More explained data processing model
Data processing cycle
It is Sequence of steps performed repeatedly by a computer in the execution of a program. The
computer's central processing unit (CPU) continuously works through a loop, involving fetching a
program instruction from memory, fetching any data it needs, operating on the data, and storing the
result in the memory, before fetching another program instruction.
Collection
For the data to be available it should be collected. Collecting data can be very time consuming and at
certain time boring. It is a process that most employees will want to run away from. It may involve
travelling or having to sit for a long time.
Preparation
After data is collected, it should be prepared for processing. Raw data cannot be processed. In today’s
world, computers are the main tools of data processing. The data collected should therefore, be
prepared for the computer. There should be codes assigned to each type of response or phenomenon.
This is quite technical and if not well done, the results will not be valid.
Methods of Data Collection and Preparation
Data can be collected manually or by using an automated mechanism.
Manual collection of data
Data can be collected by manually using following methods.
1. Observation Method
2. Interview Method
3. Thru Questionnaires/Schedules
Observation Method
Observation becomes a scientific tool and the method of data collection, when it serves a formulated
research purpose, is systematically planned and recorded and is subjected to checks and controls on
validity and reliability.
Main advantages are:



Subjective bias is eliminated
The information relates to what is currently happening
This method is independent of respondent’s willingness to respond
Main Limitations are:



It is expensive
The information provided by this method is very limited
Unforeseen factors may interfere with the observation task
Interview Method
The Interview Method of collecting data involves presentation of oral-verbal stimuli and reply in terms
of oral – verbal responses
Advantages






More information and in greater depth can be obtained
Resistance may be overcome by a skilled interviewer
Greater flexibility – an opportunity to restructure questions
Observation method can also be applied to recording verbal answers
Personal information can be obtained
Possibility of spontaneous responses and thus more honest responses
Disadvantages






Expensive method
Interviewer bias
Respondent bias
Time consuming
Under the interview method the organization required for selecting, training, and supervising
the field staff is complex with formidable problems
Establishing rapport to facilitate free and frank responses is very difficult
Data Collection thru Questionnaires
This method is popular in major studies. Briefly – a Questionnaire is sent (by post) to the persons
concerned with a request to answer the questions and return the Questionnaire. A Questionnaire
consists of a number of questions printed in a definite order on a form. The Questionnaire is mailed to
respondents who are expected to read and understand the questions and write down the reply in the
space provided
This method is





Low cost – even when the universe is large and is widespread
Free from interviewer bias
Respondents have adequate time to think thru their answers
Respondents who are not easily approachable, can also be reached conveniently
Large samples can be used
Also





Low rate of return
Respondents need to be educated and cooperative
Inbuilt inflexibility
Possibility of ambiguous replies or omission of items
This method is slow
Then the collected data are prepared in to a structured manner before the input stage.
Devices that can be used to collect data
Mark Sense Cards
Cards are divided into boxes that can be marked. A mark sense reader then scans the card and detects
where marks have been made.
Examples: Answer sheet, lottery ticket.
Advantages:



Simple to use
Fast to enter data
Data entry is very accurate.
Bar Codes
Bar codes appear on almost every item you can buy from books and newspapers to tins of beans.
Barcodes are made up of a series of lines used to represent information. The information stored is as
follows:




The country the product comes from.
The code for the manufacturer of the product.
The code for the product name and size.
A check digit used to ensure the data is entered correctly.
In a shop the bar code is scanned and further information about the price and amount of that product
there is in stock is accessed from the shops computerized database.
Using barcodes in shops means that items do not have to be individually priced and the change the
price only a single field in the database needs to be changed.
Other advantages are that prices cannot be entered incorrectly by the shop staff and the speed at
which items are scanned is much quicker, causing less queuing and less tills needing to be opened.
Magnetic Stripes
Magnetic stripes can be seen on train tickets or bank or credit cards. These stripes hold a small amount
of data (64 characters) and can be read by a magnetic stripe reader (card reader) that is connected to a
computer system.
These provide a quick and accurate way of entering details into a computer system and are simple to
operate.
Smart Cards
Most bank and credit cards are now smart cards. Cards have their own processor and memory that can
hold up to 64KB of data. The data that is stored can be updated and the processor can process simple
programs.
Magnetic Ink Character Recognition
Special magnetic ink is used to print details that can be read by a magnetic ink reader. Cheques have
the bank account number and sort code printed in magnetic ink.
Advantages of using magnetic ink on cheques include:



Bundles of cheques can be processed very quickly.
It is very difficult to forge a cheque.
The ink can be read by the reader even if the cheque gets marked or dirty.
Optical Character Recognition
Hardcopy is scanned and the image is then looked at by OCR software that recognizes text. Most
scanner come with OCR software. It should be noted that OCR software usually only recognizes printed
text, not handwriting and then it only recognize certain fonts. The main advantage of OCR is that time
can be saved not having to retype documents.
Input
This is another time consuming process in data processing. For large companies, a lot of people are
needed for several days to have this work done. The cost involves is so high and therefore, many
businesses are resorting to outsource for this purpose. In that way, they save cost. The volume of data
that must be entered in a day is so huge that people have set up data entry companies for that
purpose. There are many others who are freelancing in this area. Data entry does not require much
education. It is a simple process. What is required is speed and accuracy.
Input Methods
Data input method can be direct or indirect.
Direct Input
Direct data are machine readable data that can be fed to the system directly. The process of data
conversion is time consuming and error prone. This can be avoided by using direct input method. Credit
card is such a device that contains data that input in direct method.
Example: Credit card reader is one of the direct input devices. The credit card has magnetic
strip which is fixed on the card which contains vital information viz., owner’s code and the
details. This card is inserted into card reader and it processes up the details. Then card number
is noted and the amount is credited.
Indirect data
Data is in the human readable form hence it has to be converted into machine readable form. This
involves the data conversion. This process of data conversion is time consuming and error prone and it
causes a major bottleneck in the data processing, keyboard, mouse and joystick are some of the
examples of indirect input devices.
Checking Data Entry
When data is entered into a computer system is it important to try to ensure that it is correct. There
are two methods used to try to do this:
1. Validation
The computer system checks to see that data meets certain requirements but they do not ensure that
data is correct.
Validation checks and how they work:
a.
b.
c.
d.
Presence Check - does not allow the user to continue until an entry has been made.
Range Check - Ensures data is between a specified upper and lower limit.
Field Length Check - Ensures the correct number of characters has been used.
Data Type Check - Ensure the correct type of data (numbers/text/date.) has been
entered.
e. Check Digits - This is a common form of validation used in barcodes and other
validation of numeric codes. The check digit is the last number of the code and it is
calculated by doing a calculation on the other numbers in the code. When the number
is entered, the computer carries out the same calculation and it checks its answer
against the check digit entered. If the two are the same then the number is accepted as
being entered correctly.
2. Verification
Data is checked to ensure that it is exactly correct. There are two methods of verification:
a. Double Entry - The user enters the same data twice and if both entries are exactly the
same then it is assumed that the entry is correct.
b. Checks - Once data has been entered the computer asks the user to read the data and
click YES or NO depending on whether the user thinks the data is correct or not.
Processing
After the input is over, then comes the time for the processing itself. This is the time various means and
methods are used to manipulate the inputted data. In the past, when computers were not available, or
when they were not common, people had to do this. It was then a herculean task. However, with an
advent of computers the process is now very easy. Many software programs are available for
processing large volumes of data within very short periods. Some are general while some are for
specific industries and processes. Computers have made data processing at this stage very easy. A click
at a button is enough to produce the content.
Output and interpretation
The importance of data processing is to provide information that will guide future company policies.
That makes output very important. When the output is available, it should be interpreted in a way that
makes it useful for the company. Without interpretation the company does not benefit from the whole
process.
The output can be interpreted using devices like monitors and printers. These are the most common
methods of data output but often a different method is used.
Output to File
The data processing cycle shows that output from one process is often used as input for another
process. Output can be saved as a data file that is then stored on backing storage from where it can be
loaded as input for a process when required.
Output activities





Output activities include retrieving, converting, storing, and communicating.
Retrieving involves pulling information from storage devices for use by the decision-maker.
Converting means translating information from the computer form used to store it, to a form
understandable by the user (such as, a CRT display or printed report).
Storing involves transferring the data onto a storage medium, such as a disk or tape file for
future use.
Communication takes place when the relevant accurate information is in the right place at the
right time
Storage
The last stage in data processing is storage. The data inputted, and the result of the process must be
stored in a safe manner. This will enable it to be used another time. If the process is not stored, there
will not be a good ground for future comparison.
Since as the output data/information input data may be valuable and need to be recorded or stored
safely.
Data storage is the holding of data in an electromagnetic form for access by a computer processor.
There are two main kinds of storage:
Primary storage is data that is held in in random access memory (RAM) and other memory devices that
are built into computers.
Secondary storage is data that is stored on external storage devices such as hard disks, tapes, CD's.
Hard disks
Often called a disk drive, hard drive or hard disk drive, this method of data storage stores and provides
relatively quick access to large amounts of data. The information is stored on electromagnetically
charged surfaces called 'platters'.
Floppy disks
A floppy disk is a type of magnetic disk memory which consists of a flexible disk with a magnetic
coating. Almost all floppy disks for personal computers now have a capacity of 1.44 megabytes. Floppy
disks are readily portable, and are very popular for transferring software from one PC to another. They
are, however, very slow compared to hard disks and lack storage capacity. Increasingly, therefore,
computer manufacturers are not including floppy disk drives in the products as a built-in storage
option.
Tape storage
Tape is used as an external storage medium. It consists of a loop of flexible celluloid-like material that
can store data in the form of electromagnetic charges. A tape drive is the device that positions, writes
from, and reads to the tape. A tape cartridge is a protectively-encased tape that is portable.
Optical disks
An optical disc is a storage medium that can be written to and read using a low-powered laser beam. A
laser reads these dots, and the data is converted to an electrical signal, finally converted into the
original data.
CD-R
Compact Disc-Recordable ("CD-R") discs have become a universal data storage medium
worldwide. CD-Rs are becoming increasingly popular for music recording and for file storage or
transfer between personal computers. CDR discs are write-once media. This means that - once
used -they cannot be erased or re-recorded upon. CD-R discs can be played back in any audio
CD player or CD-ROM drive, as well as many DVD players and drives.
CD-RW
Compact Disc-Rewritable (CD-RW) disks are rewritable and can be erased and re-recorded
upon over and over again. CD-RW discs can only be used on CD players, CD-ROM drives, and
DVD players and drives that are CD-RW playback-compatible.
DVD
A DVD (Digital Versatile Disc or Digital Video Disc) is a high density optical disc with large
capacity for storage of data, pictures and sound. The capacity is 4.7 GB for single sided, single
layer DVD disc - which is approximately 7 times larger than that of a compact disc.
Data Processing systems
Data Processing System is a system which processes data which has been captured and
encoded in a format recognizable by the data processing system or has been created and
stored.
Data processing systems can be categorized in to following categories by the operation
method that uses.
1. Batch Processing
2. Real-Time Processing
Batch processing systems
Non-continuous processing of data, instructions, or materials. In data transmission, batch
processing is used for very large files or where a fast response time is not critical. The files to
be transmitted are gathered over a period and then send together as a batch.
Most data processing is done using batch processing (also known as serial, sequential, or offline processing). Batch processing involves processing transactions on the computer at
specified times.
Telephone billing system for example, is normally processed in batch mode. On a
predetermined date at a predetermined time, the variable information about phone usage
(call records- numbers, time duration, rate, etc.) is entered and the computer produces all of
the phone bill checks and information at the same time. The phone bill information is allowed
to accumulate and entered as a batch or group at a central computer site or other location.
Advantages




It allows sharing of computer resources among many users and programs.
It shifts the time of job processing to when the computing resources are less busy.
It avoids idling the computing resources with minute-by-minute manual intervention
and supervision.
By keeping high overall rate of utilization, it better amortizes the cost of a computer,
especially an expensive one.
Disadvantages


There is always a delay before work is processed and returned.
Batch processing usually involves an expensive computer and a large number of trained
staff.
Real-Time Processing Systems
Data processing that appears to take place, or actually takes place, instantaneously upon data
entry or receipt of a command.
In computer science, real-time computing, or reactive computing, is the study of hardware and
software systems that are subject to a "real-time constraint".
Operational deadlines from event to system response. Real-time programs must guarantee
response within strict time constraints. Often real-time response times are understood to be in
the order of milliseconds and sometimes microseconds. In contrast, a non-real-time system is
one that cannot guarantee a response time in any situation, even if a fast response is the usual
result.
Examples: Anti-missile defense systems, airplane landing control system, electronic fund
transfer systems and tickets reservation systems.
Advantages


There is no significant delay for response.
Information is always up-to-date.

Output from the computer may be used to adjust and improve the input.
Disadvantages


A computer must be dedicated solely to the task.
The computer must be continually online.
And there are more processing systems like
Online Processing- This is a method that utilizes Internet connections and equipment directly
attached to a computer. It is used mainly for information recording and research.
Distributed Processing- This method is commonly utilized by remote workstations connected
to one big central workstation or server. ATMs are good examples of this data processing
method.
Download