Unit 2: Organization of Data

advertisement
Unit 2: Organization of Data
Statistical Enquiry
A statistical inquiry is a process of transforming raw data into useful information that can tell us more
about a subject and allow us to make recommendations and possibly make predictions of future
outcomes. It consists of two stages:
1. Planning and Preperation
1.
2.
3.
4.
5.
6.
7.
8.
Object of enquiry
Scope of the enquiry
Units used for collection and measurement
Sources of data
Method of collection of data
Framing a format
Accuracy level
Type of equiry
2. Execution and Survey
a)
b)
c)
d)
e)
f)
g)
Setting a team of administrators
Designing of questionnaire
Selection and training of enumerators
Field work by enumerators and supervision
Follow up work in the case of non response
Analysis of collected data
Preperation of final report
Collecting Data
Collection of data is the first and most important stage in any Statistical Survey. The method for
collection of data depends upon various considerations such as objective, scope, nature of investigation
and availability of resources. Direct personal interviews, third party agencies, and questionnaires are
some ways through which data is collected.
Primary data
Data collected for the first time keeping in view the objective of the survey is known as primary data.
They are likely to be more reliable. However, cost of collection of such data is much higher. Primary data
is collected by the census method. In other words, information with respect to each and every individual
of the population is observed.
Collection of primary data can be done by any of the following methods.
a)
b)
c)
d)
e)
Direct personal observation
Indirect oral interview
Information through agencies
Information through mailed questionnaires
Information through schedule filled by investigators
Direct personal observation
In the direct personal observation method, as illustrated in figure 2.4, the investigator collects data by
having direct contact with units of investigation. The accuracy of data depends upon the ability, training
and attitude of the investigator.
Merits
1. We get the original data which is
more accurate and reliable.
Demerits
1. This method consumes more cost.
2. Satisfactory information can be
extracted by the investigator
through indirect questions.
2. This method consumes more time.
3. Data is homogeneous and
comparable.
3. This method cannot be used when the scope
of investigation is wide.
Indirect Oral Interview
Indirect oral interview is used when the area to be covered is large. The investigator collects the data
from a third party or witness or head of institution. This method is generally used by police department
in cases related to enquiries on causes of fires, thefts or murders.
Merits
1. Economical in terms of time, cost and man
power
2. Confidential information can be collected,
3. Information is likely to be unbiased and
reliable
Demerits
1. The de gree of accuracy of
information is less.
Collecting Information Through Agencies
Methods of collecting information through local agencies or correspondents are generally adopted by
newspaper and television channels. Local agents are appointed in different parts of the area under
investigation.
Merits
Very cheap and economical
Useful where information is needed regularly
Demerits
Information may be biased
It is difficult to maintain the degree of
accuracy and uniformity
Through mailed questionnaires
Often, information is collected through questionnaires. The questionnaires are filled with questions
pertaining to the investigation. They are sent to the respondents with a covering letter soliciting
cooperation from the respondents (respondents are the people who respond to questions in the
questionnaire).
Merits
Most economical
Saves manpower
Can be widely used
Demerits
Cannot be used if informants are illiterates
Many informants will not respond
In case of non-response, follow up work is
essential.
Information through schedule filled by investigators
Information can be collected through schedules filled by investigators through person al contact.
In order to get reliable information, the investigator should be well trained, tactful, unbiased and
hard working.
Merits
Useful when informants are illiterates
Rate of non-responses is less
Demerits
Training of investigators is essentials
Time consuming
Personal bias of investigators may lead to
failure of enquiry.
Secondary Data
Any information, that is used for the current investigation but is obtained from some data, which has
been collected and used by some other agency or person in a separate investigation, or survey, is known
a secondary data. They are available in published or unpublished form.
The various sources of published data are:
1. Reports and official publications of international and national organizations as well as
central and state governments
2. Publications of several local bodies such as municipal corporations and district boards
3. Financial and economic journals
4. Annual reports of various companies
5. Publications brought out by research agencies and research scholars
Questionnaire
A questionnaire is a research instrument consisting of a series of questions and other prompts for the
purpose of gathering information from respondents. Although they are often desi gned for statistical
analysis of the responses, this is not always the case.
Guidelines for Construction of a Questionnaire
The following principles are to be considered:
1.
2.
3.
4.
5.
6.
7.
8.
9.
Number of questions should be as less as possible
Questions must be simple to understand
Questions should be arranged logically
Answers to questions must be short.
As far as possible questions on personal matters must be avoided
Any clarifications on questions must be provided in the footnote
Necessary instructions must be given to informants
Questionnaire must be attractive.
Information supplied must be kept confidential.
Census
A census is the procedure of systematically acquiring and recording information about the members of a
given population. It is a regularly occurring and official count of a particular population.
Merits
Results are accurate and reliable
Demerits
Non sampling errors are likely to be more
Data are collected from each and every unit of Requires money, labor and time
the population
Provides detailed study of all units in population It isn’t possible in some circumstances when
population is vast
Free from sampling errors
While procuring data, if units are damaged
census enumeration is not suitable
Sample Survey
In statistics, survey sampling describes the process of selecting a sample of elements from a target
population to conduct a survey. The term "survey" may refer to many different types or techniques of
observation. In survey sampling it most often involves a questionnaire used to measure the
characteristics and/or attitudes of people.
Merits
Requires less labor, time and is economical
Demerits
Requires adoption of appropriate sampling
methods and appropriate analysis
If population is too heterogenous in nature,
Sample survey is more scientific
use of sampling procedure is impossible
Applied for units which are destructive in natureSampling errors are always there
Free from non-sampling errors
Difference between Census and Sample Survey
Census
Only few units of the population studied.
Sample Survey
Each and every unit of the population is
studied.
Relatively less amount of finance, till labour is Requires large amount of finance, time and
labour.
required.
Results are quite reliable.
Results are less reliable.
It is more suitable if population homogeneous in It is more suitable if population is
heterogeneous in nature.
nature
It can be used, if part of the population is
missing.
It cannot be used when part of the population
is missing.
The following are some of the methods of sampling:
Simple Random Sampling
In a simple random sample (SRS) of a given size, all such subsets of the frame are given an equal
probability. Furthermore, any given pair of elements has the same chance of selection as any other such
pair (and similarly for triples, and so on). This minimizes bias and simplifies analysis of results. In
particular, the variance between individual results within the sample is a good indicator of variance in
the overall population, which makes it relatively easy to estimate the accuracy of results.
Systematic Sampling
Systematic sampling (also known as interval sampling) relies on arranging the study population
according to some ordering scheme and then selecting elements at regular intervals through that
ordered list. Systematic sampling involves a random start and then proceeds with the selection of every
kth element from then onwards.
In this case, k=(population size/sample size). It is important that the starting point is not automatically
the first in the list, but is instead randomly chosen from within the first to the kth element in the list. A
simple example would be to select every 10th name from the telephone directory (an 'every 10th'
sample, also referred to as 'sampling with a skip of 10').
Stratified Sampling
Where the population embraces a number of distinct categories, the frame can be organized by these
categories into separate "strata." Each stratum is then sampled as an independent sub-population, out
of which individual elements can be randomly selected. There are several potential benefits to stratified
sampling.
Statistical Error (Sampling Errors)
It is the difference between estimated value and actual value
Causes of Errors
1. Error of origin: Due to improper definition of statistical units.
2. Error of inadequacy: due to incomplete data
3. Error of manipulation: Error that occurs during analysis
Biased and Unbiased Errors
Errors which occur with the notice of the investigator are called biased errors. They are prejudiced
errors
Errors which occur without the notice of the investigator are called unbiased errors. Due to chance
these cannot be controlled.
Measurement of Errors
There are two types of measurements:
1. Absolute Error: It is the arithmetic difference between actual value and estimated value:
Absolute Error = Actual Value – Estimated Value or AE = a – e
2. Relative error: it is the ratio of absolute error and estimated value.
Download