Data Collection

Data Collection
• Many economic, social and business
questions need clarification and
• Solutions need to be presented and
criteria agreed for their acceptance or
• Decision makers not only need data but
also need to evaluate the quality of data.
Data Collection contd.
• Quantitative data can describe the size of a
business, its profitability, its product range, and
the characteristics of its workforce and a host of
other factors.
• However, numbers alone are unlikely to give us
the understanding of the business problem that
we require. We need to take account of the
people involved, the culture of the enterprise,
the legal and economic environment.
Data Collection contd.
• Most significant business problems are likely to
require a multi-disciplinary approach.
• In fact, few problems are purely qualitative in
• Examples:
• If we consider the personnel problem of staff
recruitment we soon begin to describe job
requirements in terms of age, income and other
measurable factors.
Data Collection contd.
• If we consider another personnel problem
of assessing training needs, then we can
become involved in a major statistical
Problems of Data for Decision Maker
• The completeness of data:
The decision maker will need to decide whether
the current data is sufficient for the purpose or
whether additional data should be acquired.
Data collection takes time and can be costly.
• The quality of data:
Data that have bias or are misleading can
damage any effective decision making process.
Questions of data
• A prerequisite of any statistical enquiry is an
understanding of the purpose. The broad groups
of question are:
• What is the relevant population?
• What are the sources of data?
• How many people were asked and how were
they selected?
• How was the information collected?
• Who did not respond?
• What type of data was selected?
Defining population
• The term population can be used to describe all
the items or organizations of interest.
– For example, an audit is concerned with the
correctness of financial statements. The population of
interest to the auditor could be the accounting
records, invoices or wage sheets.
• In the case of job opportunities, the population
could be all the local businesses or
organizations employing one or more persons.
Identifying relevant population
• Identification of the relevant population is
essential since data collection can be a costly
exercise and contacting large numbers of people
who could have nothing to do with the survey will
only waste these valuable resources.
• If you are interested in why people bought
foreign-built cars, but failed to contact
purchasers of the imported models, then you
might fail to identify the fact that some buyers do
not realize that their car is foreign built.
Sample Frame
• Having considered relevant population, the next
problem is to identify who these people are; and
• To get a list of their names and addresses. If this
list can be obtained, it is called a sampling
• Many surveys, particularly in market research
need a general population of adults, and make
use of the Electoral Register.
Sample Frame contd.
• When a list does not exist or is not
available, then those collecting the
information may either try to compile a list,
or use a method of collection which does
not require a sampling frame.
Sample Size for Multivariate
• Fairly large sample sizes are needed for multivariate
analyses. The large sample size is necessary because
the correlations used to calculate these statistics are not
very stable when based on small samples.
• Tabachnik and Fidell (2001, p. 117)* offer the following
formula for computing the sample size required for a
multiple regression analysis:
N ≥ 50 + 8m
Where m equals the number of predictor variables.
* Tabachnick,
B. G., & Fidell, L. S. (2001) Using Multivariate Statistics (4th ed.). Boston: Allyn and Bacon.
Sample Size for MA
• So, if you have five predictor variables, you would need
a minimum 90 participants in your sample.
• Larger samples may be needed if your data are
skewed, there is substantial measurement error, or you
anticipate weak relationship among variables.
• Care is also needed about too large a sample. With
overly large samples, very weak relationships that may
have neither theoretical nor practical value can achieve
statistical significance.
* Tabachnick,
B. G., & Fidell, L. S. (2001) Using Multivariate Statistics (4th ed.). Boston: Allyn and Bacon.
Sample Size for MA
• Several factors should be considered before
using multivariate statistics.
– Make sure that your data meet the assumptions of the
test you are going to use (that is, normality, linearity,
and homoscedasticity);
– that you have removed any outlier or minimized their
effects through transformation;
– that you have considered error of measurement; and
– that you have gathered a sufficiently large sample.
• If you violate the assumptions of the test or fail to take
into account the other important factors, the results
you obtain may not be valid.
Critical steps
• Considering the purpose of a statistical enquiry
and defining the relevant population are the
most important and critical steps in a market
research survey.
• If we are not sure about the purpose of the
enquiry, and we are not selective about the
information collected, what is the likely value of
any subsequent, complex statistical analysis?
This is surely similar to the computing saying
GIGO – ‘garbage in, garbage out’.
Data Collection
• The next step is to obtain data on the
population of interest.
• A statistical enquiry may require the
collection of new data, referred to as
primary data, or be able to use existing
data, referred to as secondary data, or
may require some combination of both
Data Collection Methods
• Interviews:
Face to face interviews, telephone
interviews, computer-assisted interviews,
and through the electronic media;
• Questionnaires:
Personally administered, sent through the
mail, or electronically administered; and
• observation of individuals and events with
or without videotaping or audio recording.
Sources of data
• Primary data sources:
Individual, focus groups, and a panel of
respondent specifically set up by the
researcher whose opinions may be sought
on specific issues from time to time;
Sources of data contd.
 Secondary data sources:
 Company’s records or archives, government
publications, industry analysis offered by
media, web sites, the internet, and so on.
 In some cases, the environment or particular
settings and events may themselves be
sources of data, as for example, studying the
layout of a plant.
Sources of data contd.
• Data can also be collected from case
studies and
• Any of the many sources of secondary
data for analysis and application to solve
specific problems.
Sources of data contd.
 In survey research, the three main data
collection methods are:
 Interviewing,
 administering questionnaires, and
 observing people and phenomena.
Sources of data contd.
• Secondary data can come from within the
organization, internal secondary data, or
from outside the organization, external
secondary data.
Internal & external secondary
data sources
• Internal data sources include employee
records, payroll information and customer
• External data are the official statistics
supplied by the central statistical office
and other government departments.
Exploring Secondary Data:
Exploratory Research
• Expand understanding of management
• Expand understanding of research
• Identify plausible investigative questions
Levels of Information
• Primary sources
• Secondary sources
• Tertiary sources
Types of Information Sources
Indexes and Bibliographies
Secondary Sources by Type
Indexes and Bibliographies
– to find or locate books or articles
– to find authors, topics to use in online searches
– to identify jargon of an industry--used for online searches
– to identify bell-weather events in an industry
– to identify knowledgeable people to interview
– to identify organizations of influence
– To identify historical or background information
– To find critical dates within an industry
– To find events of significance to the industry, company
– To find facts relevant to topic
– To identify influential individuals through source citations
– To identify influential people and organizations
– to find addresses, e-mail, other contact info on these people and organizations
Evaluating Information Sources
Evaluating Sources
– What the author is attempting to accomplish
• identify hidden agenda(s)
• identify direction of bias
– Seek both biased and unbiased sources
– Identify dates of inclusion and exclusion
– Identify subjects of inclusion and exclusion
– Identify background of author
• Credentials: educational, professional
• Experience: duration, setting, level
– Identify the level of scholarship in content
• footnotes, endnotes
– Identify knowledge level and background
– Identify orientation and bias
– Seek biased and unbiased sources
– Order of content
– Versatility of use
• Indexed?
• Searchable?
• Downloadable?