the presentation and analysis of emperical and qualitative data

THE PRESENTATION AND ANALYSIS OF EMPERICAL AND QUALITATIVE
DATA
C. Njerekai (Department of Tourism and Hospitality Management)
Midlands State University, Zimbabwe, Gweru
Paper presented at Chinhoyi University of Technology on a one-day research
workshop held on the 17th of November, 2010
1. DATA ANALYSIS
1.1 INTRODUCTION
The analysis of data is carried out to decipher meaning from the data and draw
conclusions that address the research questions and objectives. Data analysis will allow
the researcher to accept or reject hypotheses and to make recommendations with respect
to the research problem.
1.2 STEPS FOR DATA ANALYSIS
Step1: Edit the data
This process is undertaken to ensure that the data gathered is relevant, accurate
consistent, complete and usable. Wherever possible, editing should be done in the field,
during and/or soon after the completion of the survey. The following points need to be
noted when editing data;
 Where responses are inconsistent, discard the inconsistent answers. As an
example, a respondent who indicates that their highest level of education is
ordinary level cannot be a college professor by profession. In this case, the
profession is inconsistent with the level of education and hence it should be
discarded.
 In the case of missing and unclear responses, the respondents can be re-contacted
to fill it in. However, missing insignificant data should be discarded.
 You must also check out for responses that are too consistent throughout the
questionnaire and validate these.
Editing can also be done manually and electronically after you have returned from
the field. Electronically, the computer can check for inconsistent responses or
coded values that lie outside the permissible range. If you decide to employ
research assistants, you must ensure that the necessary field controls are in place
to minimize error during the actual data collection. More precisely, you need to
monitor and validate the data collection process. Monitoring involves ensuring
that the data gathering proceedings are conducted as required. As an example, you
can listen to some of the interviews between the respondent and the research
assistant. Validation involves checking the accuracy of the data collected by the
1
research assistants. In this case, you can contact a section of the sampled study
subjects to confirm the validity of the data and to confirm that data gathering
actually took place.
Step 2: Code the data
Coding refers the process of assigning numerical values to all response categories in
questions. It involves classifying each response in a questionnaire with a designated
numerical symbol. Coding is necessary for the transfer of data into computers for analysis
involving statistical software packages such as SPSS. The coding needs to be done for
both qualitative and quantitative data as there are also qualitative and quantitative
computer assisted data analysis techniques. Coding can be done before the questionnaire
is administered (pre-coding) or after administration (post-coding). Pre-coding is suitable
for structured or closed questionnaires while post-coding is mostly suitable for
unstructured or open-ended questions. The process of coding requires establishment of
categories. These categories should be;
 Mutually exclusive
 Convenient in number
 Exhaustive viz; include possible responses like don’t know, not applicable, other
etc.
Step 3: Analyse Qualitative Data
Step 3.1: Transcribe or capture the data
This is the process of inputting data into the computer. It involves transferring the data
from the data collection instruments into a manual or computer readable format.
Although this process can also be done manually as will be discussed later, in today’s
computerized world, data analysis is almost always computer based. Given the range and
variety of the computerised data analysis techniques, the only way to familiarise yourself
with these packages is to practise using them using dummy or actual data collected from
the field.
Step 3.2: Analyse the qualitative data
There is great variety and diversity in approaches to the analysis of qualitative data.
However, despite this diversity, common threads that run through this variety have been
identified to the extent that, a few qualitative methodological techniques have been
crafted to analyse responses from qualitative research questions or studies. These
techniques are discussed next;
Qualitative data analysis techniques
(a) Computer assisted qualitative data analysis software
2
Several software packages have been developed specifically to analyse qualitative data.
While such packages may save time they can also distance the researcher from the data
and there really can be no substitute for the complex processes of the reading and
interpretation [powers of the human mind. Examples of CAQDAS include QSR*Nvivo
7.0, NUD*IST etc.
These software packages offer well set out workspaces from which you can:
 Note themes, patterns and connections emanating from the responses to
each question
 Have an overview of the responses
 Construct flow diagrams, flow charts, networks etc, which can further
bring out typologies and establish relationships between the responses and
 Draw, evaluate and verify conclusions.
(b) Use of quasi-statistics
This involves quantifying the quantifiable qualitative responses.
(c) Use of topical question list or hermeneutics
Hermeneutics is the study of interpretation theory. The topical questions, which can
be used to assist in this process include ‘why’, ‘what’, ‘how’, ‘when’ and ‘where’.
(d) Semiotics. This refers to the study of signs and their meaning, e.g logos, brands etc.
(e) Taxonomy. This refers to the practice and science of classification. In this case one
is supposed to code and classify the data.
(f) Micro-analysis. This involves the analysis of events as captured in videos and films.
This is done in order to identify critical incidents.
(g) Comparative analysis. This relates to the comparison of current qualitative data with
trends in literature and experience.
(h) Reconnaissance. This involves familiarization yourself with the data.
(i) Reduce, reduce and reduce to reduce clutter
(j) Use of mind maps, flow charts and diagrams to visualize the data
(k) Hunt for connections and develop theory from the data
3
Step 4: Analyse Quantitative Data
4.1 Transcribe or capture the data
This item has already been discussed under item 3.1 of this apaper.
4.2 Undertake Preliminary data analysis
It is advisable to start with simple analysis of data before subjecting it to more complex
and sophisticated data analysis. The preliminary data analysis that can be employed in
this endeavour are as follows;
(a) Simple one-way tabulation
Simple one-way tabulation involves counting the responses in each category and hence
creating a basic data array. In other words, simple one-way tabulation involves assigning
a value to each variable in the array or criteria relating to the sampling unit(s). After this,
frequency distributions are created from the data array. Such frequency distributions help
you to draw meaningful conclusions from the data. Table 1 is an example of a simple
one-way tabulation and the associated frequency distribution.
Table 1: A simple one-way tabulation
QUALIFICATIONS
NUMBER OF
EMPLOYEES
O-Level
5
A-Level
3
Certificate
1
Diploma/Higher
4
national Diploma
Degreed/Post graduate
1
(b) Cross-tabulation of data
This involves the construction of data arrays in such a way that the respondent with a
particular response to one variable can be traced, to check how they responded to other
variables. Such tabulations emphasize and bring out the relationship between variables.
4
As an example, one can cross-tabulate the ages of respondents and the recreational
activities they undertake. Cross tabulation can produce new variables.
c) Statistical summarization of data
Measures of central tendency and dispersion are the main statistical techniques you can
use to summarize and describe your data hence the term descriptive statistics. The
measures of central tendency aim at reducing your large volumes of research data to a
single measure, which characterizes the entire set of data. These measures are
summarized in table 2.
Table 2: Summary of measures of central tendency
METHOD
PURPOSE/USEFULNESS
i. Mean
Used as a descriptive statistic to indicate the typical or
central figure in a group of items, figures or numbers.
The mean can be misleading where the figures are
clustered around extremes.
ii. Mode
This refers to the most frequently occurring value in a
distribution. In layman’s language, this can be referred
to as the most fashionable or usual figure. This figure is
however, not necessarily the mean or the median.
However, in a normal distribution, it is possible for the
mean, the mode and the median to be of the same value.
(iii) Median
This refers to the middle item in a distribution when the
values are arranged in ascending or descending order. It
is more useful when the data has extreme values or
outliers
(i) Range
This reveals the difference between the highest and the
lowest values in a distribution
(ii) Inter-quartile range and
This refers to the difference between the upper and
semi-inter-quartile range
lower quartiles of the data.
(iii) Mean deviation
This refers to the average of the absolute difference of
each value in the distribution from the mean.
(iv) Standard deviation
This refers to the dispersion of the items in the
distribution around the mean. Where this figure is high
there is more homogeneity or uniformity in the data and
the opposite is true.
(v) Variance
This refers to the square of the standard deviation. This
measure is rarely used in social science research
For thorough procedures on the derivation of the measures of central tendency and
dispersion in table 2, please refer to the numerous textbooks on statistics.
5
4.3 Undertake more complex analysis of the data
This is mainly achieved through subjecting the data to statistical data analysis techniques.
Such statistical data analysis techniques are undertaken to help you make intelligent
decisions and conclusions pertaining to the research problem. Measures of association
and inferential statistics are the main techniques used in this endeavor. These will be
discussed in more detail below;
i) Measures of association
These measures double up as descriptive and inferential statistics. They relate to the
relationship between two or more sets of data. This relationship is most vividly shown
through the use of scatter graphs, which clearly bring out the nature and pattern of
associations. The positions of the correlation coefficients between the variables can be
summarized as follows;
(a) When variables are in perfect association i.e. an increase in one causes a
proportional increase in another, then there is a perfect positive correlation. In this
case the correlation coefficient is +1.
(b) When there is no correlation between variables, the correlation coefficient
is 0.
c) Correlation coefficients between two 0 and +1 can be described using descriptive
words such as ‘strong descriptive words such as ‘strong’ correlation or weak
correlation.
(d) When variables are in a perfect negative correlation, an increase in one causes
a proportional decrease in the other.
NB* Please bear in mind, that correlation does not always establish causation even
though the causation may be present.
ii) Inferential statistics
Inferential statistics are used to draw inferences, informal guesses or conclusions on
whether certain situations are by chance or are reflective of the true situation. However
the decision on the statistical tests you will use is taken at the start of the data collection
process. This is because; certain statistical tests only work with certain data collection
techniques and the resultant types of data collected e.g. nominal, ordinal, interval data
e.t.c. The sample size and the population from which you collect the sample can also
influence the type of statistical tests to be used. At this point, we would like to reiterate
that, a detailed outline of how the various statistical values are computed will not be
given in this book. Rather, situations and examples where these statistical tests can be
carried out will be given. You are therefore advised to visit the numerous textbooks on
statistics and research methods for a step-by-step derivation of these statistical values.
There are basically two types of tests that you can apply to tourism research viz
parametric and non-parametric tests. Parametric tests are tests that are used to establish
6
the correlation between continuous random variables. These tests require some
assumptions of the population before application. As an example, one has to assume that
the data is normally distributed. These tests only use interval and ratio data. Nonparametric tests make no assumptions about the population. These tests work with ordinal
and nominal data. The various types of parametric and non-parametric tests are
summarized in table 3.
Table 3: Summary of parametric and non-parametric tests
PARAMETRIC TESTS
The student’s t- test
This test is used to compare means from
small samples each usually less than 30.
The null hypothesis states that there are no
significant differences between the means.
There are various types of t- tests
depending on whether one is using paired
or unpaired sample means.
NON-PARAMETRIC TESTS
The Chi-square test
This test helps to compare the differences
between the observed and the expected
frequencies. It therefore deals with grouped
or categorical data. Its values reveal how
close the observed frequencies of the
categories are to the corresponding
expected frequencies of the same
categories. Chi-square tests use nominal
data.
Analysis of variance (ANOVA)
This test is used when one wants to
compare two or more sample means to
measure significant differences.
Related t- test
This test compares the actual size of the
differences between matched scores in each
condition or time. To employ the test, one
has to assume that, the data was taken from
a random sample, that the samples were
taken independent of each other and that
the population was normally distributed.
The sign test
This test compares the differences of pairs
of nominal scores from repeated measures.
In such cases, the null hypothesis states
that there is no difference between the
scores. The alternative hypothesis states
that there is a difference.
Wilcoxon signed rank test
This test uses ordinal data from pairs of
scores from repeated measures. The same
null and alternative hypotheses as in the
sign test are used. The main difference with
the sign test is that the differences in the
scores measured are placed in rank order.
Pearson’s product moment correlation
coefficient
This test determines the degree to which
two variables vary together. It is applicable
to ratio data drawn from a normal
distribution.
Regression analysis
This is an advanced form of regression
analysis, which can be used to predict
values of one variable, knowing the other.
With regression analysis one can estimate
the causal and functional relationships
between two variables when only one of
them is known.
Mann Whitney U-test
This test is used to test whether ordinal
data collected from two different samples
7
taken independently of each other and at
random also differ significantly. In this
case, the scores in each sample are
compared by ranking the individual scores
and determining whether the ranks are
evenly distributed or not. If the distribution
of the ranks is lop-sided, real differences
would be existing between them
Spearman’s rank order correlation
coefficient
This is used to determine the degree of
association between two sets of ranked
data. The test is more applicable to ordinal
data from randomly selected samples. Just
like the other correlation coefficients
discussed under item 3.5 of this booklet, its
value ranges between +1 and –1 and the
interpretation of these values is as
explained in figure 6.
(iii) Computer assisted data analysis and presentation
In this case, the computer makes intelligent decisions and conclusions pertaining to the
research problem using measures of association and inferential statistics. The measures of
association bring out the relationship between two or more sets of data. Using inferential
statistics the computer can draw inferences, and conclusions on whether certain situations
are by chance or are reflective of the true situation. However, please be reminded that the
statistical tests that the computer can perform will depend on the data collection process.
This is because; certain statistical tests only work with certain data collection techniques
and the resultant types of data collected e.g. nominal, ordinal, and interval
data e.t.c. The sample size and the population from which you collect the sample can also
influence the type of statistical tests to be used.
Basically, the computer can perform two types of tests to your data, viz parametric and
non-parametric tests. Parametric tests are tests that are used to establish the correlation
between continuous random variables. These tests require some assumptions of the
population before application. As an example, one has to assume that the data is normally
distributed. These tests only use interval and ratio data. Non-parametric tests make no
assumptions about the population. These tests work with ordinal and nominal data. The
various types of parametric and non-parametric tests have been already summarized in
table 3.
NB* Most students who are not well versed with these computer-based packages, have tended to
outsource this service and hence have protected their ignorance of the use of these packages.
This behaviour should be condemned. What is required is for you to be conversant with these
programs through practice. In this case, pilot data can be used for such purposes or you can
adopt a sitting by “Nelly” approach, which is equivalent to on the job training. The use of
computer assisted data presentation and analysis should be wholly encouraged among tourism
students.
8
2. THE PRESENTATION OF DATA
Step 1: Write suitable introduction
As usual, this introduction should capture the intention of the section..
Step 2: Conceptually come up with the sub-headings for this
chapter
The conceptual sub-headings you come up with, should be based on the research
questions and objectives of the study. It is not advisable, to turn all the questions in the
questionnaire into sub-headings for the mere fact that some of the actual questions in the
questionnaire would only be assisting in logically and systematically unlocking the
answers to the actual research questions. In addition, suppose this questionnaire had 50
questions, would it be logical to have 50 chapter 4 sub-headings? Certainly not!
Step 3: Present the data for each of the sub-headings you formulated in item 2
In the presentation of your results, the following points are worth noting:
 The general layout of your results must be clear and pleasing to the eye.
 Even ordinary laymen should understand the results.
 Diagrams, graphs and charts should be appropriately used.
 Identify trends, extremes e.t.c and comment on them.
 Blend the trends, extremes e.t.c with existing related literature, new or established
theories.
 Draw general conclusions to each of your sub-headings to this chapter. These are
the conclusions you will want to refer to in the last chapter of your dissertation.
 Relate your results to their effects on professional practice.
 Avoid including general points, which, although important, will not be arising
from your research results.
 Where various methods of collecting data were employed, as is usually the case,
the results from these methods should be blended and discussed under the
appropriate sub-headings for the chapter.
 The use of subjective quantitative terms such as ‘most,’ ‘some,’ ‘several’, ‘a large
number,’ etc are wholly discouraged. As much as possible, these terms should be
replaced by exact figures, percentages e.t.c as evidence that the variables in
question were measured.
The general methods of representing the data
Use of tables – a table is a systematic arrangement of data or information in a format
that allows the reader to observe variations or trends and to make comparisons
(Blaxter and Hughes 1996)
Uses of figures – bar charts, scatter graphs, pie charts, line graphs and maps. In this
case, you should examine the data carefully and decide what should be illustrated
9
and which type of illustration best suits the data or the information you wish to
present.
Descriptions/narratives
The various specific methods of representing data and when to use them
 Tables are good for presenting precise numerical data. In this case, you should use a table where the exact value is important or where there is need for strict
testing of results using statistical methods
 Figures and photos give vivid evidence of research findings. They present
information in a way that is easy to read and understand quickly. They present
information in a form which otherwise would need many words ‘a picture is
worth a thousand words’. They must be simple and clear so that readers can
immediately get the message.
 Do not use a graph to duplicate information already in a table.







Use a line graph if you wish to demonstrate the relationship between two data sets
or a dynamic comparison over time.
Use scatter graphs to portray the relationships or the strength of association
between two or more variables. In most cases, the variables will consist of a
dependent and an independent variable. The relationship between these variables
can also be positive, where an increase in the value of one also causes an increase
in the value of the other. The opposite is true where the relationship is negative.
Scatter graphs only give a rough estimate of the association. If you want to
measure the strength of the an association in a more precise manner, you then
have to calculate the correlation coefficients, which at this point, we would like to
assume that you are already familiar with.
Use a flow chart to explain a process or system.
Use a pie chart to show components that make up the overall total
Use histograms to show frequency distributions of observations for each class of
variable
Use simple bar graphs to show changes in the totals of the dependent variables
caused by changes in the independent variables. The changes in the dependent
variables will be shown by changes in the height of each bar.
Component bar graphs are mostly used where changes in the total and an
indication of the size of each component figure are required. There are two types
of component bar charts viz;
(i)
Component bar graphs (actuals), in which the component
length of the whole bar indicates actual figures.
(ii)
Percentage component bar graphs in which each component
length of a bar indicates the percentage of the overall bar
total .
10

Multiple bar graphs are mostly used where only changes in the actual values of
the component figures are required and the overall total is of no particular
importance (Kumar 1996).
Fortunately, most of the methods that have been described in this section can be produced
in a split of second by the computer as long as the data for the variables in question have
been captured.
Hints on the use of tables and figures in text
 Tables and figures should never precede their reference to them. This means that,
you should not just plant figures or tables without first making reference to them.
 Tables and figures should always immediately follow their reference to them
unless they are part of the appendix.
 Figures and tables should be used to aid understanding and bring out the main
findings of the research.
 Do not present the same data in different ways for no apparent reason.








Tables and figures should have titles derived from or related to the research
problem and be numbered in Arabic numerals. Figures should be fully labeled.
Tables or figures sourced from secondary data sources must be acknowledged.
The figure or table title and the relevant figure or table should appear on the same
page.
Table or figure titles should preferably be typed in lower case, done in bold,
aligned left and placed either below or above the table throughout the text.
Tables should not have unnecessary column and row spaces and should not
protrude beyond the left and right margins of the text.
A table or figure, which cannot fit on the remaining space of a page, should be
placed on the next page. Put text to cover the remaining page space.
Only those tables and figures that represent relevant information to the research
problem should be included in the text.
All tables and figures should be neat, simple, accurate and well labeled.
3. CONCLUSION
This paper has revealed the various ways of analysing and presenting both qualitative and
quantitative data. The paper has revealed that there are numerous ways of analysing and
presenting data depending on whether the data is qualitative and quantitative.
11
References
Altinay L. and Paraskevas A. (2008) Planning Research in Hospitality and Tourism, Elsevier
limited U.K.
Brotherton B. (2008) Researching tourism and Hospitality: A Student Guide, Sage Publications,
London.
Clark J. et. al. (1998), Research and Writing Dissertations in Hospitality and Tourism, Paul
Chapman Ltd, Oxford.
Fin, M. (2002), Tourism & Leisure Research Method, Data Collection, analysis and
interpretation, Pearson Longman, London.
Veil, A.J. (1997), Research methods for leisure and tourism, A practical guide, Second edition,
Prentice Hall, London.
12