Uploaded by abdul ahmad kilima

RESEARCH METHODOLOGY BOOK FOR POSTGRADS AT OUT

advertisement
RESEARCH METHODOLOGY FOR
POSTGRADUATE STUDENTS
Prepared by:
DIRECTORATE OF RESEARCH, PUBLICATIONS AND POSTGRADUATE STUDIES
THE OPEN UNIVERSITY OF TANZANIA
DECEMBER, 2010
1
TABLE OF CONTENTS
MODULE ONE............................................................................................................6
INTRODUCTION TO RESEARCH............................................................................6
LECTURE ONE...........................................................................................................7
1.1 Introduction...........................................................................................................7
1.2 Learning Outcomes...............................................................................................7
1.3 What is Research?.................................................................................................7
1.4 What is a Theory?..................................................................................................8
1.5 Importance of Researches......................................................................................8
1.6 The Link Between Theory and Research..............................................................9
1.7 Research Hypothesis and Theory..........................................................................9
1.8 Deduction and Induction Theory Development Approaches..............................10
1.9 Types and Strategies of Researches....................................................................11
1.10 Summary............................................................................................................13
1.11 Review exercise.................................................................................................14
1.12 References.........................................................................................................14
LECTURE TWO........................................................................................................15
2.1 Introduction.........................................................................................................15
2.2 Learning outcomes..............................................................................................15
2.3 Developing a Research Topic and Problem........................................................15
2.4 Planning for the Research Project.......................................................................17
2.5 The Research Development Process...................................................................17
2.6 Research Ethics...................................................................................................18
2.7 Summary..............................................................................................................20
2.8 Review Exercise..................................................................................................20
2.9 References...........................................................................................................20
LECTURE THREE....................................................................................................21
3.1 Introduction.........................................................................................................21
3.2 Learning outcomes..............................................................................................21
3.3 Research Proposal...............................................................................................21
3.4 Summary..............................................................................................................28
LECTURE FOUR......................................................................................................29
4.1 Introduction..........................................................................................................29
4.2 Learning outcomes..............................................................................................29
4.3 Research Methodology........................................................................................29
4.4 Time Frame and Budget table and Ways of Making Decisions..........................31
4.5 References...........................................................................................................31
4.6 Appendices..........................................................................................................31
4.7 Summary..............................................................................................................32
4.8 Review exercise...................................................................................................32
Machi, L. A. & McEvoy, B. T. (2008 The Literature Review: Six Steps to Success.
Corwin Press...............................................................................................................32
MODULE TWO.........................................................................................................34
LITERATURE REVIEW AND REFERENCING.....................................................34
LECTURE ONE.........................................................................................................35
1.1 Introduction.........................................................................................................35
1.2 Learning outcomes..............................................................................................35
1.3 Literature sources................................................................................................35
2
1.4 Importance of conducting literature search effectively.......................................40
1.5 Planning your literature search strategy..............................................................41
1.6 Plagiarism............................................................................................................53
1.7 Summary..............................................................................................................54
1.8 Review exercise...................................................................................................55
1.9 References...........................................................................................................55
LECTURE TWO........................................................................................................56
2.1 Introduction.........................................................................................................56
2.2 Learning outcomes..............................................................................................56
2.3 What is critical literature review?........................................................................56
2.4 Importance of critical literature review...............................................................57
2.5 Writing a a critical review...................................................................................58
2.6 Structure of the critical review............................................................................63
2.7 Identification of research gaps (Concluding the literature review).....................67
2.8 Conceptual and theoretical frameworks..............................................................68
2.9 Summary..............................................................................................................69
2.10 Review exercise.................................................................................................69
2.11 References.........................................................................................................70
MODULE THREE.....................................................................................................72
RESEARCH DESIGN AND DATA COLLECTION METHODS...........................72
LECTURE ONE.........................................................................................................73
1.1 Introduction.........................................................................................................73
1.2 Learning outcomes..............................................................................................73
1.3 Need for research design.....................................................................................73
1.4 Features of a good research design......................................................................74
1.5 Types of Research Designs.................................................................................75
1.6 Summary..............................................................................................................94
1.7 Review exercise...................................................................................................94
1.8 References...........................................................................................................95
LECTURE TWO........................................................................................................96
2.1 Introduction.........................................................................................................96
2.2 Learning outcomes..............................................................................................96
2.3 Quantitative Data Collection Methods................................................................96
2.3.1 Sources of data..................................................................................................97
2.3.2 Steps in data collection.....................................................................................97
2.3.3 Need for correct sampling................................................................................98
2.3.4 Methods of data collection...............................................................................98
2.3.4.2 Collection of primary data.............................................................................99
2.4 Qualitative Data Collection Techniques............................................................104
2.4.1 Observational and quasi-observational techniques.........................................105
2.4.2 Projective techniques......................................................................................106
2.4.3 In-depth interviews.........................................................................................108
2.4.4 Direct Observation..........................................................................................109
2.4.5 Standardized tests...........................................................................................109
2.4.6 Case studies....................................................................................................110
2.5 Other important aspects in data collection........................................................110
2.5.1 Data storage....................................................................................................110
2.5.2 Ethical issues/considerations in data collection..............................................110
2.5.3 Challenges faced by researchers in data collection........................................111
2.6 Summary............................................................................................................112
3
2.7 Review exercise.................................................................................................113
2.8 References.........................................................................................................115
MODULE FOUR.....................................................................................................116
DATA ANALYSIS METHODS..............................................................................116
LECTURE ONE.......................................................................................................117
1.1 Introduction.......................................................................................................117
1.2 Learning outcomes............................................................................................117
1.3 Basic ideas about data analysis and presentation..............................................117
1.4 Methods of quantitative data analysis...............................................................119
1.5 Review exercises...............................................................................................149
1.6 Summary............................................................................................................150
1.7 Additional review exercises..............................................................................152
1.8 References.........................................................................................................153
LECTURE TWO......................................................................................................154
2.1 Introduction.......................................................................................................154
2.1 Learning outcomes............................................................................................155
2.2 Procedures for processing and displaying of qualitative data...........................155
2.3 Drawing and verifying conclusions...................................................................167
2.4 Reporting qualitative data..................................................................................170
2.5 Further strategies for testing or confirming qualitative findings to prove validity
..................................................................................................................................170
2.5 Summary............................................................................................................171
2.6 References.........................................................................................................171
LECTURE THREE..................................................................................................172
3.1 Introduction ......................................................................................................172
3.2 Learning outcomes............................................................................................172
3.3 Data analysis by computer.................................................................................173
3.4 Summary............................................................................................................192
3.5 References.........................................................................................................193
MODULE FIVE.......................................................................................................194
RESEARCH REPORT WRITING...........................................................................194
LECTURE ONE.......................................................................................................195
1.1 Introduction........................................................................................................195
1.2 Learning outcomes............................................................................................195
1.3 Rationale for Report Writing.............................................................................195
1.4 How to Get Started............................................................................................196
1.5 Preliminary Considerations...............................................................................197
1.6 Types of Research Reports................................................................................198
1.7 Components of Research Report.......................................................................198
1.8 Writing Style.....................................................................................................203
1.9 Layout of the Report..........................................................................................204
1.10 Drafts...............................................................................................................205
1.11 Review exercises.............................................................................................205
LECTURE TWO......................................................................................................206
2.1 Introduction.......................................................................................................206
2.2 Learning outcomes............................................................................................206
2.3 Citation..............................................................................................................206
2.4 References.........................................................................................................211
2.4.1 Importance of References...............................................................................211
2.4.2 Reference List ................................................................................................212
4
2.4.3 Plagiarism.......................................................................................................212
2.4.4 What should you include in reference ...........................................................212
For each reference you make in a reference list or bibliography, it is essential that
you record various pieces of information so that you keep track of all your references
..................................................................................................................................212
2.4.5 How to Collect and Organise References.......................................................213
2.4.6 Author Date Referencing Styles.....................................................................213
2.4.6.1 The Harvard Style........................................................................................213
2.4.6.1 APA STYLE................................................................................................218
2.5 Appendices........................................................................................................222
2.6 Bibliography......................................................................................................223
2.7 Review exercise 1..............................................................................................224
2.8 Review exercise 2..............................................................................................225
2.9 References.........................................................................................................226
5
MODULE ONE
INTRODUCTION TO RESEARCH
6
LECTURE ONE
RESEARCH PHILOSOPHY
(By Dr. M. Kitula)
1.1 Introduction
In this lecture, you will learn about various issues related to the philosophy of research. The
lecture provides you with the basic facts about research. The lecture starts with definitions of
research and theories about research. You will also learn about the importance of researches, the
relationship of theory and researches, the features of a theory, and theory testing which include
deduction and Induction.
1.2 Learning Outcomes
At the end of the lecture, you should be able to:
i)
Define research and theory concepts
ii)
Explain the importance of researches
iii)
Explain the components and features of a theory
iv)
Discuss the link between theory and research
v)
Discuss the theory testing procedure (deductive and Inductive analysis)
1.3 What is Research?
Research is a means of getting answers to our questions, it entails following a framework of a set
of principles, which base on procedures, methods and techniques tested for validity and
reliability, objectivity and prejudice. Once these variables are adhered to, one can say a research
is being done. A research therefore, is not necessarily an activity that involves complicated
procedures such as complex statistics and computers. However, no matter simplified the research
can be, it has to adhere to the research procedure to demonstrate the difference between a
research and non research activity. Various people have defined research in many different ways.
Some of the definitions are as presented hereunder.
Kerlinger (1986), defined research as a scientific and systematic controlled empirical and critical
investigation of propositions about the presumed relationship about various phenomena.
Woody, C (1990), defined research as an activity which comprises of defining and redefining
problems, formulating hypothesis or suggested solutions, collecting, organizing and evaluating
data, making deductions and reaching conclusions.
7
On the other hand, Ngechu (1998) defined the word research as a logical purposeful formal and
critical activity and as a systematic step by step process, and as a method of science which
identifies a problem, gather data, analyse and interpret the data which leads to conclusion and or
raising more research questions.
It can generally be said that, a research is a scientific process of identifying an issue or problem,
collecting information, processing data and reporting the findings. The scientific process
addresses issues or problems of concern in the community or society at large, to prove concepts
as theories, to discover new things such as a cure for a disease or new industrial products.
1.3.1 Features and Characteristics of a Research
The definitions of a research entails that a research must have specific features. That is, a
research has to be scientific, logical, systematic and must have a plan. A research is scientific
because it is logic, systematic, has a plan for collection data and a theory that guides it. The
research is logical because it follows a path of reason which is analytical, and rational, and it is
systematic because it follows a specific procedure.
An activity of inquiry qualifies to be called a research if it has certain standard characteristics.
These characteristics are many but only six of them are mentioned here as presented by Kumar R
(2005); He says, a research must be controlled, must be systematic, must have be rigorous, it
must be viable and verifiable, must be empirical and must be critical.
1.4 What is a Theory?
A theory is a set of ideas or opinion which explains the way things are and or why they exist.
These ideas and or opinions have to be explained systematically and scientifically, basing on
facts and rules related to the phenomena in question. A theory has various components, but thee
are three most important ones. These are concepts, variables and statement, which when verified,
they form theories (Turner, 1996).
1.5 Importance of Researches
Researches are a means of getting answers to our questions. Both practical and theoretical
problems are solved as they describe situations, test hypothesis, explore new ways of thinking
through rational decisions, and generate new ideas. The researches thus do three tasks. These
tasks are to describe, explore and explain.
Basing on this explanation, it is seen that, researches are important in various ways which
include:
a.
Generation of new knowledge and theories
b.
Providing solutions to various societal problems and concerns
8
c.
Developing new methods of inquiry through the experiences of methods used in
investigation
d.
Providing peaceful methods of challenging authorities on issues of concern of
citizens
e.
Educating the population on issues that touch their welfare and development in
general
Basing on the importance of researches as indicated above, the main users of the findings are
planners and policy makers who see to it that development activities are undertaken effectively
while addressing the recommendations of research findings. Other groups which use research
findings are academicians as they always seek for new knowledge and search for facts that lead
to proving authenticity of theories through deductive process; students in higher learning
institutions and owners of industries who constantly need to change, improve or create new
products in order to keep pace with market demands.
1.6 The Link Between Theory and Research
The day today life we live is basically dictated by theories that have led to set of principles that
guide our way of living. The activities which we routinely do automatically do involve the
process of creating things, ideas opinions, demonstrate application of concepts into practice and
make evaluation of what we. Kolb (1991) said, an event that occurs leads to an individual to
provide explanations of how and why something happened in the way it did. This explanation or
statements leads to generation of concepts or guiding principles or even hypothesis that can be
extrapolated into new events. A learner then applies the guiding principle in real life as part of
the way of life. Through this application of the established principles, these extrapolated guiding
principles are being tested. In the testing process, whether the rule was received or it was
generated, out of the prior experience and reflections, a new situation is created and new
experiences arise.
The cyclic process described above as narrated by Kolb (1991), demonstrates a process of
developing a theory. The evaluative explanatory statements build around what is going on
around us provide a certain trend of facts that give strength to developing some concepts which
pave way to theories for example a statement such as “Street children are a consequence of
broken marriages”
1.7 Research Hypothesis and Theory
Theory and hypothesis are in most cases used interchangeably. This is so because they are
closely related. A theory constitutes hypothesis developed to explain a particular phenomenon.
9
Each hypothesis asserts about variable relationships of several concepts. In view of this
therefore, when related concepts are put together, they build up a complete logical statement
which we can call a theory. Concepts are like building blocks when put together, they make a
house which is a theory.
The building up of a theory entails following a logical model of scientific inquiry which are
mainly two. These are deductive and inductive theory making approaches
1.8 Deduction and Induction Theory Development Approaches
1.8.1 Deduction Reasoning Approach
This is a method which entails a researcher to identify theories or set principles (laws), which
requires to go through the logical systematic and scientific process to find out the authenticity of
the theory or set principles. The type of procedure goes hand in hand with the left hand side of
Kolb’s learning cycle. An example of a theoretical statement that may require deductive
approach of inquiry could be stated as follows: “Short people are always argumentative to draw
attention for recognition and psychological satisfaction”.
In undertaking a deductive approach of inquiry, there are crucial stages which are followed.
These include having in place a theory or concepts of interest which have to be operationalized.
This is followed by laying down the rules of path for making observations, operationalization of
the process, that is, the actual investigation. It is at this stage that, the construction of a clear and
specific guidance about what and how to observe variables during investigation is established,
thus creating a standardized procedure. Once the investigation is done, and analysis is
completed, the findings are corroborated with the theory statement upon which the investigation
was done as a test to justify whether the theory is authentic or has lost its worthiness. The
Deduction process is as shown in the figure below.
I
N
D
U
C
T
I
O
N
EMMPIRICAL
GENERALIZATION
THEORIES
D
E
D
U
C
T
I
O
N
HYPOTHESIS
OBSERVATIONS
Figure 1.1.1: Deductive and Inductive Methods
Source: Wallace (1971)
10
1.8.2 Induction Reasoning Approach
The approach is to start with an idea which has to be consolidated and later go through the
process of observation from the empirical world to the construction of explanations which leads
to theory building. It is an approach which is opposite to the deduction approach which belong to
the right hand side of Kolb’s learning cycle. Glaser and Strauss (1967), reiterated that, this
approach becomes worthless if they are not grounded in observation and experience, hence the
birth of grounded theory. The approach inductively develops out of systematic empirical
research which evolves the theories which are grounded from within the peoples voices and
actual people’s views, opinion and actual people’s lives.
1.9 Types and Strategies of Researches
1.9.1 Types of Researches
There are two main categories of researches. These are Social and Natural Science Researches.
Within these two categories there forms two main types of researches which have branches.
These two main research types are Basic and Applied/Action researches. However, these two
main types of researches have branches of types of researches which are based on their
application, purpose, the orientation of the principle researcher i.e. practitioner, and the nature of
data being collected and related methods of data collection.
1.9.1.1 Basic and Applied/Action Researches
Basic Researches addresses questions which seek for more knowledge while Applied/Action
researches search for information which addresses operational problems.
The Characteristics of both basic and Applied/Action researches are as can be seen in table 1
below.
Table 1.1.1: Characteristics of Basic and Applied Researches
S/N
i.
ii.
iii.
Basic Research
Deals with researches which
Applied/Action Researches
Identifies existing societal problems and
advance knowledge
seeks solutions to the problems.
Seeks to solve Theoretical
Seeks to obtain operational information,
Dilemma
skills, knowledge and attitude.
Labours to prove authenticity of
Evaluates projects, programmes and plans;
existing theories and to develop
ctivities, monitor system and identify
new ones.
immediate problems that require immediate
solutions.
11
1.9.1.2 Branches of Research Types
The various branches of research types have titles which are related to their application, purpose,
the orientation of the principle researcher for example a doctor or an engineer, and the nature of
data collection. The different types of research branches include:
(a) Theoretical/Pure Research
This is concerned with solving theoretical puzzles which scholars have identified or created
within their disciplines to extend knowledge.
(b) Evaluative Research
This deals with the evaluation of performance of programmes in various fields such as education,
health, business, and accountability expenditures.
(c) Action Researches
This type of research is carried out collaboratively by members of various organizations for
example, hospitals and manufacturing companies who produce products consumed by the
hospitals
(d) Critical and Feminist Researches
Finds the truth about assumptions that are taken for granted. Such issues include gender
inequality, and patriarchy oppressive systems.
(e) Qualitative and Quantitative Researches
Both qualitative and quantitative methods complement each other.
(f) Formative Researches
Practitioners who are professionals like lecturers, doctors, engineers seek to improve their
practice and generate new findings.
(g) Participatory Researches
Based on the theory of Paulo Frere, which focus on the knowledge of
Learners and seeks to
enhance the ability of the poor to generate and control their own knowledge and plan and
participate in implementation.
(h) Summative Research
Similar to another type known as evaluative research. Aims at evaluating
projects or
programmes when winding activities.
1.9.2 Strategies of Researches
Strategies of research involve the style of the research process. The style is on the other hand,
controlled by various factors which include the size, and scope of the coverage, the type of data
being sought. Other factors may include funding and time limitations. The commonly used
strategies are as presented below:
12
(a) Experimental Strategy
The strategy is used when researchers want to establish causal connections between variables.
(b) Survey Strategy
Used in studies of large samples aiming at producing generalization about populations.
(c) Case Study Strategy
Applied in studies which involve the examination of a single instance of
of phenomena in order to generate rich and thorough
some broader class
understanding of the situation.
(d) Ethnography Strategy
Used for researches that involve full participation of community members for purpose of
observing the behaviour, social structure and relations of the people in their natural settings.
(e) The Action Research Strategy
Used in researches for the purpose of promoting changes in organized social practices and
development of knowledge of these changes through various processes and practices.
(f) Longitudinal Research Strategy
Involves carrying out a research over different times to check on indicators of change or trend on
the study subject. The study can be carried out after every one year, or five years on the same
issue.
(g) Cross-Sectional Research Strategy
Involves studying a broad range of issues at one single point in time. An example of such a
research can be the levels of migration mortality and fertility at that particular time in a district or
region.
1.10 Summary
In this lecture you have learned the definition of two concept, research and theory. A research is
defined as a scientific and systematic controlled empirical and critical investigation of
propositions presumed relationship about various phenomena. A theory is a set of ideas or
opinion which explain the way things are and or why they exist. These ideas and or opinions
have to be explained systematically and scientifically, basing on facts and rules related to the
phenomena in question.
You also learned about the characteristics and importance of research, that researches add
knowledge and solve societal problems, while a theory in research guide researches. You were
also introduced to the two main types of models of scientific inquiry which are deductive an
Inductive models of scientific inquiry.
13
Lastly you learned about the types and strategies of research. You were informed that, all
researches are either natural or social science researches. You were further informed that, there
are two main types of researches which are basic and applied/action researches. These main
types of researches have various types of research branches. Strategies of research were also
learned, which include experimental strategy, survey, case study, ethnography, action research,
longitudinal and cross sectional research strategies.
1.11 Review exercise
(i) Discuss the various definitions of researches as defined by various
authors. In your opinion,
how would you define a research?
(ii) Differentiate between a theory and a hypothesis, What is the role of concepts in developing a
theory?
(iii) Discuss the models of deductive and Inductive scientific Inquiry
(iv) What is the difference between research types and research strategies?
Explain
1.12 References
•
Kumar, R. (2005). Research Methodology: A step by Step Guide for Beginners. Sage
Publication, India Pvt Ltd New Delhi 2005. Pp. 6-14
•
Kothari, C. R. (2010). Research Methodology: Methods and Techniques. New Age
International Limited Publishers, 2010. Pp. 1-24
•
Gill, J. and Johnson, P. (2002). Research Methods for Managers. Sage Publications,
2002, London. Pp. 13-45
•
David, M. and Sutton, C. D. (2004). Social Research: The Basics. Sage Publication,
2004, London. Pp.3-70
14
LECTURE TWO
RESEARCH PROCESS AND RESEARCH ETHICS
(By Dr. M. Kitula)
2.1 Introduction
In lecture one you were learned about two main research concepts which are research and theory.
You also learned about the importance of research, the characteristic of research, the link
between research and theory the model of scientific inquiry, the types and strategies of research.
In this lecture, you will learn about the research process and research ethics.
2.2 Learning outcomes
At the end of this lecture, you should be able to:
•
Explain how one starts to develop an idea for a research topic
•
Narrate plans to develop a research proposal
•
Write a research proposal
•
Discuss issues related to research ethics
2.3 Developing a Research Topic and Problem
The task of developing a topic for a research project sometime becomes a problematic exercise.
This is because of the many related ideas that are in the current literature, or the idea could be
too popular to ensure whether your topic wont duplicate other studies already done and might not
be aware of and sometimes the idea you have is still hazy, it is not yet very specific. For
inexperienced young researchers, it could involve a great deal of floundering around and spend
sleepless nights trying to figure out what exactly should the topic be and its related research
problem. Floundering around is inevitable before you establish a topic. But as you get involved
in this process, you should be aware of the other various sources of research topics in order to get
more exposure.
2.3.1 Avenues for Sourcing Research Topics
Ideas that lead to formulation of a research topic are many. However, those with more exposure
like academicians, staff in research institutes and policy makers have more advantages than
others as they come across to various literature and through conferences and speech writing for
politicians and the like. However, every researcher can access the sources of research topic
through various means as indicated by Gill and Johnson (2002).
15
i)
A research topic can be sourced, from workplace through your personal
interaction with documents and issues raised by colleagues
ii)
Research topic can be sourced from academic documents and professional
journals
iii)
Research topic can be sourced through advertisement in the media or news
broadcasted and or thorough authorities’ statements
iv)
A research topic can be sourced from reports of conferences or public
speeches
v)
A research topic can be sourced from reports of projects and programmes
In addition to what Gill and Johnson suggested, a research topic can be sourced from reading
literature in the library; while Gay (1992) also said a research topic can evolve from personal
interest on an issue; that is, an individual can create his/her own topic and conduct a research.
2.3.2 Procedure to Follow in Generating a Research Topic
Topics for research can be many as they can be part of the speeches given in public or in media
or issues of concern from within your community or workplaces. In order for you to come out
with a topic that has the characteristics of a good research topic, you need to do the following:
•
Brainstorm over the various ideas you have in mind. You can even share your ideas with
friends to identify the topics in your interest
•
List down the various topics you have identified
•
Analyse the listed topics and by use of the characteristics of a good research topic, chose
one most viable topic as your research topic.
2.3.3 Features of a Good Research Topic
A research topic must have some features which can qualify it to be a viable topic for systematic
and logical search for information. Before a researcher advances in a topic in preparing a plan for
the research project, a consideration has to be made to ensure that the topic has the quality for a
research topic. There rare various characteristics which have been identified by various scholars
on research topic; Gill and Johnson (2002) had indicated some of these characteristics which
states that:
•
A research topic have the qualities of having access to where information could be
obtained
•
A Research topic should have the quality of having achievable objectives
•
A research topic should have the qualities of attracting funding agencies
16
•
A research topic should have value in the sense that its findings will either add
knowledge and provide solutions to community problems
•
A research topic has to be symmetry of the potential outcome. That is, a research topic
which can produce valuable results despite the risks that a research project can face.
2.4 Planning for the Research Project
A plan for the research project has to be established after a researcher has decided on on a viable
research topic. A plan for implementing the project is what we call a research proposal. In a
research proposal, the researcher translates the bright ideas into a statement which frames the
real problem that a researcher will address in the project. The ideas are further translated into set
of research aims and objectives.
The plan which you prepare (the research proposal), makes it easier for you to implement the
project but also it becomes easier to manage as every activity is planned and indicated on how to
implement it. Hence, the importance of a research proposal in the research project.
2.5 The Research Development Process
There are many stages of research development process depending on the type of research, style
and field of study orientation. Kothari (2010) had suggested seven steps of research process
while Kumar (2005) had suggested eight steps of research process. The two do not differ much
and therefore I shall list down Kumar’s steps and present a chart of the steps by Konthari. Any of
the two sets of steps can be used as preferred.
2.5.1 Kumar’s (2005) Eight Steps of Research Process
These include:
•
Formulating a research problem
•
Conceptualizing a research design
•
Constructing an instrument for data collection
•
Selecting a sample
•
Writing a research Proposal
•
Collecting Data
•
Processing Data
•
Writing a research report
2.5.2 Kothari’s (2010) Seven Steps of Research Process
17
The chart presenting the steps shows that, the researcher has to define the research problem,
Review the literature, Formulate the hypothesis, Design the research, Collect data, Analyse data
and Interpret and Report.
FF
FF
Review the literature
Review
concepts
And theories
Define research
problem
I
Formulate
hypothesis
Review
previous
Research
findings
III
Design
research
(including
sample
design)
IV
II
Collect
data
(Execution)
V
F
Analyse data
(Test
hypothesis
if any)
F
Interpret
And
report
VII
VI
F
Where
F = feed back (Helps in controlling the sub-system to which it is transmitted)
FF
=feed forward (Serves the vital function of providing criteria for
evaluation)
Figure 1.2.1: Steps of Research Process
Source: Kothari, (2010).
2.6 Research Ethics
Ethical Issues in research touches on all stakeholders of research, the respondents, the researcher,
the funding urgencies and the users.
Institutions which deal with researches including higher learning institutions are very sensitive
on issues of ethics in relation to researches. To ensure researchers observe ethical issues in
research, Institutions normally have policy forma that commit the researchers to observe ethical
issues at level of questions used, samples taken on side of basic researches/experimental, at level
of interview, and utilization of information obtained from the respondents to observe
confidentiality.
2.6.1 Respondents and Data collection
18
On the side of respondents, the researchers are obliged to inquire on ethical issues of their
respective respondents before they design the type of questions to ask their respondents to avoid
offending them. They also have to be aware of their peoples cultures so that they can cope with
the cultural practices of the people at the time they are collecting data.
2.6.2 Seeking Consent from Respondents
Researchers require to seek consent from respondents before they start collecting data. It is
unethical to collect data without consent of the respondents and their respective leaders.
2.6.3 Providing Incentives to Respondents
Some researchers provide incentives to respondents for providing information. Some consider
this as unethical. Providing incentives to respondents might make them to provide information
which could be exaggerated to impress the researcher to give more information. In other cases,
some respondents.
2.6.4 Seeking Sensitive Information from Respondents
Researchers may sometimes need information which could be sensitive or confidential to
respondents. The information could make them offended, upset, get embarrassed or be
considered as an invasion of privacy. Such information could include sexual behaviour, drug use,
shoplifting, age, salary, rape, battery and the like. The researcher needs to understand all these
ethical issues before embarking on data collection activity.
2.6.5 Confidentiality of Respondents’ Information
Information provided by respondents some of which could be confidential, should not be
revealed to any other person unless there is consent from the respondent who provided the
information. Revealing information to others without the consent of the provider is unethical.
2.6.6 Researchers’ Bias on Information Provided and Incorrect Reporting
Sometimes the researcher can deliberately avoid information for purpose of avoiding the facts to
cover up some people’s interest or his/hers. The report thus becomes distorted. This type of
manipulation is unethical.
2.6.7 Misuse of Information by Sponsors/Users of Research Findings
Having received the research findings, sponsors sometimes manipulate information to present a
different case hence distorting the actual findings of the research project. The same can be
tempted by the users. Treating research findings in this manner is unethical.
19
2.7 Summary
You have learned about the process of research and ethics. Avenues of sourcing research topics
were discussed which includes workplaces, academic documents and professional journals, the
media, authority statements and public speeches, programmes and libraries.
You also learned about the procedures of the procedure to follow in establishing a research topic,
that it has to be accessible, achievable, valuable and worthwhile to attract funds. The steps of a
research process were also discussed as presented by both Kothari (2010) and Kumar (2005).
Lastly, the ethical issues related to research were discussed. These research ethics included
issues of confidentiality, consent, collection of sensitive information, bias, and misuse of
information.
2.8 Review Exercise
•
Discuss the causes of difficulty in establishing the research topic and problem
•
What is the rationale of research planning?
•
Critique the various steps of research process as presented by both Kothari and Kumar
•
Why is it important for the researcher to ensure there is adequate understanding of the
culture and ethical issues of the study population before actual data collection? Explain.
•
Explain how researchers, sponsors and users can misuse research findings.
2.9 References
•
Kumar, R. (2005). Research Methodology: A step by Step Guide for Beginners, Sage
Publication, India Pvt Ltd New Delhi 2005. Pp. 6-14
•
Kothari, C. R. (2010). Research Methodology: Methods and Techniques. New Age
International Limited Publishers, 2010. Pp. 1-24
•
Gill, J. and Johnson, P. (2002). Research Methods for Managers. Sage Publications,
2002, London. Pp 13-45
•
David, M. and Sutton, C. D. (2004). Social Research: The Basics. Sage Publication,
2004, London. Pp.3-70
20
LECTURE THREE
COMPONENTS OF RESEARCH PROPOSAL – PART 1
(By Dr. E. Swai)
3.1 Introduction
In lecture two you learned about developing an idea for your research topic. You also learnt
about issues related to research ethics. In this third lecture, you are introduced to the components
of research proposal. This topic is also covered in lectures four and five.
3.2 Learning outcomes
At the end of this lecture, you should be able to:
•
Identify and list the components of a research proposal
•
Explain each of the components of the research proposal
•
Develop a research topic and statement of the research problem
•
Develop conceptual and or theoretical framework
•
Prepare a full research proposal
3.3 Research Proposal
A research proposal is a plan that guides the research project by indicating the strategy one has to
follow in doing the research project. It is a detailed plan showing the title, describing the
background of the research, stating the research problem, the hypothesis/research questions, the
objectives, the significance/justification of the research, for research. Research proposal is a
serious statement of intent to reach your goal. In short, the research proposal is the roadmap of
any researcher, showing how to conduct the research. Some people think of their research
proposal as their clear star to guide them in a voyage of discovery; the proposal thus is used to
chart a course and avoid undesired detours. Others see the proposal as similar to a proposal to
live with a friend or life partner; it indicates a willingness to engage in a significant undertaking
that has consequences for both parties. A good research proposal includes arrangements for
check-points so that you can make changes when necessary to assist you to stay on course given
the realities of life and the requirements for quality scholarship.
3.3.1 Components of Research Proposal
There are several elements in research proposal. The following are popular elements which
researchers address while preparing research proposals.
(i)
A Title/Research Title/topic
21
(ii)
Background or Introduction of research problem
(iii)
Statement of the problem
(iv)
Objectives
(v)
Research Questions/Hypotheses
(vi)
Justification/Rationale/Significance
(vii)
Conceptual Framework and Literature Review
(viii)
Research Design
(ix)
Research Methods
(x)
Data Analysis Methods
(xi)
References
(xii)
Budget and time frame
(xiii)
Certification
(xiv)
Appendices
3.3.2.1 A Title/Research Title/Topic
A research topic is your statement of the problem and major research questions stated in a
summary. It is an open-ended phrase that contains the least number of concepts. Examples of
research topics may be: Role of motivation in learning; Impact of Pollution on Environment;
Relationship between Teaching and Learning, etc.
3.3.2.2 Background of Research Problem
Background of research problem is the "nitty-gritty" of the body of your research proposal. It is
an extensive explanation of the background to the problem, which includes sufficient
information about the problem. Background of research problem reflects your scholarship and
show evidence of a thorough research of your topic.
The background
to the problem also
establishes the social significance of your study (the ‘who cares’ factor); demonstrates how the
problem is worthy expenditure in effort, time, and resources.
A good research problem describes in qualitative and quantitative terms, showing the nature and
scope of the problem. It shows precisely what the problem is, and what are the social concerns,
as well as its widespread. In the background of your research, identify the groups of persons who
affected and who are likely to care about the matter and tell us .why you think we should care.
Individuals reading this section of your work may disagree with your conclusions, but they
would get a good sense of what you think the concerns are in both their nature and scope.
3.3.2.3 Statement of the Problem
Statement of the problem is a statement which describes briefly the problem which the research
project is addressing and the reasons for interest in the topic. The statement of the problem
22
clarifies a question about the problem that has not been subjected into research. Researchers who
want to be sure of the feasibility of the problem statement of their research projects, they use a
check list to ensure their research projects are viable. Table 1.3.1 shows the check list used to
ensure research problem feasibility.
Table 1.3.1: Checklist for testing the feasibility of the research problem
ITEMS
1
Is the research problem of current interest?
2
Will the research results have social, educational or scientific value?
3
Will it be possible to apply the results in practice?
3
Does the research contribute to the science of education?
4
Will the research opt new problems and lead to further research?
5
Is the research problem important? Will you be proud of the result?
6
Is there enough scope left within the area of research (field of research)?
7
research problem?
Will it be practically possible to undertake the research?
9
Will it be possible for another researcher to repeat the research?
10
Is the research free of any ethical problems and limitations?
11
Will it have any value?
13
14
Do you have the necessary knowledge and skills to do the research? Are you qualified to
undertake the research?
Is the problem important to you and are you motivated to undertake the research?
Is the research viable in your situation? Do you have enough time and energy to complete the
project?
15
Do you have the necessary funds for the research?
16
Will you be able to complete the project within the time available?
17
NO
Can you find an answer to the problem through this research? Will you be able to handle the
8
12
YES
Do you have access to the administrative, statistic and computer facilities the research
necessitates?
3.3.2.4 Research Purpose and Objectives
Research purpose shows the aim of wanting to do the research project. The purpose of the
research is similar to the general object of the research as they present the overall statement of
the aim of the project. On the other hand, the objectives of the research is to present the specific
goals of the purpose of the research using action word which are verbs to indicate what exactly is
the researcher supposed to in order to get the required information. It is a statement showing the
activities the researcher does to reach the goal and this comes after you have identified your
research problem.
23
Both the research purpose and Objectives culminate from the statement of the problem which is
like the heart of the research project. Having a research objective without the aid of a problem
statement is like starting a journey with a very vague idea of where you want to go. An example
of the research objectives is as follows:
i)
To investigate performance level of OUT graduands in labour market.
ii) To examine the adequacy of the sociology curriculum in producing marketable output.
iii) To analyze the effectiveness of the lecturer’s motivational strategies at OUT aimed at
ensuring rentation of staff.
iv) To investigate whether there is any difference in learning between ODL mode and that of
conventional mode of teaching.
The objectives of the research project normally have three qualities. The objectives have to be
researchable; that is, it enables data to be collected on time without using excessive resources;
The objectives have to be measurable, that is, the information expected to be collected can
analysed using a specific criteria not necessarily numerical. Lastly, the objectives have to be
specific and or concrete so as to be able to show the specific targets and strategies to do the
research.
3.3.2.5 Research Questions/Hypothesis
(a) The Hypothesis
This is a statement which shows the predicted outcome. It is a guess which plays a role of giving
direction to the researchers thinking about the possible outcome of the research basing on the
reviewed literature and related theory. It is a predicted answer to the problem. The Hypothesis is
normally stated in the form of declarative sentences which are testable in order to prove the
statement correct or wrong. It is categorized into null or alternative hypothesis.
(b) Research Questions
These are used to guide the study. They are used instead of hypothesis especially in inductive
researches. A researcher interested in establishing the increased number of smoking youth
despite high awareness of the consequences, can decide to use research questions instead of
hypothesis.
On the other hand, there are questions which are used to collect data and are derived from the
specific objectives to get information to answer that specific objective. These are activity
oriented questions. In most cases, activity oriented questions are confused with research
questions which normally applied in social science researches instead of. It is important to
develop research questions that are analytical rather than descriptive. How do analytical
24
questions look like? Analytical questions move beyond the "what" and explore the "how," and
the "why." How do teachers in secondary schools motivate their students?” is analytical because
it will lead you not only to explore the activities to motivate students, but how they do it. A
descriptive question would ask: "What do teachers in secondary schools do to motivate their
students?
3.3.2.6 Justification/Rationale/Significance
The three concepts, justification, rationale and significance are normally used interchangeably.
These concepts establish the social significance of the study (the ‘who cares’ factor).
Justification of your research demonstrates that the expenditure in effort, time, and resources is
worth it, not only for you, but also for others. You should indicate and defend why it is necessary
to undertake the research, showing the benefits that will result from the research. For example,
justification for a research on, “Role of motivation in learning” may be: An understanding of the
role of motivation in learning is an important step towards creation of culture of learning. A
culture of learning can additionally promote critical thinking and can reduce dependency on
witchcraft and mysticism. Significance of a study refers to uses to which the findings are put.
3.3.2.7 Conceptual Framework and/or Theoretical Framework
(a) Conceptual Framework
A concept is a work or a phrase which symbolizes several interrelated ideas and meaning
(Strauss and Corbin, 1998). The phrases conceptual framework is a broader idea of a research
that contains key concepts and issues which a researcher wants to explore in the study.
Conceptual framework is a basic structure of a research consisting of certain abstract ideas and
concepts that a researcher wants to observe, experiment or analyze. When you connect these
abstract concepts, you develop a conceptual framework. Framing the research project
conceptually means placing boundaries around, thereby ruling in and ruling out certain lines of
thought. The “frame” is the road map or outline that would guide the exploration. In developing
conceptual framework, the researcher choose terms that connect the work to existing literature.
Developing a conceptual framework is a creative endeavor. It involves thinking deep and wide
about all the issues and topics you want to explore.
This conceptual framework may be depicted as funnel similar to the one described by Marshall
and Rossman (1999, p. 29). The broad end of the funnel (the mouth) represents the broadest idea
comprising the key issue (motivation); Intermediary ideas (effective learning) are represented in
the middle portion of the funnel. The narrow end represents the seed, or the gap in the literature
(the relationship between motivation and learning) (Figure 1.3.1).
25
Figure 4.1: Research Funnel: Exploring Relationship between
Motivation and Effective Learning
Defining Motivation
Measuring Effective Learning
Exploring relationship b/w Motivation and
Effective Learning
Figure 1.3.1: Conceptual Framework: Motivation and Learning
(b) Theory and Theoretical Framework
Strauss and Corbin (1998) define theory as a set of well-developed concepts related through
statements of relationships which together constitute an integrated framework that can be used to
explain or predict phenomena. A theory is more than a set of well-developed concepts; it offers
an explanation about phenomena.
A theoretical framework is a collection of interrelated concepts. Unlike conceptual framework
that tries to find connections between the concepts, theoretical framework comprises various
theories that you will use to explain issues in your study. Theoretical framework help you
determine what things and issues you will measure, and what statistical relationships you will
look for.
Developing theoretical framework means, theorizing your research. The term “theorizing”
denotes choosing terms that connect your work to existing literature. It also entails formulating
concepts and putting them into a logical systematic, and explanatory scheme. Developing a
theoretical framework is a complex activity. It involves exploring an idea fully, considering it
from many different angles or perspectives. It also involves thinking deep about the implications
of certain theories that have been used to explain the issues. Framing the research project
theoretically, means making decisions about and acting in relationship to many questions
throughout the research process.
The process of developing a theoretical framework is as important as the product as it sharpens
the researchers senses by forcing them reveal, critique, and defend biases and assumptions they
hold about the issue(s) they are exploring. In developing the theoretical framework(s), they must
scrutnize theories and explanations that challenge researchers own biases and assumptions. This
exercise allows researchers to reveal or bracket as much as possible their presuppositions
regarding the phenomena under investigation. The process also introduces uncertainty and doubt
26
regarding what is being investigated and how best researchers can investigate. While developing
the theoretical framework researchers become more and more self-critical; and this selfcriticality prepares researchers to look for and welcome surprises as the investigation unfolds. A
sample of the theoretical Framework is provided in figure 1.3.2.
Predictor Variables
Dependent Variable
Individual Domain
•
Motivation
•
Theories of Motivation
Program Domain
•
Teacher characteristics
•
Students activities
Teaching and Learning
Classroom Interactions
•
Peer pressure
Figure 1.3.2: Sample Theoretical Framework
The goal at this phase of the study is to develop an instrument (a theoretical framework) that will
sharpen and expand the ability to observe, describe and explain your issue. It is this goal which
also guides the literature review. The theoretical framework is also supposed to help the
researcher to make logical sense of the relationships of the variables and factors that have been
deemed relevant/important to the problem. It provides definition of relationships between all the
variables so the reader can understand the theorized relationships between them.
3.3.2.8 Literature Review
The role of the literature review plays is to expose the researcher to the problem he/she is
addressing. The researchers are familiarized more about the history of the problem, how it began,
and the consequences related to the problem, how other researchers have addressed the problem,
the theories that have been formulated related to the problem and the gaps that exist which are
yet to be bridged in order to solve the problem. The literature sought in order to enrich the
researchers knowledge and build better ideas and concepts about the problem of interest, has no
boundaries. This can be sought from all over continents electronically as societal problems are
cross cutting but even when they are localized, researchers can do research can do research in
other continents and publish findings internationally.
The researcher, while reviewing the literature, has to observe what other researchers have said
about the problem, what related researches have been done previously, what type of findings
have been produced , and what are the are the emerging gaps in the body of existing researches.
27
3.4 Summary
In this lecture, you have learned about the various components of a research proposal. You were
introduced to definitions of each of the components. The background is an extensive explanation
of the background to the research problem.
The statement describe briefly the problem which the research project addresses. The purpose
and objectives were also discussed. The purpose gives general aim while the objectives provide
the specific issues to be addressed. The hypothesis is a predictive statement that tries to answer
the problem while the research question is the same as hypothesis used by social scientists in
place of the hypothesis. Justification shows the worthiness of the project while significance
refers to the use of findings.
Conceptual frame work symbolizes several interrelated ideas and meaning while the theoretical
framework is a collection of interrelated concepts.
28
LECTURE FOUR
COMPONENTS OF RESEARCH PROPOSAL – PART 2
(By Dr. E. Swai)
4.1 Introduction
In lecture 3, you learned about the components of research proposal. You were later introduced
to the definitions of background to the research, problem statement, purpose and objectives,
research questions/hypothesis, justification of research, conceptual and theoretical framework. In
this lecture, you will learn more components of research proposal which includes, Research
Design, Data collection Methods, Data Analysis Methods, References, Budget and time frame
and Certification.
4.2 Learning outcomes
At the end of this lecture, you should be able to:
•
Explain the meanings of various research components related to research methodology
•
Choose appropriate research design and methods for research project
•
Use appropriate data analysis methods
•
Write references appropriately
•
Prepare time frame and related budget, and Certification.
4.3 Research Methodology
Research methodology is often a large section of the research proposal. Research methodology
comprises in addition to other parts, Research Design, Methods of Collecting Data and Data
analysis. Research Methodology application depends on how one understands the methodology
and how to use it, and the characteristics of the specific approach selected.
4.3.1 Research Area
This indicates to you all the possible areas that are supposed to be included in the study.
However, the researcher has to indicate the actual areas which can be reached during field data
collection as not all the areas can be reached due to various limitations.
4.3.2 Research Population
The research population refers to the elements of research. The elements which are going to be
involved in the study, either respondents if it involves people as elements of the study or
animals, insects, plants etc. The research population statement indicated who the elements are
29
and later provides an indicative percentage which shall be representative of the whole study
population which shall be directly involved in the study
4.3.3 Sampling Techniques
These are techniques engaged in identifying a representative study are and study population.
There are various techniques of sampling which engage some mathematical calculations. These
include simple random sampling, cluster sampling, purposive, systematic, stratified etc.
4.3.4 Research Design
The research design shows the type of the research and strategies to be used. It has three
common elements. These include the research area, the research population and the sampling
techniques to be applied after you have shown the type of research and strategies of research to
be followed. Both the researcher and research advisory committee members need to be
convinced that the selected design is better or most suitable for the intended research. The
research design is your general plan of how you will go about answering your research questions
(Saunders, Lewis and Thornhill, 2009). It is an overall plan of your research. The term research
design is a structure of your research, how do you want your research to look like. Like a
structure of a house, research design is influenced by your own preferences in consideration of
your own capacity, time and other resources needed to construct your research.
4.3.5 Methods of Data Collection
Methods of collecting data include specific instruments and procedure to collect information. In
historical and documentary research, for instance, a careful list of sources is expected and why
some are included while others rejected. In quantitative research, it is expected the sample size
and recruitment will be justified on particular standards. In observational naturalist inquiry, the
decisions taken to select what will be observed, when, where, and how need description. There
are two main approaches for data collection. These are inductive and deductive approaches.
In whichever approach one uses, the methods of collecting data are more or less similar. The
methods include questionnaires, interview schedule, Focus Group Discussion (FGD),
Observation, Participatory data collection etc.
4.3.6 Methods of Data Analysis
Here the researcher explains in details what she/he will do with the data; how the data will be
organized, inspected, entered in statistical programmes, analyzed, compared and interpreted. In
this section the researcher drafts possible tables and charts that will be used to present possible
relationships between several variables or categories. Here the researcher explains the
appropriate statistical tests, relevant analytical processes and possible points of comparison or
juxtaposition of paradoxes that will explicate what is going on in the data.
30
4.4 Time Frame and Budget table and Ways of Making Decisions
4.4.1 The Time Frame
This is the section that specifies who is doing what, when, and where. What are the costs in
money and time. What check-points will there be for evaluating how the research is progressing,
what are the problems, and what changes are required. Thinking about time and ways of making
decisions, especially a process to check mid-way cannot be ignored. Sometimes writing up a
budget about costs for preparing surveys, transcription costs, honorariums for participants, and
the time needed for observations are helpful to clarify feasibility of sampling.
In all research projects, there can be serious disagreements or unexpected problems. There may
also be pleasant surprises, such as finding an unexpected source of excellent data. It is necessary
to think about what processes can be used to resolve concerns, respond to opportunities, and
revise methodological decisions.
4.4.2 The Budget
Dependent upon the sponsor's preferred format, this may be incorporated within the body of
proposal, submitted as a separate document, or contained within an attachment or referenced
appendix. It should include a definitive line-item budget for all direct costs, and administrative or
indirect costs, unless prohibited by the sponsor. The extent of individual cost items should match
the scope of the project, reflect real or estimated cost burdens, and not be padded. Each major
cost item should be accompanied by a narrative explanation of the basis of costs, and avoid
jargon terms. Cost contributions, either "in-kind" or real shillings, may be required to be
explicitly identified by some sponsors. If a multiple year project, a detailed budget sheet should
be provided for each year, plus a consolidated or summary budget page totalling all cost
categories.
4.5 References
A working bibliography is essential to write a satisfactory proposal. It is the foundation for the
completed project or thesis. An example of this style shall be provided in the coming lectures.
4.6 Appendices
Appendices may include:
•
Drafts of submissions to ethics;
•
A guide to interviews;
•
A specific questionnaire or instrument;
•
Draft tables indicating how data may be analyzed, and
31
•
Letters of support of a proposal. Another useful appendix is
•
A proposed table of content for the completed project or thesis. Some proposals have few
appendices, as the students and their committees decide to spend their time clarifying particular
details when an ethics review is actually submitted, or drafting the table of contents after data
analysis has begun.
4.7 Summary
In this lecture, the definitions of each of the research components related to research
methodology were provided. You were also exposed to different research design, which include
sampling techniques. You were exposed to types and strategies of research in brief and the
research approaches. You were also exposed to different types of research. You also learned the
types of research data collection methods, data analysis methods, time frame, the budget and
how to write references.
4.8 Review exercise
•
What are three key elements of research methodology?
•
What will be the best design for you research project? In one paragraph explain why
you think the design you chose is the best considering the purpose of your research.
•
There are two major methods of collecting data for research project. What are they?
In half a page explain in specific terms how you plan to collect data for your
research, paying particular attention to research participants, specific instruments and
procedure that you will follow to collect information.
•
Prepare a budget and time frame for your research.
4.9 References
Galvan, J. L. (2003). Writing Literature Reviews: A Guide for Students of the Social and
Behavioral Sciences Pyrczak Publishing
Machi, L. A. & McEvoy, B. T. (2008 The Literature Review: Six Steps to Success. Corwin Press.
Marshall, C. and Rossman, G. B. (1999). Designing qualitative research (3rd ed.). Thousand
Oaks: Sage Publications.
Saunders, M. Lewis, P. and Thornhill, A. (2009). Research Methods for Business Students.
London: FT/ Pitman.
Strauss, A. and Corbin, J. (1998). Basics of Qualitative Research Techniques and Procedures for
Developing Grounded Theory. Sage Publications: London.
32
Van Manen, M. (1997). Researching Lived Experience: Human Science for an Action Sensitive
Pedagogy. Suny Series in the Philosophy of Education.
33
MODULE TWO
LITERATURE REVIEW AND REFERENCING
34
LECTURE ONE
SOURCES OF LITERATURE
(By Dr. P. Ngatuni)
1.1 Introduction
One of the problems we observe from proposals submitted by many of our students is the lack
of authoritative information to support their arguments. This is a signal that while information
is much more available nowadays, our students are unable to locate and interact with the
sources. Consequently, this lecture is designed to introduce you to the sources of literature, and
how you can interact with them to effectively get what you are looking for.
1.2 Learning outcomes
At the end of this lecture you will be able to:
•
Identify and describe a range of primary, secondary and tertiary literature sources
available;
•
Plan and carry out literature search effectively;
•
Identify sources of literature appropriate to your literature needs and develop an
understanding of the need to know how they work.
•
identify keywords and undertake a literature search using a range of methods including
the internet;
•
Evaluate the relevance, value and sufficiency of the literature found.
•
Describe what plagiarism is and identify various measures to avoid it.
1.3 Literature sources
Before you can search for the relevant information for your research project, you must first
have clear knowledge of the available literature sources. Knowing the sources and their nature
helps you determine the appropriate approach or technique to interact with such sources. For
example, some sources may only be available in print form – say in the cabinets of some
responsible offices or in libraries, while some may be available in electronic form, e.g. in CD
ROMs or in some kind of database accessible through the internet, or in both. So looking for
such information in the library while it is only available electronically (and vice versa) may not
help you much.
35
The literature sources available to help you develop a good understanding of, and insight on,
previous research can be divided into three categories: primary, secondary and tertiary sources.
Figure 2.1.1 shows these categories. It is important to note that the different sources indicated
in this figure do overlap; and that is what happens in reality.
Figure 2.1.1: Literature sources available
Source: Adapted from Saunders et al (2009:69) figure 3.2
The different categories of literature resources represent the flow of information from the
original source. Saunders, et al (2009: 69) argues that as information flows from primary to
secondary and eventually to tertiary sources it becomes (i) less detailed and (ii) less
authoritative, but (iii) more accessible. This is mainly because primary literature sources can be
difficult to trace. The arrows in Figure 5.1 show this flow. As you move from tertiary sources
backwards to primary sources, you are moving towards the original ideas, and therefore the
level of detail should increase. On the other hand, as you move from primary sources to
tertiary sources, the time to publish increases and the level of detail decrease. The nature of this
information flow is typical of traditional printed materials. With the current wave of moving
towards electronic publications thanks to the power of internet, this situation is changing
rapidly resulting into more direct means of both publishing and accessing information. With
this move, even what was grey literature, e.g. government publications, are increasingly being
made available via the internet. This is perhaps the main reason why a greater part of this
lecture will focus on electronic sources.
1.3.1 Primary literature sources
Primary literature sources are the first occurrence of a piece of work. They include such
sources as reports, some central and local government publications (e.g. white papers, planning
documents, unpublished manuscripts, etc). Unpublished manuscripts include sources such as
theses/dissertations, conference proceedings, company reports, letters, memos, e-mail
36
messages, committee meeting minutes, etc. These primary sources are sometimes referred to as
grey literature, simply because they are difficult to locate. Again with the advent of the
internet, an increasing number of these sources are now being made available via the internet.
Of these, the most accessible and that are most likely to be of use in showing how your
research relates to those of other people, are reports, conference proceedings and
theses/dissertations.
(a) Reports
Reports are documents produced by various organisations such as consultancy or advisory
firms, government department or agencies, as well as individuals or academics.
Access to reports may be difficult because (i) they are not as widely available as books (ii) are
not well indexed in the tertiary literature especially in developing countries like Tanzania. For
these reasons, you will need to inquiry extensively to know their existence and their location.
Even if you are able to locate them you may find it difficult to gain access. Some of the
organisations which produce them are increasingly being challenged to sustain themselves and
given the high cost of producing primary data, most of these producers charge access fees,
some of which may be prohibitive to individual researchers. The National Bureau of Statistics
(NBS) for example is the Central Statistical Office of Tanzania and is charged with conducting
censuses and surveys which yield a wide range of economic, social and demographic
statistics1. NBS produces the Central Register of Enterprises (CRE), a useful resource for
company research, but a CD-ROM version of this resource costs slightly above 400,000
shillings (2010 price).
Individual academics are also increasingly publishing reports and their research on the internet.
Although these can be useful sources of information, their use will require assessment of the
authenticity of the material as well as the authority of the author. This is mainly because these
reports may not have gone through the same review and evaluation process as journal articles
or books.
(b) Conference proceedings
Conference proceedings, sometimes referred to as symposia, are often published as unique
titles within journals or as books. Since most conferences are organized with a specific theme
or a wide range of themes in mind, they make a very useful source of information relevant to
your research project, so long as you can locate them. However, conference proceedings are
not well indexed in the tertiary literature. To locate them you may need some specific search
1
http://www.nbs.go.tz
37
tools such as index to conference proceedings, library public catalogue or more general search
engines such as Google. It is common practice nowadays for conferences to have a dedicated
Webpage from where abstracts and occasionally the full papers presented at the conference can
be obtained.
(c) Theses/dissertations
Previous dissertations and theses are unique sources of detailed information and of further
references. The trouble is that they can be difficult to locate and when found, there may be
only one copy at the awarding institution, with some restricted access. Specific search tools
such as index to theses and library catalogue may be used to locate them. The other trouble is
that in most cases only the Ph.D and M.Phil/MRes theses are covered well in the tertiary
resources; less so for the research undertaken as part of taught masters degrees. The University
of Dar es Salaam library catalogue for example can be used to locate electronically, via its
website, theses and dissertation submitted to the University2
1.3.2 Secondary literature sources
Secondary sources include such sources as books and journals. These are subsequent
publications of primary literature, and they are aimed at a wider audience. They are much
easier to locate than the primary literature most certainly because they are better covered by the
tertiary literature. As the number of secondary literature sources is increasing daily, access to
them is also increasingly being made via the internet. Libraries at many universities are
rapidly exploring this avenue, as a way of cutting down subscription cost as well as improving
access. Journals and books constitute the major component of this category.
(a) Journals
Journals are known as periodicals, serials and magazines and are published on regular basis.
Although many publishers are moving towards online publishing accessible via the internet
subject to subscription services, most journals are still being published in print form 3. Journals
are also well covered by tertiary literature. You are advised to build a database of journals that
publish materials related to your research project and make it a habit to browse these journals
regularly to be sure of finding useful items. Many journals’ content pages can be browsed via
the internet.
(i) Refereed journals
2
Hold the control key on the keyboard of your internet connected computer and click on this link
http://www.libis.udsm.ac.tz/opac/%28qh5aal45oly2y4rev3ajfa55%29/search.aspx
3
OUT also subscribes to a number of databases containing a good number of refereed journals.
Being an OUT student gives you right of access. Go to
http://www.out.ac.tz/current/Library/jounal1.html for a list of these databases.
38
Refereed academic journals are the most important sources when placing your ideas in the
context of earlier research. Books are likely to be more important than professional and trade
journals. Refereed journals (e.g. Huria Journal, JIPE, OUT law journal, Journal of Finance, etc)
are journals that are evaluated by academic peers prior to being published, to assess their
quality and suitability. Their suitability for research projects is also enhanced by the fact that
they contain detailed reports of relevant earlier research. It is important to note that (i) not all
academic journals are refereed; (ii) the articles in them are written by experts in the field and
often for a more narrow audience of scholars with particular interest in the field (iii) the
language used may be technical or highly specialized because a prior knowledge of the topic is
assumed; (iv) articles accepted for publications in them may still need to undergo several
serious revisions based on referees’ comments before it actually appears in print; and that (v)
the relevance and usefulness of such journals vary considerably and you may need to worry
about possible personal bias.
(ii) Professional and trade journals
Professional journals are produced for their members by organization. For example, the
National Board or Accountant and Auditors (NBAA) publish The Accountant Journal for its
members. They contain both a mix of news-related items and articles that are more detailed.
Caution is required though as articles in these sources (i) can be biased towards their author’s
or organisation’s views. (ii) are more practical in nature and more closely related to
professional needs than those in academic journals.
Trade journals fulfil similar function to professional journal. They are published by trade
organizations or aimed at particular industries or trades such as catering or mining. They often
focus on new products or services and net items. They rarely contain articles based on
empirical research, although some provide summaries of research
(b) Books and monographs
Books and monographs are written for specific audiences. While some are aimed at the
academic market with a theoretical inclination, others are aimed at practicing professionals
which may be more applied in their content. Materials in books are usually presented in a
more accessible manner and cover a wide range of topics. These features make them very
good introductory sources to help clarifying your research questions, research objective or
research methods. With this knowledge in mind you are better able to generate appropriate key
words for your search queries (see a latter section for details). Remember also that books may
contain out-dated material even by the time they are published.
(c) Newspapers
39
Newspaper is another secondary source, a good source of topical events, developments within
business, government, profession as well as recent statistical information. For example, you
can obtain exchange rates, treasury yields, prices of ordinary shares, government securities,
commodities, etc. In Tanzania for example, these information is available in most daily papers
including “Nipashe” and “Majira”, and most of them are also accessible via the publishers’
websites. Caution is required here though because (Sterwart and Kamins, 1993; cited by
Saunders, et al. 2009: 74), (i) newspapers may contain bias in their coverage be it political,
geographical, or personal (ii) reporting can be inaccurate (iii) you may not pick up any
subsequent amendments (iv) news presented may be filtered depending on events at the time,
with priority given to more headline-grabbing stories.
1.3.3 Tertiary literature sources
These are designed to help information users to either (i) locate primary and secondary
literature or (ii) introduce a topic. So by design, they include indexes and abstracts as well as
encyclopaedias and bibliographies. For example, the University of Dar es Salaam used to
publish a bibliographic index, which showed the publications held by different information
centres in Tanzania, by title, and coverage in terms of year of publication and/or volume. The
use of these literature sources depends on your research question(s) and objectives, the need
for secondary data to answer them, and the time available.
1.4 Importance of conducting literature search effectively
In the preceding section you learned about the different categories of literature sources. In this
section you will learn about the importance of conducting effective searches in these literature
sources for what you are looking for. At this juncture, it is important to beware that the world
of information is loaded with tones and tones of information – whether it be in print or in
electronic form. On the other hand you also need information that is relevant to your research
project, and you need to get it at the most optimal level of cost and time. This therefore creates
the need for you to learn how to search for information effectively.
You will recall from the preceding section that literature sources can be in print or electronic.
Irrespective of whether the source is in print or electronic you need to know how to
search/locate them more effectively and such knowledge and ability will help you to:
•
Find the materials you want from amongst the tonnes of printed materials in various
information centres or from the huge number of online resources available;
•
make efficient use of limited library hours and space or limited access to PCs and
bandwidth; and
40
•
save time and money
It is important to note that (i) you already have searching skills that are useful in both the print
and electronic worlds (ii) these searching skills can be enhanced by understanding how both
print and electronic searching work (iii) searching should not be done haphazardly; it usually
require careful planning before you go to the information source, and where electronic sources
are involved, before you even think of locating and switching on an internet computer terminal.
The next sections will take you through the various steps in planning your search.
1.5 Planning your literature search strategy
From the preceding section you were hinted that planning for literature search begins away
from the library shelves or computer terminals. Students often find their literature search timeconsuming, expensive and frustrating. But planning your search carefully not only relieves you
of these situations but it also ensures that you locate relevant, quality and up-to-date literature.
The time spent on planning will be compensated by time saved when running a clear search
strategy. Writing the search strategy down helps you maintain consistency and focus on what
you are looking for while at the same time exhausting the available options. Whenever possible
discuss the strategy with your research supervisor and/or the information centre manager – e.g.
the librarian.
Now planning a search strategy involves a series of activities. Such activities include (i)
defining your information needs; (ii) defining the parameters of your search; (iii) generating
key words; (iv) deciding which sources to use; (v) finding how they function; (vi) designing
your search queries. After these activities, a search can be carried out, after which several
follow up activities will follow. These include (vii) evaluating the literature so obtained on the
basis of relevancy sufficiency and quality; and finally (viii) refining the search strategy as
appropriate until each of the criteria in (vii) above is fulfilled. These activities are detailed in
the following subsections. Where it involves free online materials e.g. Wikipedia, you will
need to validate their authenticity. This will be discussed later in this lecture.
1.5.1 Defining your information needs
The first question you would probably need to ask yourself is what sort of information are you
looking for? The answer to this question will guide you in identifying the possible sources. For
example, if you are looking for specific information like facts or dates, then reference sources
such as data books, encyclopaedia, dictionary, the Web or even textbooks are usually the best.
If instead you are looking for general information such as research areas, you may need to
think more and ask yourself, for example, how much information is needed and at what depth?
41
Another question could be who is going to use the information? This is because the end user’s
needs e.g. those of professional researchers, academicians, first or last year university students,
etc., might affect the level and the quality of information you may require.
1.5.2 Defining the parameters of your search
Once you are clear about the sort of information you need, you must then establish key
parameters for your search. Since you will have already stated both your research objectives
and your research questions, you may have a clear idea about what subject matter will be
relevant for your search strategy, but not equally clear about several other important
parameters. Defining the parameters within which you can search for information will require
you to be clear about, among other things: the language of the publications, the subject area,
the business sector, the geographical area, the publication period, the literature type, etc. (Bell,
2005; cited in Saunders, et al 2009: 75). To sort this task out you may consult your lecture note
or text books in your research area, while at the same time taking notes about the subject areas
that appear most relevant to your research question as well as the authors’ names.
Often we hear our Masters students complaining of lack of literature on a selected topic for
their term paper assignments or claiming that there no research done on the topic they have
chosen for their dissertation project. These are quite unsafe claims and they may be attributed
to some of their search parameters being defined too narrowly. So the advice here is to think
broad. For example instead of thinking of ordinary shares, think of common stock, common
equity, equity stocks, equity shares, ordinary equity and so forth so that you increase your
chances of getting literature from different language bases relating to ordinary shares.
1.5.3 Generating your key words
Key words also known as search terms are the basic terms that describe your research
question(s) and objectives and you will definitely need them to search the tertiary literature.
This makes their identification the most important part of planning your search for relevant
literature. The notes you would have taken when re-examining your lecture notes and text
books in the subject areas will provide a good basis for the task. The keywords can be
identified using one or a number of different techniques (Table 2.1.1) in combination
(Saunders et al 2007:71).
Table 2.1.1: Key words generating techniques
Technique
Discussion
Description
with Take every opportunity to share and discuss your research with others
42
colleagues, your research either face-to-face, via emails or letters. The feedbacks you get will help
supervisor and librarian
you refine your ideas, approach and generate effective key words.
Initial
reading, You can improve the quality of the key words you generate by combining
Dictionaries,
the results of your discussion sessions with initial readings from various
Thesauruses,
recent review articles and materials like dictionaries, encyclopaedia,
encyclopaedias
handbooks.
and handbooks, and thesauruses, both general and subject specific. These are
also available via the internet or on a CD ROM. You can also Google it.
Google search engine, for example, offers a “define” search option by
typing “define:[enter the term]”. You can also use Wikipedia
http://www.wikipedia.org. These are free resources although you may
be required to authenticate the materials you adopt into your research
Brainstorming
project using refereed journal articles or textbooks.
Brainstorming is a helpful technique for developing research question
and objectives. It can also prove useful in generating key words. You can
start with each member individually or collectively generating and
writing down all the words and short phrases that come to mind on the
research project. These can then be evaluated and the most relevant key
Relevance tree
words and phrases are selected.
This is a more structured guide to your search process. Looks like an
organisation chart and it is hierarchical in nature beginning with main
heading followed by subheadings (which may represent the research
objectives and or research questions, authors’ names etc). The following
paragraph expands a little bit more on the technique.
Source: Summarized from Saunders et al (2009)
The relevance tree technique can be prepared after brainstorming and according to Jankowicz
(2005) cited by Saunders et al (2007: 74), it can help you decide the following:
i)
Which key words are directly relevant to your research questions and objectives;
ii)
Which areas you will search first and which your search will use later; and
iii)
Which areas are more important?
You can follow the following steps outlined in Saunders et al. (2009: 74) to develop your
relevance tree:
•
Start with your research questions and objectives at the top level
•
Identify two or more subject areas that you think are important
•
Further subdivide each major subject areas that you think are of relevance
43
•
Further subdivide the sub areas into more precise sub areas that you think are of
relevance
•
Identify those areas that you need to search immediately and those that you particularly
need to focus on
•
Add more areas as you continuously read and review the literature
In performing the activity above you may need to ask the following questions:
•
What key words do you think will appear on the site or article you want?
•
What key concepts are each of the key word a part of, or related to?
•
Are there any synonyms for these keywords or concepts?
•
Are there any alternative spellings for your keywords/concepts
•
Are plurals or capitalisation involved?
Example:
Assume that you want to find information about “the health implications for water pollution”.
Table 2.1.2: Summary of key word examples
Keywords
Concepts
‘water’ ‘pollution’ and ‘health’
‘environmental degradation’ or ‘agricultural management’ or
Synonyms
Related to location
Related causes
‘health’
Contamination, effluence,
rivers, lakes, sea, coastal,’ domestic water’, etc
‘oil spills’, chemical, biological, toxic waste, green house gases,
Alternative spellings
Plurals
smog, litter, etc
none
river(s), lake(s), disease(s)
1.5.4 Deciding which sources to use and knowing how they work
After defining your information needs, the next thing will be to decide the sources to use. You
will need to assess whether the following will be more appropriate for your information need?
(i) Individuals’ and organisations’ home pages; (ii) Newspapers and magazines; (iii) Subject
gateways, databases, catalogues; (iv) Journals—titles, abstracts or full text; (v) Reference
resources, e.g., encyclopaedias, dictionaries; (vi) Books; (vii) Grey literature, e.g. government
publications; etc.
Alongside the decision of the appropriate source will be determination of whether the sources
are accessible in print or electronically. Knowing this will help you determine a number of
44
things which include the kind of resources you will need – e.g. library subscription, photocopy
facilities or simply internet connectivity and printers.
Once you have determined the appropriate sources to use, whether in print or in electronic
form, you must then find out how they work. Knowing this will help you design the approach
to interact with the sources in the most optimal way i.e. getting the most out of the source.
1.5.5 Conducting your search
You can use various approaches depending on the type of source identified and how the
identified source works. So the following approaches are suggested in Saunders, et al (2007:
74)
•
Searching using tertiary literature sources
•
Obtaining relevant literature referenced in books and journal articles you have read
•
Scanning and browsing secondary literature in your library
•
Searching using the internet
Books may not provide adequate up-to-date coverage of your research question, but some can
give you a lead to further readings in addition to helping you generate key words and concepts
relevant to your research objectives and questions. With such insight you can then search the
tertiary sources more effectively
Most learners are tempted to start with internet straight away. But most of the time this
approach may not take you to academic literature. To avoid that it is advisable to begin with
indexes and abstracts. These are some of the varieties of tertiary sources maintained by most
university libraries. Having said that, these tertiary sources are also increasingly becoming
accessible through the internet.
An index will index articles from a range of journals and sometimes books, chapters from
books, reports, theses, conferences and research. It will contain information like author(s), date
of publication, title, journal title, volume and part number of journal issue and page number of
the article. An extension to this is the citation index which provides list of other authors who
have cited that author’s publication. The abstract goes further than that by providing summary
of the article. The good thing is that many of these sources are now available electronically,
and as long as your library subscribes, you can access them on campus or off campus. There
are also increasing number of institutional repositories which can also be very useful4.
Saunders, et al (2007: 77) provides the following guidelines when searching electronic
databases:
4
See for example www.oro.open.ac.uk , a freely accessible repository maintained by the Open
University UK which is very useful business, technology and education literature
45
•
Ensure your key words match the controlled index language
•
Search appropriate printed and database sources
•
Note precise details, including the search strings used, of the actual searches you have
undertaken for each database.
•
Note the full reference of each item found (where possible cut and paste)
Noting the search string used or search strategy in sufficient details is becoming more
emphasized nowadays (see for example Tranfield, et al., 2003: cited by Saunders et al
2007:77) as it will make it possible of others to replicate the search
1.5.6 Designing your search queries
From the keywords and phrases generating process, you will need to construct search queries
(using the keywords or phrases alone or in combination), which you will then send to the
identified database. These keywords need to match the database’s controlled index language of
pre-selected terms and phrases or descriptors.
Otherwise, your search strategy may fail.
Examples of items which may cause failure include but not limited to variation in spelling
between English UK and English US (labour vs. labor); language correctness (chemist vs. drug
store), terminology correctness (redundancy vs. downsizing); acronyms and abbreviations,
jargons, etc.
The search queries are created by a combination of the selected keywords and phrases linked
using Boolean logic. Boolean logic consists of search strings which enables you combine, limit
or widen the variety of items being searched. You can also use them with dates, journal titles,
and names of organisations or people. Table 2.1.3 shows examples of most commonly used
terms that use Boolean logic.
Table 2.1.3: Common link terms that use Boolean logic
Link item
Purpose
Example
AND
(+)
Narrows search
Recruitment AND Returns articles containing both key words
Selection
OR
Widens search
Recruitment OR
selection
NOT
(-)
Excludes terms from Recruitment NOT Returns articles containing “Recruitment” but
search
Selection
not “Selection”
* Or $
(truncation)
Uses world stems to Motivat*
pick up different
words
Litera*
Returns articles containing Motivate;
motivated, motivating; motivation.
Picks up different
Returns articles containing Organisation,
?
Organi?ation
Outcome
Returns articles containing at least one key
word
Returns articles containing Literature, literacy,
literary
46
(wild card)
spellings
organization
Behavio?r
Returns articles containing Behaviour,
behavior
Labo?r
Returns articles containing Labour, labor
Source: Adapted from Saunders et al. (2009:84)
Worked example
Let us now borrow an example form Burns and Burns (2008) in which one wanted to identify
the role that personal employee characteristics, such as personality and motivation, have within
successful project management outcomes. This is a simple example in which you can clearly
see the key terms and the nature of the link among them. For example, key words here would
be personality, motivation, and a phrase like “project management”. Having these in mind you
can then use the Boolean operators OR and AND to search for articles on them. Surely, the
different combination of these terms will yield differing size of outcomes.
3
Motivation
Personality
and Project
and Project
Management
Management Personality,
Motivation
and Project
Management
1
Personality
and
Motivation
2
Figure 2.1.2: Boolean operators as a Venn diagram
Source: Adapted from Burns &Burns (2008:55)
Figure 2.1.2 shows different ways 3of combining the terms and phrases. Using OR e.g.
“personality” OR “motivation” OR “project management” will produce a large number of
results and will be represented by all three ovals. Circle 1 contains all search results for the
term “personality” while circles 2 and 3 contains search results for the term “motivation” and
the phrase “project management”, respectively. The very centre of the three circles (the
intersection) presents results for the “personality” AND “motivation” AND “project
47
1
2
management”. For each set of two circles the middle part outside the intersection, presents
results of the two key words that are not part of the third.
To demonstrate this point and several others, I used some databases selected from the
multitude of databases to which the Open University of Tanzania Library subscribes. You have
access to most of them within the campus, and to some both from campus and from offcampus. A sample of six databases is selected – Emerald, JSTOR, Oxford University Press
(OUP), Sage Publications and Taylor & Francis. In addition, google – the most commonly
used search engine by students and staff alike is added and most importantly the google scholar
is also added. The main purpose here is to show you how the choice of source is important and
the kind of expectations you should have. The other purpose is of course to show you what you
miss each time you run for search engines like google rather than for the subject relevant
databases. I run a search based on the different combinations of the key terms identified in
Figure 2.1.2 and results are presented in Table 2.1.4.
The results show that (i) as you move from general to academic databases, the level of results
reduces, indicating that you are getting more to specific subject; (ii)combining the three terms
with OR yields the most results (widens results) while combining them with AND limits the
searches; (iii) searching for “project management” yields fewer results than using project
management indicating the limiting power of searching for a phrases; (iv) it is important to
beginning with much wider options to gain confidence that some materials are available and
then limit your search using various combinations, while at the same time watching over you
research questions; (v) Google scholar is shown here to be better than the general google, and
for the latter it does not matter whether you use google.com or other
variants like google.co.uk; (vi) For the proximity searches, it does not matter whether you use
single or double inverted commas; (vii) For some databases e.g. the Taylor and Francis, the
pattern demonstrated in the other databases is not maintained, indicating the need to find out
from the website itself whether it works the same way as the others; (viii) Google clearly
reports more search results than the Google scholar of the academic databases sampled here,
simply because of the differences in the number of data records each search engine compares
your search criteria against. For example Google claims that it searches more than 20 billion
web pages each time you type in a query 5. But research shows that these represent only about
16 percent of what could be accessed electronically. So where then does the remaining 84
percent reside?
5
(http://www.google.com/intl/en/corporate/history.html November 2007).
48
Table 2.1.4: Search results for personality motivation and project management
Boolean
Operator
OR
AND
Term
Personality
Motivation
Project Management
"Project Management"c
Personality, motivation,
"project management”
Personality and motivation
Personality and project
management
Personality and "project
management"
Motivation and "project
management"
Personality, motivation and
"project management"
Googlea
Google
Scholar
Emeraldb
WilleyBlackwell
Synergy
JSTOR
73,400,000
49,000,000
189,000,000
35,000,000
1,320,000
1,420,000
3,610,000
485,000
14,610
24,983
46,249
6,040
169,629
180,724
271,508
9,110
329,36
225,75
211,11
5,21
287,000,000
2,890,000
37,640
310,075
513,82
7,420,000
888,000
6,199
46,678
45,09
29,740,000
419,000
5,124
30,072
23,22
1,100,000
18,200
530
1,033
49
2,300,000
46,800
1,530
2,224
1,15
312,000
11,000
271
547
25
This table presents results of searches on pre-identified keywords using different Boolean search strategies
University of Tanzania library website: http://www.out.ac.tz/current/Library/jounal1.html - Accessed on 9t
hours.
a
There is no difference between using google.com and google.co.uk
b
based on journal articles found only. Others not reported here are results for books, bibliographical datab
c
For searches based on phrases it does not matter whether you use single ' ' or double commas " "
49
The online academic databases, newspapers, databases, books, company web pages, dictionaries,
encyclopaedias, individual home pages, etc, would be the answer.
Citations from these academic databases like those sampled here are from peer reviewed journals
and are generally more academically credible. Although “googling” will certainly bring up
reputable articles by reputable researchers, there is far more information of dubious authorship
(Burns and Burns, 2008:56). These results will need to be authenticated before you can use them
in your research. See the next section for the issues you will need to consider.
Another useful Boolean operator is “NOT”. This operator allows you to more precisely distinguish
within a term that may incorporate more than what you are interested in. For example the term
motivation has two sides (i) the intrinsic motivation which comes from inside the individual –
doing something because it is personally rewarding and give you a sense of self esteem; and (ii)
extrinsic motivation which is driven by some external factors like pay or fear of being sacked.
Now suppose it is the intrinsic motivation that you are more interested in. You can either use
“AND” while constraining motivation to intrinsic motivation – “project management” AND
“intrinsic motivation” or you use the operator “NOT” to exclude extrinsic motivation as follows:
“project management AND motivation NOT “extrinsic motivation”.
An alternative to Boolean operators implied Boolean logic is used in which “+” replaces the
“AND”, and the “-” replaces the “NOT”. Please note that in typing your queries, the absence of a
Boolean operator is significant because the space between two terms usually defaults to “AND”.
1.5.7 Reviewing search queries, obtaining and evaluating literature
Tertiary literature search (which may take you a few trials with a number of refinements on the
search strategy as describe in the preceding section) will give you details of what literature is
available and where to locate it. The next step naturally will be to obtain these items. To obtain the
items, you are advised to try the following:
Check your library catalogue (both card and electronic catalogues), and remember that most
libraries nowadays hold many periodicals on either CD-ROMs, provide access to subscribed for
on line databases accessible via the internet, or both. There are other free sources (e.g. institutional
repositories) around the world that can also be accessed via the internet.
For those that are held by your library (in print or electronic form) note down their location, find
and scan them in order to discover whether they are likely to be worth reading thoroughly. At this
50
stage the abstract alone could do you a lot of good. You can also browse other books and journals
with similar class marks to see whether they may also be of use. Those that are not held by your
library, the best way is to discuss with your librarian to find out whether they can be obtain
through interlibrary loan from another library. Since this service can be expensive, you are advised
to ensure that your really need it and secondly it is of high quality (refereed journal article).
Once you have obtained the literature you are looking for, you must evaluate it on the basis of
relevance, value and sufficiency. To do this you need to be at least aware of the two dilemmas
(Saunders, et al 2009: 92-93). One is how do you know whether you are reading the relevant stuff;
and two is how do you know when you have read enough. Answers to these questions require you
to set the scope of your review and skills to determine the value of the items. In both cases, the
research questions and objectives should be your starting point.
•Assessing relevance
You are advised to read all the literature that is closely related to the research objectives and
research questions. Focus on relevance and do not critically assess the ideas contained within. You
must have criteria for inclusion and for exclusion of articles so that when reading the literature you
have found, you do some reflection on the research objectives and questions and at the same time
measure the article against the inclusion/exclusion criteria.
•Assessing value
In assessing the quality of the research being reported in an article you should look at such issues
as the methodological rigour and theory robustness and quality of the arguments. . For example,
some articles may contain results of subjective evaluation rather than those of a systematic
research. Good examples of the former would be managerial autobiographies in which most
experiences of successful entrepreneurs may be published, or articles in trade magazines which
may represent practices in the profession. Remember to make notes about the relevance of each
item as soon as you read it and the reason for conclusion you drew from it.
•Assessing sufficiency
Although it is virtually impossible to read everything, determining when you have read sufficient
literature is one of the challenges researchers are faced with. This is because as a researcher you
will need to ensure that your critical review discusses what research has already been undertaken
and at the same time ensuring that you have positioned your research project in the wider context
51
citing the main writers in the field. Saunders et al (2009:93) suggest that one clue for knowing
that you have achieved this is when further searches provide mainly references to items you have
already read.
Saunders et al (2009:93) compile a checklist of relevant items, from their own experience as well
as experience of several other writers, which can be used in the assessment of both relevance and
value. These items are reproduced hereunder for your guidance
Relevance
•How recent is the item?
•Is the item likely to have been superseded?
•Are the research questions or objectives of the article sufficiently close to your own to make it
relevant to your own research?
•Is the context sufficiently different to make it marginal to your research questions and
objectives?
•Have you seen reference to this item in other items that were useful?
•Does the item contradict or support your argument?
Value
•
Does the item appear to be biased? i.e. illogical arguments, emotionally toned words
or appear to choose those cases that support the point being made?
•
What are the methodological omissions within the work?
•
Is the precision sufficient?
•
Does the item provide guide for future research?
1.5.8 Recording the literature
The relevant literature identified in the preceding processes must be recorded. But when recording
it you must also reflect – determining how the item read will contribute to your research questions
and objectives, and make notes with such a focus. Even if you print or photocopy all the
materials, you are still advised to make notes because as you do so you think through the ideas in
the literature in relation to your research. Sharp et al., 2002: cited in Saunders et al (2009:94)
identifies three sets of information you need to record:
•Bibliographic details
52
•Brief summary of content
•Supplementary information
Recording the literature has been made easy these days by the availability and accessibility to
general software like Microsoft AccessTM or to specialist bibliographic software like Procite TM,
Reference Manager for WindowsTM, EndNoteTM which not only help to record but also to organize
and generate references automatically.
The records of bibliographic details should be sufficiently done to enable you or other readers to
locate the original items. For journal articles for example you should take note of the names and
initials of the authors, year of publications, title of the article, journal title, volume number, part or
issue number, and running page umbers. For a text book, you should take note, of the names of the
authors and initials, year of publication, title and subtitle of the book, edition, place of publication
and the publisher. For any other item, you will need to take similar notes but bearing in mind what
will be required when generating references for your work. For a guide on how to write references
see a later chapter in this study manual.
Brief summaries will include such details as the key words used to locate them, the abstract, etc.
These summaries will help you locate the relevant items later, facilitate references to your notes,
and photocopies, and maintain consistence in your searches.
Supplementary information can be any thing you feel will be of value. Saunders et al (2009:96)
summarised some of the useful supplementary details to include ISBN of the book, class number,
quotations, where it was found, the tertiary resource used and the key words used to locate it,
evaluative comments, when the item was consulted, and file name with which it was saved.
1.6 Plagiarism
Plagiarism has become an enormously important topic in recent years largely as a result of the
ease with which materials can be copied from the internet and passed of as the work of the
individual student. Plagiarism is a deliberate attempt at passing off the ideas or writings of another
person as your own (Burns and Burns, 2008:63) without acknowledging the original source of the
ideas used (Easterby-Smith et al., 2008 cited by Saunders et al 2009: 97): and it may include
(Saunders et al 2009:97):
53
•Stealing from another source and passing it off as your own e.g. buying a paper from a research
service, essay bank or term paper mill; copying a whole paper from a source text without proper
acknowledgement; submitting another student’s work with or without that student’s knowledge,
etc.
•Submitting a paper written by someone else (e.g. a peer or relative) and passing it off as your
own
•Copying a section of materials from one or more source texts, supplying proper documentation
including full reference, but leaving out quotation marks, thus giving an impression that the
material has been paraphrased rather than directly quoted.
•Paraphrasing material from one or more sources texts without supplying appropriate
documentation
To avoid being accused of this really a form of intellectual theft, you are advised to always
provide references to the following cases (Burns and Burns, 2008:63):
•Direct quotation from another source;
•Paraphrased text which you have rewritten and/or synthesized but is based on someone else’s
work;
•Information derived from other studies;
•Statistical information;
•Theories and ideas derived from other authors; and
•Interpretations of events or evidence derived from other sources.
1.7 Summary
In this lecture you were introduced to various sources of literature and the steps you need to
follow to effectively search them for what you are looking for. To enable you achieve this, various
aspects such as how to identify keywords and structure effective queries were covered. To
evaluate the literature you find, you were introduced to various measures you should take to
evaluate the literature so found for relevance, value and sufficiency. Few hints on what is
plagiarism and how you can avoid it were covered. The next lecture will introduce you to how you
can critically review such literature to fit your purpose.
54
1.8 Review exercise
Assume that you are interested in identifying the factors that influence the decision of Chinese
firms to come and invest in Tanzania.
i)
Determine the relevant key words
ii)
Assume that the best source is electronic and then create a different sets of search queries
iii)
Using the different search queries you developed, visit online databases available at the
OUT website and include google.com as well as scholar.google.com and run the search.
Remember to make notes regarding the types of items that each of these services finds.
iv)
Which service do you think is likely to prove most useful to the research project above?
v)
Obtain a couple of articles on this subject and evaluate the value, and relevance of each to
it.
1.9 References
Saunders, M. N. K., Lewis & P., Thornhill, A. (2009) Research methods for business students. 5th
edition. Harlow: Prentice Hall – Financial Times
Burns, R.B. & Burns R.A. (2008) Business research methods and statistics using SPSS. London:
Sage publications Ltd.
55
LECTURE TWO
CRITICAL LITERATURE REVIEW
(Dr. P. Ngatuni)
2.1 Introduction
In the process of working on your research project, one thing you will definitely be asked by your
supervisor is to write a literature review. This is also one part that will take you time before you
can agree with your supervisor. Its process would have started with locating; obtaining,
evaluating, recording and reading the literature (see discussions in chapter five). After that you
will now be ready to conduct a literature review. But what is literature review; why is it so
important to any research project and how do we do it, and how is it written? This lecture is
designed to shade light to these questions.
2.2 Learning outcomes
At the end of this lecture you will be able to:
•Discuss the importance and purpose of the critical literature to your research project
•Identify what you need to include when writing your critical review of literature
•Write up a collected review of relevant information and document.
•Design both conceptual and theoretical frame works from a review of literature and differentiate
between them
2.3 What is critical literature review?
A (critical) literature review is a detailed and justified analysis and commentary of the merits and
faults of the literature within a chosen area, which demonstrates familiarity with what is already
known about your research topic (Saunders, et al. 2009:590). For it to be critical, it need to cover
substantial ground which will need lots of readings and to make judgements as to the value of
each piece of work and to organise those ideas and findings that are of value into a review.
Saunders et al (2009:59) summarizes results of a discussion with fellow research tutors about what
they observe in students’ literature review pieces submitted (which I also share from experience in
56
my supervision assignments at OUT). While they acknowledged the work that student normally
put into it, the following are the common observations
•The purpose for which the literature review was undertaken is unclear
•The literature review is a summary of the articles and books read, each article or book being
given one paragraph or two, some of which are arranged either chronologically by authors or by
subject.
•Although some of the items reviewed are grouped together, the purpose of grouping is not made
explicit.
Consequently, Saunders et al likened these pieces of work to a shopping catalogue. Therefore,
reviewing the literature critically should be able to help you establish (i) what research has been
publish in the chosen area, (ii) identify any other research that might currently be in progress, (iii)
identify research gaps that needs to be filled, which will help you situate your research project
within those gaps. This implies that you should undertake a critical literature review with the view
to enhance your subject knowledge and help you to clarify your to clarify your research
question(s) further. How this is achieved is discussed in a latter section.
2.4 Importance of critical literature review
Although you may feel that you already have a good knowledge of your research area, reviewing
the literature is still very essential. There are two reasons for why you must review the literature.
•Review of literature will help you to develop a good understanding and insight into relevant
previous research i.e. what research has been published in the chosen area; and the trends that
have emerged.
• Review of literature helps you to generate and refine your research ideas. For example, some
articles – review articles - contain both a considered review of the state of knowledge in the
chosen topic area and pointers towards areas where further research needs to be undertaken.
(Saunders 2009:27). This entails the research gap that needs to be filled.
•When the literature review is critically done, it will help you demonstrate awareness of the
current state of knowledge in your subject, its limitations, and how your research fits in this wider
context. This is a key requirement in any research.
Gall et al (2006) cited by Saunders et al. (2009: 61) adds the following purposes:
57
•Highlight research possibilities that have been overlooked implicitly in research to-date.
•To discover explicit recommendations for future research which you could then use as a superb
justification for your own research questions and objectives.
•To help you avoid to simply repeating work that has been done already
•To sample current opinions in newspapers professional and trade journals, thereby gaining
insights into the aspects of your research questions and objectives that are considered newsworthy.
•To discover and provide an insight into research approaches, strategies and techniques that may
be appropriate to your own research questions and objectives.
You must remember that the significance of your research and what you find out will inevitably be
judged in relation to other people’s research and findings. For example, Jankowicz (2005:161) as
quoted in Saunders et al (2009:59), argues as follows.
“There is little point in reinventing the wheel … the work that you do is not done in a vacuum, but
builds on the ideas of other people who have studied the field before you. This requires you to
describe what has been published and to marshal the information in a relevant and critical way”.
The purpose will also depend on the approach you are intending to use in your research i.e.
whether it is deductive or inductive approach (refer to a relevant lecture for a discussion of this).
For example, in a deductive approach you are expected to develop a theoretical or conceptual
framework and then you subsequently test it using data. In inductive approach on the other hand,
you are expected to explore your data and to develop theories from them that you will
subsequently relate to literature.
It is important that you note the following: (i) the purpose of literature review is not to provide a
summary of everything that has been write in your research topic but to review the most relevant
and significant research on your topic. Strauss and Corbin (1998) as cited in Saunders et al (2009:
61) argue that if the literature review is effective, new findings and theories will emerge that
neither you nor anyone else has thought about.
2.5 Writing a a critical review
Conducting a critical literature review requires critical reading
2.5.1 Critical reading
58
Conducting a critical literature review requires critical reading; but what is critical reading?
Wallace and Wray 2006: cited by Saunders et all (2009:63) refers to critical reading as the
capacity to evaluate what you read and the capacity to relate what you read to other information.
So they suggested the following five critical questions that you can employ in critical reading
(Table 2.2.1).
Table 2.2.1: Critical questions for critical reading
Critical Question
What For
Why am I reading this?
A focusing devise
Ensures you stick the purpose of reading
Helps you avoid getting sidetracked by the author’s
agenda
What is the author trying to do in For deciding how valuable the writing may be for
writing this?
your purposes
What is the writer saying that is Reflecting on your work
relevant to what I want to find out?
How convincing is what the author is Is the author’s argument/conclusion based on
saying?
evidence?
What use can I make of the reading?
Determining where its fits in your own work
Source: Constructed from Saunders, et al (2009:63)
To answer the question in the table above you need some key skills for effective reading, which
include among others, previewing, annotating, summarizing, comparing and contrasting. The use
value of each skill is summarized in the Table 2.2.2.
Table 2.2.2: Skills for effective reading
Previewing
Looking around the document’s text before reading it – it helps establish
the document’s purpose and how it may inform your literature search
Annotating
Conducing a dialogue with yourself the author and the issues and ideas at
stake
Summarising
e.g. Outlining the argument of the text and eventually being able to state it
in your own words
Comparing
and
contrasting
Asking yourself how your thinking has been altered by this reading or how
the article has affected your response to the issues and theses of your
research.
59
Source: constructed from Saunders et al (2009:62)
2.5.2 Approaching critical review
Mingers (2000) cited by Saunders et al (2009: 64) considers four aspects of critical approach to the
review of literature; - critique of rhetoric, critique of tradition, critique of authority, and critique of
objectivity. A critique of rhetoric for example involves appraising or evaluating a problem with
effective use of language in making reasoned judgement and of arguing effectively in writing. Critiques
of tradition involves questioning where justification exist to do so the conventional wisdom. Critique of
authority involves the questioning of the dominant view portrayed in the literature you are reading.
Finally, the critique of objectivity involves recognising in your review that the knowledge and
information you are discussing is not value free.
In order for you to be effectively critical you need you need reading skills and the right attitude, read the
literature with scepticism and be willing to question what you are reading. Both these require you to
have read widely on your research topic and have a good understanding of the literature and have ability
to make reasoned judgement that is argued effectively. Thus critical review may be taken to be a phrase
describing the process of providing detailed and justified analysis of, and commentary on, the merits
and faults of the key literature within your chosen area. You should therefore refer and assess research
by recognised experts in your chosen area; consider and discuss research that supports and research that
opposes your ideas, make reasoned judgement regarding the value of others’ research showing clearly
how it relates to your research; justify your argument with valid evidence in a logical manner; and
distinguish clearly between facts and opinion.
To ensure that your literature review is sufficiently critical you need to ask yourself whether you have:
• shown how your research question
• justified your arguments by referencing
relates to previous research reviewed
• assessed the strength and weaknesses
correctly published research
1.
highlighted clearly those areas
of previous research reviewed
• been objective in your discussion and
where new research is needed to provide
assessment of other peoples research
• included references to research that is
arguments
counter to your opinion
• distinguished clearly between facts
and opinions
fresh insights and taken these in your
2.
where there are inconsistencies
in current knowledge and understanding
3.
where there are omissions or
60
• made reasoned judgements about the
value and relevance of others research
to your own
• justified clearly your own ideas
bias in published research
4.
where research findings need to
be tested further
5.
where
evidence
is
lacking,
inconclusive, contradictory or limited
Source: (adapted from Saunders et al 2009: 65, Box 3.3 )
The more questions to which you can answer yes the more likely your review will be critical.
2.5.3 The process of conducting literature review
The process of conducting literature review is continuous from when you define your research
questions and objectives and it will continue in a circular form at various stages. At each stage, a
completed circle, a refinement is done and the process continues, forming a spiral process shown
in Figure 2.2.1. How long this spiral process will go on will be guided by how clear are the
research questions and objectives. It begins with literature search (discussed in the preceding
lecture). This should not be taken as misplaced rather it should be taken as a way of showing the
connectedness of literature search and literature review; i.e. There is no review of what you do not
have.
61
Figure 2.2.1: The literature review process
Source: Saunders, Lewis, Thornhill and Jenkins (2003): reproduced
from Saunders et al (2009:60)
2.5.4 Writing the literature review
The writing process begins at the very beginning when you run your searches by taking notes of
the articles you obtain and their full bibliographical details. hen you scan them through and assess
their relevance and value to your research project you should also note down the ways in which
they will be useful. In the end, when to begin writing the report of the review, it is when you
really need skills to make your arguments. A lot of Master dissertations presented by our students
suffer from this deficiency and you should try as much as you can to avoid it. For example Burns
62
and Burns (2008:61) advises that you should not write a literature review that is either
disorganized ramblings or a chain of pointless isolated summaries of each document with each
sentence beginning with ‘Brown (2001) says …’ or ‘Burns (2005 says …’, ‘Green (2003) says …’,
etc. These are what Saunders et al (2009:58) likens with adjacent pages of shopping catalogues
rather than a piece of literature review. To avoid these, Burns and Burns suggest that you should
create a coherent argument that paraphrases and evaluates the literature and shows its relevance to
what your problem or topic is.
Since you make these arguments using what others have written, it is important that you give due
acknowledgements to all your sources of information lest you will be accused of plagiarism – the
academic crime discussed in the preceding lecture. A guide on what to cite, how to cite and how to
show bibliographic details of the different types of documents – books, chapters in books, journal
articles, magazines and newspapers, e-mails, web pages, etc will be treated in a separate lecture in
this learning material.
2.5.5 The content of the critical review
When thinking about the content of the critical review, you will need to remember the need to
think about being able to combine academic theories and ideas. To achieve this, you must (i)
evaluate the research that has already been undertaken in the area of your research, (ii) show and
explain the relationships between published research findings and reference the literature in which
they were reported, (iii) draw out the key points and trends, recognising any omissions and biases;
and (iv) present them in a logical way, also showing the relationship to your won research project.
To achieve this, Saunders et al (2009:63) recommends you to consider:
• Including the key academic theories within your chosen area of research
• Demonstrating that your knowledge of your chosen area is up-to-date
• Enabling those reading your project report to find the original publications you cite
• Acknowledging the research of others and writing it in the format prescribed in the assessment
criteria
2.6 Structure of the critical review
The literature review needs to be a description and critical of what other authors have written
(Jankowicz, 2005 cited by Saunders et al 2009). Therefore in writing your review you need to
63
focus on your research question(s) and objectives. Saunders et al 2009:66) suggest that the focus of
your research may be achieved by either:
thinking of your literature review as discussing how far existing published research goes in
answering your research questions, addressing the shortfalls at a later stage; or
asking yourself how your review relates to your objectives where by if the answer is it does not or
it does only partially then you know that there is a need for a clearer focus on your objectives.
Although the precise of the structure of the critical review is usually a matter of choice, you may
need to check with the assessment criteria. It can be a single chapter, a series of chapters or
running through the whole report as you tackle issues. Most researchers have found it useful to
think of the structure of their review as a funnel. This would be achieved by :
•
Starting at a more general level before narrowing down to your specific research question(s)
and objectives
•
Provide a brief overview of key ideas and themes
•
Summarize compare and contrast the research of the key writers
•
Narrow down to highlight previous research work most related to your own research
•
Provide a detailed account of the findings of this research and show how they are related
•
Highlight those aspects where your own research will provide fresh insights
•
Lead the reader into subsequent sections on your project report which explores these issues.
Figure 2.1.3 shows an example of the funnel model. The width of the funnel represents the number
of publications that are available and the depth represents increasing or decreasing specialisation or
relevance. At the top of the review funnel, there are publications that cover the general area of the
research e.g. general text books, encyclopaedias, and so on. Next, there are the historical and
seminal research papers which broach the general field of study. However, as one moves down the
review funnel, towards one's specifically chosen field of endeavour, then the number of available
papers decreases because specialisation has increased. Therefore, Toncich (2007) recommends that
for an effective means of structuring review section, one should typically commence with
generalities and history and, gradually, work its way through to more and more specific issues that,
ultimately, lead to the directions proposed for the research program. A typical sequence would be
to start with (i) Encyclopaedia or Web-Based (Internet) Search; (ii) General Text Book (iii)
64
Specific Text Book; (iv) Journal and Conference Papers (v) Trade Journals and Marketing
Literature. A high-quality encyclopaedia is often a far better starting point for a research program
than a text book or a journal paper, simply because credible, professionally-edited encyclopaedias
cover a subject to a low depth but with a large breadth – various sections in such encyclopaedias
are generally written by people who are chosen because they are internationally regarded in their
field. Several pages of an encyclopaedia can often reveal the entire spectrum of key words,
phrases and related areas of study that can subsequently be examined through text books and
journal papers. Encyclopaedias also tend to provide historical time-frames and backgrounds that
can be used to search for seminal papers during the course of a literature review. Better still; an
encyclopaedia will often place an entire field of study
Figure 2.1.3: The funnel model of literature structure
Source: Toncich (2007:146)
The next problem is how to write the results of the literature, specifically how to structure the main
body of the literature review section. Carnwell and Daly (2001) recommend about four different
65
strategies. The choice is yours depending on what you want to achieve and what works better for
you as well as for the subject that you are treating. These strategies include (i) examining the
theoretical literature and then methodological literature underpinning the selected study; (ii)
examining the theoretical literature and then the empirical literature in discrete sections; (iii)
dividing the literature into content themes; and (iv) examining the literature chronologically.
examining the theoretical literature and then methodological literature underpinning the selected
study
This would fit best in situations where empirical literature is absent; the only literature available
might be of a theoretical nature. Such subjects often generate the need for qualitative research,
such as grounded theory. The purpose of the literature review in this case will be two-fold: first to
review the theories on the subject, and secondly to consider the implication of these theories for
the development of an appropriate methodology to conduct a new study. Therefore the theoretical
papers that critically discuss the nature, constituents and dimensions of the topic as portrayed by
various authors could be reviewed and questions such as whether there is a consensus regarding
the meaning, nature or constitution of the topic; whether there are counter-arguments as to the
meaning, nature and constitution of the topic, if so what they are; and whether you the researcher
agree with these counter-arguments?
examining the theoretical literature and then the empirical literature in discrete sections
This would fit for situations where the topic area contains many theoretical works (that discuss or
describe a concept, construct, or topic that is not based on actual research) and empirical papers
(those based on research with identified findings). The literature review in these cases could be
divided into these two categories. (i) a section devoted to reviewing the theoretical literature (ii) a
section devoted to the empirical literature. Analysis and evaluation of empirical literature,
however, will need to include critical appraisal of methodologies used within the studies reviewed.
You will need to answer question like whether the methodology of one study produced more valid
results than another study (ii) whether one study has more practical relevance than another study.
Dividing the literature into themes
This strategy would fit to situations in which the literature could be divided into distinct themes,
which would come from within the literature itself. Alternatively, the review could also be
categorized methodologically, e.g: (i) Studies utilizing a survey approach (ii) Studies utilizing
66
interview approaches (iii) Studies using experimental approaches (iv) Studies using patient
simulations. This method integrates theoretical literature and empirical literature, and might serve
to guard against the temptation to description. Questions that would need to be answered could
include whether the evidence is conclusive; whether there is theoretical consensus; whether the
counterarguments or counterevidence exists, if not, whether you can think of any; whether there is
multiple viewpoints or positions and what is your considered view; etc. Once you have reviewed
and synthesized the literature within each theme, you should write a short summary identifying the
key arguments and how they relate to the next theme. This technique ensures that each theme
flows appropriately on to the next theme, so that the review flows as a logical structure.
Examining the literature chronologically
A chronological strategy of reviewing the literature would fit best in subject matter that has
evolved over time periods, in which theories have been developed, tested and refined over several
decades. Quick examples would include the theory of capital structure, or the capital market
efficiency hypotheses in Finance. As with the other methods, you would lay out the literature in a
clear structure, and analyse the literature within each time period. Questions pertaining to the
literature would be the same as those discussed above.
2.7 Identification of research gaps (Concluding the literature review)
The conclusion part of the literature review should integrate all the theme summaries into a broad
terminal conclusion, which would logically lead onto the purpose of a new study and possible
conceptual framework. In formulating a conclusion, it is necessary to draw together conclusions
from both categories into the main conclusions. Gaps and shortcomings in previous works should
now be evident, and why these may not answer a particular research question which therefore
needs to be investigated. It might equally be justified to replicate one of the studies reviewed, for
example, on different or larger population group. The gaps and shortcomings identified logically
lead onto the purpose of a proposed study. It may also be possible to use the material in the
different sections of the review to formulate either a conceptual or a theoretical framework. These
are discussed in the next section.
67
2.8 Conceptual and theoretical frameworks
If you have done your literature review critically, then the theory should be clearly delineated,
showing the relationship between factors. Out of this then frameworks must be developed with
which you will be able to make sense of the relationships and to identify the relevant variables.
These frameworks are normally graphically presented although brief verbal materials can be added
to explain (i) why you suggested these relationships and (ii) why you believed the variables are
relevant. There are two framework each of which offers a particular support. These are theoretical
and conceptual frameworks.
2.8.1 Conceptual framework
For many of the quantitative studies this is quite an important item. Let us say for example that
you are trying to establish whether employees’ retention rate is related to leadership style. Then
after the review of relevant literature critically you realized that the leadership style in
organisations i.e. the degree to which it is authoritarian or democratic is a factor that is likely to
cause stress in employees and in turn affects their likelihood to stay in the organisations. These can
be presented graphically (hence conceptually) as shown in Figure 2.1.4, where the concepts are
places in ovals or ellipses while the arrows indicate the direction of influence
Leadership styles:
Authoritarian
Democratic
Employee
stress
Retention
rates
Figure 2.1.4: Conceptual framework
Source: Adapted from Burns and Burns (2008:71)
So a conceptual framework helps you to link abstract concepts to theory and it is the first stage in
designing a piece of research. As the phrase implies, they are abstracts and may not be directly
measured and hence the need to turn them to measurable variables.
2.8.2 Theoretical framework
Recall from the preceding section that a conceptual framework is a mechanism through which you
can portray graphically the relationship being proposed by the results of critical literature review
68
between abstract concepts. Theoretical framework on the other hand, is a graphical portrayal of
the relationship that is proposed to exist between those abstracts but this time using measurable
variables of the constructs. The added value here is the measurable variables and also the
definition of the expected relationships between them and specification of the direction of such
relationship. These in turn are the basis for formulating testable hypotheses. Figure 2.1.5 presents
the same information as that presented in Figure 2.1.4 but extends it to show the theoretical
framework i.e. the relationship in terms of measurable variables.
Conceptual framework
Leadership styles:
Authoritarian
Democratic
Employee
stress
Retention
rates
Theoretical framework
Management styles as
measured by
questionnaires
Number of employees
leaving each month
Figure 2.1.5: Conceptual vs. theoretical framework
Source: Adapted from Burns and Burns (2008:79)
2.9 Summary
In this lecture you have learned the importance of reviewing the literature critically and how it can
be done. You also learned that if your literature review is successfully done then you should be
able to develop both conceptual and theoretical frameworks, the how, the differences and the
importance of which were detailed.
2.10 Review exercise
A classy hotel on the beaches on Kigamboni has three different classes of accommodation –
standard, deluxe and royal. The standard was aimed to provide inexpensive accommodation for
69
families on holiday. The deluxe was meant for business travellers. The royal provided high quality
services to wealthy international travellers. The hotel’s CEO was wondering how to differentiate
among these three different types of accommodation in order to attract the appropriate type of
client for each. In essence, the CEO felt that the revenues could be increased if clients and
potential clients understood these distinctions better. Keen to develop a strategy to avoid any
potential confusion on these types of accommodation, he commissioned a customer survey of
those who had used each type of the facility. The results showed that many customers were
unaware of the differences; many complained about the age of the buildings and their poor
maintenance; the service quality was rated poor; and rumours were spreading that a name change
was likely and franchise owners were becoming angry. The CEO then realized that he needed to
understand how the different accommodation classification would be important so as to develop a
marketing strategy. He also recognized that unless the franchise owners cooperated his plans
would never reach fruition.
Required: From this case;
•
Identify the problem
•
Identify the themes that you could include when writing a literature review on the case
•
Develop a conceptual framework
•
Develop a theoretical framework
2.11 References
Burns, R.B. & Burns R.A. (2008) Business research methods and statistics using SPSS. London:
Sage publications Ltd.
Carnwell, R. & Daly, W. (2001) Strategies for the construction of a critical review of literature.
Nurse Education in Practice (1) 57-63.
Hart, Chris (2008) Doing literature review: Releasing the social science research imagination.
Los Angeles: Sage Publications Limited
Kothari, C. R. (2008) Research methodology: Methods and techniques. 2nd revised edition. New
Delhi: New Age International (P) Limited
Saunders, M. N. K., Lewis & P., Thornhill, A. (2009) Research methods for business students. 5th
edition. Harlow: Prentice Hall – Financial Times
70
Toncich, D.J (2007) Key factors in postgraduate research – A guide to students Ch 7.
http://www.doctortee.net/files/PHDBK2006-07p.pdf. accessed on 30th July 2010
71
MODULE THREE
RESEARCH DESIGN AND DATA COLLECTION
METHODS
72
LECTURE ONE
RESEARCH DESIGN
(By Dr. A. Gimbi)
1.1 Introduction
A problem that follows the task of defining the research problem is the preparation of the design
of the research project, popularly known as the “research design”. The meaning of research design
can be described in various ways. Research design is the conceptual structure within which
research is conducted; it constitutes the blueprint for the collection, measurement and analysis of
data. Research design is a plan for collecting and utilizing data so that desired information can be
obtained with sufficient precision or so that an hypothesis can be tested properly. Research design
provides the glue that holds the research project together. A design is used to structure the
research, to show how all of the major parts of the research project such as the samples or groups,
measures, treatments or programs, and methods of assignment, work together to try to address the
central research questions.
1.2 Learning outcomes
By the end of this lecture you should be able to:
•
Explain the importance of research design
•
Describe the importance of having thought carefully about your research design
Outline the main types of research designs and explain why these should not be thought
of as always mutually exclusive
•
•
Outline basic principles of experimental research design
•
Describe the main types of experimental designs
•
Explain the reasons for adopting multiple designs in the conduct of research
1.3 Need for research design
Research design is required since it facilitates the smooth running of the the various research
operations, thereby making research as efficient as possible yielding maximal information with
minimal expenditure of effort and resources such as time and funds. Just as for better, economical
and attractive construction of a house, we need a blueprint (commonly known as a house map)
73
well thought out and prepared by an expert architect, similarly we need a research design or a plan
in advance of data collection and analysis for research project. Preparation of the research design
must be done with great care since any error in it may upset the entire research project . Research
design, in fact has a great bearing on the reliability of the results arrived at and as such constitutes
the firm foundation of the entire edifice of the research work. The design helps the researcher to
organize his ideas in a form whereby it will be possible for him to look for flaws and
inadequacies. Such a design can even be given to others for their comments and critical
evaluation. In the absence of such a course it will be difficult for the critic to provide a
comprehensive review of the proposed study.
Thoughtlessness in designing the research project may result in rendering the research exercise
futile. It is therefore imperative that an efficient and appropriate design must be prepared before
starting research operations.
1.4 Features of a good research design
A good research design is frequently characterised by adjectives like flexible, appropriate,
efficient, economical and so on. Generally, the design which minimises bias and maximises the
reliability of the data collected and analysed is considered a good design. The design which gives
the smallest experimental error is supposed to be the best design in many investigations. Similarly,
a design which yields optimal amount of information and provides an opportunity for considering
many different aspects of a problem is considered most appropriate and efficient design in respect
of many research problems. Thus, the question of good design is related to the purpose or
objective of the research problem and also with the nature of the problem to be studied. A design
may be quite suitable in one case, but may be found wanting in one respect or the other in the
context of some other research problems. One singe design cannot serve the purpose of all types
of research problems.
A research design appropriate for a particular research problem, must at least contain and involve
the consideration of the following factors:
i.
A clear statement of the research problem
ii.
The objective of the problem to be studied
74
iii. Population to be studied
iv. Procedures and techniques to be used for gathering the information
v.
Methods to be used in processing and analysing data.
vi.
The ability and skills of the researcher and his staff, if any
vii.
The availability of time and money for the research work.
Crucially, when you choose to employ a particular research design, it should reflect the fact that
you have thought carefully about why you are employing your particular research design. You
should be able to answer why you chose to conduct your research in a particular organisation, why
you chose the particular department, and why you chose to talk to one group of staff rather than
the other. You must have valid reasons for all your research design decisions. The justification
should always be based on your research questions(s) and objectives as well as being consistent
with your research philosophy.
1.5 Types of Research Designs
Understanding the types of designs and relationships among them is important in making design
choices and thinking about the strengths and weaknesses of different designs. Research designs
can be categorised as:
1.5.1 Exploratory research (Formulative research) design:
Exploratory research is research into the unknown. It is used when you are investigating
something but really don't understand it all, or are not completely sure what you are looking for.
It's sort of like a journalist whose curiosity is peaked by something and just starts looking into
something without really knowing what they're looking for. An exploratory research is a valuable
means of finding out 'what is happening; to seek new insights; to ask questions and to assess
phenomena in a new light'. It is particularly useful if you wish to clarify your understanding of a
problem, such as if you are unsure of the precise nature of the problem. The main purpose of such
studies is that of formulating a problem for more precise investigation or of developing the
working hypotheses from an operational point of view. It may well be that time is well spent on
exploratory research, as it may show that the research is not worth pursuing!
75
The research design appropriate for such studies must be flexible enough to provide opportunity
for considering different aspects of a problem under study. In built flexibility in research design is
needed because the research problem, broadly defined initially, is transformed into one with more
precise meaning in exploratory studies, which fact may necessitate changes in the research
procedure for gathering relevant data.
There are three principal ways (methods) of conducting exploratory research:
•
A search of literature (survey of relevant literature)
The survey of relevant literature happens to be the most simple and fruitful method of formulating
precisely the research problem or developing hypotheses. Hypotheses stated by earlier workers
may be reviewed and their usefulness be evaluated as a basis for further research. It may also be
considered whether the already stated hypotheses suggest new hypothesis. In this case you should
review and build upon the work already done by others, but in cases where hypotheses have not
yet been formulated, your task is to review the available material for deriving the relevant
hypotheses from it.
•
Interviewing 'experts' in the subject (Experience survey)
This method involves a survey of people who have had practical experience with the problem to
be studied. The aim of such a survey is to obtain insight into the relationships between variables
and new ideas relating to the research problem. You must carefully select competent people who
can contribute new ideas as respondents to ensure representation of different types of experience.
The interview must ensure flexibility by allowing respondents to raise issues and questions which
you have not previously considered. Thus an experience survey may enable you to define a
problem more concisely and help in the formulation of research hypothesis. Such a survey may as
well provide information about the practical possibilities for doing different types of research.
•
Analysis of 'insight-stimulating' examples
This method is particularly suitable in situations where there is little experience to serve as a
guide. The method consists of intensive study of selected instances of the phenomenon in which
you are interested. In this case you can adopt different approaches such as examination of existing
records, and unstructured interviewing. Your attitude, intensity of the study and your ability to
76
draw together diverse information into a unified interpretation are the main features which make
this method an appropriate procedure for evoking insights.
1.5.2 Descriptive/Diagnostic/Survey research design
Descriptive research studies are those concerned with describing the characteristic of a particular
individual, or of a group, events or situations, whereas diagnostic research studies determine the
frequency with which something occurs or its association with something else. The studies
concerning whether certain variables are associated are examples of diagnostic research studies. In
contrast, studies concerned with specific predictions, with narration of facts and characteristics
concerning individual, group, or situation are all examples of descriptive research studies.
Descriptive research is thus a type of research that is primarily concerned with describing the
nature or conditions and degree in detail of the present situation. The emphasis is on describe
rather than on judge or interpret.
In descriptive as well as in diagnostic studies, you must be able to define clearly, what you want to
measure and must find adequate methods for measuring it along with a clear cut definition of
population you want to study. Since the aim is to obtain complete and accurate information in the
said studies, the research design must make enough provision for protection against bias and must
maximise reliability, with due concern for the economical completion of the research study. The
design in descriptive/diagnostic studies must be rigid and not flexible and must focus attention on
the following:
a.
Problem and objective formulation (what the study is about and why is it being
made?)
The research problem being tested should be explicitly formulated in the form of a question.
When formulating the objective of your study, you must specify the objective with sufficient
precision to ensure that the data collected are relevant. If this is not done carefully, the study may
not provide the desired information.
b.
Designing/selection of methods of data collection (what techniques of gathering data
will be adopted?)
77
You need to carefully design and or select methods by which you will obtain the appropriate data.
Several methods such as observation, questionnaires, interviewing, examination of records etc
with their merits and limitations, are available for the purpose and you may use one or more of
these methods. While designing data-collection procedure, you have to ensure adequate
safeguards against bias and unreliability. It is always desirable to pretest your data collection
instruments before you finally use them for study purposes.
c.
Selecting the sample (how much material will be needed?)
In most descriptive/diagnostic studies, you will take out sample(s) and then wish to make
statements about the population on the basis of the sample analysis or analyses. Two important
questions arise frequently when you anticipate to select a sample, namely:
- How big should the test sample be?
- What is the probability of mistakes occurring in the use of test sampling (instead of
the whole population)?
Special care should be taken with the selection of test samples. The results obtained from a survey
can never be more authentic than the standard of the population or the representatives of the test
sample. The size of the test sample can also be specified by means of statistics. It is important for
you to bear in mind that it is desirable that test sampling be made as large as possible. The most
important criterion that serves as a guideline here, is the extent to which the test sample
corresponds with the qualities and characteristics of the general population being investigated.
Take into consideration the next three factors before you make a decision with regard to the size of
the test sample:
- What is the grade of accuracy expected between the test sample and the general
population?
- What is the variability of the population? (This, in general terms, is expressed as the
standard deviation.)
- What methods should be used in test sampling?
d.
Collecting the data (where can the required data be found and with what time
period should the data be related?)
78
Data collection refers to the gathering of information aimed at proving or refuting some facts. This
is important issue in the research process as the validity of results of a statistical analysis clearly
depends on reliability and accuracy of the data used. In fact the reliability and accuracy of the
data depend on the method of collection.
Sources of data
There are two major sources of data that you can use that are the primary and secondary sources.
Primary sources
This is information that you gather directly from experimental studies or respondents using your
research instruments. In experimental studies this information is obtained by measuring the
variable(s) of interest.
Secondary sources
This is information that you gather from other previous studies e.g. published material and
information from internal sources such as raw data and unpublished summaries.
Use of secondary data saves time and cost for you and provide an insight on outcome from similar
researches.
Steps in data collection
•
Define your sample
•
Reflect on the research design
•
Ensure research instruments are ready
•
Define the data to be collected and how you are going to analyse them
•
Request permission to collect data from the relevant authorities
•
Pretest your research instruments
To obtain data free from errors introduced by those responsible for collecting them, it is necessary
for you to supervise closely the staff of field workers as they collect and record information. You
may set up checks to ensure that the data collecting staff perform their duty honestly and without
79
prejudice. As data are collected, you should examine them for completeness, comprehensibility,
consistency and reliability.
e.
Processing and analysing the data
Data processing and analysis includes steps such as coding the interview responses, observations
etc.; tabulating the data; and performing several statistical computations. Statistical computations
such as averages, percentages and various coefficients must be worked out. Probability and
sampling analysis may as well be used. The appropriate statistical operations, along with the use
of appropriate tests of significance should be carried out to safeguard the drawing of conclusions
concerning the study. To the extent possible, make sure you plan in detail the processing and
analysing procedures before you embark on actual work.
f.
Reporting the findings
This is the task of communicating the findings to others and the you must do it in an efficient
manner. The report entails the reproduction of factual information, the interpretation of data,
conclusions derived from the research and recommendations. The layout of the report needs to be
well planned so that all things relating to the research study may be well presented in simple and
effective style.
You should make sure that you understand the meaning of the terminology used. Consult the
recommended/other sources for detailed explanations. However, further reference must be made
to aspects related to test sampling.
The differences between Exploratory and Descriptive designs can be summarized as shown in
Table 3.1.1.
Table 3.1.1: Summary of differences between exploratory and descriptive research designs
Type of Study
Research Design
Exploratory
Descriptive
Overall design
Flexible design (design must
provide
opportunity
for
considering different aspects of
the problem)
Rigid design (design must make
enough provision for protection
against bias and must maximise
reliability)
(i) Sampling design
Non-comparability
sampling Probability
sampling
design
80
design (purposive or judgement (random sampling)
sampling)
(ii) Statistical design
No pre-planned
analysis
design
(iii) Observational
design
Unstructured instruments
collection of data
for Pre-planned design for analysis
for Structured or well thought out
instruments for collection of data
(iv) Operational design No fixed decisions about the Advanced
decisions
operational procedures
operational procedures
Source: Kothari, (2009)
about
1.5.3 Experimental (Hypothesis-testing) research design
Hypothesis-testing research studies are those where you test the hypotheses of causal relationships
between variables. Such studies require procedures that will not only reduce bias and increase
reliability, but will permit drawing inferences about causality. This type of research design is
known by a variety of names. Synonyms are, the cause and consequence, before and after,
control group and laboratory design. It is research designed to study cause and consequence. A
clear distinction between the terms experiment and experimental research should be evident. In the
former there is normally no question about the interpretation of data in the discovery of new
meaning. Experimental research, however, has control as fundamental characteristic. The selection
of control groups, based on proportional selection, forms the basis of this type of research.
Experimental research is basically the method that you can apply in a research laboratory. The
basic structure of this type of research is elementary: two situations (cause and consequence) are
assessed in order to make a comparison. Following this, attempts should be made to treat the one
situation (cause) from the outside (external variable) to affect change, and then to reevaluate the
two situations. The perceivable changes that occurred can then be presumed as caused by external
variables. Since experimental designs originated in the context of agricultural operations, several
terms of agriculture (such as treatment, yield, plot, block etc) are still used in experimental
designs.
1.5.3.1 Basic Principles of Experimental designs
a)
Principle of replication
81
According to the principle, the experiment should be repeated more than once. Thus each
treatment is applied in many experimental units instead of one. By doing so the statistical accuracy
of experiment is increased.
Conceptually replication does not present any difficulty, but computationally it does. For instance,
if an experiment requiring a two-way analysis of variance is replicated, it will then require a threeway analysis of variance since replication itself may be a source of variation in the data. However,
you should remember that replication is introduced in order to increase the precision of a study,
that is to say, to increase the accuracy with which the main effects and interactions can be
estimated.
b)
Principle of randomisation
The principle provides protection against the effects of extraneous factors by randomization. In
other words, the principle, indicates that you should design or plan the experiment in such a way
that the variations caused by extraneous factors can all be combined under the general heading of
“chance”. For instance, if you grow one variety of rice, say in the first half of the parts of a field
and the other variety in the other half, then it is just possible that the soil fertility may be different
in the first half in comparison to the other half. If this is so, your result would not be realistic. In
such a situation, you may assign the variety of rice to be grown in different parts of the field on the
basis of some random sampling technique i.e., you may apply randomization principle and protect
yourself against the effects of the extraneous factors (soil fertility differences in the given case).
As such, through the application of the principle of randomization, you can have a better estimate
of the experimental error.
c)
Principle of Local Control
Under this principle the extraneous factor, the known source of variability, is made to vary
deliberately over as wide a range as necessary and this needs to be done in such a way that the
variability it causes can be measured and hence eliminated from the experimental error. This
means that you should plan the experiment in a manner that you can perform a two-way analysis
of variance, in which the total variability of the data is divided into three components attributed to
treatments (e.g. variety of rice), the extraneous factor (e.g. soil fertility) and experimental error.
82
Through the principle of local control you can eliminate the variability due to extraneous factor(s)
from the experimental error.
1.5.3.2 Commonly used experimental designs
Research designs can be weak or strong (or quasi which are moderately strong; that is, in between
the weak and the strong designs) depending on the extent to which they control for the influence
of confounding variables.
Experimental designs refers to the framework or structure (outline, plan or strategy) of an
experiment. There are several experimental designs which can be classified into two broad
categories, viz., informal experimental designs and formal experimental designs.
Informal experimental designs are those designs that normally use a less sophisticated form of
analysis:
•
Before-and-after without control design
•
After-only with control design
•
Before-and-after with control design
Formal experimental designs:
•
Completely randomized design (CRD)
•
Randomized block design (RBD)
•
Latin square design
•
Factorial designs
a) Informal experimental designs
i) Before-and-after without control design/One-group post-test design
In such a design a single test group or area is selected and the dependent variable is measured
before the introduction of the treatment. The treatment is then introduced and the dependent
variable is measured again after the treatment has been introduced. The effect of the treatment
would be equal to the level of the phenomenon after the treatment minus the level of the
phenomenon before the treatment.
83
Test area:
Level of phenomenon before
treatment (X)
Treatment introduced
Level of phenomenon after
treatment (Y)
Treatment Effect = (Y) - (X)
Figure 3.1.1 Before-and-after without control design/One-group post-test design
Source: Kothari, 2009
The one-group post-test only design is a very weak research design with two problems:
•
A serious problem with this design is that you do not know whether the treatment
condition had any effect on the dependent variable because you have no idea as to what the
response would be without exposure to the treatment condition. That is, you don’t have a pretest
or a control group to make your comparison with.
•
Another problem with this design is that you do not know if some confounding
extraneous variable affected the dependent variable. With the passage of time considerable
extraneous variations may be there in its treatment effect.
Because of the problems with this design it generally gives little evidence as to the effect of the
treatment condition.
ii) After-only with control design
In this design two groups or areas (test area and control area) are selected and the treatment is
introduced into the test area only. The dependent variable is then measured in both the areas at the
same time. The treatment impact is assessed by subtracting the value of he dependent variable in
the control area from its value in the test area.
Test area:
Treatment introduced
Control area:
Level of phenomenon after
treatment (Y)
Level of phenomenon
without treatment (Z)
Treatment Effect = (Y) - (Z)
Figure 3.1.2: After-only with control design
Source: Kothari (2009).
84
The basic assumption in such a design is that the two areas are identical with respect to their
behaviour towards the phenomenon considered. If this assumption is not true, there is the
possibility of extraneous variation entering into the treatment effect. However, data can be
collected in such a design without the introduction of problems with the passage of time. In this
respect the design is superior to before-and-after without control design.
iii) Before-and-after with control design/Pretest-posttest control-group design
In this design two areas are selected and the dependent variable is measured in both the areas for
an identical time-period before the treatment. The treatment is then introduced into the test area
only, and the dependent variable is measured in both for an identical time-period after the
introduction of the treatment. The treatment effect is determined by subtracting the change in the
dependent variable in the control area from the change in the dependent variable in test area
TIME-PERIOD I
Test area:
Level of phenomenon before
treatment (X)
Control area:
Level of phenomenon without
treatment (A)
TIME-PERIOD II
Treatment introduced
Level of phenomenon after
treatment (Y)
Level of phenomenon
without treatment (Z)
Treatment Effect = [(Y) – (X)] – [(Z) - (A)]
Figure 3.1.3: Before-and-after with control design/Pretest-posttest control-group design
Source: Adapted from Kothari, (2009)
This design is superior to the above two designs for the simple reason that it avoids extraneous
variation resulting both from the passage of time and from non-comparability of the test and
control areas. But at times due to lack of historical data, time or a comparable control area, you
should prefer t o select one of the first two informal designs stated above.
b) Formal experimental designs
i) Completely randomized design (CRD)
This design involves only two principles viz., the principle of replication and the principle of
randomisation. It is the simplest possible design and its procedure of analysis is also easier.
Subjects are randomly assigned to experimental treatments (or vice versa). For instance, if you
have 10 subjects and if you wish to test 5 under treatment A and 5 under treatment B, the
85
randomisation process gives every possible group of 5 subjects selected from a set of 10 an equal
opportunity of being assigned to treatment A and treatment B. One-way analysis of variance (oneway ANOVA) is used to analyse such a design. Even unequal replications can work in this design.
It provides a maximum number of degrees of freedom to the error. Such a design is generally used
when experimental areas happen to be homogeneous. Technically, when all the variations due to
uncontrolled extraneous factors are included under the heading of chance variation, the design is
referred to as CRD. The two forms of CRD are two-group simple randomised design and random
replications design.
•
Two-group simple randomised design
In two-group simple CRD, first of all you define the population and then randomly select a
sample from the population. Then you randomly assign the sample items to the experimental and
control groups. This design yields two groups as representatives of the population.
POPULATION
Random
selection
SAMPLE
Experimental
group
Treatment A
Control
group
Treatment B
Random
assignment
Independent
variable
Diagrammatically the design can be shown as in Figure 3.1.4.
Figure 3.1.4: Two-group simple randomised experimental design
Source: Kothari, (2009).
Since in the simple randomised design the elements constituting the sample are randomly drawn
from the same population and randomly assigned to the experimental and control groups, it
becomes possible to draw conclusions on the basis of samples applicable for the population. The
experimental and control groups of such a design are given different treatments of the independent
variable. The merit of such a design is that it is simple and randomises the differences among the
sample items. It's limitation is that the individual differences among the individuals conducting the
treatments are not eliminated, i.e., it does not control the extraneous variable and as such the result
of the experiment may not depict a correct picture.
86
•
Random replications design
The limitation of the two-group simple randomised design is usually eliminated within the random
replications design. In the random replications design the differences on the dependent variable are
not ignored i.e. extraneous variable is controlled. The effect of the differences on the dependent
variable are minimised (reduced) by providing a number of repetitions for each treatment. Each
repetition is technically called a 'replication'.
Random replications design serves two purposes i.e. it provides controls for the differential effects
of the extraneous independent variables and secondly, it randomises any individual differences
among individuals conducting the treatments.
ii) Randomized block design (RBD)
This design is an improvement over the CRD. In this design the principle of local control can be
applied along with the other two principles of experimental designs. In the RBD, subjects are first
divided into groups, known as blocks, such that within each group the subjects are relatively
homogeneous in respect to some selected variable. The variable selected for grouping the subjects
is one that is believed to be related to the measures to be obtained in respect of the dependent
variable. The number of subjects in a given block would be equal to the number of treatments and
one subject in each block would be randomly assigned to each treatment. In general, blocks are the
levels at which you hold the extraneous factor fixed, so that its contribution to the total variability
of data can be measured. The main feature of the RBD is that each treatment appears the same
number of times in each block. The RBD is analysed by the two-way analysis of variance (twoway ANOVA) technique.
Suppose four different forms of standardised test in mathematics were given to each of five
students (selected one from each of the five I.Q. Blocks) and the scores obtained are as shown in
Figure 3.1.5 below.
Very low I.Q.
Low I. Q.
Average I. Q.
High I. Q.
Very high I. Q.
Student A
Student B
Student C
Student D
Student E
From 1
82
67
57
71
73
Form 2
90
68
54
70
81
Form 3
86
73
51
69
84
Form 4
93
77
60
65
71
87
Figure 3.1.5: Randomised block experimental design
Source: Adapted from Kothari, (2009)
If each student separately randomised the order in which they took the four tests by using random
numbers or some similar device, such a design is a RBD. The purpose of this randomisation is to
take care of such possible extraneous factors such as fatigue or experience gained from repeatedly
taking the test.
iii) Latin square design (LSD)
This is an experimental design frequently used in agricultural research. The conditions under
which agricultural investigations are carried out are different from those in other studies since
nature plays an important role in agriculture. For instance, an experiment has to be made through
which the effects of five different varieties of fertilizers on the yield of a certain crop, say wheat,
is to be judged. In such a case, the varying fertility of the soil in different blocks in which the
experiment has to be performed must be taken into consideration; otherwise the results obtained
may not be very dependable because the output happens to be the effect not only of fertilizers, but
it may also be the effect of fertility of soil. Similarly, there may be impact of varying seeds on the
yield. To overcome such difficulties, the LSD is used when there are two major extraneous factors
such as the varying soil fertility and varying seeds.
The LSD is one wherein each fertilizer, in our example, appears five times but is used only once in
each row and in each column of the design. In other words, the treatments in LSD are so allocated
among the plots that no treatment occurs more than once in any one row or any one column. The
two blocking factors may be represented through rows and columns, one through rows and the
other through columns (Figure 3.1.6).
FERTILITY LEVEL
SEEDS DIFFERENCES
I
II
III
IV
V
X1
A
B
C
D
E
X2
B
C
D
E
A
X3
C
D
E
A
B
X4
D
E
A
B
C
X5
E
A
B
C
D
88
Figure 3.1.6: Latin square experimental design
Source: Kothari, (2009).
Figure 3.1.6 shows that in LSD the field is divided into as many blocks as there are varieties of
fertilizers and then each block is again divided into as many parts as there are varieties of
fertilizers in such a way that each of the fertilizer variety is used in each of the block (whether
column-wise or row-wise) only once. The analysis of the LSD is very similar to the two-way
ANOVA technique.
The merit of this experimental design is that it enables differences in fertility gradients in the field
to be eliminated in comparison to the effects of different varieties of fertilizers on the yield of the
crop. But this design suffers from one limitation, and it is that although each row and each column
represents equally all fertilizer varieties, there may be considerable difference in the row and
column means both up and across the field. This, in other words, means that in LSD we must
assume that there is no interaction between treatments and blocking factors. This defect can,
however, be removed by taking the means of rows and columns equal to the fields mean by
adjusting the results. Another limitation of the design is that it requires number of rows, columns
and treatments to be equal. This reduces the utility of this design. In case of (2x2) LSD, there are
no degrees of freedom available for the mean square error and hence the design cannot be used. If
treatments are 10 or more, than each row and each column will be larger in size so that rows and
columns may not be homogeneous. This may make the application of the principle of local control
ineffective. Therefore, LSD of orders (5x5) to (9x9) are generally used.
iv) Factorial designs
Factorial designs are used in experiments where the effects of varying more than one factor are to
be determined. They are especially important in several economic and social phenomena where
usually a large number of factors affect a particular problem. Factorial designs can be of two
types:
•
Simple factorial designs (two-factor factorial design)
In this case, we consider the effects of varying two factors on the dependent variable, but when an
experiment is done with more that two factors, we use complex factorial designs. Simple factorial
89
design may either be a 2 x 2 simple factorial design, or it may be a 3 x 4 or 5 x 3 or the like type
of simple factorial design.
Experimental Variable
Control Variables
Treatment A
Treatment B
Level I
Cell 1
Cell 3
Level II
Cell 2
Cell 4
Figure 3.1.7: Two by two simple factorial experimental design illustration
Source: Kothari, (2009).
In this design the extraneous variable to be controlled by homogeneity is called the control
variable and the independent variable, which is manipulated, is called the experimental variable.
Then there are two treatments of the experimental variable and two levels of control variable. As
such there are four cells into which the sample is divided. Each of the four combinations would
provide one treatment or experimental condition. Subjects are assigned at random to each
treatment in the same manner as in a randomized group design. The means for different cells may
be obtained along with the means for different rows and columns. Means of different cells
represent the mean scores for the dependent variable and the column means in the given design
are termed the main effect for treatments without taking into account any differential effect that is
due to the level of the control variable. Similarly, the row means in the said design are termed the
main effects for levels without regard to treatment. Thus, through this design we can study the
main effects of treatments as well as the main effects of levels. An additional merit of this design
is that one can examine the interaction between treatments and levels, through which one may say
whether the treatment and levels are independent of each other or they are not so.
The following examples make clear the interaction effect between treatments and levels. The data
obtained in case of two (2 x 2) simple factorial studies may be as given in Figures 3.1.8a and
3.1.9b.
STUDY I DATA
90
Training
Control
(Intelligence)
Treatment A
Treatment B
Row Mean
Level I (Low)
15.5
23.3
19.4
Level II (High)
35.8
30.2
33.0
Column mean
25.6
Figure 3.1.8a: Two by two simple factorial study I data
26.7
Source: Kothari, (2009).
STUDY II DATA
Training
Control
(Intelligence)
Treatment A
Treatment B
Row Mean
Level I (Low)
10.4
20.6
15.5
Level II (High)
30.6
40.4
35.5
Column mean
20.5
Figure 3.1.8a: Two by two simple factorial study II data
30.5
Source: Kothari, (2009).
Both the above figures (study I and study II data) represent the respective means.
The 2 x 2 design need not be restricted in the manner as explained above i.e., having one
experimental variable or two control variables. For example, a college teacher compared the effect
of the class-size as well as the introduction of the new instruction technique on the learning of
research methodology. For this purpose he conducted a study using a 2 x 2 simple factorial design.
His design in graphic form would be as shown in Figure 3.1.9.
Experimental Variable I
(Class size)
Experimental Variable II
(Instruction technique)
New
Small
Usual
Usual
Figure 3.1.9: Two by two simple factorial experimental design
91
Source: Kothari, (2009).
But if the teacher uses a design for comparing males and females and the senior and junior
students in the college as they related to the knowledge of research methodology, in that case we
will have a 2 x 2 simple factorial design wherein both the variables are control variables as no
manipulation is involved in respect of both the variables.
A factorial design is a design in which two or more independent variables are simultaneously
investigated to determine the independent and interactive influence which they have on the
dependent variable. It also has random assignment to the groups.
• Each combination of independent variables is called a "cell."
• Research participants are randomly assigned to as many groups are there are cells of the factorial
design if both of the independent variables can be manipulated.
• The research participants are administered the combination of independent variables that
corresponds to the cell to which they have been assigned and then they respond to the dependent
variable.
• The data collected from this research give information on the effect of each independent variable
separately and the interaction between the independent variables.
• The effect of each independent variable on the dependent variable is called a main effect. There
are as many main effects in a factorial design as there are independent variables. If a research
design included the independent variables of gender and type of instruction, then there would
potentially be two main effects, one for gender and one for type of instruction.
• An interaction effect between two or more independent variables occurs when the effect which
one independent variable has on the dependent variable depends on the level of the other
independent variable. For example, if gender is one independent variable and method of teaching
mathematics is another independent variable, an interaction would exist if the lecture method was
more effective for teaching males mathematics and individualized instruction was more effective
in teaching females mathematics.
Illustration-4 x 3 simple factorial design
92
The 4 x 3 simple factorial design will usually include four treatments of the experimental variable
and three levels of the control variable (Figure 3.1.10).
Experimental Variable
Control Variable
Treatment A
Treatment B
Treatment C
Treatment D
Level I
Cell 1
Cell 4
Cell 7
Cell 10
Level II
Cell 2
Cell 5
Cell 8
Cell 11
Level III
Cell 3
Cell 6
Cell 9
Figure 3.1.10: Four by three simple factorial experimental design illustration
Cell 12
Source: Kothari, (2009).
This model of a simple factorial design includes four treatments viz., A, B, C, and D of the
experimental variable and three levels viz., I, II, and III of the control variable and has 12 different
cells as shown above. This shows that a 2 x 2 simple factorial design can be generalised to any
number of treatments and levels. Accordingly we can name it as such and such (_x_) design. In
such a design the means of the columns provide the researcher with an estimate of the main effects
for treatments and the means for rows provide an estimate of the main effects for the levels. Such a
design also enables the researcher to determine the interaction between treatments and levels.
•
Complex factorial designs
Experiments with more than two factors at a time involve the use of complex factorial designs. A
design which considers three or more independent variables simultaneously is called a complex
factorial design. In case of three factors with one experimental variable having two treatments and
two control variables, each one of which having two levels, the design used will be termed 2 x 2 x
2 complex design which will contain a total of eight cells as shown in figure 3.1.11 below.
Experimental Variable
Treatment A
Treatment B
Control Variable 2 Control Variable 2 Control Variable
Level I
Level II
2
Level I
Control variable I Level I
Level II
Control Variable 2
Level II
Cell 1
Cell 3
Cell 5
Cell 7
Cell 2
Cell 4
Cell 6
Cell 8
Figure 3.1.11: Two by two by two complex factorial design
93
Source: Kothari, (2009).
Factorial designs are used mainly because of the two advantages:
•
They provide equivalent accuracy (as happens in the case of experiments with only one factor)
with less labour and as such are a source of economy. Using factorial designs, we can determine
the main effects of two (in simple factorial designs) or more (in complex factorial design) factors
(or variable) in one single experiment.
•
They permit various other comparisons of interest. For example, they give information about
such effects which cannot be obtained by treating one single factor at a time. The determination
of interaction of interaction effects is possible in case of factorial designs.
1.6 Summary
There are several research designs and the researcher must decide in advance the collection and analysis
of data as to which design would prove to be more appropriate for his research project. He must give
due weight to various points such as the type of universe and its nature, objective of his study, the
resource list or the sampling frame, desired standard of accuracy and the like when taking a decision in
respect of the design for his research project
1.7 Review exercise
1. Explain the meaning and significance of a Research design
2. Explain the meaning of the following in context of Research design
a) Extraneous variables
b) Confounded relationship
c) Research hypotheses
d) Experimental and control groups
e) Treatments
3. Describe some of the important research designs used in experimental hypothesis-testing
research study.
4. “Research design in exploratory studies must be flexible but in descriptive studies, it must
minimise bias and maximise reliability.” Discuss.
5. Give your understanding of a good research design. Is a single research design suitable in all
research studies? If not, why?
94
6. Explain and illustrate the following research designs:
a) Two group simple randomized design
b) Latin square design
c) Random replications design
d) Simple factorial design
e) Informal experimental designs
7. Write a short note on 'Experience Survey' explaining fully its utility in exploratory research
studies.
8. What is research design? Discuss the basis of stratification to be employed in sampling public
opinion on inflation.
1.8 References
Gomez, K.A. and Gomez, A.A. (1984). Statistical Procedures for Agricultural Research (2ed). An
International Rice Research Institute Book. A Wiley-Interscience Publication, John Wiley &
Sons.
Kothari, C. R. (2009). Research Methodology: Methods and Techniques. New Age International
Limited Publishers.
Mason, R.D., Lind, D.A and Marchal, W.G. (1999). Statistical Techniques in Business and Economics.
Irwin McGraw Hill.
Saunders, M. N. K., Lewis & P., Thornhill, A. (2009). Research methods for business students. (5ed).
Harlow: Prentice Hall – Financial Times.
Zar, J.H. (1996). Biostatistical Analysis. Prentice Hall, Inc. New Jersey.
95
LECTURE TWO
DATA COLLECTION METHODS
(By Dr. S. Massomo and Dr. D. Ngaruko)
2.1 Introduction
Data collection is an important aspect of any type of research study. Inaccurate data collection can
impact the results of a study and ultimately lead to invalid results. Data collection methods for
impact evaluation vary along a continuum. At the one end of this continuum are quantitative
methods and at the other end of the continuum are qualitative methods for data collection. In this
lecture you will be introduced to various quantitative and qualitative methods used in collecting
research data.
2.2 Learning outcomes
At the end of this lecture you should be able to:
•Outline various methods of collecting quantitative and qualitative data and state their
•Describe various data collection instruments/tools
•Distinguish between quantitative and qualitative data collection methods
•Apply various qualitative data collection methods appropriate for a given type of research
•Advantageously use a combination of different data collection techniques.
•Identify ethical issues involved in data collection and ways of ensuring that your research
informants or subjects are not harmed by your study.
2.3 Quantitative Data Collection Methods
Quantitative data collection begins after a research problem has been identified and a research
design/plan has been devised. It refers to the gathering of information aimed at proving or refuting
some facts. This is one of the most important aspects in the research process as the validity of the
results of a statistical analysis clearly depends on reliability and accuracy of the data used. In turn,
the reliability and accuracy of the data depend on the method of collection. Data-collection
96
techniques allow us to systematically collect information about our objects of study (people,
objects, phenomena) and about the settings in which they occur. The methods rely on random
sampling and structured data collection instruments that fit diverse experiences into predetermined
response categories.
In this section we briefly describe quantitative methods that are used to gather quantitative data –
that is information dealing with numbers and anything that is measurable. Statistics, tables and
graphs, are often used to present the results of these methods (see details in Lecture one (Module
Four) on quantitative data analysis). The methods should therefore be distinguished from
qualitative methods described in section 2.4 of this Lecture. However, you should note that in most
physical and biological sciences, the use of either quantitative or qualitative methods is
uncontroversial, and each is used when appropriate.
2.3.1 Sources of data
There are two major sources of data used by researchers. These are primary and secondary
sources. Primary sources are information gathered directly from experimental studies or
respondents using your research instruments. In experimental studies this information is obtained
by measuring the variable(s) of interest. Secondary sources are information gathered from other
previous studies e.g. published material and information from internal sources such as raw data
and unpublished summaries. Different techniques are employed for collecting data from primary
and secondary sources (see sections 2.3.4.1 and 2.3.4.2).
2.3.2 Steps in data collection
• Define
your sample
• Reflect
on the research design
• Ensure
research instruments are ready
• Define
the data to be collected and how you are going to analyse them
• Request
permission to collect data from the relevant authorities
• Pre-test
your research instruments
97
2.3.3 Need for correct sampling
2.3.3.1 Sampling
A researcher need to consider need to draw a suitable sample from the population. In most cases it
is costly, time consuming and sometimes impractical to work with entire population. Imagine a
study that would require dissection and destruction of the hearts of a certain animal species, say
the Giraffe our national symbol, it would be impractical to expect to use the entire population of
giraffes from a certain game park. A carefully drawn sample can provide useful results which
represent the entire population. However, it should be stressed that if you fail to draw your sample
correctly you will end up with wrong conclusions from your study.
2.3.3.2 Choice of sample size
Usually the larger the absolute size of a sample the more closely its distribution will be to the
normal distribution – the central limit theorem. As a researcher you will need to consider the
following aspects that may influence choice of sample size;
• The
confidence you need to have in your data
• The
margin of error that you can tolerate
• The
type of analysis you are going to undertake
• The
size of the total population from which your sample is being drawn
2.3.4 Methods of data collection
There are several methods used for collecting data. Each method has its own uses and none is
superior to all situations. Selection of appropriate method(s) for data collection is influenced by i)
the nature, scope and object of enquiry ii) availability of resources iii) time factor and iv) the
precision required (Kothari, 2009). In the next sections we briefly describe the various data
collection techniques that can be used for collection of primary and secondary data.
2.3.4.1 Collection of secondary data
In most cases there is a large amount of data that has been collected by others. As a starting point,
you should try to locate these sources and retrieve the information. The information may not have
been analysed or published before. Examples include; census data, meteorology data, files/reports,
computer databases, Government reports, and documents such as budgets, organisation charts,
98
policies and procedures. Use of secondary data save time and cost for the researcher and provide
an insight on outcome from similar researches. Additionally secondary data permit examination of
trends over time. However, you need to check on the reliability, suitability and adequacy of the
data.
The demerits of secondary data include the following;
ι)
Sometimes it is difficult to gain access to the records or reports
ιι)
Data may not always be complete or exactly what is needed
ιιι) You have to verify validity and reliability of the data
ιϖ) Ethical issues concerning confidentiality may arise.
2.3.4.2 Collection of primary data
Usually, quantitative data gathering strategies include: i) Experiments/clinical trials, ii) Observing
and recording well-defined events, iii) Obtaining relevant data from management information
systems, iv) Administering surveys (e.g., face-to face and telephone interviews, questionnaires
etc). Let us briefly describe each of these strategies.
a) Experiment
If the researcher conducts an experiment, s/he observes and takes some quantitative measurements,
or the data. In the case of a survey, data may be collected by using several methods such as
observations, interview and administering questionnaires. If you must collect original data it is
important that you
i)
establish procedure and follow them,
ii)
maintain accurate records of definition and coding,
iii) pre-test your instruments and
iv) verify accuracy of coding
b) Observation
Observation is a commonly used method especially in studies related to behavioural sciences (see
also section 2.4.1). This technique involves systematically selecting, watching and recording
99
behaviour and characteristics of living beings, objects or phenomena. Observation can be the
major research technique if the interest is to check the presence or absence of an object or
phenomena e.g.
the presence of latrines and their state of cleanliness, condition of
roads/buildings/animals. In this case you obtain actual (real time) data versus self reported
behaviour or perceptions. Furthermore, observations can give additional, more accurate
information on behaviour of people than interviews or questionnaires. They can also check on the
information collected through interviews especially on sensitive topics such as alcohol or drug use,
or stigmatising diseases (Varkevisser et al., 2003).
Observation may be done in three ways; i) Unobstructive - here no one knows you are observing,
ii) Participant- you actually participate in the activity and iii) Obstructive- the people being
observed know that you are there to observe them. While planning your observation it is advisable
that you i) develop a checklist to rate your observation, ii) develop a rating scheme and iii) pilottest the observation data collection instrument(s).
When observations are made using a defined scale they may be called measurements. For
instance, the Point Count Method is one of the techniques used to assess the abundance of bird
species in the wild. With this technique, the researcher randomly allocates point counts to be used
as representative samples for the area. Point counts are visited over a period of several days or
longer to assess how many and what types of birds are in an area.
c) Tracking
This is a modified observational technique that may be used to gather quantitative data. With
tracking, research marketers are able to monitor the behaviour of customers as they engage in
regular purchase or information gathering activities. Possibly the most well-known example of
tracking research is used by websites as they track customer visits. But tracking research also has
offline applications, especially when point-of-purchase scanners are employed, such as tracking
product purchases at grocery stores and automated collections on toll roads as well as use of
automated teller machines (ATMs) in banks.
This method of research is expected to grow significantly as more devices are introduced that
provide means for tracking. However, some customers may see tracking devices as intrusive and
100
many privacy advocates have raised concerns about certain tracking methods especially if these
are not disclosed to customers.
d) Interview method (face-to-face)
An interview is a data-collection technique that involves oral questioning of respondents, either
individually or as a group. The answers to the questions posed during an interview can be recorded
by writing them down (either during the interview itself or immediately after the interview) or by
tape-recording the responses, or by a combination of both. Interviews can be conducted with
varying degrees of flexibility as described by Varkevisser et al., (2003).
i) High degree of flexibility
A flexible method of interviewing is useful if a researcher has as yet little understanding of the
problem or situation he is investigating, or if the topic is sensitive. The unstructured or loosely
structured method of asking questions can be used for interviewing individuals as well as groups
of key informants. It is frequently applied in exploratory studies of qualitative nature. The
instrument used may be called an interview guide or interview schedule6.
ii) Low degree of flexibility
Less flexible methods of interviewing are useful when the researcher is relatively knowledgeable
about expected answers or when the number of respondents being interviewed is relatively large.
Then questionnaires may be used with a fixed list of questions in a standard sequence, which have
mainly fixed or pre-categorised answers.
Face-to-face interviews enable the researcher to establish rapport with potential participants and
therefore gain their cooperation. These interviews usually yield highest response rates in survey
research. They also allow the researcher to clarify ambiguous answers and when appropriate, seek
follow-up information. Disadvantages include; impractical when large samples are involved, time
consuming and expensive.
Other forms of types of interviews include Telephone interviews and Computer Assisted
Personal Interviewing (CAPI). The later is a form of personal interviewing, but instead of
completing a questionnaire, the interviewer type in the responses directly in her/his laptop
computer.
6
Interview schedule is term used for loosely structured tools where interviewer asks and records answers.
101
e) Data collection using questionnaires
Questionnaires often make use of checklist and rating scales. These devices help simplify and
quantify people's behaviours and attitudes. A checklist is a list of behaviours, characteristics, or
other entities that the researcher is looking for. Either the researcher or survey participant simply
checks whether each item on the list is observed, present or true or vice versa. A rating scale is
more useful when behaviour need to be evaluated on a continuum. They are also known as Likert
scales.
A questionnaire consists of a number of questions printed or typed in a definite order on a form or
set of forms. However, the questionnaire needs to be carefully constructed in order to obtain the
required information. The following four aspects of a questionnaire need to be taken care;
i) Main aspects of a questionnaire
•
General form
The questionnaire can be structured or unstructured. Structured questionnaires are those
questionnaires with definite and pre-determined questions with a list of possible options/answers.
The questions are presented with exactly the same wording and in the same order to all
respondents. A highly structured questionnaire is one in which all questions and answers are
specified and comments in the respondents own words are held to a minimum (Kothari, 2009).
Whereas with unstructured (or non-structured) questionnaire, the interviewer is provided with a
general guide on the type of information to be obtained and leaves formulation of questions to the
interviewer. Unstructured questionnaires are ideal when the aim is to probe attitude and reasons for
certain actions of feelings. Unstructured questionnaires are useful for obtaining facts with which a
researcher is not familiar.
•
Questions sequence
It is important to have a question sequence that is clear, logical and smooth-moving. Questions that
are easiest to answer should be placed at the beginning and the questions should flow from general
to specific questions. Focused questions related to objectives of the study should be placed in the
main body of the questionnaire.
•
Questions formulation and wording
102
The formulation and wording of questions should produce questions that are i) easily understood,
and useful ii) Avoid jargon and too technical terminologies, iii) use short simple clear questions to
prompt for only one answer.
•
Questionnaire layout
Avoid unnecessary congesting your questions, allocate sufficient space for open ended question
and list choices downwards rather than horizontally (across the page). It is not advisable to have
questions on both sides of a paper.
ii. Administering written questionnaires
A written questionnaire (also referred to as self-administered questionnaire) is a data collection
tool in which written questions are presented that are to be answered by the respondents in written
form. A written questionnaire can be administered in different ways, such as by:
• Sending
questionnaires by mail with clear instructions on how to answer the questions and asking
for mailed responses. The main advantages of this method include the relatively low cost, can cater
for large samples and reach widely spread respondents. The demerits of this method include; low
rate of return of dully filled questionnaire and can only be used when respondents are educated and
cooperating.
• Gathering
all or part of the respondents in one place at one time, giving oral or written
instructions, and letting the respondents fill out the questionnaires; or
• Hand-delivering
questionnaires to respondents and collecting them later.
The questions can be either open-ended or closed (with pre-categorised answers), but it should be
short preferably of about 20 minutes.
Table 3.2.1: Comparison between data collection techniques and data collection tools
S/N
1
2
3
4
Data collection techniques
Using secondary data
Observation
Interviewing
Administering written
Data collection tools
Checklist; data compilation forms
Eyes and other senses, pen/paper, watch, scales microscope, etc
Interview guide, checklist, questionnaire, tape recorder
Questionnaire
questionnaires
103
2.4 Qualitative Data Collection Techniques
Qualitative data collection methods play an important role in impact evaluation by providing
information useful to understand the processes behind observed results and assess changes in
people’s perceptions of their well-being. Furthermore qualitative methods can be used to improve
the quality of survey-based quantitative evaluations by helping generate evaluation hypothesis;
strengthening the design of survey questionnaires and expanding or clarifying quantitative
evaluation findings. These methods are characterized by the following attributes:
• They
tend to be open-ended and have less structured protocols (i.e., researchers may change the
data collection strategy by adding, refining, or dropping techniques or informants)
• They
rely more heavily on iterative interviews; respondents may be interviewed several times to
follow up on a particular issue, clarify concepts or check the reliability of data
• They
use triangulation to increase the credibility of their findings (i.e., researchers rely on
multiple data collection methods to check the authenticity of their results)
• Generally
their findings are not generalizable to any specific population, rather each case study
produces a single piece of evidence that can be used to seek general patterns among different
studies of the same issue
Regardless of the kinds of data involved, data collection in a qualitative study takes a great deal of
time. The researcher needs to record any potentially useful data thoroughly, accurately, and
systematically, using field notes, sketches, audiotapes, photographs and other suitable means. The
data collection methods must observe the ethical principles of research. From Craig and Douglas
(2000) and Kombo and Tromp (2006) we can outline four major types of qualitative data
collection techniques:
• observational
• projective
and quasi-observational techniques;
techniques;
• depth
interviews;
• direct
observation;
• standardised
• case
tests; and
studies
104
We briefly discuss each of these techniques in the following subsections. 7
2.4.1 Observational and quasi-observational techniques
Observational techniques involve direct observation of phenomena in their natural settings.
Observational research might be somehow less reliable than quantitative research yet it is more
valid and flexible since the marketer is able to change his approach whenever needed.
Disadvantages are given by the limited behavioural variables and the fact that such data might not
be generalisable - we can observe a customer's behaviour at a given moment and situation but we
cannot assume all further customers will act the same.
Quasi-observational techniques on the other hand are reported to have increased in usage over
the past decades, due to the large scale employ of surveillance cameras within stores. Such
techniques cost less than pure observational ones since costs associated with video surveillance
and taping are far lower than a researcher's wage; the tape can be viewed and analyzed at a later
time, at the marketer's convenience. When performing videotaping of consumers' behaviours, they
can be asked to give comments and insights upon their thoughts and actions while the conversation
itself can be recorded and be further analyzed. The following are some common variants of
observational and quasi-observational data collection techniques.
2.4.1.1 Pure observation
With pure observation, the researcher watches behaviour of customers in real-life situation, either
in situ or by videotaping the respondents (less intrusive). Videotaping can be specifically
recommended when studying patterns of different cultures, since we can easily compare
behaviours taped and highlight similarities and / or differences.
2.4.1.2 Trace measures
Trace measures consist in collecting and recording traces of consumers' behaviour. Such traces can
be fingerprints or tear of packages, empty packages, garbage cans analysis and any other ways a
marketer can imagine (it's all about creativity here!). In e-Marketing, trace measures come under
7
Content of qualitative data collection techniques is mainly based on the article by Otilia Otlacan titled ‘Overview on
Qualitative Data Collection Techniques in International Marketing Research’. Article Source: http://EzineArticles.com/?
expert=Otilia_Otlacan
105
the form of recorded visits and hits - there are numerous professional applications that can help an
e-marketer analyze the behaviour of visitors on his company's website.
2.4.1.3 Archival measures
Achieve measures can be any type of historical records, public records, archives, libraries,
collections of personal documents etc. Such data can prove to be of great use in analyzing
behavioral trends and changes in time. Researchers can also identify cultural values and attitudes
of a population at a given moment by studying mass media content and advertisement of the
timeframe questioned.
2.4.1.4 Entrapment measures
These are indirect techniques (by comparison to the previously mentioned ones) and consist in
asking the respondent to react to a specific stimulus or situation, when the actual subject of
investigation is totally different. The marketer plants the real stimulus among many fake ones and
studies reactions. The method is quite unobtrusive and the marketer can gather valuable, nonreactive facts. When the respondent becomes aware of the true subject under investigation (s)he
might change the behaviour and compromise the study.
2.4.1.5 Protocols
Protocols are yet another observational research technique which asks respondents to think out
loud and verbally express all their thoughts during the decision-making process. Protocols are of
great value for determining the factors of importance for a sale and they can be collected in either
real shopping trips or simulated ones.
2.4.2 Projective techniques
Projective techniques are based on the respondent's performance of certain tasks given by the
researcher. The purpose is to have the respondents express their unconscious beliefs through the
projective stimuli; to express associations towards various symbols, images, signs. Projective
techniques can be successfully employed to:
a) Indicate emotional and rational reactions;
b) Provide verbal and non-verbal communication;
c) Give permission to express novel ideas;
106
d) Encourage fantasy, idiosyncrasy and originality;
e) Reduce social constraints and censorship;
f) Encourage group members to share and "open up" etc.
Projective research techniques can take the following forms, presented in subsections that follow
below.
2.4.2.1 Collages research technique
Collage projective research technique is used to understand lifestyles and brand perceptions,
respondents are asked to assemble a collage using images and symbols from selected sets of
stimuli or from magazines and newspapers of their choice.
2.4.2.2 Picture completion
Some pictures can be designed to express and visualize the issue under study and respondents have
to make associations and / or attribute words to the given pictures.
2.4.2.3 Analogies and metaphors
Analogies and metaphors are research techniques used when a larger range of projection is needed,
with more complexity and depth of ideas and thoughts on a given brand, product, service,
organization. The respondents are asked to freely express their association and analogies towards
the object being studied; or they can be asked to select from a set of stimuli (e.g. photos) those that
fit the examined subject.
2.4.2.4 Psycho-drawing
Psycho-drawing is a data collection technique that allows study participants to express a wide
range of perceptions by making drawings of what they perceive the brand is (or product, service).
2.4.2.5 Personalisation
Personalisation consists in asking the respondents to treat the brand or product as if it is a person
and start making associations or finding images of this person. This technique is especially
recommended in order to understand what kind of personality consumers assign to a brand/
product /service.
107
2.4.3 In-depth interviews
In-depth interviews are the most common type of qualitative data collection. These techniques of
research put an accent on verbal communication and they are efficient especially when trying to
discover underlying attitudes and motivations towards a product or a specific aspect/situation. Let
us briefly outline a few types of in-depth interviews.
2.4.3.1 Individual in-depth interviews
These types of interviews are performed on a person-to-person environment and the interviewer
can obtain very specific and precise answers. Interviews can be conducted by phone or via
internet-based media, from a centralized location: this can greatly reduce costs associated with
research and the results are pretty much as accurate as the face-to-face ones. The only
disadvantage would be the lack of non-verbal, visual communication.
2.4.3.2 Focus groups Discussion (FGD)
FGDs are basically discussions conducted by a researcher with a group of respondents who are
considered to be representative for the target population. Such meetings are usually held in an
informal setting and are moderated by the researcher. Videotaping the sessions is common these
days, and it can add more sources of analysis at a later time. Focus groups are perhaps the ideal
technique, if available in terms of costs and time, to test new ideas and concepts towards brands
and products; to study customers' response to creative media such as ads and packaging design or
to detect trends in respondents' attribute and perception.
One of the important advantages of focus groups is the presence of several respondents in the same
time, providing a certain synergy. Disadvantages refer mainly to the costs involved and the
scarcity of good professionals to conduct the interviews and discussions.
Review Exercise
Based on the extracts from the readings recommended for this lecture draw up a checklist of at
least 10 common mistakes that many researchers make when undertaking semi structured
interviews.
108
2.4.4 Direct Observation
Observation is a tool that provides information about actual behaviour. Direct observation is useful
because some behaviour involves habitual routines of which people are hardly aware. Observations
can be made of actual behaviour patterns. Kombo and Tromp (2006) outline the following three
variants of observation techniques: participant observation, unstructured observation, and
structured observation.
2.4.4.1 Participant observation
This is participation helps to reduce reactivity whereby the researcher becomes an active
functioning member of the culture under study. It gives a researcher an intuitive understanding of
what is happening in a culture. However the process is highly time consuming.
2.4.4.2 Structured observation
With structured observation technique the observer is an onlooker with a small number of specific
behaviour patterns and only on those whose names appear in the observation list predetermined
prior to effective data collection. This suggests that the researcher must have a clue of the trait
under observation.
2.4.4.3 Unstructured observation
Unstructured observations are helpful in understanding the behaviour patterns in their physical and
social context. Unlike the structured observation, here the researcher takes a position of an
onlooker, that is, the researcher collects data in the forms of descriptive short notes.
2.4.5 Standardized tests
Standardized tests are used by researchers mainly in education research studies, whereby the
researcher frequently uses standardized tests to measure one or more of the variables in a study.
There are many types of tests that one may consider to use in collecting data in educational
research e.g. achievements tests, personality tests, aptitude tests (such as intelligence tests) etc.
Irrespective of the type of tests used to collect data, the tests must be of high validity and of
unquestionable reliability.
109
2.4.6 Case studies
The case study is a means of obtaining detailed information on a relatively small number of units
of study and we consider the characteristics of case studies, the circumstances in which they are
useful. A case study approach is often appropriate when it is necessary to probe deeply into the
systems governing behaviour and interrelationships between people and institutions. In a case
study a single case (hence the name) or, more commonly in the social sciences, relatively few
items, are studied for a period of time and the results recorded. The other research designs involve
studying more than one group, or studying the same group at different times in order to make
comparisons.
A case study may be one or relatively few farmers or households, one group, one village, one
project, one district, one nation. The focus of a case study is on the detailed structures, patterns, or
interrelationships observed within each individual case included in the study, though the cases
themselves may be selected to cover a range of different types of study units.
Although case study is very flexible and valuable method of inquiry, its main limitation is its
generalisability. One cannot generalize statistically from a single case study. The case study like
the experiment does not represent a “sample” and therefore cannot be used for making statistical
generalizations about the population.
2.5 Other important aspects in data collection
2.5.1 Data storage
You need to decide on how to store your data for the short and long term before you conclude your
study. The two major forms of storage are the electronic and non electronic (paper) form. A
combination of both forms is highly recommended.
2.5.2 Ethical issues/considerations in data collection
Varkevisser et al. (2003) emphasised that when we develop our data collection techniques, we
need to consider whether our research procedures are likely to cause any physical or emotional
harm. The harm may be caused, for example, by:
110
• Violating
informants’ right to privacy by posing sensitive questions or by gaining access to
records which may contain personal data;
• Observing
the behaviour of informants without their being aware (concealed observation should
therefore always be crosschecked or discussed with other researchers with respect to ethical
admissibility);
• Allowing
personal information to be made public which informants would want to be kept
private, and
• Failing
to observe/respect certain cultural values, traditions or taboos valued by your informants,
e.g. dressing code.
Several methods for dealing with these issues may be recommended:
• Obtaining informed consent before the study or the interview begins and you must ensure that all
subjects participate voluntarily.
• Not exploring sensitive issues before a good relationship has been established with the informant;
• Ensuring the confidentiality of the data obtained; and
• Learning enough about the culture of informants to ensure it is respected during the data
collection process.
If sensitive questions are asked, for example, about family planning or sexual practices, or about
opinions of patients on the health services provided, it may be advisable to omit names and
addresses from the questionnaires.
2.5.3 Challenges faced by researchers in data collection
Some of the challenges encountered during data collection include the following issues
• Inadequate quality control of the work done by assistants.
• Failure to carry out a pilot study lead to haphazard work in the field.
• Poor definition and selection of sample
• Poor implementation: human and technical errors
• Lack of sufficient follow-up on non respondents
111
2.6 Summary
When thinking of conducting academic research, usually surveys and experimental designs are
most likely the first techniques that come to one’s mind. However, experimental designs and
surveys are a quantitative research approaches and, in order to understand respondents’ behaviour
and the social and cultural context, we will need to perform some qualitative research as well.
Qualitative methods are most certainly a more appropriate option when in need of researching
patterns and attitudes in customer behaviour, understand the depth of the environment around the
customer, and understand the cultural characteristics then influence a customer - especially when
the marketer is not familiar with the country of culture. There are certain situations where
qualitative research alone can provide the researcher with all insights needed to make decisions
and take actions; while in some other cases quantitative research might be needed as well. We will
stop by the main qualitative techniques and see how and where they can be employed in research.
Quantitative data collection methods rely on random sampling and structured data collection
instruments that fit diverse experiences into predetermined response categories. They produce
results that are easy to summarize, compare, and generalize. Quantitative research is concerned
with testing hypotheses derived from theory and/or being able to estimate the size of a
phenomenon of interest. Depending on the research question, participants or experimental units
may be randomly assigned to different treatments. If this is not feasible, the researcher may collect
data on participant and situational characteristics in order to statistically control for their influence
on the dependent, or outcome, variable. If the intent is to generalize from the research participants
to a larger population, the researcher will employ probability sampling to select participants.
You should remember that Qualitative research techniques involve the identification and
exploration of a number of often mutually related variables that give insight in human behaviour
(motivations, opinions, attitudes), in the nature and causes of certain problems and in the
consequences of the problems for those affected. ‘Why’, ‘What’ and ‘How’ are important
questions.
112
Structured questionnaires that enable the researcher to quantify pre- or post-categorised answers to
questions are an example of quantitative research techniques. The answers to questions can be
counted and expressed numerically.
Quantitative research techniques are used to quantify the size, distribution, and association of
certain variables in a study population. ‘How many?’ ‘How often?’ and ‘How significant?’ are
important questions.
The choice of data collection approach to use depends on the situation. Each technique is more
appropriate in some situation than others. You may want to consider combination of a variety of
data collection techniques. Oftentimes, both qualitative and quantitative research techniques are
used within a single study. Multiple tools often help you to meet the evaluation needs. You should
choose the correct tool to meet the needs of the evaluations (Table 3.2.2).
Table 3.2.2: Summary guide to choice of data collection method
If you
Want anecdotes or in depth information
Then use
Qualitative approach
Are not sure what you want to measure
Do not need to quantify
Want statistical analysis
Quantitative approach
Know exactly what you want to measure
Want to cover a large group
2.7 Review exercise
1. State situations under which each of the observational and projective techniques of data
collection can be more appropriate
2. Write brief notes supported with relevant examples to distinguish between the following;
a) Observational and quasi-observational techniques
b) Observational and projective data collection techniques
c) Structured and unstructured observations
d) Focus Group Discussions and In-depth interviews
113
e) Analogies and metaphors
f) Personalisation and Protocols
3. Entrapment measures complement psycho-drawing data collection methods. The two are
inseparable. Discuss this statement giving relevant examples.
•
Study the four cases presented below, and briefly describe for each case the following
aspects;
•
What type(s) of study you would propose.
•
From whom (or from what) you would collect the data required for each study (your study
populations).
•
For each case: which data collection techniques or combination of quantitative and quantitative
data collection techniques you would use.
Case one: You have just been appointed as a Regional Education Officer in a certain region. You note
that in a certain district the number of secondary school drop outs for both girls and boys is on the
increase. You suspect that the increase may be associated with the new ruby gemstone mining activities
in some of the wards. How would you (i) Identify the underlying causes of school drop outs? ii)
Determine whether and how students, parents and teachers themselves could contribute to alleviate the
problem and iii) What methods would you borrow from qualitative approach.
Case two: You are the new District Agricultural Officer. The district is faced with a maize food security
threat due to extensive damage of crop produce in store by a certain insect pest. Describe how you
would i) determine the perception and understanding of farmers on insect control strategies, ii) Compare
the extent of crop loss in relation to the different methods of maize storage used by farmers, iii) Collect
data to test the efficacy of two rates of insecticide application.
Case three: You are a new supermarket manager and you would like to test whether grouping of similar
products on shelves help to reduce bias in consumer selection.
Case four: You suspect that performance of students in day schools may be influenced by provision of
lunch to students. How would you collect data in a study aimed at testing that hypothesis?
114
2.8 References
David M. and Sutton C. D. (2004). Social Research: The Basics. Sage Publication, 2004, London. Pp.370
Gill J. and Johnson P. (2002). Research Methods for Managers. Sage Publications, 2002, London. Pp
13-45
http://www.worldbank.org/poverty/impact/methods/qualitative.htm#indepth . Last visited in August
2010.
IPDET
(undated).
Module
8:
Data
Collection
Methods.
www.worldbank.org/oed/ipdet/presentation/M_www.worldbank.org/oed/ipdet/presentation/M_0
8-Pr.pdfwww.worldbank.org/oed/ipdet/presentation/M_08-Pr.pdf . Last visited in August 2010.
Kombo, D.K. and Tromp, D.L.A. (2006). Proposal and Thesis Writing. An Introduction. Paulines
Publications Africa.
Kothari, C. R. (2004). Research Methodology: Methods and Techniques. New Age International
Limited Publishers, Pp. 1-24
Kumar, R. (2005). Research Methodology: A step by Step Guide for Beginners. Sage Publication, India
Pvt Ltd New Delhi 2005. Pp. 6-14
Otlacan, O. (undated) titled ‘Overview on Qualitative Data Collection Techniques in International
Marketing Research’http://EzineArticles.com/?expert=Otilia_Otlacan. Last visited in August
2010.
Saunders, M. N. K., Lewis & P., Thornhill, A. (2009). Research methods for business students. (5ed).
Harlow: Prentice Hall – Financial Times.
Varkevisser, C.M., Pathmanathan, I and Brownlee, A. (2003).Overview of Data Collection Techniques.
Module 10A: in Designing and Conducting Health Systems Research Projects: Volume 1.
Proposal
Development
and
Fieldwork.
KIT/IDRC.
www.idrc.ca/en/ev-56606-201-1-
DO_TOPIC.html. Last visited in August 2010.
115
MODULE FOUR
DATA ANALYSIS METHODS
116
LECTURE ONE
ANALYSIS OF QUANTITATIVE DATA
(By Dr. S. Massomo)
1.1 Introduction
Data analysis is an important stage of the research process. Quantitative data in a raw form convey
very little meaning to most people, hence data need to be processed using quantitative analysis
techniques in order to turn them into useful information. Usually data analysis is preceded by a preprocessing stage where raw data collected may be edited, coded, classified and tabulated.
In this section, we discuss the basics of statistical analysis. However, we do not cover the techniques
in detail. If you are unsure of the techniques you should refer to your notes from undergraduate
studies or consult statistics textbooks.
1.2 Learning outcomes
At the end of this lecture, you should be able to:
•
Outline various quantitative data analysis techniques and state their application
•
Distinguish between descriptive and inferential statistic methods
•
Recognise different types of data and apply various quantitative data analysis techniques
appropriate for a given type of data.
•
Select the most appropriate statistic to describe individual variables and to examine
relationships between variables and trends in your data.
•
How to perform statistical calculations in a straightforward step-by-step manner.
1.3 Basic ideas about data analysis and presentation
1.3.1 Variable
Variable is the word that is used to describe the name of any characteristic of a population or sample
that is of interest to us, e.g. age. Data is the word that describes the actual values (measurements or
observations) of the variable. Data may be either quantitative (numerical) or qualitative
(categorical). Note that the word ‘data’ is the plural form of word ‘datum’.
117
1.3.2 Scales of measurement
The selection of the type of analysis to use on a set of data depends on the scale of measurement of
the data. These scales are nominal, ordinal and numerical (sub-divided into interval and ratio) the
qualities of the scale of measurements are shown in Table 4.1.1.
Table 4.1.1: Comparison of Qualities of Measurement Scales (After Dunn, 2001).
Scale
Provide
Defining features
name
Nominal Names, Labels, categories.
less
Qualitative operations: =, ≠
information
Examples
Gender (1=male,
More
2=female), Ethnicity
qualitative
or religion of a
person, smoker Vs
Ordinal
Interval
Observation ordered or
non smoker
Class rank (1st,
ranked,
2nd ..), Rank such as
Qualitative operations: <, >
Order or ranking, equal
low, high
Fahrenheight
intervals between
temperature,
observations, no true zero
IQ score
point.
Qualitative operations: +, -,
X, ÷
Order or ranking, equal
Weight, height,
More
more
intervals between
reaction time, speed
Quantitative
information
observations, true zero point
etc.
Provide
Ratio
Qualitative operations: +, -,
X, ÷
a). Nominal scale
A nominal scale: is where the data can be classified into non-numerical or named categories, and
the order in which these categories can be written or asked is arbitrary. These include marital status
118
and sex. The best you can do with such data is to count the number of the observations in each
category and then calculate proportion or percentage of all observations that fall into each category.
b) Ordinal scale
Ordinal scale: is where the data can be classified into non-numerical or named categories. An
inherent order exists among the response categories. Ordinal scales are seen in questions that call
for ratings of quality (for example, very good, good, fair, poor, very poor) and agreement (for
example, strongly agree, agree, disagree, strongly disagree). Qualitative data that can be ranked may
be analysed by appropriate non-parametric techniques.
c) Numerical scale
Numerical scale: is where numbers represent the possible response categories, there is a natural
ranking of the categories, zero on the scale has meaning and there is a quantifiable difference within
categories and between consecutive categories.
1.4 Methods of quantitative data analysis
Statistical data analysis divides the methods for analysing data into two categories exploratory
(descriptive) and confirmatory (inferential) methods. In the following section, we look at the
exploratory methods.
1.4.1 Descriptive statistics
1.4.1.1 Introduction
Descriptive statistics are statistical procedures that describe, organise and summarize the main
characteristics of sample data. These include measures of central tendency (averages - mean,
median and mode) and measures of variability about the average (range and standard deviation).
These give the reader a 'picture' of the data collected and used in the research study.
Often the first step taken towards summarizing a mass of numbers is to form a table of means and
or a frequency distribution. We use example of age of a sample of active OUT students at
Morogoro, to illustrate this point. Then we summarise the age data by using mean and standard
deviation values in a table, bar graph, histogram and pie chart.
The age data were collected from 368 students and recorded as shown in the first six columns in
Table 4.1.2 Usually some sort of data processing is done before data analysis. The data in Table
4.1.2 were first coded to change sex and programme names into numbers, and then dates of birth
119
were converted to age in years. Finally, ‘Age at first year’ was obtained by subtracting ‘year of
study’ from ‘Age in year 2010’.
Table 4.1.2: Data recording sheet showing students names, date of birth, year of study, sex
and programme of study
S/N Name
Sex Year
of
Progr.
Date of
Age in Progr.
Sex
Age at first
Name
birth
year
code
year1
Study
F
4
F
3
F
3
M
10
F
4
F
7
F
8
M
8
M
5
F
5
F
6
code
2010
29
38
26
55
36
38
40
48
50
39
35
1
A
B.Com
26.06.81
1
1
25
2
B
B.Com
22.02.72
1
1
35
3
C
B.Com
30.11.84
1
1
23
4
D
B.Com E 19.05.55
2
2
45
5
E
B.Com E 18.12.74
2
1
32
6
F
B.Ed
15.12.72
3
1
31
7
G
B.Ed
12.04.70
3
1
32
8
H
B.Ed
03.05.62
3
2
40
9
I
B.Ed
01.01.60
3
2
45
10 J
B.Ed
01.08.71
3
1
34
11 K
B.Ed
28.03.75
3
1
29
12
367
368 Z
M
1
LLB
07.05.82
28
15
2
27
NB: Age at first year was obtained by subtracting ‘year of study’ from ‘Age in year 2010’
We first summarize the data by calculating and presenting means, standard deviations and range.
The summary is shown in Table 4.1.3. Later on we show how the information on the variable ‘age
at first year’ can be summarised in a histogram, bar graph and a pie chart.
Table 4.1.3: Summary of Age of students at first year of study across study programmes and sex
(n=366).
Program
All Students
Mean
B.Com
B.Com Ed
B. Ed
BA Ed
BA Gen
27.7 ± 6.4
38.5 ± 9.2
36.2 ± 8.5
32.8 ± 8.8
35.8 ± 6.8
Mi
Ma
n
23
32
21
21
24
x
35
45
54
50
50
N
3
2
112
78
26
Female students
M
Mean
in
27.7 ± 6.4 23
32.0 ±
32
36.1 ± 8.4 21
31.9 ± 9.7 21
37.0 ± 7.3 24
Male students
Ma
x
35
32
53
50
46
N
Mean
3
1
44
36
6
45.0
36.3 ± 8.7
33.5 ± 8.1
35.5 ± 6.8
Mi Ma
n
45
21
23
26
x
45
54
50
50
N
0
1
68
42
20
120
BA Jour
BA Soc
BA Tour
BBA Ed
BBA Gen
BSc Ed
BSc Env.
BSc Gen
CYP
LLB
Total
29.3 ± 4.6
35.9 ± 7.3
35.3 ± 10.2
40.3 ± 5.3
32.8 ± 6.2
29.1 ± 8.9
39.8 ± 12.4
38.6 ± 7.3
32.0
34.3 ± 7.4
23
24
23
36
23
21
23
24
32
23
34.8 ± 8.4
21
34
49
50
48
46
50
51
50
32
50
54
4
14
7
4
24
22
4
36
1
29
34.0
32.7 ± 7.6
39.0
28.5 ± 3.7
25.2 ± 5.8
40.6 ± 7.8
36.8 ± 13.1
366
33.7 ± 9.1
34
24
39
24
21
24
24
2
1
34
38
39
32
38
50
50
1
3
0
1
4
10
0
8
0
4
12
53
1
27.7 ± 4.2
36.8 ± 7.3
35.3 ± 10.2
40.7 ± 6.4
33.7 ± 6.3
32.4 ± 9.9
39.8 ± 12.4
38.1 ± 7.1
32.0
34.0 ± 6.4
23
28
23
36
23
22
23
24
32
23
31
49
50
48
46
50
51
49
32
48
3
11
7
3
20
12
4
28
1
25
35.3 ± 8.0
21
54
245
Comments on the data: You will notice that incidentally, there are no female students in BA Tour,
BSc Env and CYP, programmes, likewise there are not any male students in B.Com. Age at first
year ranges from 21 – 54, with the mean being 34.8± 8.4 standard deviation. The highest variation
in age appears in BA Tour. Generally, age at first year seems to be similar for males and females
students. However, female students seem to start studying at a younger age than male students in
BSc Ed, BBA Ed and BA Soc programmes.
We further use the data in the column of number of all students (i.e. column 5, Table 4.1.1) to plot
Figure 4.1.1. The histogram in Figure 4.1.2 was prepared using seven classes representing the range
of age at first year i.e. 21 – 54. Additionally a pie chart is presented in Figure 4.1.3 to indicate the
150
100
LLB
CYP
BSc Gen
BSc Env
BSc Ed
BBA Gen
BBA Ed
BA Tour
BA Soc
BA Jour
BA Ed
BA Gen
B.Ed
0
B Com Ed
50
B Com
Number of students
relative proportion of the distribution of the students’ age groups.
Programmes
121
Figure 4.1.1: Bar Chart showing number of students at Morogoro in different undergraduate
programmes (n=367)
Number of students
100
75
50
25
0
21-25 26-30 31-35 36-40 41-45 46-50 51-55
Age groups
Figure 4.1.2: Histogram showing the proportion of students age groups at first year of study
(n=367)
122
Figure 4.1.3: Pie chart showing proportion of students in the seven age groups (n=367)
1.4.2 Inferential Statistics
1.4.2.1 Introduction
In the previous section, we have shown how age data can be summarised, organised and presented in
different ways. In this section, we look at another data analysis technique called inferential statistics.
Inferential statistics extend the scope of descriptive statistics by examining the relationships within
a set of data; in particular, inferential statistics enable the researcher to make inference, that is,
judgements about the population based on the relationships within the sample data. This is achieved
by estimation of population parameters and by hypothesis testing.
Most measurement data where the aim is to compare means can be analysed using an analysis of
variance (ANOVA), a t-test or a suitable non-parametric method. Scores and proportions often use a
chi-squared test, while cause-effect/dose-response relationships use regression analysis. Other
methods may be needed when there are multiple outcomes. You should note that the design of an
experiment and its statistical analysis are intimately connected.
123
1.4.2.2 Hypothesis testing
Hypothesis testing: Also known as testing of hypothesis or test of significance, is a statistical test
that examines a set of sample data and on the basis of an expected distribution of the data (e.g. t, Z, F
or Chi-square) leads to a decision about whether to accept the null hypothesis (Ho) or the alternative
hypothesis (Hi). It is a procedure for establishing whether a test detected a reliable difference
between two or more groups.
The steps involved in hypothesis testing are:
•State the null hypothesis (Ho) and the alternative hypothesis (Hi)
•Select the level of significance
•Choose appropriate test to use
•Formulate a decision rule by obtaining a critical value from the table and state the decision rule; If
the test statistic fall in the critical region we reject Ho
•Determine the value of the test statistic
•Compare the value of the test statistic with the critical value and conclude either do not reject Ho or
reject Ho and accept Hi.
a) Terminologies related to hypothesis testing
We start by describing the important terminologies related to hypothesis testing;
i) Hypothesis: A statement about a population that is subject for testing. A hypothesis may be null
or alternative.
ii) Null hypothesis: A statement about a population that is under test. Denoted as Ho, and Ho state
that there is no difference between means or there is no effect. Always include the equal sign ‘=’.
iii) Alternative hypothesis: A statement that is true when Ho is false and answers your research
question. It can assume three possible forms; Hi: X > µ, or X < µ or X ≠ µ. Alternative hypothesis
(Hi) determines whether the test is left/right one tailed of two tailed. It is characterised by presence
of inequality sign.
iv) Level of significance: Also called p-value or alpha. The p-value refers to the probability of
rejecting the null hypothesis when it is true. It indicates the probability of observing a sample value
as extreme as, or more extreme than, the value observed, given the null hypothesis is true. It indicates
124
the likelihood that Ho is not true. The smaller the p-value the better, however, there is no one level of
significance that is applied to all tests. The P=0.05 and 0.01 are common for most studies.
v) Test statistic: this is a value, determined from sample information that is used to determine
whether to reject the null hypothesis, e.g. Z, t, F value.
vi) A critical value: The value(s) which separates the critical region from the non critical region.
The critical values are determined independently of the sample statistics. They are read from
appropriate tables of distribution.
vii) Critical region: also called rejection region, is a set of all values which would cause us to reject
Ho. If the test statistic falls in the rejection region Ho is rejected.
viii) Decision rule/statement: When the test statistic exceeds the critical value Ho is rejected and a
statement is made based upon the null hypothesis. It is either ‘reject the null hypothesis’ or ‘fail to
reject the null hypothesis’. Usually we never accept the null hypothesis.
ix) Conclusion: A statement which indicates the level of evidence (sufficient or insufficient), at a
specific level of significance and decide whether the original claim is rejected (null hypothesis) or
supported (alternative hypothesis).
x) Statistical significance: Refers to whether a statistical test detected a reliable difference between
two or more groups, for instance one caused by the effect of an independent variable on a dependent
measure.
xi) Type I error: Occurs if we reject Ho when it is true. Usually more serious error.
xii) Type II error: Accepting Ho when it is false, that is saying true when it is false.
For example: usually defendants are presumed innocent until proven guilty. The purpose of a court
trial is to see whether a null hypothesis of innocence is rejected by the weight of the data (evidence).
The null hypothesis : Ho = the person is innocent,
The alternative hypothesis Hi = the person is guilty
Which is more serious error? Convicting an innocent person or letting the guilty person go free?
b) Testing for the population mean for large samples
i) Introduction
The standard normal distribution, that is the Z distribution, is appropriate test statistic to use when
the sample size has at least 30 observations and population standard deviation is unknown. If the
125
sample size is less than 30 consider using the t-test described in section 11.2.4. The Z statistical test
uses the sample standard deviation to estimate the unknown population standard deviation. In the
following sections we look at use of Z test for inference about one or two populations.
ii) Inference about one population mean
The one sample Z test may be used to test whether there is any significant difference between a
sample mean and a known population mean value. We can verify if the age of students in first year,
using the sample of Morogoro OUT students, is similar to the starting age in convectional
Universities. Suppose the average age of students when they begin studying in convectional
Universities is 28 years, the one sample Z test will be used to compare the sample mean with the
population mean of 28 years.
The sample statistics are mean = 34.8, n=366 and Standard deviation of 8.4 (see Table 4.1.3). In this
case Ho: µ = X= 34.8, Hi: µ ≠ X ≠ 34.8. If we decide to use the 0.05 significance level, computation
of the value of the test statistics is done as follows;
The formula is:
Where X = Sample mean, µ = Population mean, SD = Sample standard deviation and n = sample
size.
15.45
Critical value from table at p = 0.05 is 1.96
Conclusion: Since test statistic (Z=15.45) exceed the critical value (1.96). We fail to accept null
hypothesis Ho: µ = X and instead we accept the alternative hypothesis Hi: µ ≠ X. In other words the
sample mean (34.8) is significantly different from the population mean (28).
iii) Inference about two means using the Z test
We use the data of age of BA Ed students (see Table 4.1.3) to illustrate how Z test can be used to test
for statistical difference between two samples. In this case, we compare the mean age at first year for
female students versus that of male students in the BA Ed programme, using a two sample Z test at
the 0.05 level of significance.
126
Table 4.1.4: Mean age, standard deviation data for BA Ed students
Group
Female students
Male students
Mean age
31.9
33.5
Standard deviation Sample size
9.7
36
8.1
42
Null Hypothesis (Ho): X1 = X2 that is, the age at first year is the same for female and male BA Ed
students. Alternative Hypothesis (Hi) X1 ≠ X2 that is, the age at first year is different for female and
male BA Ed students.
Z=
= Z=
Z=
=
=
= -0.81 (but we use the absolute value 0.81)
Critical value from table at p = 0.05 is 1.96
Conclusion: Since test statistic (calculated absolute value of Z: 0.81) is less than the critical value
(1.96), we fail to reject the null hypothesis and conclude that the two means for age at first year do
not differ significantly.
c) Testing for the population mean for small samples
i) Introduction
In the previous section, we showed how the Z test statistic can be used in hypothesis testing. In this
section, we look at hypothesis testing using the t test. The t-test, which use the t distribution, is used
for small samples (n<30) because the Z distribution provides unreliable estimates of differences
between samples when the number of available observation is less than 30.
Characteristics of the t-distribution are:

• It
is a continuous distribution
• Mound
• Flatter
(note it is not bell) shaped and symmetrical
or more spread out than standard normal distribution
127
• There
is a family of t-distribution depending on the number of d.f.
Application of the t-test
The t test was created to deal with small samples when parameters and variability of larger parent
population are unknown. The t tests are used to compare one or two sample means but not more
than two means. The use of t test in hypothesis testing assumes that the sample was drawn from a
normally distributed population. The t test detects a significant difference between means when the
(i) Difference is large, (ii) Sample standard deviation is small and or (iii) Sample size is large

Variation of the t test
I. Single or one sample t test
This is used to compare the observed mean of one sample with a hypothesized value assumed to
represent a population. The t or Z test both use similar formulas
Test statistic =
=
It tries to answer the question: is it likely that a sample with a given mean could have come from a
population with the proposed µ? It is usually used to determine if some set of scores or observation
deviate from some established pattern.
If the population standard deviation, sigma, is unknown, then the population mean has a student's t
distribution, and you will be using the t-score formula for sample means. The test statistic
is very similar to that for the z-score, except that sigma has been replaced by s and z has been
replaced by t. The critical value is obtained from the t-table. The degree of freedom for this test is n1.
Worked example of a single or one sample t test
Suppose a retail shop sells an average of 320 units of a certain product per day with a standard
deviation of 40. After an extensive advertising campaign the manager calculated the average sales of
the product for the next 25 days to see whether an improvement has occurred. The average sales
turned out to be 345 units per day.
128
From this information can we say that the advertisement significantly improved sales of the product?
The answer is No, until when we perform a statistical test, in this case we use the one sample t-test
and the 0.05 level of significance.
The hypotheses
Null Hypothesis (Ho): X = µ . Advertisement does not affect/improve sales of the product.
Alternative Hypothesis (Hi) X > µ . Advertisement does improve sales of the product.
We use the formula
Critical value at t 0.05, 24df = 1.711
Conclusion:
Since the value of test statistic (t =
) is greater than the critical value (1.711)
We fail to accept null hypothesis Ho: X = µ and accept the alternative hypothesis Hi: X > µ units. In
other words the sample mean (345) is significantly different from the population mean (320) hence
the advertisement had improved sales of the product.
II. The T test for independent groups (two sample test)
Samples are independent when they are not related. Independent samples may or may not have the
same sample size. Independent sample t test is designed to detect significant difference between one
group e.g. a control group and another group such as the experimental group.
It tries to answer the Question: Is X1 different from X2 or could the two sample means come from
identical population?
Assumptions required for the independent t test;
•
The population are normally distributed
•
The populations are independent
•
Standard deviation are the same in both population
129
•
Therefore the standard deviation are pooled
Worked example of a two sample t test
Suppose we have two samples with the following measurements, and we want to test whether the
two means are statistically different say at the 0.05 level of significance.
Sample 1
2
4
9
3
2
Sample 2
3
7
5
8
4
n = 5, X1 = 4.0, SD =
3
2.9155
n = 6, X2 = 5.0, SD =
2.0976
Step 1. Calculate the means and standard deviation for each sample
Step 2. Pool the sample variances
S2p =
= 6.222
Step 3. Determine the t value
= 0.662
Where
S2p
=
is the mean of the first sample
=
is the mean of the second sample
=
is the number in the first sample
=
is the number in the second sample
=
is the pooled estimate of the population variance
Conclusion: Since calculated t value (0.662) is less than the critical value of 2.262 at p=0.05 and 9
degrees of freedom, we fail to reject the null hypothesis and conclude that there is no statistical
difference between the means of the two samples
130
III. Dependent samples t test (paired samples t test)
Dependent sample t test is commonly used with samples in which the subjects are paired or matched
in some way. Dependent samples must have the same sample size, but note that it is possible to
have the same sample size without being dependent.
Type of dependent samples are:
(i) Those characterised by a measurement, an intervention of some type, then another measurement.
In this case a paired t test is designed to detect the presence of measurable change in the average
attitude/behaviour of group from one point in time to another point in time.
(ii) Those characterised by a matching or pairing of observations, e.g. newly- wed couples where
husbands are paired with their wives.
It tries to answer the Question: Is the mean one (X1) different from mean two (X2)?
Procedure
We calculate the Mean of the Difference:
not the difference between the two means. The idea
with the dependent t test is to create a new variable, D, which is the difference between the paired
values. You will then be testing the mean of this new variable.
Here are some steps to help you accomplish the hypothesis testing
•
Write down the original claim in simple terms. For example; After > before.
•
Move everything to one side: After - Before > 0.
•
Call the difference you have on the left side D: D = After - Before > 0.
•
Convert to proper notation:
•
Compute the new variable D and be sure to follow the order you have defined in the third
step above. Do not simply take the smaller away from the larger. From this point, you can think of
having a new set of values. Technically, they are called D, but you can think of them as X. The
original values from the two samples can be discarded.
•
Find the mean and standard deviation of the variable D. Use these as the values in the t-test
Value of t is =
note the S.E in this formula
131
s.d=
Note that this formula is similar to the formula for s.d.
d=
Where d = mean of the difference between paired or related observations
s.d = standard deviation of the distribution of the differences between paired or related observations
Worked example: Paired t test
A study was carried to measure the effect of a fitness campaign for OUT students at Dodoma
regional centre. Five students were randomly sampled and their weights (in Kg) were recorded
before and after the exercise as presented in the following table. We determine, using the 0.05 level
of significance, whether the campaign had any significantly effect on the students.
Table 4.1.5a: Weight of students in Kilogrammes, before and after the fitness campaign.
Student
Before
After
A
88.45
89.35
B
76.65
73.93
C
83.00
81.65
D
70.30
68.04
E
76.20
72.57
Table 4.1.5a: Weight of students in Kilogrammes, before and after the fitness campaign.
Student
A
B
C
D
E
SD =
Before
88.45
76.65
83.00
70.30
76.20
Total
Mean
After
89.35
73.93
81.65
68.04
72.57
=
=
D
-0.90
2.72
1.35
2.26
3.63
9.06
1.81
d2
0.81
7.40
1.82
5.11
13.18
28.32
=
= 1.73
Formula t =
=
=
=
= 2.35
132
Critical value at 4 degrees of freedom = 2.132 (This is read from the Students t distribution table at
4 degrees of freedom for a one tailed test)
Conclusion
Since test statistic (calculated t value: 2.35) > the critical value (2.132), we fail to accept the null
hypothesis and accept alternative hypothesis. We conclude that the fitness campaign significantly
reduced the weights of students.
d) The Analysis of Variance/ F-Test
i) Introduction
The F-test, commonly known as the analysis of variance (ANOVA), is the most widely used method
of statistical analysis of quantitative data. It is important that every researcher doing quantitative
studies should be familiar with this technique. The ANOVA calculates the probability that
differences among the observed means could simply be due to chance. The ANOVA is based on Fstatistical distribution. It is closely related to Student's t-test, but whereas the t-test is only suitable
for comparing two treatment means, the ANOVA can be used both for comparing several means and
in more complex situations, provided that certain assumptions are met. The t statistic and F ratio
share a specific relationship when only two means are present. The relationship is
= t, or t2
= F ratio.
The ANOVA partitions the total variation into a number of parts such as Treatment, Block, Error
and Total, depending on the design of the experiment.
The assumptions that have to be met include;
2.
Data be in interval or ratio scale
3.
The population where the sample were drawn have equal variances
4.
That the observations are independent
5.
That the residuals (deviations from group means) have a normal distribution
6.
Data were randomly sampled
In some cases a scale transformation is necessary in order for the assumptions required for a valid
ANOVA are met.
ii. Variations of ANOVA
133
There are several variations of the ANOVA used according to the number of factors 8, treatments and
other sources of variation.
•
The completely randomised design (CRD) this is the simplest one factor experimental
design that is only used when experimental units are homogenous. For instance, nutrient medium in
Petri dishes is homogenous mixture that can be used for testing the sensitivity of a certain bacteria to
different drugs or different levels of a single drug.
•
Blocked or stratified designs: These are the most commonly used in ANOVA. The designs
include randomised block and the Latin square. They break the experiment up into smaller "miniexperiments": These designs are usually analysed by a two-way analysis of variance without
interaction (see worked example).
•
Factorial designs: These designs look at the effect of two or more "factors" simultaneously.
There can be any number of factors and any number of levels of each factor. Factorial designs
provide extra information at little or no extra cost. They are commonly used to:

Study any interactions among factors. It is often important to know what factors influence
the outcome of an experiment.

Increase the amount of information from an experiment without increasing the numbers of
experimental units. They are almost like doing two or more experiments simultaneously with the
same experimental units.

Find the combination of factors which produces most sensitivity in subsequent similar
experiments.
Blocking and factorial designs are not mutually exclusive. Both can be used in a single experiment
We look at the example of a single factor ANOVA using the Randomised complete Block Design
(RCBD).
8
A factor is an independent variable within the ANOVA. A factor must have two or more levels within it to be
analytically viable.
134
The single-factor ANOVA
The RCB design partitions the total variation into parts associated with treatments, blocks and error.
The design consists of blocks of equal size, each of which contains all treatment. Blocking technique
help to reduce experimental error by eliminating the contribution of known sources of variation
among the experimental units. This is achieved by grouping experimental units into blocks such that
variability within each block is minimised and variability among the blocks is maximised. In
experiments blocking is most effective when the experimental area has a predictable pattern of
variability, e.g. slope of field, soil fertility gradient etc.
Table 4.1.6: Layout of a Randomised Complete Block experiment with four treatments (A, B,
C & D) and four blocks/replications.
Note that each treatment appear once in each block.
Block I
A
C
D
B
Block II
C
B
A
D
Block III
C
D
B
A
Block IV
B
A
D
C
Predictable direction of variation in fertility or soil moisture
Worked example: the Single factor two way ANOVA
A field experiment was carried out to test the effect of fertilizer application on the yield of rice.
There were three treatments9 namely; (i) no fertilizer, (ii) fertilizer A and (iii) fertilizer B. The No
fertilizer serves as a control treatment. The experiment was laid out in a Randomised Complete
Block Design with four replications.
In this case we test whether there are any statistical differences among the means for treatments and
blocks/replications. This is an example of a two way ANOVA with two sources of variations i.e.
blocks and replications.
Null hypothesis H0: µ1 = µ2 = µ3
Alternative H1: The mean of at least one treatment is different.
9
Treatment refers to the cause or specific source of variation in the data. In experiments it is a procedure whose effect is
being measured, in this case fertiliser application.
135
Table 4.1.7: Yield of rice (Kg/plot) following application of fertilizer in a field trial laid out in
Randomised Complete Block Design.
Treatments
No fertilizer
Fertilizer A
Fertilizer B
Treat. Totals
The Grand total:
Replication
1
2
3
4
Rep.
6.0
6.9
7.2
20.1
6.5
7.0
7.8
21.3
5.5
6.6
6.8
18.9
6.4
7.5
7.4
21.3
Totals
24.4
28.0
29.2
81.6
= 6.0 + 6.4 +… 6.8 = 81.6
(x)2/n =
Correction factor (CF) =
(81.6)2/12 = 554.88
Total sum of squares = 6.02 + 6.42 +…..6.82 - CF
559.56 – CF = 4.68
Replication sum of squares
= ((20.12 + ... 18.92) / 3) – CF
= (1668.60 / 3) – CF
= 556.2 – CF = 1.32
Treatment sum of squares
= ((24.42 + 28.02 + 29.22) / 4) – CF , = (2232/4) – CF
= 558.00 – CF = 3.12
Error Sum of Square
= Total SS – (Rep SS + Treat SS)
= 4.68 – (1.32 + 31.2) = 0.24
Table 4.1.8: Analysis of variance of data in Table 4.1.5a & b from single factor two way
Randomised Complete Block Design.
Source of
variation
Replication
Treatment
Error
Total
Conclusion
Degree of
Sum of
Mean SS
F-Value
Table F-value
freedom
3
2
6
11
squares
1.32
3.12
0.24
4.68
0.44
1.56
0.04
11.00
39.00
(p ≤ 0.05)
4.76
5.14
(ii) Since calculated F value 39.0 (test statistic) for treatments > table value 5.14 (critical value) we
fail to accept Ho. We accept Hi: and conclude that the means for treatments/fertilizers differ
significantly at p ≤ 0.05.
136
(ii) You will also notice that the calculated F value 11.0 (test statistic) for replications > table value
4.76 (critical value) we fail to accept Ho. We accept Hi: and conclude that the means for replications
differ significantly.
From the above example the ANOVA give the probability (p ≤ 0.05) that the observed differences
among the means for the three treatments (Fertilizers) and replications could have arisen by chance
sampling variation.
Other examples of situations where ANOVA can be used
Case one: A certain supermarket has shops in four locations namely Temeke, Upanga, Kariakoo and
Ubungo. The manager recorded weekly sales of one product for each shop in the following 11.9
below.
Table 4.1.9: Number of product sold at four supermarkets over a period of four weeks
Week
The number of product sold in each week
Temeke
Upanga
Kariakoo
Ubungo
1
124
155
311
170
2
234
208
350
188
3
430
234
298
168
4
120
249
330
174
In this case the manager can use a single factor two way ANOVA to test difference in mean sales
from the four shops across the four weeks and also test whether there is any difference in weekly
sales across all shops.
Case two: In recent years, affordable motorcycles, nicknamed ‘Yebo-yebo’, were imported and sold
throughout the country where some of them are used to carry passengers. However, most regions
have recorded an upward trend in the number of accidents involving motorcycles. As the new Chief
of Police in Dar es Salaam you would like to determine whether there is a difference in the mean
number of accidents involving motorcycles in three districts. Suppose the record of the number of
accidents reported in each district for a sample of seven days is as follows;
Table 4.1.10: Record of motorcycle accidents in three districts
Day
Number of motorcycle accidents reported
Temeke
Ilala
Kinondoni
137
1
21
12
18
2
13
14
21
3
18
13
15
4
19
15
20
5
18
12
22
6
11
10
18
7
16
11
16
The Chief of police can use a single factor two- way ANOVA to test whether there is any differences
in the means for three districts and for the seven days period.
e) Mean Separation and Scale transformation
i) Mean separation
The ANOVA technique does not indicate which mean differs from which. Further analysis is
required to determine which means are significantly different. There are two strategies that can be
used: i) Planned, single degree of freedom F- tests (orthogonal contrasts) and ii) Multiple
comparisons (means separation). The mean separation techniques include; Least Significant
Difference, Student-Newman-Keuls, Duncans Multiple Range Test, etc.
ii) Data transformation (Scale transformation)
A scale transformation may be needed to improve the situation if the assumptions about normality
of the residuals and homogeneity of variances are not met. Unfortunately, there is no general rule on
how much of a departure from normality and homogeneity of variance there has to be to make a
transformation of scale or the use of non-parametric methods necessary. In this case it would
probably be sensible to transform the data (see below), but it is a borderline case. Three
transformations commonly used are:
•
The logarithmic transformation for skewed measurement data
Biological data often has a skewed distribution, particularly when the concentrations of something
are being measured. Concentrations cannot be less than zero, but often there are a few high values.
Taking the logs (to any base, but usually to base 10) will often result in a better fit to the
assumptions. All the statistical analyses would then be done on the logarithm of each observation, but
138
in presenting the results, the means should be back-transformed by taking antilogs. However,
the standard deviations cannot be treated in this way. If there are some numbers below one, you can
avoid negative numbers by adding one before taking logs, and subtracting it again after taking the
antilogs.
•
The square root transformation for counts
Counts where the mean count is low (e.g. where a lot of the counts are 0, 1, 2 or 3) often have a
Poisson distribution where the mean is equal to the variance. A square root transformation will
normalise the residuals, i.e. each count should be replaced by its square root. Sometimes one is added
to each number before taking square roots.
•
The logit transformation for percentages
Percentages where a large proportion of the values are either less than 20% or greater than 80% have
a skewed distribution because it is not possible to have values of less than 0% or greater than 100%.
The logit transformation is X=ln{p/(1-p)}, where p is the proportion, should correct this situation.
f) Regression and Correlation analyses
In the previous sections, we showed procedures used for comparisons of means and analysis of
variance. The procedures were used to evaluate only one variable at a time. However, when you
suspect interrelationships and association to occur among different variables then you should also
consider possibility of using other data analysis techniques. Interrelationships of quantitative data can
be examined using regression and correlation analysis.
i) Regression analysis
Regression analysis is used when the aim is to study a cause-and-effect relationship between
variables. Regression analysis describes the effect of one or more variables (independent variables)
on a single variable (dependent) by expressing the later as the function of the former. It is therefore
important that you distinguish between the dependent and the independent variable before you
perform this analysis. The relationship between the dependent variable and independent variable(s)
may be specified based on (i) accepted biological concepts, secondary data or past experience or (ii)
based on the data gathered in the experiment itself.
ii) Correlation analysis
139
Correlation analysis is used to test the nature and strength of association between two or more
variables even when there is no evidence of a cause-and-effect relationship between variables.
Regression and correlation procedures can be classified according to the number of variables
involved and the form of functional relationship between the dependent and independent variables.
The types are; i) Simple linear regression and correlation, ii) Multiple linear regression and
correlation, iii) Simple non linear regression and correlation and iv) Multiple non linear regression
and correlation (Gomez and Gomez, 1984).
In this section, we describe the procedure used in simple linear regression and correlation because of
its simplicity and its wide usage in research. There is only one independent and one dependent
variable and the functional relationship is assumed to be linear.
•
Simple linear Regression analysis
This technique deals with estimation and tests of significance concerning the two parameters α and β
in equation Y= α + βx when a linear relationship exist. However, it does not provide any test as to
whether the best functional relationship is indeed linear. We illustrate the procedure for simple linear
regression analysis.
Worked example: simple linear regression analysis
We use hypothetical data10 presented in the following table;
Table 4.1.11: Five rates of fertilizer applied in a carrot farm and the corresponding yield of
carrots realised
Fertilizer rates (Kg/Ha)
5
10
15
20
25
Carrot yield (Kg/Plot)
3
6
8
10
10
(a)
10
Computation of the simple linear regression equation between the two variables
Usually crop response to fertiliser application become negative at higher level of the fertiliser and hence the relationship
becomes non linear.
140
Mean X = 15
= 1375
= 309
β=
=
=
Formula for α=
=
= 0.36
=
= 2.0
Simple linear regression equation: (Y= α + βx) = Y = 2 + 0.36x
This equation can be used to predict the yield of carrot that may be realised following application of
different levels of fertiliser. For instance, if we want to estimate the yield of carrot when 22 kg/ha of
fertiliser is applied. Then Y=2+ 0.36 X 22, Y=2+7.92, Y= 9.92. Note that the yield of 9.92kg/ha is
lower than the yield of 10kg/ha obtained with application of 20 kg/ha, implying that yield of carrot
cannot be improved by additional increase in the fertiliser rate beyond the 10 kg/plot.
•
Simple Linear Correlation analysis
This technique deals with estimation and tests of significance of the simple linear correlation
coefficient r. The correlation coefficient (r) indicates the strength (-1.0 to +1.0) and nature (positive
or negative) between variables. You should note that both -1 and +1 equally indicate perfect linear
association. The value of r = 0 indicates no linear relationship, r > 0 indicates positive linear
relationship and r < 0 indicates negative linear relationship. For example, an r value of 0.8 imply that
64% [(100)(r2) = (100)(0.8)2 = 64)] of the variation in the variable Y can be explained by the linear
function of the variable X.
It should be noted that a zero r value indicates the absence of a linear relationship between two
variables but it does not indicate the absence of any relationship between the variables. It is possible
for two variables to have a non linear relationship, such as quadratic form (Gomez and Gomez, 2004).
Worked example: simple linear correlation analysis
We use the data from the previous section to illustrate the procedure for estimation and test of
significance of the simple linear correlation coefficient r between two variables X and Y is as
follows;
141
Formula for simple linear correlation coefficient r =
r=
=
=
=
= 0.99
Testing the significance of the simple linear correlation coefficient
This can be done by evaluating the ratio of r and the standard error of the estimate and then use the t
test with n-2 degrees of freedom. However, in practice we usually compare the calculated r value to
the r value from a table of Simple linear coefficients (e.g. Gomez and Gomez, 1984: Appendix H;
Zar, 1996: Appendix B). The table values at 3 (i.e. n-2) degrees of freedom are 0.878 at the 0.05 level
of significance and 0.959 at the 0.01 level. The calculated r value is declared significant if it exceeds
the table value at a specific level of significance. Hence, we can conclude that the calculated r value
is significant different from zero at the 0.01 probability level. This implies that there is strong
evidence that the two variables, i.e. carrot yield and fertiliser applied, are highly associated. You
should note that a significant correlation coefficient r value does not always imply a cause-effect
relationship.
Notes: Pearson r is a correlation coefficient that may be used with interval or ratio scaled data,
whereas Spearman rs is ideal for ranked or ordinal data and also nominal data
g) Non parametric techniques
Non-parametric methods are used when the assumptions of normality of the residuals and equal
variation in each group, required for parametric methods such as the t-test and ANOVA, are not met
and cannot be met by a suitable transformation of scale.
The more widely used methods replace each observation by its rank in the total set of observations.
This comes with a cost. In general, non-parametric tests are not as powerful as parametric tests
because, in transforming the data to ranks, they throw out some useful information. This means that a
non-parametric test may fail to detect a true treatment effect which could be detected by a parametric
method. Therefore, where possible, parametric methods should be used.
i) Advantages of using non parametric test
• Usually distribution free. Do not posses any underlying assumptions that must be met before they
can be applied to the data
• Can be used to analyse data that are not precisely numerical. e.g. interval or ratio data
142
• Ideal for analysing data from small samples
• Generally easy to calculate
ii) Disadvantages of non parametric test
• They
• The
are less statistically powerful than parametric tests
scale of measurement analysed by non parametric tests are less sensitive than those analysed by
parametric tests
iii) Examples of Non Parametric test
The commonly used non parametric tests are:
•
The Mann-Whitney test: This is equivalent of Student's t-test, but it is a test of whether the
medians (rather than the means) are the same in each group. It can be used to test whether two
independent samples have been drawn from the same population. Like the t-test it is only appropriate
when comparing two groups.
•
The Kruskal-Wallis test: This is the equivalent of the one-way ANOVA. It can be used to
compare the medians of two or more groups, the null hypothesis being that all are samples from the
same population. If the over-all differences are statistically significant, post-hoc comparisons are
done by comparing each pair of medians in turn using the same test.
•
The Friedman's test: This is the equivalent of the two-way ANOVA without interaction,
appropriate for a randomised block experimental design. Again, it tests the null hypothesis that all
treatment groups came from the same population, and is a test of group medians, removing any block
effect.
•
Chi-square test: We describe in detail this procedure in the next section.
The Chi square test
The Chi square test is one of the non-parametric11 tests commonly used for statistical inference.
Chi-square test Formula:
χ2
Chi-square distribution
11
Non parametric test: Is an inferential test that, one that makes few or sometimes no assumptions regarding any numerical
data or the shape of the population from which the observation were drawn (Dunn, (2001), other non parametric test
include the Mann-Whitney test and Friedman’s test.
143
A distribution obtained from the multiplying the ratio of sample variance to population variance by
the degrees of freedom when random samples are selected from a normally distributed population
Application of the χ2 test
Chi square may be used to tests whether obtained observations (of categorical data) conform to or
diverge from the population proportions specified by the null hypothesis. In this case goodness of fit
test is used. The Chi-square may also be used to determine whether frequencies associated with two
nominal variables (with two or more categories each) are statistically independent of one another or
dependent on each other. In this case Chi-square test of independence is used. In the next section we
describe the two variations of the Chi-square in detail.
Variation of the χ2 test
I. Chi- square Goodness-of-fit Test
The idea behind the chi-square goodness-of-fit test is to see if the sample comes from the population
with the claimed distribution. Another way of looking at that is to ask if the frequency distribution fits
a specific pattern. Two values are involved, an observed value, which is the frequency of a category
from a sample, and the expected frequency, which is calculated based upon the claimed distribution.
The idea is that if the observed frequency is really close to the claimed (expected) frequency, then the
square of the deviations will be small. The square of the deviation is divided by the expected
frequency to weight frequencies. A difference of 10 may be very significant if 12 was the expected
frequency, but a difference of 10 isn't very significant at all if the expected frequency was 1200.
If the sum of these weighted squared deviations is small, the observed frequencies are close to the
expected frequencies and there would be no reason to reject the claim that it came from that
distribution. Only when the sum is large there is a reason to question the distribution. Therefore, the
chi-square goodness-of-fit test is always a right tail test.
The test statistic has a chi-square distribution when the following assumptions are met
1)
The data are obtained from a random sample.
2)
The expected frequency of each category must be at least five (5). This goes back to the
requirement that the data be normally distributed. You're simulating a multinomial experiment (using
a discrete distribution) with the goodness-of-fit test (and a continuous distribution), and if each
144
expected frequency is at least five then you can use the normal distribution to approximate (much like
the binomial).
The following are properties of the goodness-of-fit test
• The
data are the observed frequencies. This means that there is only one data value for each
category.
• The
degrees of freedom are one less than the number of categories, not one less than the sample
size.
• It
is always a right tail test.
• It
has a chi-square distribution.
• The
value of the test statistic doesn't change if the order of the categories is switched.
The test statistic is χ2
Worked example: Chi Square test for Goodness of fit
Four similar products (e.g. brands of toothpastes/soap/carbonated drinks) are displayed for sale in a
shop. The shop manager wants to find out whether or not the four similar products are equally
preferred by customers. S/He recorded sales of the different products over a specific period and
present the data in the following Table;
Table 4.1.12: Amount of four products sold over a period of four months in a shop
Amount of products sold
A
B
C
D
Total
249
161
347
243
1000
In this case, if the four products were equally preferred we should expect the sales to conform to a
ratio of 1:1:1:1. (That is 250 units of each product). We perform a test, using the 0.05 level of
significance, to determine if customers equally prefer the four products. We follow the steps
involved in hypothesis testing as follows;
State the null and alternative hypothesis
• Null
hypothesis (Ho): the four products are equally preferred hence X1=X2=X3=X4.
• Alternative
hypothesis (Hi): the four products are NOT equally preferred hence X1≠X2≠X3≠X4
145
Traditional method
χ2
Formula :
The expected numbers (under the null hypothesis) in each cell are equal to
Total x
as we are dealing with a ratio of 1:1:1:1
Expected numbers for cell A above = 1000 x
= 250
Table 4.1.13: Chi-square calculations
Product Observed
A
B
C
D
Total
(Obs)
235
275
247
243
Expected
(Obs-Exp)
(Obs-Exp)2
(Exp)
250
250
250
250
1000
15
-25
3
7
0
225
625
9
49
0
0.900
2.500
0.036
0.196
3.632
Calculated χ2 value (test statistic) = 3.632
The χ2 table value (critical value) at 0.05 = 3.841 (This value is read from a Chi-Square in the column
under P=0.05 at 1 degree of freedom).
Conclusion:
Since calculated χ2 value (3.632), the test statistic, is less than the critical value at 0.05 (3.841), we fail to
reject the null hypothesis (Ho) and conclude the sales of the four products do not deviate significantly
from the expected ratio of 1:1:1:1 that is 250:250:250:250. We conclude that the four products are
equally preferred by consumers.
Note this example can also be used to test other similar hypotheses such as
i) Test whether heritability of a character is controlled by a single dominant gene based on the number of
different phenotypes in filial one generation of crosses between the two parents. The phenotypes of all
filial one progenies usually conform to a 1:1 ratio and the ratio of Filial 2 progenies should conform to a
3:1 ratio.
146
II. Chi- Square test for independence
In the test for independence, the claim is that the row and column variables are independent of each
other. This is the null hypothesis. The multiplication rule said that if two events were independent, then
the probability of both occurring was the product of the probabilities of each occurring. This is key to
working the test for independence.
If you end up rejecting the null hypothesis, then the assumption must have been wrong and the row and
column variable are dependent. Remember, all hypothesis testing is done under the assumption the null
hypothesis is true.
The test statistic used is the same as the chi-square goodness-of-fit test. The principle behind the test for
independence is the same as the principle behind the goodness-of-fit test. The test for independence is
always a right tail test.
In fact, you can think of the test for independence as a goodness-of-fit test where the data are arranged
into table form. The table is called a contingency table.
The test statistic has a chi-square distribution when the following assumptions are met
•
The data are obtained from a random sample
•
The expected frequency of each category must be at least 5.
•
χ2
The following are properties of the test for independence
•
The data are the observed frequencies.
•
The data are arranged into a contingency table.
•
The degrees of freedom are the degrees of freedom for the row variable times the degrees of
freedom for the column variable. It is not one less than the sample size, it is the product of the two
degrees of freedom.
•
It is always a right tail test.
•
It has a chi-square distribution.
The expected value is computed by taking the row total times the column total and dividing by the grand
total. The value of the test statistic doesn't change if the order of the rows or columns is switched. The
value of the test statistic doesn't change if the rows and columns are interchanged (transpose of the
matrix).
147
Worked example: Chi Square test for Independence
An ecological study was carried out to determine whether there is any association between two plant
species in Serengeti plains. The researcher randomly threw a 1m x 1m sampling frame several times and
recoded the presence and or absence of the two plant species in the samples as follows;
Table 4.1.14: Data collection sheet: Showing the presence or absence of the species A and
species B in each sampling frame
S/N
Species A
Species B
Remarks
1
1
0
Added in cell B
2
0
1
Added in cell C
3
1
1
Added in cell A
4
0
0
Added in cell D
450
1
0
Added in cell B
Table 4.1.15: Frequency of counts of absence or presence of species A and Species B in
sampling frame
Plant species A
Present
Absent
Totals
This table is called a contingency table.
Plant species B
Present
Absent
90 (A)
181 (B)
66 (C)
113 (D)
156
294
Totals
271
179
450
If we want to perform a test, using the 0.05 level of significance, to determine if there is any
association between the two species, then we need to follow the steps involved in hypothesis testing
as follows;
State the null and alternative hypothesis
• Null
hypothesis (Ho): the presence of Species A is associated with the presence of Species B
• Alternative
hypothesis (Hi): the presence of Species A is NOT associated with the presence of
Species B
Shortcut method
(You are strongly discouraged from using the shortcut method, instead use the traditional method)
Formula χ2 =
148
≈
0.639
Traditional method
Formula :
χ2
The expected numbers (under the null hypothesis) in each cell are equal to
Expected numbers for cell A above =
Calculations
S/N Observed
A
B
C
D
value (Obs)
90
181
66
113
Totals
= 62.05
Expected
Difference
d2
d2/Exp
value (Exp)
93.95
177.10
62.05
116.95
(d)
-3.95
3.95
3.95
-3.95
0.00
15.60
15.60
15.60
15.60
0.166
0.088
0.251
0.133
0.639
Calculated χ2 value (test statistic) = 0.639
The χ2 table value (critical value) at 0.05 = 3.841 (This value is read from a Chi-Square in the
column under P=0.05 at 1 degree of freedom)
Conclusion:
Since calculated value (0.639), the test statistic, is less than the critical value at 0.05 (3.841), we
accept null hypothesis (Ho) and reject alternative hypothesis (Hi), and conclude there is no
association between the two species
Note this example can also be used to test other similar hypotheses such as whether;
1.5 Review exercises
1.
According to the Mendelian genetic model, a certain plant should produce offspring that
have white and red flowers, in the proportion of 75% and 25%, respectively. A sample of 100 such
offspring was coloured as follows: White 79 and Red 21. Using an appropriate test, can you reject
the Mendelian hypothesis at the 5% level?
149
2.
A survey was conducted to investigate whether or not alcohol consumption is associated
with cigarette smoking. The following information was compiled for 600 individuals. Using p=0.05,
test the hypothesis that smoking and alcohol consumption are independent. What are the null and
alternative hypotheses?
Drinker
Non Drinker
Smoker
193
89
Non Smoker
165
153
1.6 Summary
This lecture has illustrated different statistical techniques for various types of research. The diversity
indicates the availability of appropriate statistical techniques for most research problems. However,
the diversity also indicates the difficulty of matching the best technique to a specific research
problem. Choice of the correct statistical procedure for a given research must be based on expertise in
statistics and in the subject matter of the research. You should attempt to seek assistance if you are
not comfortable with statistics techniques. Table 4.1.11 and 4.1.17 should orient you on choice of
correct statistical procedure to use for the different types of data collected.
It is strongly advised that before collecting one datum, you should complete an analysis plan that
promotes identifying necessary statistical tests in advance. This lecture covered descriptive and
inferential statistics techniques commonly used to handle quantitative data
Descriptive statistics are very useful in most studies. They help to describe, organise and summarize
the main characteristics of sample data. They give a general picture of your data to allow for
additional data manipulation. The arithmetic mean and the standard deviation are the most useful
measure of central tendency and variability, respectively. Other measures of central tendency include
the mode and median. The measures of variability e.g. standard deviation account for the way data in
the sample or population deviate from the relevant measure(s) of central tendency. Variability is low
when the spread of scores around the mean is small.
Inferential statistics are used in hypothesis testing, generally to demonstrate mean differences.
Hypothesis testing compares sample data and statistics to either known or estimated population
parameters. There are two types of hypotheses, conceptual/theoretical and statistical. Conceptual
hypotheses identify predicted relationships among independent variables and dependent measures.
Statistical hypotheses test whether the predicted relationships are mathematically supported by the
150
existing data, that is, do differences based on sample statistics reflect differences among the
population parameters?
For instance, an experiment usually results in some means or affected proportion of different groups
such as control and treated animals. Means will differ because each animal is different. Proportions
affected could differ by chance. Means and proportions may also differ as a result of the treatment.
The aim of the statistical analysis is to calculate the probability that differences as great as or greater
than those observed could be due to chance. If this probability is high, then chance may be the
explanation, if it is low then a treatment effect may be the explanation.
Various inferential statistical tests are based on a conceptual model where between-groups variation
(attributed to an independent variable) is divided by within-groups variation (the error term or the
degree of similarity observed in each group). Generally, researchers want to obtain a large amount of
between-group variation relative to a small amount of within group variation.
In this lecture, we have illustrated data calculation by using statistical calculators however, these days
the actual calculations are almost always done using a computer (see Lecture 12).
Table 4.1.16: Research design and some selected tests available to analyse their data (after
Dunn, (2001)
Research Design
Non Parametric test for
Nominal data
Ordinal data
Parametric tests
One sample
χ2 goodness of fit
-
One sample t or z test
Two independent
χ2 test of
Mann-Whitney U
Independent groups t
samples
Two dependent samples
independence
-
test
Wilcoxon
test
Dependent groups t
matched pairs
test
More than three
χ2 test of
signed rank test
-
One way ANOVA
independent samples
Correlation
independence
-
Spearman rx
Pearson r
Table 4.1.17: Examples of commonly used statistical tests ad their application
151
Type of test
Simple Linear regression
Application
Examines changes in level of variable Y relative to
analysis
changes in the level of variable X.
Predict the value of a dependent variable from one or
more independent variables.
Simple linear correlation analysis Assess strength and nature of linear relationship between
Independent group t test
two variables.
Examines mean differences between sample means from
Single or one sample t and Z test
two groups
Determine whether observed sample mean represents a
Dependent group t test
population
Determine differences between two means drawn from
Analysis of Variance (ANOVA)
the same sample at two different point in time
Determines the differences means of two or more levels
Chi-square (goodness of fit) test
of an independent variable
Examine whether categorical data conform to
Chi-square test of independence
proportions specified by a null hypothesis
Determines whether frequencies associated with two
Kolmogorov-Smirnov or Mann-
nominal variables are independent
Can be used to test whether two groups (categories) are
Whitney U test
different
1.7 Additional review exercises
1.
Define the word ‘significant’ in statistical context.
2.
Explain the difference between statistical hypothesis and conceptual/theoretical hypothesis
3.
Describe using your own examples when a researcher should use the dependent t-test, one
way ANOVA and Regression analysis.
4.
Using your own example, briefly outline the steps involved in testing of hypothesis.
5.
The district education officer wants to confirm the allegation that O-level students in the
district performed miserably in last year English examination. Suppose 10 students are randomly
picked, and their scores are 76, 77, 75, 58, 57, 79, 54, 67, 79 and 94. Test whether or not the district
performance is significantly different from the national average of 82.7 using the 5% level of
significance.
152
6.
Study the data presented in the following table and (a) Determine the mean, Standard
deviation, range and mode. (b) Using five classes, summarize the data in a histogram.
35
55
43
51
7.
46
43
64
50
63
42
49
66
69
59
39
63
54
45
59
57
50
44
60
56
62
57
42
51
68
47
60
38
38
48
42
61
40
46
38
54
A random sample of 10 observations from one population revealed a sample mean of 23 and
a sample standard deviation (SD) of 4, A random sample of8 observation from another population
revealed a sample mean of 26 and a sample SD of 5. At the 0.05 significance level, test if there is a
difference in the population means.
1.8 References
Aczel, A. D. (1999). Complete Business Statistics. Irwin McGraw Hill.
Dunn, D. S. (2001). Statistics and Data Analysis for the Behavioural Sciences. Irwin McGraw Hill.
Gomez, K. A. and Gomez, A. A. (1984). Statistical Procedures for Agricultural Research (2ed). An
International Rice Reaseerch Institute Book. A Wiley-Interscience Publication, John Wiley & Sons.
James, J. (2009). Introduction to Applied Statistics: lecture notes
http://people.richland.edu/james/lecture/ last visited in January, 2009.
Kothari, C. R. (2009). Research Methodology: Methods and Techniques. New Age International
Limited Publishers.
Mason, R. D., Lind, D. A and Marchal, W. G. (1999). Statistical Techniques in Business and
Economics. Irwin McGraw Hill.
Saunders, M. N. K., Lewis & P., Thornhill, A. (2009). Research methods for business students. (5ed).
Harlow: Prentice Hall – Financial Times.
Zar, J. H. (1996). Biostatistical Analysis. Prentice Hall, Inc. New Jersey.
153
LECTURE TWO
ANALYSIS OF QUALITATIVE DATA
(By Dr. D. Ngaruko)
2.1 Introduction
In previous lectures it was pointed out that we use qualitative research techniques if we wish to
obtain insight into certain situations or problems concerning which we have little knowledge.
Qualitative techniques such as the use of loosely structured interviews with open-ended questions,
(focus) group discussions, observations, projective and participatory approaches will therefore be
appropriate in many studies, especially at the onset. For sensitive topics they may be the only reliable
techniques. Irrespective of how and for what purpose the data has been collected, the researcher
usually ends up with a substantial number of pages of written text that needs to be analysed.
Although procedures and outcomes of qualitative data analysis differ from those of quantitative data
analysis, the principles are not so different. IDRC-Science for humanity 12 identifies five principle
stages of the qualitative data analysis process i.e.
•
describe the sample populations;
•
order and reduce/code the data (data processing);
•
display summaries of data in such a way that interpretation becomes easy, e.g., by preparing
compilation sheets, flowcharts, diagrams or matrices;
•
draw conclusions, relate these to the other data sets of the study and decide how to integrate
the data in the report; and
•
if required, develop strategies for further testing or confirming the (qualitative) data in order
to prove their validity.
Throughout this lecture we will look into details at each of these points.
12
IDRC is a Canadian Crown corporation that works in close collaboration with researchers from the developing world in
their search for the means to build healthier, more equitable, and more prosperous societies: http://www.idrc.ca/en/ev56452-201-1-DO_TOPIC.html
154
2.1 Learning outcomes
At the end of this session you should be able to:
•
Describe efficient ways of ordering and summarising qualitative data.
•
Indicate why it is essential to start summarising and analysing during the field work.
•
List the major steps in analysing qualitative data and drawing conclusions.
•
Make an outline of how you will proceed with the ordering and summarising of your
qualitative data, and with the subsequent analysis.
•
Plan on how to report your qualitative data, integrated in the most effective way with
your other data.
•
Indicate, either now or at the end of data analysis, what additional activities you will
undertake to test or confirm your findings in order to prove their validity.
2.2 Procedures for processing and displaying of qualitative data
This section covers four important sub-sections sections describing efficient ways of ordering,
coding and summarising qualitative data. That is, description of the sample population; content
analysis; and summarising data in compilation sheets, matrices, figures and table.
2.2.1 Description of the sample population in relation to sampling procedures
This is the first step in data processing (as well as in the reporting of findings) is a description of the
informants. If numbers allow, relevant background data may be tabulated, for example on age, sex,
occupation, education or marital status, as is the practice in quantitative studies. However, as
qualitative data originates from small samples (sometimes a handful of key informants or focus
group discussions and observations) more information is required to place the data in its context. For
example, who were the key informants, what made you decide to choose them? Who took part in the
focus group discussions? How were the participants of the groups selected and how representative
are they for your study population? For observations: under what circumstances were they carried
out? Who were observed, and by whom? Unless this type of information is provided, interpretation
of data may appear haphazard.
2.2.2 Content analysis: Ordering and coding of data
155
Content analysis is the most important purpose of the analysis which principally involves ordering.
By counting the answers under each label, however, the researcher gains insight as well in how
common the different reasons are. In this section we will discuss two types of qualitative data:
answers to open questions, and more elaborate narratives from loosely structured interviews or
FGDs. Coding is the conversion of the verbatim answers to categorical data. A simplifying frame
must be imposed on the large variety of real-life situations. One way is to provide precoded answers
for the respondents to choose from. Another way, which is of more relevancy in qualitative data
analysis, is to record verbatim answers and then to fit them into categories devised after the event (a
coding frame).
2.2.2.1 Answers to open questions
The most commonly collected qualitative data are the answers to open questions. This is more true of
the more probing questions beginning with “Why ….”. Ordering and coding of data is also common
in such questions with added categories like “Other answers, please specify…” in case the preferred
list is not exhaustive. The answers are systematically ordered of the data in question. Let us take an
example of the study on answers to the question ‘Why are you smoking?’ which we will discuss in
depth again to analyse the different steps of analysing open ended questions:
STEP 1: Listing - A first, basic step in the analysis of answers to open questions is to list the
answers of a sample of 20-25 informants as they were provided (adding the questionnaire number in
order to avoid losing the connection with the informant’s other data).
STEP 2: Reading - Then read the answers carefully, remembering the purpose of the question. The
question ‘why are you smoking’ was supposed to help nursing students to develop an intervention
against smoking.
STEP 3: Coding - Make rough categories of answers that seem to belong together and code them
with a key word. For example, answer 3 (It gives me pleasure) and answer 14 (I like to blow smoke
rings) could be labelled with the term ‘pleasure’, which could be abbreviated with the code pleas.
STEP 4: List per code - Then list again all answers but now per code, so that you get some 5-7 short
lists, for example:
156
STEP 5: Interpretation - Then interpret each list, and end up with some 5-7 meaningful categories
with a characteristic key word. For example: Pleasure, being sociable, giving status, giving selfconfidence, addiction, defiance. There may be discussion on the need to split up some categories or
combine others with few answers. Answers 17 and 18, for example could be put in a separate
category reducing stress. In that case there would be seven categories. The category defiance may
have two answers: for instance answer number 4- I do not see why I would give up smoking and
number 12 - Why not? The exclamation marks indicate that defiance rather than lack of knowledge
forms the motivation for the answer. Without this addition by the interviewer, these answers would
have been difficult to code.
Now you can make a tentative interpretation according to the assumed willingness of your informants
to change their behaviour. For those who smoke for pleasure or to socialise it might be most easy to
give up smoking. Those who are addicted but tried to stop and those who feel they derive status from
smoking might form a middle category, whereas for those who smoke to enhance their selfconfidence and reduce stress or who are very defiant at the question why they smoke, it might be
most difficult to stop. Now try a next batch of 20-25 answers and check if the labels work. It is well
possible that at this stage still some labels will be changed or that you decide to add new categories or
combine others.
STEP 6: Final list - Make a final list of labelled categories and code all data including the data you
already processed with the abbreviated codes. Then discuss whether you will stick to your tentative
interpretation of the data and what this means for the content of the messages to address different
reasons for smoking.
157
2.2.2.2 Elaborate narratives
This is another way of undertaking content analysis. The data from interviews with key informants or
focus group discussions (FGDs) are as a rule more bulky than answers to open questions. The
carefully transcribed field notes and tapes may consist of pages of narrative text. When analysing the
texts we usually discover that, no matter how good our guidelines for the discussion were, the data
contain valuable information but also a number of less essential details. In addition, the data is
usually not presented in the order we need for our analysis, since informants may jump from one
topic to the other.
To make the analysis easier, we have to order and reduce the data. Ordering is best done in relation to
the objectives and the discussion topics. Again, let us systematically follow a number of steps:
STEP 1: Reread your objectives and discussion topics and then carefully read a number of the
interviews, FGDs or narrative observations you want to process. Number the material according to
the broad discussion topic it pertains to. Use a yellow marker to highlight particularly illustrative
remarks. Use the margins to define sub-topics.
For example, in a gender and leprosy study carried out in different countries (adopted from
http://www.idrc.ca/en/ev-56451-201-1-DO_TOPIC.html: Module 4) it appeared that the discussion
topic stigma had to be differentiated according to different social settings in which it occurred:
among close relatives (parents-children), spouses, in-laws, and community members. Further, a
distinction had to be made between self-stigmatisation (e.g., a wife diagnosed as a leprosy patient
encouraging her husband to marry a second wife in order to prevent divorce, or a patient not
attending community meetings for fear of being avoided) and stigmatisation by others. Different
degrees of severity in stigmatisation could also be distinguished, varying from slight avoidance to
complete expulsion. If stigma would be topic (6) in your discussion list, you would mark everything
related to stigma with a (6) in the margin, and add key words such as self-stigm., spouse, in-laws,
comm., in the margin, as well as key words such as sleep(ing) sep(arately) or divorce indicating the
severity of the stigma.
STEP 2: List key words - List all key words that belong to a certain topic in the sub-categories that
have been developed under step 1. For instance, everything belonging to stigma could be subdivided
and listed in the four major social settings in which stigma was found to manifest itself.
158
STEP 3: Interpret the data – At this level distinguish the major forms in which stigma manifests
itself in these different social settings, try to make a ranking order of severity and link it to other
variables (such as degree of deformity, socio-economic status) in order to understand differences in
stigma.
STEP 4: Code data – The fourth step is to code all your qualitative data. If necessary, adapt your
coding scheme as you order, code and interpret more data. In that case, you should again read and
possibly re-code the material you have already processed.
Note: You may already have analysed and coded your qualitative data in the field in order to adjust
and deepen your interview guides or topic lists. In that case it may be possible to develop your final
coding list in one cycle instead of two. However, instead of developing a very detailed coding system
on your rough data, you may also refine your interpretation as you record your roughly coded,
summarised data in compilation sheets, which we are going to cover in the next section.
2.2.3 Summarising data in compilation sheets
This is the next stage in the qualitative data analysis process. After ordering and coding the data you
have to summarise them. A useful first step is summarising all data of each study unit per study
population on separate compilation sheets. Like the master sheets for quantitative data, compilation
sheets for qualitative data consist of a number of columns with the topics covered by the study as
headings. These may be further sub-divided in smaller themes that you identified and coded when
ordering the data (see a hypothetical example in table 8.1 below).
Each interview, FGD or observation gets a number and is successively entered in that sequence on
the relevant compilation sheet. If there are different categories of informants within one study
population, for example, young mothers and an older generation of mothers, or male and female
patients, the data for these groups are entered on separate sheets. If the topics covered in those subgroups are not completely identical, it is important to be systematic and follow roughly the same
sequence of topics for each category of informants. The information inserted is summarised in key
words and key sentences, clear enough to remember the statements informants made. (As the number
of each study unit is entered in the compilation sheet, it is always possible to go back to the original
data and present the full statement, for example in a presentation or in the research report).
Now you have an overview of all data per study population on one or more big sheet(s). If you read
the columns, you have a list of answers of all group members on a certain (sub-)topic. If you read
159
horizontally, you can per informant relate different topics to each other or to personal characteristics
of the informant. It becomes also easy to compare the answers of different groups on specific issues
by comparing compilation sheets. Table 4.2.1 is a hypothetical example of the compilation sheet.
Table 4.2.1: Example of compilation sheet (gender and leprosy)
160
161
Let us focus on Table 4.2.1 to describe the information contained therein. Table 4.2.1 presents the
personal data of leprosy patients (recently declared cured) and a number of topics and sub-topics
discussed with them. Stigma actually experienced, which originally was one topic, has in the
compilation sheet been subdivided in the four major social settings in which stigmatisation may
occur: close blood relatives, marriage, wider circle of spouse’s relatives and community. In each of
those still finer distinctions can be made (e.g., community can be neighbours, friends, work mates,
school mates or distant community members). As samples are small, these may all be inserted under
the heading ‘community’. Codes (italics) can be added to the statements presented in key words, for
example big fear and worried under the heading ‘first reaction’. From the three examples presented,
it already appears (confirmed by the analysis of all data in all four countries) that in general the
stigma feared when patients hear the diagnosis of leprosy is bigger than the stigma in reality
experienced. Patient (12) is in this respect an exception. Ironically, the husband who divorced her
had already died from another disease at the moment she was declared cured from leprosy.
Horizontal comparison of the data of patient (1) teaches us that it is highly unlikely that the man’s
friends do not know about the disease, as even after he has been declared cured he has visible signs.
Here the researchers had to interview the friends to find out if indeed this man was (or had not been)
stigmatised at all by the community.
Note that interpretation of data and labeling becomes indeed easy when using compilation sheets, as
a researcher can visualise all aspects of his/her informants even if (s)he looks at one aspect at a time
for the whole study population.
A next step in summarising may be the combination, contrasting or further analysis of important
topics through graphical displays such as matrices, diagrams, flow charts and tables.
2.2.4 Summarising of data in matrices, figures and tables
2.2.4.1 Matrices
Matrices can be used for quantitative as well as qualitative data comparison. In qualitative data we
may compare different groups or data sets on important variables, presented in key words. A
MATRIX is a chart that looks like a cross-table, but contains words (as well as, sometimes,
numbers). Table 8.3 is a hypothetical example (adopted from IDRC-Science for humanity-Module
2313) of a summarised FGD discussion on changing weaning practices, in which the researchers listed
13
http://www.idrc.ca/en/ev-56467-201-1-DO_TOPIC.html
162
the answers of young mothers concerning the introduction of soft foods and those of mothers above
childbearing age. They then summarised these answers in a matrix:
Table 4.2.2: Matrix on introduction of soft baby foods among mothers of different age groups
This type of display makes it easy for the researcher to conclude that:
•
younger mothers start giving soft foods, on average, 2.5 months earlier than the
generation of their own mothers;
•
younger mothers use a larger variety of soft weaning foods than women in the preceding
generations; and
•
younger mothers give soft foods to their babies more frequently, but for the same reasons
as their mothers did.
Matrices facilitate data analysis considerably. They are the most common form of graphic display of
qualitative data. They can be used to order and compare information in many ways, for example,
according to:
•time sequence (of procedures being investigated in different periods, for example),
•type of informants (as in the example above), or location of data collection (to visualise differences
between rural and urban populations).
163
2.2.4.2 Diagrams
A diagram is a figure with boxes containing variables and arrows indicating the relationships
between these variables. When analysing the problems you wanted to investigate during the
development of your protocols, most groups developed a diagram. In a similar way diagrams can be
developed to summarise findings of a study. (See Figures 8.1 and 8.2). You might use a diagram to
illustrate a crucial issue in your study, combining all available qualitative and quantitative data
collected.
Figure 4.2.1: Reasons for early introduction of soft foods by young mothers
Diagrams, like matrices, can be of great assistance in providing an overview of the data collected and in
guiding data analysis.
Figure 4.2.2: Reasons for late introduction of soft foods by young mothers
164
2.2.4.3 Flow charts
FLOW CHARTS are special types of diagrams that express the logical sequence of actions or
decisions. Flow charts are especially useful to summarise different flows of events that are mutually
connected.
For instance, a counselling team in Bulawayo, Zimbabwe, for example, which interviewed some 95
HIV positive persons in-depth over a period of two years, summarised the roughly 100 pages of
interview material for each informant by drawing five lines (see Figure 4.2.3). One central line
presented the development of the disease over time, with crises and periods of relative well-being.
Another line presented different forms of medical care sought, a third the flaws in economic status
connected to the disease (e.g., loss of job, seeking employment elsewhere), a fourth the possible
changes in social status such as divorce or (re)marriage, whereas a fifth line presented the patient’s
emotional status linked to events occurring in the four other fields (e.g., positive coping, depression).
Figure 4.2.3: Flowchart on coping of HIV+ persons with their condition over time
165
166
These flow charts were extremely useful for comparison of data, per informant and between different
groups of informants (e.g. males/females, single/married). They highlighted the impact of the disease
on the lives of different groups of patients and their way of coping with it14.
2.2.4.4 Tables
Qualitative data can also be categorised, coded, inserted in master sheets or computer and counted,
together with other quantitative data, and displayed in tables. Answers to open-ended questions in
questionnaires will usually be categorised and summarised in this way. However, you will in the first
place want to analyse the content of the individual answers in each category.
2.3 Drawing and verifying conclusions
Drawing and verifying conclusions is the essence of data analysis. It is not an isolated activity,
however. When we start summarising our data in compilation sheets, flowcharts, matrices or
diagrams, we continuously draw conclusions, and modify or reject quite a number of them as we
proceed. Writing helps generate new ideas as well. Therefore writing should start as early as possible,
right from the onset of data processing and analysis, if only for ourselves. No creative insights should
get lost.
Note: Collection, processing, analysis and reporting of qualitative data are closely intertwined, and
not (as is the case with quantitative data) distinct successive steps. It may often be necessary to go
back to the original field notes and verify conclusions, collect additional data if available data appear
controversial, and get feedback from all parties concerned.
2.3.1 Identifying variables and associations between variables
Sometimes we do not know enough about a situation to define variables beforehand. Only during or
at the end of the study it will be possible to define certain variables and search for associations with
other variables, without having the prior aim of measuring them. Many studies have qualitative parts
with open questions, key informant interviews, focus group discussions or observations for the
purpose of identifying these variables. The researcher who uses such a qualitative approach should be
like a detective who searches for evidence, accounts for countervailing evidence, and verifies the
14
Meursing (1997) A world of silence in ICDR, Canada
167
findings by looking for independent, supporting evidence, until (s)he is confident about possible
associations among certain variables which shed light on the problem under investigation.
Drawing from an example covered in section 8.2 for example, if we find among the mothers who
wean their children early that quite a number have jobs, we may assume that having a job contributes
to early weaning. Similar studies carried out elsewhere with similar findings support this assumption
(independent evidence). Only if there are very few employed women who wean their children late,
however, can we be more certain that our assumption is true, and for each of those exceptions we
should try to find an explanation. Do the mothers take their children with them (at place of work) or
do they work near their homes so that they can feed the baby during breaks? Or do they successfully
combine breast-milk with alternatives? If yes, why don’t more mothers try this combination? etc.
1. Finding confounding or intervening variables
In some cases variables appear to be related but the association cannot easily be explained. Other
times it seems that variables should logically go together, but you cannot find a relationship. In cases
such as these there may be another variable (‘Q’) influencing the association between the two
variables concerned, that has to be identified.
Figure 4.2.4: A confounding variable Q between variables A and B
For example, one expects a relationship between the quality of drinking water and the incidence of
diarrhea. It is assumed that the incidence of diarrhoea would decrease as the number of water faucets
in a village increased. If there is no change over time, there might be a confounding variable. People,
for example, may dislike the taste of tap-water so much that they use it for everything, except for
drinking.
Note: Such unexplained associations may appear in any study. The essential characteristic of a
qualitative research approach is that it purposively looks for such associations during the fieldwork,
and that additional questions and tools may be developed to highlight such relationships. In
quantitative surveys that attempt to objectively measure the strength of a presupposed association
between two variables, the tools should not be changed once the fieldwork is ongoing.
168
2.3.3 Integrating qualitative and quantitative data
Thus far we have discussed the analysis of qualitative data as a separate activity. However, if a
research team has collected qualitative as well as quantitative data, which is the case in most HSR
studies, it would be foolish not to look at them in combination, as this can inspire to deeper and more
rewarding analysis.
For example, the Indonesian ‘gender and leprosy’ research team found, when analysing the
registration data of 4500 new leprosy patients who had registered over the past five years, that the
M/F ratio was most unfavourable in the age group of 15-44 years. This was a puzzling finding, as in
Nepal women in this age group were reporting much better (though still less than men). In-depth
interviews with staff revealed that they suspected adolescent girls and young women to hide their
skin patches, because of shameful associations with dirt, ugliness. This provided the incentive for a
further break down of the quantitative data, which revealed that the M/F difference in reporting was
indeed most pronounced in the 15-34 age group, and levelled off above 35. The reason(s) for this
relatively large gender difference in the younger age groups were then further explored.
2.3.5 Content analysis of qualitative data for action
Quantitative data serve in the first place to convince health authorities that there is indeed a serious,
sizeable problem; qualitative data help to provide ideas on how to solve it. The FGDs on weaning
foods with young mothers and mothers who had surpassed the childbearing age, for example, will
yield many suggestions on how to develop interventions with the mothers which they are likely to
consider useful and will be able to implement. Likewise, the in-depth interviews with leprosy and exleprosy patients will provide new insights into how best to counsel new patients and their close
relatives/spouses in order to reduce unnecessary fears.
2.3.6 Computer analysis of qualitative data
With the ever-increasing importance of computers in research, strategies for analysing qualitative
data by computer have been/are being developed. There are several possibilities, ranging from simple
word processing programs to highly sophisticated Qualitative Data Management Software including
possibilities for statistical testing of associations. Some examples of such software include Nvivo,
Qualitan, SPSS just to mention a few. As numbers are usually small in qualitative studies and content
analysis, which can be done by hand, is most likely more important than testing of associations, we
will not elaborate these techniques here
169
2.4 Reporting qualitative data
Basically, there are two ways of reporting qualitative data that form part of a study in which different
research techniques were used. One way is summarising the major qualitative results in a separate
section of the findings, with examples and quotations, following the objectives that guided the
collection of this particular data. The results would then be discussed in the chapter ‘Discussion’,
together with the results of other, more quantitative data collection tools and would subsequently be
reflected in the summary of the findings and the recommendations. Another possibility is to fully
integrate different data sets in the chapter of findings, ordered according to the objectives of the
entire study. If quantitative and qualitative data have been analysed and sometimes even collected in
an integrated way, it would also be logical to present them in an integrated fashion. Attention should
be paid that no valuable data get lost. Therefore a rough draft of all important findings is required in
any case, after which can be decided to present the data either in separate sections or chopped up for
integration with other data.
2.5 Further strategies for testing or confirming qualitative findings to prove validity
Researchers who use quantitative research designs reduce their data to numbers and apply statistical
tests. This does not necessarily insure that their research results are valid: something may have gone
wrong during sampling or collection of data or even in the earlier design of the study (overlooking
possible confounding variables). The following strategies will therefore be of use to any researcher.
They are particularly relevant, however, to qualitative research, since the small numbers of
qualitative data often generate questions concerning its validity.
a) Check for representativeness of data.
b) Check for bias due to observer bias or the influence of the researcher on the research situation.
c) Cross-check data with evidence from other, independent sources.
d) Compare and contrast data.
e) Use extreme (groups of) informants to the maximum.
f) Do additional research to test the findings of your study.
• to replicate certain findings,
• to rule out (or identify) possible intervening variables,
• to rule out rival explanations by investigating them, or
170
• to look for negative evidence.
g) Get feedback from your informants.
2.5 Summary
We have reached to the end of this lecture. Throughout analysis and reporting of qualitative data you
need to involve all parties concerned in the various stages of the research. This is important not only
for ethical reasons or because it will improve the chances that the results will be implemented, but
also because it will improve the quality of your study design, of your data, and of the conclusions
drawn from these data. Suggestions and additional information collected during feedback sessions
will invariably increase the quality of your research report.
2.6 References
Miles MB and Huberman AM (1984) Qualitative data analysis, a sourcebook of new methods.
Beverley Hills, CA, USA.: Sage Publications.
Patton MQ (1990) Qualitative Evaluation and Research Methods. 2nd ed. Newbury Park, CA: Sage
Publications.
Spradly JP (1979) The ethnographic interview. New York, NY, USA.: Holt, Rinehart and Winston.
Walker R (ed) (1985) Applied qualitative research. Hants, UK: Gower Publishing Company Ltd.
Willms DG and Johnson NA (1996) Essentials in Qualitative Research: A Notebook for the Field.
Hamilton, Canada: Mc Master University.
Yin RK (1984) Case study research: design and methods. Beverly Hills, CA, USA.: Sage
Publications.
NB: A major source of inspiration for writing this module was Miles and Huberman’s book. Section
V of this module is a heavily abbreviated and adapted version of their chapter VII.
171
LECTURE THREE
THE USE OF COMPUTERS IN DATA ANALYSIS
(Dr. D. Ngaruko)
3.1 Introduction
Statistics, as a scientific discipline, is evolving rapidly. In large part of this evolution rate is a
consequence of the development of computer. Just about 30 years ago data analysis took place either
by means of a rudimentary (by today’s standards) calculator or on a centrally-located computer.
Nowadays, most analysis takes place inside a personal computer located on the user’s desk (or via a
network, on some other personal computers or workstation). Punched tapes and cards have given way
to direct interaction with the computer. However from a data analytic viewpoint, these are all
superficial changes. The fundamental changes relate to what data analyses can be done, and the speed
with which data analysis can be done. All of these are again, result of the development of powerful
computer software. Most statistical packages also provide facilities for data management. Whereas
some types of statistical software is discipline specific, there are others which are applicable more
generally across all knowledge disciplines. Table 9.1 summarizes most common packages and the
disciplines where they are more appropriate. We will only be covering SPSS for Windows in this
lecture.
3.2 Learning outcomes
At the end of this course you should be able to:
•Explain and appreciate the role of computer software in data analysis.
•Undertake quantitative data entry into SPSS for Windows
•Generate frequency tables and descriptive statistics using SPSS for Windows
•Run and interpret Crosstabs and Chi-Square statistic using SPSS for Windows
•Compute T-Tests for dependent and independent samples using SPSS for Windows
•Run correlations and regression analyses using SPSS for Windows
172
3.3 Data analysis by computer
A statistical package is a suite of computer programs that are specialised for statistical analysis. It
enables people to obtain the results of standard statistical procedures and statistical significance tests,
without requiring low-level numerical programming. Traditional methods which might have taken
months of painstaking hand calculation can now be tacked effectively instantaneously.
One
consequence of this is that one does not need to be so sure one is doing the right thing before
undertaking it. This can be both good and bad: one can fit several or even many different models to a
set of data, which is good; but one can also over fit the data (finds a model which fits the data so well
that it does not generalize very well), which is bad. There is also a very real danger here: that modern
sophisticated statistical methods can be used without a proper understanding of the (often deep)
theory underlying them.This has obvious implications for the validity of any conclusions one might
draw. It seems that the development of accessible software has led not to the redundancy of
statisticians (as was once feared might happen)but to an even greater need for them.
Computer power has also provided impetus for the invention of entirely new statistical methods,
methods which would have been completely impracticable or even inconceivable before such
machines were available. Examples of such methods are:
•Resampling methods-e.g. jackknife, bootstrap, and cross validation methods. These tools repeatedly
analyze resample drawn from the original sample, and so get an idea of the variability which results
from sampling
•Non-linear models. These are often analytically intractable, and need rapid optimization methods to
estimate the parameters of the models.
•Stochastic optimization methods. These permit global optima to be found, even in the presence of
many local optima. Simulation allows one to explore the properties of estimators or models which
cannot be solved using analytic methods.
•Non-parametric smoothing and curve estimation methods are becoming increasingly important
•New kinds of statistical models, such as graphical models are being developed.
Table 4.3.1: Common types of computer data analysis software
173
S/N
1
2
3
4
5
6
7
8
Software
SHAZAM
SPSS
Stata
StatsDirect
S-PLUS
Unistat
SAS
RATS
9
Quantum
10
11
12
13
Minitab
MATLAB
GenStat
GAUSS
Discipline where used
comprehensive econometrics and statistics package
comprehensive statistics package
comprehensive statistics package
general statistics package mostly used in medical statistics
general statistics package
general statistics package that can also work as Excel add-in
comprehensive statistical package with programming language
Regression Analysis of Time Series, comprehensive econometric
analysis package
part of the SPSS MR product line, mostly for data validation and
tabulation in Marketing and Opinion Research
general statistics package
programming language]] with statistical features
general statistics package
programming language for statistics
The softwares in table 4.3.1 represent novel kinds of statistical tools, but even deeper and more
fundamental changes are occurring.
Graphical methods are becoming increasingly important.
Whereas not so long ago producing a graph was a slow and painful process, now accurate and
revealing displays can be produced with ease. This has led to a new philosophy of informal data
analysis, moving away from formal inference to informal sitting and examination of data and
making use of immense power of the human eye to detect patterns. Interactive data analysis in
general is becoming central to the way data analysis is done. We need no longer such through the
mountains of line-printer output seeking the one number we want. Now those numbers will appear
on the screen on command, and along with pictures and graphs.
3.3.1 Introduction to the basics of SPSS for Windows
SPSS (originally, Statistical Package for the Social Sciences) is a computer program used for
statistical analysis. Between 2009 and 2010 the premier software for SPSS was called PASW
(Predictive Analytics SoftWare) Statistics. The company announced July 28, 2009 that it was being
acquired by IBM. As of January 2010, it became "SPSS: An IBM Company". SPSS was released in
its first version in 1968 after being developed by Norman H. Nie and C. Hadlai Hull. Norman Nie
was then a political science postgraduate at Stanford University, and now Research Professor in the
Department of Political Science at Stanford and Professor Emeritus of Political Science at the
174
University of Chicago.[3] SPSS is among the most widely used programs for statistical analysis in
social science. It is used by market researchers, health researchers, survey companies, government,
education researchers, marketing organizations and others. The original SPSS manual (Nie, Bent &
Hull, 1970) has been described as one of "sociology's most influential books".[4] In addition to
statistical analysis, data management (case selection, file reshaping, creating derived data) and data
documentation (a metadata dictionary is stored in the datafile) are features of the base software.
Statistics included in the base SPSS software include:
•
Descriptive statistics: Cross tabulation, Frequencies, Descriptives, Explore, Descriptive
Ratio Statistics
•
Bivariate statistics: Means, t-test, ANOVA, Correlation (bivariate, partial, distances),
Nonparametric tests
•
Prediction for numerical outcomes: Linear regression
•
Prediction for identifying groups: Factor analysis, cluster analysis (two-step, K-means,
hierarchical), Discriminant
Prior to SPSS 16.0, different versions of SPSS were available for Windows, Mac OS X and Unix.
The Windows version was updated more frequently, and had more features, than the versions for
other operating systems. The following is a release of SPSS versions from version 15 to the current
one, version 18:
•
SPSS 15.0.1 - November 2006
•
SPSS 16.0.2 - April 2008
•
SPSS Statistics 17.0.1 - December 2008
•
PASW Statistics 17.0.3 - September 2009
•
PASW Statistics 18.0 - August 2009
•
PASW Statistics 18.0.1 - December 2009
•
PASW Statistics 18.0.2 - April 2010
This part of the lecture covers a brief look at what SPSS for Windows and what SPSS is capable of
doing. It is not our intention to teach you about statistics in this tutorial. For that you should rely on
your classes in statistics and/or a good textbook. If you're a novice this tutorial should give you a
175
feel for the programme and how to navigate through the many options. Beyond that, the SPSS Help
Files should be used as a resource. Further, SPSS sells a number of very good manuals.
3.3.1.1 Starting SPSS for Windows
SPSS for Windows has the same general look a feel of most other programmes for Windows.
Virtually anything statistic that you wish to perform can be accomplished in combination with
pointing and clicking on the menus and various interactive dialog boxes. Presumably, SPSS is
already installed on the (your) computer you are using. If you don't have a shortcut on your desktop
go to the [Start => Programs] menu and start the package by clicking on the SPSS icon.
Once you've clicked on the SPSS icon a new window will appear on the screen. The appearance is
that of a standard programme for windows with a spreadsheet-like interface. At the bottom left
corner of the screen there are two tabs representing the two primary components of then data editor:
data view and variable view. The data view screen is designed to hold raw data for analysis. The
variable view screen contains information about that data set. In fact the variables are formatted in
the variable view and data entry is only possible when SPSS screen is on the data view. The Data
Editor displays the contents of the active data file.
Note that the information in the Data Editor consists of variables and cases. In Data View, columns
represent variables, and rows represent cases (observations). In Variable
View, each row is a variable, and each column is an attribute that is associated with that
variable. Variables are used to represent the different types of data that you have compiled. A
common analogy is that of a survey. The response to each question on a survey is equivalent to a
variable. Variables come in many different types, including numbers, strings, currency, and dates.
176
As you can see, there are a number of menu options relating to statistics, on the menu bar. There are
also shortcut icons on the toolbar. These serve as quick access to often used options. Holding your
mouse over one of these icons for a second or two will result in a short function description for that
icon. The current display is that of an empty data sheet. Clearly, data can either be entered manually,
or it can be read from an existing data file.
3.1.1.2 Data Entry into SPSS
As noted before data entry in SPSS of the data collected using instruments such as questionnaires
follows two stages: first is variable formatting in variable view mode and the second is the real data
entry when SPSS is in data view mode. The process of formatting the questionnaire into SPSS is
more tedious than data entry. The former involves transformation of the questions in the
questionnaires into variables in the computer. To begin the process of adding data, just click on the
first cell that is located in the upper left corner of the datasheet. It's just like a spreadsheet. Enter each
data point then hit [Enter]. Once you're done with one column of data you can click on the first cell
of the next column.
177
The data view below indictaes the first column as "Respondent’s age" and the second column as
"number of siblings".
If you're entering data for the first time, like the above example, the variable names will be
automatically generated (e.g., var00001, var00002,....). They are not very informative. To change
these names, click on the variable name button. For example, double click on the "var00001" button.
Once you have done that, a dialog box will appear. The simplest option is to change the name to
something meaningful. For instance, replace "var00001" in the textbox with operational variable
name (refer to SPSS tutorial for details)
Each variable has several characteristics represented on the variable view. Thus in addition to
changing the variable name one has to make changes specific to these characteristics such as [Type],
[Labels], [Missing Values], and [Column Format].
• [Type] - One can specify whether the data are in numeric or string format, in addition to a few more
formats. The default is numeric format.
178
• [Labels] - Using the labels option can enhance the readability of the output. A variable name is
limited to a length of 8 characters, however, by using a variable label the length can be as much as
256 characters. This provides the ability to have very descriptive labels that will appear at the output.
Often, there is a need to code categorical variables in numeric format. For example, male and female
can be coded as 1 and 2, respectively. To reduce confusion, it is recommended that one uses value
labels . For the example of gender coding, Value:1 would have a correspoding Value label: male.
Similarly, Value:2 would be coded with Value Label: female. (click on the [Labels] button to verify
the above)
• [Missing Values] - This option provides a means to code for various types of missing values.
• [Column Format] - The column format dialog provides control over several features of each
column (e.g., width of column).
Once the variables are formatted correctly then click the Data View tab to continue entering the data. The
variable names that you entered in Variable View are now the headings for the columns in Data View (as
seen above). Data are entered in the first row, starting at the first column for the first case (respondent)
hence there will be as many rows as there cases (respondents) in the data view. The non-numeric data,
179
such as strings of text, can also be entered into the data editor by selecting “string” as the type of the
variable.
3.1.1.3 Data analysis using SPSS for Windows
a) Frequency tables and descriptive statistics
To begin, click on [Analyze=>Descriptive Statistics =>Frequencies].....for frequencies and [Analyze=
Descriptive Statistics=>descriptives]....for simple descriptive statistics. Frequency tables are ideal for
categorical variables whereas descriptives are ideal for scale variables. The result is a new dialog box that
allows the user to select the variables of interest. Also, note the other clickable buttons along the border
of the dialog box. The buttons labelled [Statistics...] and [Charts...] are of particular importance. Select
variables of interest from the list followed by a mouse click on the arrow pointing right. The consequence
of this action is transference of the selected variables to the Variables list. At this point, clicking on the
[OK] button would spawn an output window with the Frequency information for each of the variables in
form of tables. However, more information can be gathered by exploring the options offered by the
[Statistics...] and [Charts...]. [Statistics...] offers a number of summary statistics. Any statistic that is
selected will be summarized in the output window. As for the options under [Charts...] click on Bar
Charts to replicate the graph in the text.
180
Once the options have been selected, click on [OK] to run the procedure. The results are then displayed
in an output window. In this particular instance the window will include summary statistics for the
variable in question, and the frequency distribution. You can see all of this by scrolling down the
window. The results should also be identical to those in the text.
You may have noticed from the above that calculating summary statistics requires nothing more than
selecting variables, and then selecting the desired descriptive statistics. The frequency example
allowed us to generate frequency information plus measures of central tendencies and dispersion.
These statistics can be calculated by clicking directly on [Analyze => Descriptive Statistics
=>Descriptives]. Not surprisingly, another dialog box is attached to this procedure. To control the
type of statistics produced, click on the [Options...] button. Once again, the options include the
typical measures of central tendency and dispersion. Each time as statistical procedure is run, like
181
[Frequencies...] and [Descriptives...] the results are posted to an Output Window. If several
procedures are run during one session the results will be appended to the same window.
b) Crosstabs and Chi-Square
The computation of the cross tabulation (simply known as crosstabs) Chi-Square statistic can be
accomplished by clicking on [Analyze => Descriptive Statistics => Crosstabs...]. This particular
procedure will be your first introduction to coding of data, in the data editor. To this point data have
been entered in a column format. That is, one variable per column. However, that method is not
sufficient in a number of situations, including the calculation of Chi-Square, Independent T-tests, and
any Factorial ANOVA design with between subjects’ factors. I'm sure there are many other cases,
but they will not be covered in this tutorial. Essentially, the data have to be entered in a specific
format that makes the analysis possible. The format typically reflects the design of the study, as will
be demonstrated in the examples.
For the Chi-Square statistic, the table of data can be coded by indexing the column and row of the
observations. To perform the analysis,
• Select [Analyze => Descriptive Statistics => Crosstabs...] to launch the controlling dialog box.
• At the bottom of the dialog box are four buttons, with the most important being the [Statistics...]
button and the [cell...] button. You must click on the [Statistics...] button and then select the Chisquare option, otherwise the statistic will not be calculated. In the cell button you have an option of
selecting displays of percentages across the rows, along the column and total. Exploring this dialog
box makes it clear that SPSS can be forced to calculate a number of other statistics in conjunction
with Chi-square. For example, one can select the various measures of association (e.g., contingency
coefficient, phi and Cramer’s v,...), among others.
• Move the one variable into the Row(s): box, and the other variable(s) into the Column(s):, then
click [OK] to perform the analysis. A subset of the output looks like the following.
182
The resultant cross tabulation and its associated Chi Square are as indicated below. It can be seen in
the figure below that he Chi Square has to be selected for it to be displayed in the same output as the
cross tabulation.
In the previous lectures you should have learnt on how to interpret results.
Normally the two (crosstab results and associated Chi Square) are discussed together. For well
informed conclusions from the cross tab the observed P-value of the Pearson’s Chi Square should be
less or equal to 0.05 for one tailed and 0.025 for 2-tailed asymptotic significance levels. From the
illustration below we can see that the t-value is significant at 5% level of significance, implying that
there is significant relationship between the two variables (sex and general happiness).
183
Although simple, the calculation of the Chi-square statistic is very particular about all the required
steps being followed. More generally, as we enter hypothesis testing, the user should be very careful
and should make use of manuals for the programme and textbooks for statistics.
c) T-Test
By now, you should know that there are two forms of the t-test, one for dependent variables and one
for independent variables, or observations. To inform SPSS, or any stats package for that matter, of
the type of design it is necessary to have to different ways of laying out the data. For independent ttests, the observations for the two groups must be uniquely coded with a Group variable. Like the
184
calculation of the Chi-square statistic, these calculations will reinforce the practice of thinking about,
and laying out the data in the correct format.
i) Dependent T-Test
To calculate the t statistic click on [Analyse => Compare Means => Paired-Samples T Test...],
then select the two variables of interest. To select the two variables, hold the [Shift] key down while
using the mouse for selection. You will note that the selection box requires that variables be selected
two at a time. For the dependent design, the two variables in question must be entered in two
columns. Once the two variables have been selected, move them to the Paired Variables: list. This
procedure can be repeated for each pair of variables to be analyzed. Finally, click the [OK] button.
The critical result for the current analysis will appear in the output window as follows,
185
Paired Samples Test
Paired Differences
95% Confidence
Interval of the
Mean
Std.
Std. Error
Deviation
Mean
Difference
Lower
Upper
t
df
Sig. (2-tailed)
Pair 1 Number of
Brothers and
Sisters - Age of
-41.602
17.710
.457
-42.498
-40.705 -91.008
1500
.000
Respondent
As you can see an exact t-value is provided along with an exact p-value, and this p-value is greater
that the expected value of 0.025, for a two-tailed assessment. Closer examination indicates several
other statistics are presented in output window. Quite simply, such calculations require very little
effort!
ii) Independent T-tests
When calculating an independent t-test, the only difference involves the way the data are formatted in
the datasheet. The datasheet must include both the raw data and group coding, for each variable.
To generate the t-statistic follow the following simple procedure
• Click on [Analyse => Compare Means => Independent-Samples T Test] to launch the
appropriate dialog box.
• Select the dependent variable from the list of variables and move it to the Test Variable(s): box.
• Select "group" - the grouping variable list - and move it to the Grouping Variable: box.
• The final step requires that the groups be defined. That is, one must specify that Group1 - the
experimental group in this case - is coded as 1, and Group2 - the control group in this case - is coded
as 2. To do this, click on the [Define Groups...] button. Click on the [Continue] button to return to
the controlling dialog box.
• Run the analysis by clicking on the [OK] button.
186
• The output for the current analysis extracted from the output window looks like the following.
The p-value of .004 is way lower than the cutoff of 0.025, and that suggests that the means are
significantly different. Further, a Levene's Test is performed to ensure that the correct results are
used. In this case the variances are equal; however, the calculations for unequal variances are also
presented, among some other statistics - some not presented.
In the next section we will briefly demonstrate the calculation of correlations and regression, as
discussed in Chapter 9 of Howell. In truth, you should be able to work through many statistics with
your current knowledge base and the help files, including correlations and regressions. Most statistics
can be calculated with a few clicks of the mouse.
d) Correlations and Regression
To calculate a simple correlation matrix, one must use [Analyse => Correlate => Bivariate...], and
[Analyse => Regression => Linear] for the calculation of a linear regression. Let us briefly outline
how the two analyses are performed in SPSS.
i) Simple Correlation
• Click on [Analyse => Correlate => Bivariate...], then select and move "IQ" and "GPA" to the
Variables: list. [Explore the options presented on this controlling dialog box.]
• Click on [OK] to generate the requested statistics.
The results from output window should look like the following,
187
Correlations
Number of
Brothers and
Sisters
Number of Brothers and
Pearson Correlation
Sisters
Sig. (2-tailed)
Age of Respondent
Pearson Correlation
Sig. (2-tailed)
Age of Respondent
1
.116**
.000
.116
**
1
.000
**. Correlation is significant at the 0.01 level (2-tailed).
As you can see, Pearson Correlation coefficient =0.116, and p=.000. The results suggest that the
correlation is significant at 5%.
Note: In the above example we only created a correlation matrix based on two variables. The process
of generating a matrix based on more than two variables is not different. That is, if the dataset
consisted of 10 variables, they could have all been placed in the Variables: list. The resulting matrix
would include all the possible pair wise correlations.
ii) Linear Regression analysis
• Initiate the procedure by clicking on [Analyse => Regression => Linear...]
• Select and move endogenous variable into the Dependent: variable box
• Select and move exogenous variable into the Independent(s): variable box
• Click on the [OK] to generate the statistics.
Note: A variety of options can be accessed via the buttons on the bottom half of this controlling
dialog box (e.g., Statistics, Plots,...). Many more statistics can be generated by exploring the
additional options via the Statistics button.
Some of the results of this analysis are presented below,
188
Coefficientsa
Standardized
Unstandardized Coefficients
Model
1
B
(Constant)
Age of Respondent (AGE)
Coefficients
Std. Error
3.023
.215
.020
.004
Beta
t
.116
Sig.
14.083
.000
4.534
.000
a. Dependent Variable: Number of Brothers and Sisters (SIBS)
The resultant statistics are "Constant", or a from the text, and "Slope", or B from the text. In the
above output of the regression analysis, the dependent variable is number of brothers and sisters
whereas the independent variable is the age of respondent. As such, one can predict number of
brothers and sisters with the following equation,
SIBS = 3.023 + 0.116* AGE
The interpretation of tis model is that there is a significant positive impact of age on number of
siblings i.e. if age increases by 1% then the number of brothers and sisters increases by 0.116%.
Note: Multiple regression analysis involves more than one independent variables and the Bs for each
independent variable is interpreted independently of other variables. The interpretation is the same as
for the simple regression but other variables are held unchanged.
e) One-Way ANOVA
As in the independent t-test datasheet, the data must be coded with a group variable. To complete the
analysis,
•
Select [Analyse => Compare Means => One-Way ANOVA...] to launch the
controlling dialog box.
•
Select and move "Scores" into the Dependent list:
•
Select and move "Groups" into the Factor: list
•
Click on [OK]
189
The preceding is a complete speciation of the design for this one-way ANOVA. The simple
presentation of the results, as taken from the output window, will look like the following,
The analysis that was just performed provides minimal details with regard to the data. If you take a
look at the controlling dialog box, you will find 3 additional buttons on the bottom half [Contrasts...], [Post Hoc..], and [Options...].
Selecting [Options...] you will find,
190
f) Factorial ANOVA
To conduct a Factorial ANOVA one only need extend the logic of the one-way design. To compute
the relevant statistics – the following simple approach is required:
i)
Select [Analyse => General Linear Model => Simple Factorial...]
ii)
Select and move "Scores" into the Dependent: box
iii)
Select and move "Age" into the Factor(s): box.
iv)
Click on [Define Range...] to specify the range of coding for the variables Click on
[Continue].
v)
Select and move "Condition" into the Dependent: box
vi)
Click on [Define Range...] to specify the range of the Condition factor.
vii)
Under [Options...] activate Hierarchical, or Experimental, then activate Means
and counts - Click [Continue]
viii)
Click on [OK] to generate the output.
The output is a complete source table with the factors identified with Variable Labels
191
3.4 Summary
SPSS is the statistical package most widely used. There seem to be several reasons for its popularity:
•
Force of habit: SPSS has been around since the late 1960s. (Political scientist Norman
Nie, who co-authored The Changing American Voter with Sidney Verba, developed it. SPSS
originally stood for “Statistical Package for the Social Sciences”, but the name has since been
changed to reflect the marketing of SPSS outside the academic community.
•
Of the major packages, it seems to be the easiest to use for the most widely used statistical
techniques;
•
One can use it with either a Windows point-and-click approach or through syntax (i.e., writing
out of SPSS commands). Each has its own advantages, and the user can switch between the
approaches;
•
Many of the widely used social science data sets come with an easy method to translate them
into SPSS; this significantly reduces the preliminary work needed to explore new data.
There are also two important limitations that deserve mention at the outset:
SPSS users have less control over statistical output than, for example, Stata or Gauss users. For
novice users, this hardly causes a problem. But, once a researcher wants greater control over the
equations or the output, she or he will need to either choose another package or learn techniques for
working around SPSS’ limitations;
192
SPSS has problems with certain types of data manipulations, and it has some built in quirks that seem
to reflect its early creation. The best known limitation is its weak lag functions, that is, how it
transforms data across cases. For new users working off of standard data sets, this is rarely a problem.
But, once a researcher begins wanting to significantly alter data sets, he or she will have to either
learn a new package or develop greater skills at manipulating SPSS.
Overall, SPSS is a good first statistical package for people wanting to perform quantitative research
in social science because it is easy to use and because it can be a good starting point to learn more
advanced statistical packages.
3.5 References
Burchinal, Lee (1997) - Methods for Social Researchers in Developing countries. Available in PDF
and www.srmdc.net/ Last accessed on October 15th 2008.
Gerber SB and Finn K.V (2005) Using SPSS for Windows. 2nd Ed.
http://www.hmdc.harvard.edu/projects/SPSS_Tutorial. Last assessed in July 2010
http://stattrek.com- Select Tutorials and then Introduction to Probability and Statistics. Last assessed
in July 2010
Robert Burns and Richard Burns ( ) Business Research and Statistics using SPSS. SAGE Pub. Ltd
SPSS for Windows software (any version between Version 14 and most recent)
SPSS programme/manual
193
MODULE FIVE
RESEARCH REPORT WRITING
194
LECTURE ONE
WRITING A RESEARCH REPORT, DISSERTATION AND THESIS
(By Prof. S. Mbogo, Dr. L. Kisoza and Ms. H. Mtae)
1.1 Introduction
The purpose of this chapter is to introduce you to how to write research reports, dissertations and
theses. The chapter covers: rationale for writing reports, types of reports and main contents of the
report.
1.2 Learning outcomes
After completion of this chapter you should be able to:
i)
Identify the different types of the report
ii)
Outline main components of a research report
iii)
Outline main components of a report
iv)
Write scientific reports
v)
Explain the importance of formatting research reports
1.3 Rationale for Report Writing
There are many reasons for reporting research results including:
• Criteria for judging the capacity of research personnel. The research outputs can only be judged by
research if is continuous.
• To disseminate research findings to other researchers and indicating directions for future research.
One of the purposes of research is to generating new information, for that matter research can provide
solution to problems or avenues for further research. It will also prevent duplication of research
efforts
• To justify expenditure of public or donor funds. The funding agencies need to satisfy them selves
that the research funds were worth spending.
195
1.4 How to Get Started
The main assumption here is that you have come up with a good idea for research, had your proposal
approved, collected the data, conducted your analyses and now you're about to start writing the
dissertation. Take your proposal and begin by checking your proposed research methodology. Change
the tense from future tense to past tense and then make any additions or changes so that the
methodology section truly reflects what you did. You have now been able to change sections from the
proposal to sections for the dissertation. Move on to the Statement of the Problem and the Literature
Review in the same manner.
You first have to make an outline for your report. This outline will contain three main parts. The first
part will consist of a description of your problem, within its context (the country and research area),
the objectives of the study and the methodology. This part should not comprise more than one quarter
of the report. The second part will form the bigger part of your report and this will contain the
research findings. The third and final part will consist of the discussion of your data, conclusions and
recommendations.
Then you will have to make your report, dissertation or thesis attractive and user-friendly with a
creative title page, a preface with acknowledgements, a table of contents, and a list of tables, figures
and abbreviations. The references you used for your study will have to be added, and annexes
including your data-collection tools.
Before you start writing, it is therefore essential to group and review the data you have analysed by
objective. Check whether all data has indeed been processed and analysed as planned.
Draw major conclusions and relate these to the literature read. If necessary go back to your raw data
and refine your analysis, or go search for additional literature to answer questions that the analysis of
your data may evoke.
Compile the major conclusions and tables or quotes from qualitative data related to each specific
objective. You are now ready to draft the report.
Reports need to;
196
• Have a logical, clear structure
• Be to the point , and
• Use simple language and have a pleasant lay-out
1.5 Preliminary Considerations
When writing research reports the following issues must be put into consideration:
a) Knowledge of your audience
• Know who your readers are. Must also take into consideration the community needs, as well as
policy and programme makers
• Why do they want to read your report? For instance it is established that most people would like to
know solutions of the problems rather telling them what a problem is.
b) Knowledge on how the reader reads your report
Most readers want to know about new information generated by a particular research. The new
knowledge is usually highlighted in the conclusion. For that matter such readers begin with the
conclusion.
c) Complete data analysis before you start writing a report
Before you start writing a report you need to have a through review of data analysis and ask yourself
the following:
• If the conclusions are appropriate to the specific objectives
• If the analytical tables are adequate
• If all methods of data collection have been included
197
1.6 Types of Research Reports
There are different types of research, therefore we have different types of research reports and to
match the type of research. Examples of research reports are:
a) Specific Project reports
These include progress reports and annual reports. Formats for such reports differ depending on the
funding agency. These reports have a very limited circulation.
b) Theses and Dissertations
These are specialized reports which are prepared by post graduate students. Their formats differ
depending on specific requirements of particular Universities or Institutions.
c) Technical Articles for Journals or Scientific Conferences
These include either short or full length papers. The formats and styles of presentation depend on
specifications of the publisher of the journal or proceedings. These articles have potential for
circulation to wider audience, internationally depending on the distribution
1.7 Components of Research Report
The research report should contain the following components
1.7.1 Title
A good title is the one that has a minimum possible number of words that describe accurately the
content of the paper
1.7.2 Cover page
The cover page should contain the full title of the research report/dissertation/thesis and, the name of
the author. If is a dissertation/thesis, it should include the degree and the university as well as the year
of submission.
198
1.7.3 Abstract
The summary should be written only after the first or even the second draft of the report has been
completed. It should contain:
• A very brief description of the problem (why this study was needed)
• Main objectives (what has been studied)
• Place of study (where)
• Type of study and methods used (how)
• Major findings and conclusions, followed by
• Major (or all) recommendations.
1.7.4 Acknowledgements
It is good practice to thank those who supported you technically or financially in the design and
implementation of your study. Also your employer who has allowed you to invest time in the study
and the respondents may be acknowledged.
1.7.5 Table of contents
A table of contents is essential. It provides the reader with a quick overview of the major sections of
your report, with page references, so that the reader can go through the report in a different order or
skip certain sections.
1.7.6 List of tables, figures
If you have many tables or figures it is helpful to list these also, in a ‘table of contents’ type of format
with page numbers.
Examples:
Tables
199
• Table 1.1 means Table 1 in chapter one
• Table 2.1 means Table 1 in chapter two
Figures
• Fig.1.1 means Figure 1 in chapter one
• Fig.2.1 means Figure 1 in chapter two
1.7.7 List of abbreviations
If abbreviations or acronyms are used in the report, these should be stated in full in the text the first
time they are mentioned. If there are many, they should be listed in alphabetical order as well.
The table of contents and lists of tables, figures, abbreviations should be prepared last, as only then
can you include the page numbers of all chapters and sub-sections in the table of contents. Then you
can also finalise the numbering of figures and tables and include all abbreviations.
1.7.8 Introduction
Introductory chapter should give the reader a clear idea about the central issue of concern in your
research and why you thought that this is worth studying.
The introduction is a relatively easy part of the report that can best be written after a first draft of the
findings has been made. It should certainly contain some relevant background data about the country,
the data which are related to the problem that has been studied.
Then the statement of the problem should follow, revised from your research proposal with additional
comments and relevant literature collected during the implementation of the study. It should contain a
paragraph on what you hope(d) to achieve with the results of the study.
Global literature can be reviewed in the introduction to the statement of the problem if you have
selected a problem of global interest. Otherwise, relevant literature from individual countries may
follow as a separate literature review after the statement of the problem. You can also introduce
200
theoretical concepts or models that you have used in the analysis of your data in a separate section
after the statement of the problem.
1.7.9 Literature review
The main purpose of the literature review is to set your study within its wider context and to show the
reader how your study supplements the work that has already been done by others.
1.7.10 Research Design and Methods/ Methodology
This should be a detailed chapter giving the reader sufficient information to make an estimate of the
reliability and validity of your methods.
The methodology you followed for the collection of your data should be described in detail including
description of:
• The study type
• Major study themes or variables (a more detailed list of variables on which data was collected may
be annexed)
• The study population(s), sampling method(s) and the size of the sample(s)
• Data-collection techniques used for the different study populations
• How the data was collected and by whom
• Procedures used for data analysis, including statistical tests (if applicable).
1.7.11 Results
It is the most straight forward chapter as you just have to report the facts that your research
discovered. It includes;
• Tables and graphs that will illustrate your findings
• Quotes from interviewee’s (this is qualitative equivalent of tables and graphs)
• Sections of narrative account that illustrate periods of unstructured observations.
1.7.12 Discussion
201
The main purpose of the discussion chapter is interpretation of the results that you presented in the
previous chapter. It involves making judgements rather than reporting facts on research findings.
Findings should be discussed by objectives.
You should state the relation of your findings to the goals, questions and hypothesis you stated earlier.
It includes also consideration of the implications of your research for the relevant theories which you
detailed in your literature review. It is usual to discuss the strength, weaknesses and limitations of
your study.
The discussion may include findings from other related studies that support or contradict your own.
1.7.13. Conclusion and Recommendations
a). Conclusion
A conclusion is a synthesis of findings corresponding to a specific circumstance based on the
researcher’s understanding.
This should be conclusion to the whole chapter and not just the research findings but should not
include new ideas. The best way is to follow a similar structure to that used in your findings section.
b) Recommendation
A recommendation is a suggested course of action based on the conclusions about a specific
circumstance. It involves suggestions for improvements of programme or activity researched.
The conclusions and recommendations should follow logically from the discussion of the findings.
Conclusions can be short, as they have already been elaborately discussed in research findings. As the
discussion will follow the sequence in which the findings have been presented, the conclusions should
logically follow the same order.
Recommendations should be placed in roughly the same sequence as the conclusions or may at the
same time be summarised according to the groups towards which they are directed, for example,
programme makers, policy makers, community or further studies.
In making recommendations, use not only the findings of your study, but also supportive information
from other sources. You should also consider constraints, feasibility and usefulness of the proposed
solutions.
202
1.7.14 References
This applies to materials that has been referred to or quoted in the study (Detailed explanation
presented in the next chapter)
1.7.15 Appendices
It includes materials that may be of interest to the reader but not very crucial to the study (Detailed
explanation presented in the next chapter)
1.8 Writing Style
The major myth in writing a dissertation is that you start writing at Chapter One and then finish your
writing at Chapter Five. This is seldom the case. The most productive approach in writing the
dissertation is to begin writing those parts of the dissertation that you are most comfortable with.
Then move about in your writing by completing various sections as you think of them. At some point
you will be able to spread out in front of you all of the sections that you have written. You will be
able to sequence them in the best order and then see what is missing and should be added to the
dissertation.
One must take into consideration an axiom that there as many writing styles as there are writers. For
that reason there is no any single prescription of writing style. Nonetheless, the following should be
taken into consideration when one start to write a report: Firstly remember that your reader has many
other urgent matters to attend, therefore is short of time. Again is probably not knowledgeable of
research jargon
Therefore consider the following
• Simplify your report by keeping to essentials, be precise and specific.
• Be clear, logical and systematic: Use adverbs and adjectives sparingly. Also be consistence in the
use of tenses (past, present)
• Justify what you report by making statements that are only based on facts and use short sentences
• Always strive to inform not to impress. Always quantify your results, avoid expressions like large or
small, in steady say almost 75% or one in three.
203
Dissertation-style writing is not designed to be entertaining. Dissertation writing should be clear
and unambiguous. To do this well you should prepare a list of key words that are important to your
research and then your writing should use this set of key words throughout. There is nothing so
frustrating to a reader as a manuscript that keeps using alternate words to mean the same thing. If
you've decided that a key phrase for your research is "educational workshop", then do not try
substituting other phrases like "in-service program", "learning workshop", "educational institute", or
"educational program." Always stay with the same phrase - "educational workshop." It will be very
clear to the reader exactly what you are referring to.
1.9 Layout of the Report
Ensure that your report has good layout, and meet the specifications of publishers, universities or
institutions. A good layout helps your report to; make a good initial impression, encourage the reader,
gives an idea of the organization of the information.
In order to have a good layout take into consideration of the following:
• An attractive layout for the title page, and clear table of contents
• Consistency in margins and spacing
• Consistency in headings and subheadings e.g. use of bold, italics, underline, lower case, upper case.
• Consistency in numbering for figures and tables
• Accuracy and consistency in quotations and references
• High quality photocopying
Review two or three well organized and presented dissertations. Examine their use of headings,
overall style, typeface and organization. Use them as a model for the preparation of your own
dissertation. In this way you will have an idea at the beginning of your writing what your finished
dissertation will look like. A most helpful perspective!
204
1.10 Drafts
A starting point is report drafts. It is advisable to prepare an outline of the report first, which consists
of the following:
• Headings of main section
• Headings of subsections
• Points to be made for each subsection
• A list of tables and figures to be illustrated
As you get involved in the actual writing of your dissertation you will find that conservation of paper
will begin to fade away as a concern. Just as soon as you print a draft of a chapter there will appear a
variety of needed changes and before you know it another draft will be printed. And, it seems almost
impossible to throw away any of the drafts! After awhile it will become extremely difficult to
remember which draft of your chapter you may be looking at. Print each draft of your dissertation
on a different colour paper. With the different colours of paper it will be easy to see which is the
latest draft.
1.11 Review exercises
1. What should one think of before embarking on report writing?
2. There are different types of reports. List all the types you know.
3. Choose a title of your own and write a simple report on it
including all the major components
of a report.
205
LECTURE TWO
CITATION, REFERENCES AND APPENDICES
(By Prof. S. Mbogo, Dr. L. Kisoza and Ms. H. Mtae)
2.1 Introduction
The main purpose of this chapter is to introduce to you citation, references, quotations and
appendices. It covers the purpose of citation, what to cite, citation styles, references, appendices and
their importance.
2.2 Learning outcomes
At the end of this chapter you are expected to be able to:
I).
Differentiate between citations, quotations, references and appendices
ii).
Explain the purpose of citation
iii).
Identify what to cite and quote
iv).
Explain the importance of citation, references and appendices
v).
Use citation, references, quotations and appendices appropriately in
reports,
dissertation and thesis writing.
2.3 Citation
This means referring to the work of other authors in your own text. Normally it’s done to show
evidence of the background reading work that has been done and to support the contents of your
research report. Each citation requires a reference at the end of your text.
These references may be from work presented in journal or newspaper articles, government reports,
books or specific chapters of books, research dissertations or theses, material from the Internet etc.
2.3.1 Purpose of citation
There are three main reasons for you to include citations in your papers:
•
To give credit to the authors of the source materials you used when writing the paper.
•
To enable readers to follow up on the source materials.
206
•
To demonstrate that your paper is well-researched.
2.3.2 What to Cite
You should cite all direct quotations, paraphrased factual statements, and borrowed ideas. The only
items that you do not need to cite are facts that seem to be common knowledge
When you draw a great deal of information from a single source, you should cite that source even if
the information is common knowledge, since the source (and its particular way of organizing the
information) has made a significant contribution to your research report. Failure to give credit to the
words and ideas of an original author is plagiarism.
2.3.3 Citation Styles
Various citation styles exist. They convey the same information, only the presentation of that
information differs. Most style guides fall into two commonly used systems:
• Author-date system (e.g. Harvard)
• Numeric system (e.g. Vancouver, MLA-Modern Language Association)
Whichever system you use, it is important that you are consistent in its application.
2.3.4 Citations in the text
All ideas taken from another source regardless of whether directly quoted or paraphrased need to be
referenced in the text. To link the information you use in your text to its source (book, article, etc.),
put the author’s name and the year of publication at the appropriate point in your text. If the author’s
name does not naturally occur in your writing, put the author’s surname and date in brackets.
For example:
There is some evidence that these figures are incorrect (Jones, 1992).
If the author’s name is part of the statement, put only the year in brackets:
For example:
207
Jones (1992) has provided evidence that these figures are incorrect.
If there are two authors, give both:
For example:
It is claimed that government in the information age will “work better and cost less”(Bellamy and
Taylor, 1998).
Note: if you are giving a direct quotation then you need to include the page number.
If there are more than two authors, cite only the first followed by ‘et al.’ (which means ‘and others’):
For example:
. . .adoptive parents were coping better with the physical demands of parenthood and found family
life more enjoyable (Levy et al. 1991).
Note: up to three author names can be given in your reference list/bibliography.
If an author has published more documents in the same year, distinguish between them by adding
lower-case letters:
For example:
In recent studies by Smith (1999a, 1999b, 1999c) . . .
2.3.5 Citation of work described in another work
When an author quotes or cites another author and you wish to cite the original author you should
first try to trace the original item. However, if this is not possible, you must acknowledge both
sources in the text, but only include the item you actually read in your reference list.
For example:
208
If Jones discusses the work of Smith you could use:
Smith (2005) as cited by Jones (2008)
or Smith’s 2005 study (cited in Jones 2008, p.156) shows that…
Then cite Jones in full in your reference list.
2.3.6
Information found in more than one source
If you find information in more than one source, you may want to include all the references to
strengthen your argument. In which case, cite all sources in the same brackets, placing them in order
of publication date (earliest first). Separate the references using a semi-colon (;).
For example:
Several writers (Jones 2004; Biggs 2006; Smith 2008) argue…
2.3.7 Chapter/section of an edited book
For example:
The view proposed by Franklin (2002, pp88). . .
1.
Journal article
. . . the customer playing the part of a partial employee (Dawes and Rowley, 1998).
2.
Newspaper article
For example:
TGNP(2010) accused the 18th Parliament Session of Tanzania for not delivering as expected
209
3.
Electronic information
For example:
One commentator (Ben, 2005) questioned whether educators will have time to acquire. . .
Repeating a Citation
▪
After the first complete citation of a work, you may abbreviate subsequent instances by using
either Ibid. or a shortened form of the citation. See the following examples of each style.
Ibid.
Use Ibid. to repeat a footnote that appears immediately before the current footnote. Ibid. takes the
place of the author’s name, the title of the work, and as much of the subsequent information as is
identical.
For example:
50 Thomas
Smith, “New Debate over Business Records,” The New York Times, December 31,
1978, sec. 3, pp 5.
51 Ibid., pp. 6.
1.
Quotations
210
Quoting involves using exact words, phrases and sentences from a source, setting them off with
quotation marks, and citing where the information was taken from.
For example:
According to Berestein (2003), the Middle Eastern water pipe known as the hookah recently "has
been resurrected in youth-oriented coffee houses, restaurants and bars, supplanting the cigar as the fad
of the moment"
Smoothly incorporate the quote into your document. Try using a compound or complex
sentence in which at least one entire clause includes only your original ideas. The other clause is the
quote.
For example:
Because Lenina is incapable of real love, she misunderstands John's emotion when he tells her, "I
love you more than anything in the world."
2.4 References
This is a list of bibliographic details of all items referred to directly in the text.
There are many styles, but most popular ones are;
ix)
Author - Date System (Havard and American Psychological Association (APA)
systems)
x)
Numeric System (Vancouver Style and MLA-Modern Language Association Styles)
You have to choose appropriate referencing system for your research report as many universities have
different systems of referencing.
2.4.1 Importance of References
References are used to:
Enable the reader to locate the sources you have used;
Help support your arguments and provide your work with
credibility;
211
Show the scope and breadth of your research;
Acknowledge the source of an argument or idea. Failure to do so
could result in a charge of
plagiarism.
2.4.2 Reference List
Full references of sources used should be listed at the end of your work as a reference list. This list of
references is arranged alphabetically usually by author.
2.4.3 Plagiarism
Plagiarism is the submission of an item of assessment containing elements of work produced by
another person(s) in such a way that it could be assumed to be your own work.
Examples of plagiarism are:
• the verbatim copying of another person’s work without acknowledgement
• the close paraphrasing of another person’s work by simply changing a few words or altering the
order of presentation without acknowledgement
• the unacknowledged quotation of phrases from another person’s work and/or the presentation of
another person’s idea(s) as one’s own.
Plagiarised work may be from a published source such as a book, report, journal or material available
on the internet.
2.4.4 What should you include in reference
For each reference you make in a reference list or bibliography, it is essential that you record various
pieces of information so that you keep track of all your references
• Authors/editors
• Year of publications
• Title
• Edition
212
• Publisher
2.4.5 How to Collect and Organise References
It is often not easy (or possible) to retrieve sources after you have written your text. For this reason it
is best to keep a good record of everything that you use. Bibliographic software, such as Endnote,
Procite or Reference Manager, will help you organise your references according to different citation
systems and to add the citations to your text. Alternatively, you could store your references on index
cards.
Start your references section at the beginning of writing process and add to it as you go along.
Ensure that you have cited in reference section all those sources to which you have referred in the
text.
Ensure that all data and material taken as they are from another person’s published or unpublished
written or electronic work is explicitly identified and referenced to its author. This also includes the
work which is referred to in the written work of others even if the material is not quoted exactly as
they are.
2.4.6 Author Date Referencing Styles
2.4.6.1 The Harvard Style
According to Neville 2007, there are variations within the Harvard style including;
• Name(s) of authors or organisations may or may not be in UPPER CASE
• Where there are more than two authors, the names of the second and subsequent authors may or
may not be replaced by et al. in italics.
• The year of publications may or may not be enclosed in brackets
• The title of publications may be in italics or may be underlined
Examples:
I) Books and chapters in books
• Book (first edition)
213
Berman Brown, R. and Saunders, M. (2008).Dealing with statistics: What you need to know.
Maidenhead: Open University Press.
• Book (other than first edition)
Morris, C. (2003). Quantitative approaches to business
London: Financial Times Pitman
studies. (6th edn).
Publishing.
• Book (no obvious author)
Mintel Marketing Intelligence (1998).Designer wear: Mintel marketing intelligence report. London:
Mintel International Group Ltd.
• Chapter in a book
Robson, C. (2002). Real World Research. (2nd edtn). Oxford: Blackwell. Chapter 3.
Tuckman, A. (1999) Labour, skills and training. In: R. Levitt et al, eds.The reorganised National
Health Service. 6th ed. Cheltenham: Stanley Thornes,pp. 135-155
• Chapter in an edited book containing a collection of articles
King, N. (2004).Using templates in the thematic analysis of text. In C. Cassel and J. Symon (eds)
Essential guide to qualitative methods in an organizational research. London: Sage. pp.256-270
• Books
Kadolph, S.J. (2007) Textiles, 10th ed. New Jersey: Pearson Prentice Hall
• Books with two or three authors
Li, X. and Crane, N.B. (1993) Electronic style: a guide to citing electronicinformation. London:
Meckler
• Books with more than three authors
Levitt, R. et al. (1999) The reorganised National Health Service. 6th ed.Cheltenham: Stanley
Thornes.
II) Other sources
• Journal article (originally printed but same as found on line)
Storey, J., Cressey, P., Morris, T. and Wilkinson, A. (1997). Changing employment practices in UK
banking: case studies. Personnel Review. Vol.26, No. 1, pp. 24-42.
214
• Journal article only published online
Illingworth, N. (2001). The Internet matters: exploring the use of the internet as a research tool.
Sociological
research
Online,
vol.
6,
No.2.
Available
at
http://www.socresonline.org.uk/6/2/illingworth.htm[Acessedhttp://www.socresonline.org.uk/6/2/illin
gworth.htm[Acessed 14th May 2002].
• Magazine article (no obvious authors)
Quality world. (2007). Immigration abuse. Quality World. Vol. 33, No.12, pp.6
• Papers in conference proceedings
Gibson, E.J. (1977) The performance concept in building. In: Proceedings of the7th CIB Triennial
Congress, Edinburgh, September 1977. London: Construction Research International, pp. 129-136
• Publication from corporate body (e.g Government publication)
Great Britain. Department of the Environment, Development Commission (1980) 38th Report, 1st
April 1979 to 31st March 1980. London: HMSO, 1979-80 HC. 798, pp. 70-81
• News paper article where the author is identified
Kikwete J.K (2010).Time for Tanzania to rid itself of corrupt leaders. This Day Tuesday March 1 st
2010, pp1
• Thesis
Tregear, A.E.J (2001) Speciality regional foods in the UK: An investigation from the perspectives of
marketing and social history. Unpublished PhD thesis. University of Newcastle upon Tyne.
III) Electronic Sources
• Websites
National electronic Library for Health, 2003. Can walking make you slimmer and healthier? (Hitting
the headlines article) [Online] (Updated 16 Jan 2005) Available at: http://www.nhs.uk.hth.walking
[Accessed 10 April 2005].
• Publications
215
Scottish Intercollegiate Guidelines, 2001. Hypertension in the elderly. (SIGN publication 20)
[internet] Edinburgh : SIGN (Published 2001) Available at: http://www.sign.ac.uk/pdf/sign49.pdf
[Accessed 17 March 2005].
• E-mail correspondence
Available
at:
http://gog.defer.com/2004_07_01_defer_archive.htmlhttp://gog.defer.com/2004_07_01_defer_archiv
e.html [Accessed 7 July 2005].
• Electronic books (e-books)
Grrahame,
K.(1917).
The
wind
in
the
willows.
Netlibrry
(online).
Available
at
http://www.netlibrary.comhttp://www.netlibrary.com [Accessed 14th July, 2005]
• Article in electronic journals
Bright, M. (1985).'The poetry of art', journal of the history of ideas, 46(2), pp.259-277 JSTRO
(online). Available at :http://uk.jstro.org/http://uk.jstro.org/ [Accessed 16th June 2005]
IV) Other types of documents
• Acts of Parliament
Higher Education Act 2004. (c.8), London: HMSO.
For Acts prior to 1963, the regal year and parliamentary session are included:Road Transport
Lighting Act 1957. (5&6 Eliz. 2, c.51), London: HMSO.
• Statutory Instruments
Public Offers of Securities Regulations 1995. SI 1995/1537, London: HMSO.
• Command Papers and other official publications
Royal Commission on civil liability and compensation for personal injury,1978. (Pearson Report)
(Cmnd. 7054) London: HMSO.
216
Select Committee on nationalised industries (1978-9). Consumers and the nationalised industries:
prelegislative hearings (HC 334 of 1978-9) London: HMSO.
http://libweb.anglia.ac.uk/referencing/harvard.htm
• Law report
R v White (John Henry) [2005] EWCA Crim 689, 2005 WL 104528.
Jones v Lipman [1962] 1 WLR 832.
Saidi v France (1994) 17 EHRR 251, p.245
• Annual report
Marks & Spencer, 2004. The way forward, annual report 2003-2004, London: Marks & Spencer
• For an e-version
Marks & Spencer, 2004. Annual report 2003-2004. [Online]
Available at: http://www-marks-and-spencer.co.uk/corporate/annual2003/[Accessed 4 June 2005]
N.B. the URL should be underlined
• Map
Ordnance Survey, 2006. Chester and North Wales. Land ranger series Sheet 106, 1:50000,
Southampton: Ordnance Survey
• Pictures, Images and Photographs
Beaton, C., 1956. Marilyn Monroe. [Photograph] (Marilyn Monroe’s own private collection).
•
Beaton, C., 1944. China 1944: A mother resting her head on her sick child's pillow in the Canadian
Mission Hospital in Chengtu. [Photograph] (Imperial War Museum Collection).
Electronic reference :
Dean, Roger, 2008 Tales from Topographic Oceans. [electronic print] Available at:
http://rogerdean.com/store/product_info.php cPath=48&products_id=88 From home
page/store/calendar/august [Accessed 18 June 2008].
V) Unpublished works
217
•
Unpublished works
Woolley, E. & Muncey, T., (in press) Demons or diamonds: a study to ascertain the range of attitudes
present in health professionals to children with conduct disorder. Journal of Adolescent Psychiatric
Nursing. (Accepted for publication December 2002).
•
Informal or in-house publications
Anglia Ruskin University, 2007. Using the Cochrane Library. [Leaflet]
•
Personal communications
O’Sullivan, S., 2003. Discussion on citation and referencing [Letter] (Personal communication, 5
June 2003).
•
Unpublished conference papers
Saunders, M.N.K., Thornhill, A and Evans, C. (2002). Conceptualising trust and distrust and the role
of boundaries: an organisationally based exploration. Unpublished paper presented at ‘EIASM 4th
Workshop on Trust Within and between organisations’. Amsterdam, 25-26 Oct. 2007.
•
Internet site
European
commission.
(2007).
Eurostat-structural
indicators.
Available
at
http://epp.eurostat.ec.europa.eu/portal/page?_pageid=1133_47800773, 1133_47802558&_dad=portal
&_schema=PORTAL [Accessed 27 Nov. 2007]
•
Internet reports and guides
Browne, L. and Alstrup, P. (2006). What exactly is the Labour Force Survey? Available at
http://www.statistics.gov.uk
2.4.6.1 APA STYLE
The APA style guide prescribes that the Reference section, bibliographies and other lists of names
should be accumulated by surname first, and mandates inclusion of surname prefixes. For example,
"Martin de Rijke" should be sorted as "de Rijke, M." and "Saif Al-Falasi" should be sorted as "AlFalasi, S."
Examples:
1.
Published Materials
218
Book by one author
•
Sheril, R. D. (1956). The terrifying future: Contemplating color television. San Diego, CA:
Halstead.
Book by two authors
• Kurosawa, J., & Armistead, Q. (1972). Hairball: An intensive peek behind the surface of an enigma.
Hamilton, Ontario, Canada: McMaster University Press.
Chapter in an edited book
• Mcdonalds,
A. (1993). Practical methods for the apprehension and sustained containment of
supernatural entities. In G. L. Yeager (Ed.), Paranormal and occult studies: Case studies in
application (pp. 42–64). London, England: OtherWorld Books.
Dissertation (PhD or masters)
• Mcdonalds, A. (1991). Practical dissertation title (Unpublished doctoral dissertation). University of
Florida, Gainesville, FL.
Article in a journal with continuous pagination (nearly all journals use continuous pagination)

Rottweiler, F. T., & Beauchemin, J. L. (1987). Detroit and Narnia: Two foes on the brink of
destruction. Canadian/American Studies Journal, 54, 66–146.
(b)
Kling, K. C., Hyde, J. S., Showers, C. J., & Buswell, B. N. (1999). Gender
differences in self-esteem: A meta-analysis. Psychological Bulletin, 125, 470–500.
Article in a journal paginated separately Journal_pagination
• Crackton, P. (1987). The Loonie: God's long-awaited gift to colourful pocket change? Canadian
Change, 64(7), 34–37.
Article in a weekly magazine
• Henry,
W. A., III. (1990, April 9). Making the grade in today's schools. Time, 135, 28–31.
Article in a weekly magazine with DOI
3)
Hoff, K. (2010, March 19). Fairness in modern society. Science, 327, 1467-1468.
doi:10.1126/science.1188537
Article in a print newspaper
• Wrong,
M. (2005, August 17). "Never Gonna Give You Up" says Mayor. Toronto Sol, p. 4.
219
2.
Electronic sources
Online article based on a print source, with DOI (e.g., a PDF of a print source from a database)
• Krueger, R. F., Markon, K. E., Patrick, C. J., & Iacono, W. G. (2005). Externalizing
psychopathology in adulthood: a dimensional-spectrum conceptualization and its implications for
DSM-V. Journal of Abnormal Psychology, 114, 537-550. doi:10.1037/0021-843X.114.4.537
Online article based on a print source, without DOI (e.g., a PDF of a print source from a
database)
• Marlowe, P., Spade, S., & Chan, C. (2001). Detective work and the benefits of colour versus black
and white. Journal of Pointless Research, 11, 123–127.
Online article from a database, no DOI, available ONLY in that database (proprietary
content--not things like Ovid, EBSCO, and PsycINFO)
• Liquor advertising on TV. (2002, January 18). Retrieved from
http://factsonfile.infobasepublishing.com/
OR
• Liquor advertising on TV. (2002, January 18). Retrieved from Issues and Controversies database.
Article in an Internet-only journal
• McDonald, C., & Chenoweth, L. (2009). Leadership: A crucial ingredient in unstable times. Social
Work & Society, 7. Retrieved from http://www.socwork.net/2009/1/articles/mcdonaldchenoweth
Article in an Internet-only newsletter (eight or more authors)
• Paradise, S., Moriarty, D., Marx, C., Lee, O. B., Hassel, E., . . . Bradford, J. (1957, July). Portrayals
of fictional characters in reality-based popular writing: Project update. Off the Beaten Path, 7.
Retrieved from http://www.newsletter.offthebeatenpath.news/otr/complaints.html
Article with no author identified
• Britain launches new space agency. (2010, March 24). Retrieved from
http://news.ninemsn.com.au/technology/1031221/britain-launches-new-space-agency
Article with no author and no date identified (e.g., wiki article)
• Harry Potter. (n.d.). In Wikipedia. Retrieved March 12, 2010, from
http://en.wikipedia.org/wiki/Harry_Potter
220
Entry in an online dictionary or reference work, no date and no author identified
• Verisimilitude. (n.d.). In Merriam-Webster's online dictionary (11th ed.). Retrieved from
http://www.merriam-webster.com/dictionary/verisimilitude
E-mail or other personal communication (cite in text only)
• (A. Monterey, personal communication, September 28, 2001)
Book on CD
• Nix, G. (2002). Lirael, Daughter of the Clayr [CD]. New York, NY: Random House/Listening
Library.
Book on tape
• Nix, G. (2002). Lirael, Daughter of the Clayr [Cassette Recording No. 1999-1999-1999]. New
York, NY: Random House/Listening Library.
Movie
• Gilby, A. (Producer), & Schlesinger, J. (Director). (1995). Cold comfort farm [Motion picture].
Universal City, CA: MCA Universal.
3.
Statistical expressions in APA
Note on Probabilities
There are two ways to report statistical probabilityprobability : pre-specified probability given as a
range below the chosen alpha levelalpha level and exact probability given as a calculated p-valuepvalue . Since most statistical packages calculate an exact value for p, the Publication Manual
recommends that exact p-values should be reported.
• Example: p < .05
• Example: p = .031 (preferred)
Exceptions, where a pre-specified probability range may be preferred, include large or complex tables
of correlations or when the p-value is particularly small (e.g., p < .001).
Reporting FF-tests
General format: F([df-between], [df-within]) = [F-obtained], p = [p-value], [eta-squared obtained] =
[value].
• Example:
F(2, 50) = 9.35, p < .001, η2 = .03.
221
If a p-value is not significantnot significant , then the letters ns are substituted, or the precise p-value
is substituted prefaced by an equals sign.
• Example: F(2, 50) = 1.35, ns.
• Example: F(2, 50) = 1.35, p = .18. (preferred)
If an F-value is less than 1, thereby implying that it can never be statistically significant, then neither
the F-value itself, nor the associated p-value, is reported.
• Example: F(2, 50) < 1.
• Example: F < 1.
Reporting tt-tests
General format: t([df error])= [t-obtained], p = [p-value], [Cohen's d obtained] = [value].
• Example: t(9) = 2.35, p = .043, d = .70.
Reporting χχ2χ2 tests
General format: χ2([df error], N = [total sample size]) = [Chi-squared obtained], p = [p-value].
• Example: χ2(4, N = 24) = 12.4, p = .015.
2.5 Appendices
This is a supplement to the research report, dissertation or thesis. It should not normally include
material that is essential for the understanding of the report itself, but additional relevant material in
which the reader may be interested.
• Should be kept to the minimum
• Materials which are interesting to know rather than essential to know should be in appendices
• Should include a blank copy of your questionnaire, interview or observation schedule. Where these
have been conducted in different language from that in which you write your submitted research
report, you will need to submit both this version and the translation.
In documents appendix may refer:
222
• Addendum, any addition to a document, such as a book or legal contract
• Bibliography, a systematic list of books and other works
• Index , a list of words or phrases with point to where related material can be found in a document
• Specifically, a text added to the end of a book or an article, containing information that is important
to, but is not the main idea of, the main text
2.6 Bibliography
This is alphabetical list of bibliographic details for all relevant items consulted and used, including
those items not referred to directly in the text.
Bibliographies have the following formatting conventions:
• The first author’s name is inverted (last name first), and most elements are separated by periods.
• Entries have a special indentation style in which all lines but the first are indented.
• Entries are arranged alphabetically by the author’s last name, or by the first word of the title if no
author is listed.
Examples
1.Books (Printed)
One author
Footnote
10 David A. Garvin, Operations Strategy: Text and Cases (Englewood Cliffs, NJ:Prentice-Hall,
1992), p. 73.
Bibliography
Garvin, David A. Operations Strategy: Text and Cases. Englewood Cliffs, NJ:Prentice-Hall, 1992.
2. Two authors
Footnote
11 John P. Kotter and James L. Heskett, Corporate Culture and Performance (New York: Free Press,
1992), p. 101.
Bibliography
223
Kotter, John P., and James L. Heskett. Corporate Culture and P
erformance. New York: Free
Press, 1992.
3. Three authors
Footnote
12 John W. Pratt, Howard Raiffa, and R.O. Schlaifer, Introduction to Statistical Decision Theory
(Cambridge: MIT Press, 1995), p. 45.
Bibliography
Pratt, John W., Howard Raiffa, and R.O. Schlaifer. Introduction to Statistical Decision Theory.
Cambridge: MIT Press, 1995
4. Unpublished material
Footnote
31 Sarah Dodd, “Transnational Differences in Entrepreneurial Networks,” paper presented at the
Eighth Global Entrepreneurship Research Conference, INSEAD, Fontainebleau, France, June 1998.
Bibliography
Dodd, Sarah. “Transnational Differences in Entrepreneurial Networks.” Paper presented at the Eighth
Global Entrepreneurship Research Conference, INSEAD, Fontainebleau, France, June 1998
2.7 Review exercise 1
1.
Seven publications of various formats are described below. Try expressing the reference
details for each in the Harvard style and putting them into alphabetical order in a reference list.
•
A book with the title: 'Occupational health and safety', published in Sydney in 2004 by
McGraw-Hill, with authors M. Stewart and F. Heyes. This is the second edition.
•
A book with the title: 'Internal control and corporate governance', with authors K. Adams,
R. Grose, D. Leeson and H. Hamilton, published in Frenchs Forest, NSW by Pearson Education
Australia in 2003.
•
An article by M. Scardamalia and C. Bereiter, called 'Schools as knowledge-building
organizations', published in 1999 in a book edited by D. Keating and C. Hertzman, called 'Today's
children, tomorrow's society' in New York by Guilford as pages 274 to 289.
224
•
An article by J. R. Savery and T. M. Duffy, called 'Problem based learning: an instructional
model and its constructivist framework', published on pages 31 to 38 in the journal 'Educational
Technology', volume 35, number 5, in 1995.
•
An article called 'Integration and thematic teaching: integration to improve teaching and
learning' by S. Lipson, S. Valencia, K. Wixson and C. Peters, published in 1993 in the journal
'Language Arts', volume 70, number 4, pages 252 to 263.
•
A video recording of a television documentary called 'Embers of the sun', produced in 1999
by the Australian Broadcasting Corporation in Sydney.
•
A Web page with the title 'Telstra conferencing - video overview', found at the address:
http://www.telstra.com.au/conferlink/videoconf.htm on 11 August 2004. No date on it, though
Mozilla gives a last modified date of 4 July 2004.
2.8 Review exercise 2
•
Read the following passage then choose an appropriate reference from the list below and fill
in the gaps.
•
Make sure that the reference you choose means that the sentence is grammatical AND makes
sense AND that it makes sense relative to the surrounding sentences. In other words that the overall
meaning of the passage makes sense.
•
Provide reasons why you chose to select or not select the particular phrases.
(….......................1995, p.6) children between the ages of five and eight who are repeatedly exposed
to violent films are highly likely to commit some form of crime associated with physical violence.
However, …...............................states such a claim is more emotive than reasoned in nature. Citing
the lack of research to support this claim................................, that why the children are watching such
movies in the first place is a far more pressing question that society needs to address.
Phrase
Reason for selecting/not-selecting
As Jones states
According to Jones
Jones
225
Smith (1998, p. 9)
According to Smith (1998, p. 9)
Smith (p. 9) argues
2.9 References
Anglia Ruskin University (2008). Guide to the Harvard Style of referencing. University Library.
http://libweb.anglia.ac.uk [Accessed on 26th August,2010]
Havard Business School (2009). Citation guide 2009-2010 academic year
http://intranet.hbs.edu/dept/drfd/caseservices/styleguide.pdf. [Accessed on 26th August,2010]
Shayo, H.E., (2010). Beginners referencing resource. Handbook for young researchers. The Open
University of Tanzania.
Saunders, M. N. K., Lewis & P., Thornhill, A. (2009). Research methods for business students. (5ed).
Harlow: Prentice Hall – Financial Times.
226
Download