RESEARCH METHODOLOGY FOR POSTGRADUATE STUDENTS Prepared by: DIRECTORATE OF RESEARCH, PUBLICATIONS AND POSTGRADUATE STUDIES THE OPEN UNIVERSITY OF TANZANIA DECEMBER, 2010 1 TABLE OF CONTENTS MODULE ONE............................................................................................................6 INTRODUCTION TO RESEARCH............................................................................6 LECTURE ONE...........................................................................................................7 1.1 Introduction...........................................................................................................7 1.2 Learning Outcomes...............................................................................................7 1.3 What is Research?.................................................................................................7 1.4 What is a Theory?..................................................................................................8 1.5 Importance of Researches......................................................................................8 1.6 The Link Between Theory and Research..............................................................9 1.7 Research Hypothesis and Theory..........................................................................9 1.8 Deduction and Induction Theory Development Approaches..............................10 1.9 Types and Strategies of Researches....................................................................11 1.10 Summary............................................................................................................13 1.11 Review exercise.................................................................................................14 1.12 References.........................................................................................................14 LECTURE TWO........................................................................................................15 2.1 Introduction.........................................................................................................15 2.2 Learning outcomes..............................................................................................15 2.3 Developing a Research Topic and Problem........................................................15 2.4 Planning for the Research Project.......................................................................17 2.5 The Research Development Process...................................................................17 2.6 Research Ethics...................................................................................................18 2.7 Summary..............................................................................................................20 2.8 Review Exercise..................................................................................................20 2.9 References...........................................................................................................20 LECTURE THREE....................................................................................................21 3.1 Introduction.........................................................................................................21 3.2 Learning outcomes..............................................................................................21 3.3 Research Proposal...............................................................................................21 3.4 Summary..............................................................................................................28 LECTURE FOUR......................................................................................................29 4.1 Introduction..........................................................................................................29 4.2 Learning outcomes..............................................................................................29 4.3 Research Methodology........................................................................................29 4.4 Time Frame and Budget table and Ways of Making Decisions..........................31 4.5 References...........................................................................................................31 4.6 Appendices..........................................................................................................31 4.7 Summary..............................................................................................................32 4.8 Review exercise...................................................................................................32 Machi, L. A. & McEvoy, B. T. (2008 The Literature Review: Six Steps to Success. Corwin Press...............................................................................................................32 MODULE TWO.........................................................................................................34 LITERATURE REVIEW AND REFERENCING.....................................................34 LECTURE ONE.........................................................................................................35 1.1 Introduction.........................................................................................................35 1.2 Learning outcomes..............................................................................................35 1.3 Literature sources................................................................................................35 2 1.4 Importance of conducting literature search effectively.......................................40 1.5 Planning your literature search strategy..............................................................41 1.6 Plagiarism............................................................................................................53 1.7 Summary..............................................................................................................54 1.8 Review exercise...................................................................................................55 1.9 References...........................................................................................................55 LECTURE TWO........................................................................................................56 2.1 Introduction.........................................................................................................56 2.2 Learning outcomes..............................................................................................56 2.3 What is critical literature review?........................................................................56 2.4 Importance of critical literature review...............................................................57 2.5 Writing a a critical review...................................................................................58 2.6 Structure of the critical review............................................................................63 2.7 Identification of research gaps (Concluding the literature review).....................67 2.8 Conceptual and theoretical frameworks..............................................................68 2.9 Summary..............................................................................................................69 2.10 Review exercise.................................................................................................69 2.11 References.........................................................................................................70 MODULE THREE.....................................................................................................72 RESEARCH DESIGN AND DATA COLLECTION METHODS...........................72 LECTURE ONE.........................................................................................................73 1.1 Introduction.........................................................................................................73 1.2 Learning outcomes..............................................................................................73 1.3 Need for research design.....................................................................................73 1.4 Features of a good research design......................................................................74 1.5 Types of Research Designs.................................................................................75 1.6 Summary..............................................................................................................94 1.7 Review exercise...................................................................................................94 1.8 References...........................................................................................................95 LECTURE TWO........................................................................................................96 2.1 Introduction.........................................................................................................96 2.2 Learning outcomes..............................................................................................96 2.3 Quantitative Data Collection Methods................................................................96 2.3.1 Sources of data..................................................................................................97 2.3.2 Steps in data collection.....................................................................................97 2.3.3 Need for correct sampling................................................................................98 2.3.4 Methods of data collection...............................................................................98 2.3.4.2 Collection of primary data.............................................................................99 2.4 Qualitative Data Collection Techniques............................................................104 2.4.1 Observational and quasi-observational techniques.........................................105 2.4.2 Projective techniques......................................................................................106 2.4.3 In-depth interviews.........................................................................................108 2.4.4 Direct Observation..........................................................................................109 2.4.5 Standardized tests...........................................................................................109 2.4.6 Case studies....................................................................................................110 2.5 Other important aspects in data collection........................................................110 2.5.1 Data storage....................................................................................................110 2.5.2 Ethical issues/considerations in data collection..............................................110 2.5.3 Challenges faced by researchers in data collection........................................111 2.6 Summary............................................................................................................112 3 2.7 Review exercise.................................................................................................113 2.8 References.........................................................................................................115 MODULE FOUR.....................................................................................................116 DATA ANALYSIS METHODS..............................................................................116 LECTURE ONE.......................................................................................................117 1.1 Introduction.......................................................................................................117 1.2 Learning outcomes............................................................................................117 1.3 Basic ideas about data analysis and presentation..............................................117 1.4 Methods of quantitative data analysis...............................................................119 1.5 Review exercises...............................................................................................149 1.6 Summary............................................................................................................150 1.7 Additional review exercises..............................................................................152 1.8 References.........................................................................................................153 LECTURE TWO......................................................................................................154 2.1 Introduction.......................................................................................................154 2.1 Learning outcomes............................................................................................155 2.2 Procedures for processing and displaying of qualitative data...........................155 2.3 Drawing and verifying conclusions...................................................................167 2.4 Reporting qualitative data..................................................................................170 2.5 Further strategies for testing or confirming qualitative findings to prove validity ..................................................................................................................................170 2.5 Summary............................................................................................................171 2.6 References.........................................................................................................171 LECTURE THREE..................................................................................................172 3.1 Introduction ......................................................................................................172 3.2 Learning outcomes............................................................................................172 3.3 Data analysis by computer.................................................................................173 3.4 Summary............................................................................................................192 3.5 References.........................................................................................................193 MODULE FIVE.......................................................................................................194 RESEARCH REPORT WRITING...........................................................................194 LECTURE ONE.......................................................................................................195 1.1 Introduction........................................................................................................195 1.2 Learning outcomes............................................................................................195 1.3 Rationale for Report Writing.............................................................................195 1.4 How to Get Started............................................................................................196 1.5 Preliminary Considerations...............................................................................197 1.6 Types of Research Reports................................................................................198 1.7 Components of Research Report.......................................................................198 1.8 Writing Style.....................................................................................................203 1.9 Layout of the Report..........................................................................................204 1.10 Drafts...............................................................................................................205 1.11 Review exercises.............................................................................................205 LECTURE TWO......................................................................................................206 2.1 Introduction.......................................................................................................206 2.2 Learning outcomes............................................................................................206 2.3 Citation..............................................................................................................206 2.4 References.........................................................................................................211 2.4.1 Importance of References...............................................................................211 2.4.2 Reference List ................................................................................................212 4 2.4.3 Plagiarism.......................................................................................................212 2.4.4 What should you include in reference ...........................................................212 For each reference you make in a reference list or bibliography, it is essential that you record various pieces of information so that you keep track of all your references ..................................................................................................................................212 2.4.5 How to Collect and Organise References.......................................................213 2.4.6 Author Date Referencing Styles.....................................................................213 2.4.6.1 The Harvard Style........................................................................................213 2.4.6.1 APA STYLE................................................................................................218 2.5 Appendices........................................................................................................222 2.6 Bibliography......................................................................................................223 2.7 Review exercise 1..............................................................................................224 2.8 Review exercise 2..............................................................................................225 2.9 References.........................................................................................................226 5 MODULE ONE INTRODUCTION TO RESEARCH 6 LECTURE ONE RESEARCH PHILOSOPHY (By Dr. M. Kitula) 1.1 Introduction In this lecture, you will learn about various issues related to the philosophy of research. The lecture provides you with the basic facts about research. The lecture starts with definitions of research and theories about research. You will also learn about the importance of researches, the relationship of theory and researches, the features of a theory, and theory testing which include deduction and Induction. 1.2 Learning Outcomes At the end of the lecture, you should be able to: i) Define research and theory concepts ii) Explain the importance of researches iii) Explain the components and features of a theory iv) Discuss the link between theory and research v) Discuss the theory testing procedure (deductive and Inductive analysis) 1.3 What is Research? Research is a means of getting answers to our questions, it entails following a framework of a set of principles, which base on procedures, methods and techniques tested for validity and reliability, objectivity and prejudice. Once these variables are adhered to, one can say a research is being done. A research therefore, is not necessarily an activity that involves complicated procedures such as complex statistics and computers. However, no matter simplified the research can be, it has to adhere to the research procedure to demonstrate the difference between a research and non research activity. Various people have defined research in many different ways. Some of the definitions are as presented hereunder. Kerlinger (1986), defined research as a scientific and systematic controlled empirical and critical investigation of propositions about the presumed relationship about various phenomena. Woody, C (1990), defined research as an activity which comprises of defining and redefining problems, formulating hypothesis or suggested solutions, collecting, organizing and evaluating data, making deductions and reaching conclusions. 7 On the other hand, Ngechu (1998) defined the word research as a logical purposeful formal and critical activity and as a systematic step by step process, and as a method of science which identifies a problem, gather data, analyse and interpret the data which leads to conclusion and or raising more research questions. It can generally be said that, a research is a scientific process of identifying an issue or problem, collecting information, processing data and reporting the findings. The scientific process addresses issues or problems of concern in the community or society at large, to prove concepts as theories, to discover new things such as a cure for a disease or new industrial products. 1.3.1 Features and Characteristics of a Research The definitions of a research entails that a research must have specific features. That is, a research has to be scientific, logical, systematic and must have a plan. A research is scientific because it is logic, systematic, has a plan for collection data and a theory that guides it. The research is logical because it follows a path of reason which is analytical, and rational, and it is systematic because it follows a specific procedure. An activity of inquiry qualifies to be called a research if it has certain standard characteristics. These characteristics are many but only six of them are mentioned here as presented by Kumar R (2005); He says, a research must be controlled, must be systematic, must have be rigorous, it must be viable and verifiable, must be empirical and must be critical. 1.4 What is a Theory? A theory is a set of ideas or opinion which explains the way things are and or why they exist. These ideas and or opinions have to be explained systematically and scientifically, basing on facts and rules related to the phenomena in question. A theory has various components, but thee are three most important ones. These are concepts, variables and statement, which when verified, they form theories (Turner, 1996). 1.5 Importance of Researches Researches are a means of getting answers to our questions. Both practical and theoretical problems are solved as they describe situations, test hypothesis, explore new ways of thinking through rational decisions, and generate new ideas. The researches thus do three tasks. These tasks are to describe, explore and explain. Basing on this explanation, it is seen that, researches are important in various ways which include: a. Generation of new knowledge and theories b. Providing solutions to various societal problems and concerns 8 c. Developing new methods of inquiry through the experiences of methods used in investigation d. Providing peaceful methods of challenging authorities on issues of concern of citizens e. Educating the population on issues that touch their welfare and development in general Basing on the importance of researches as indicated above, the main users of the findings are planners and policy makers who see to it that development activities are undertaken effectively while addressing the recommendations of research findings. Other groups which use research findings are academicians as they always seek for new knowledge and search for facts that lead to proving authenticity of theories through deductive process; students in higher learning institutions and owners of industries who constantly need to change, improve or create new products in order to keep pace with market demands. 1.6 The Link Between Theory and Research The day today life we live is basically dictated by theories that have led to set of principles that guide our way of living. The activities which we routinely do automatically do involve the process of creating things, ideas opinions, demonstrate application of concepts into practice and make evaluation of what we. Kolb (1991) said, an event that occurs leads to an individual to provide explanations of how and why something happened in the way it did. This explanation or statements leads to generation of concepts or guiding principles or even hypothesis that can be extrapolated into new events. A learner then applies the guiding principle in real life as part of the way of life. Through this application of the established principles, these extrapolated guiding principles are being tested. In the testing process, whether the rule was received or it was generated, out of the prior experience and reflections, a new situation is created and new experiences arise. The cyclic process described above as narrated by Kolb (1991), demonstrates a process of developing a theory. The evaluative explanatory statements build around what is going on around us provide a certain trend of facts that give strength to developing some concepts which pave way to theories for example a statement such as “Street children are a consequence of broken marriages” 1.7 Research Hypothesis and Theory Theory and hypothesis are in most cases used interchangeably. This is so because they are closely related. A theory constitutes hypothesis developed to explain a particular phenomenon. 9 Each hypothesis asserts about variable relationships of several concepts. In view of this therefore, when related concepts are put together, they build up a complete logical statement which we can call a theory. Concepts are like building blocks when put together, they make a house which is a theory. The building up of a theory entails following a logical model of scientific inquiry which are mainly two. These are deductive and inductive theory making approaches 1.8 Deduction and Induction Theory Development Approaches 1.8.1 Deduction Reasoning Approach This is a method which entails a researcher to identify theories or set principles (laws), which requires to go through the logical systematic and scientific process to find out the authenticity of the theory or set principles. The type of procedure goes hand in hand with the left hand side of Kolb’s learning cycle. An example of a theoretical statement that may require deductive approach of inquiry could be stated as follows: “Short people are always argumentative to draw attention for recognition and psychological satisfaction”. In undertaking a deductive approach of inquiry, there are crucial stages which are followed. These include having in place a theory or concepts of interest which have to be operationalized. This is followed by laying down the rules of path for making observations, operationalization of the process, that is, the actual investigation. It is at this stage that, the construction of a clear and specific guidance about what and how to observe variables during investigation is established, thus creating a standardized procedure. Once the investigation is done, and analysis is completed, the findings are corroborated with the theory statement upon which the investigation was done as a test to justify whether the theory is authentic or has lost its worthiness. The Deduction process is as shown in the figure below. I N D U C T I O N EMMPIRICAL GENERALIZATION THEORIES D E D U C T I O N HYPOTHESIS OBSERVATIONS Figure 1.1.1: Deductive and Inductive Methods Source: Wallace (1971) 10 1.8.2 Induction Reasoning Approach The approach is to start with an idea which has to be consolidated and later go through the process of observation from the empirical world to the construction of explanations which leads to theory building. It is an approach which is opposite to the deduction approach which belong to the right hand side of Kolb’s learning cycle. Glaser and Strauss (1967), reiterated that, this approach becomes worthless if they are not grounded in observation and experience, hence the birth of grounded theory. The approach inductively develops out of systematic empirical research which evolves the theories which are grounded from within the peoples voices and actual people’s views, opinion and actual people’s lives. 1.9 Types and Strategies of Researches 1.9.1 Types of Researches There are two main categories of researches. These are Social and Natural Science Researches. Within these two categories there forms two main types of researches which have branches. These two main research types are Basic and Applied/Action researches. However, these two main types of researches have branches of types of researches which are based on their application, purpose, the orientation of the principle researcher i.e. practitioner, and the nature of data being collected and related methods of data collection. 1.9.1.1 Basic and Applied/Action Researches Basic Researches addresses questions which seek for more knowledge while Applied/Action researches search for information which addresses operational problems. The Characteristics of both basic and Applied/Action researches are as can be seen in table 1 below. Table 1.1.1: Characteristics of Basic and Applied Researches S/N i. ii. iii. Basic Research Deals with researches which Applied/Action Researches Identifies existing societal problems and advance knowledge seeks solutions to the problems. Seeks to solve Theoretical Seeks to obtain operational information, Dilemma skills, knowledge and attitude. Labours to prove authenticity of Evaluates projects, programmes and plans; existing theories and to develop ctivities, monitor system and identify new ones. immediate problems that require immediate solutions. 11 1.9.1.2 Branches of Research Types The various branches of research types have titles which are related to their application, purpose, the orientation of the principle researcher for example a doctor or an engineer, and the nature of data collection. The different types of research branches include: (a) Theoretical/Pure Research This is concerned with solving theoretical puzzles which scholars have identified or created within their disciplines to extend knowledge. (b) Evaluative Research This deals with the evaluation of performance of programmes in various fields such as education, health, business, and accountability expenditures. (c) Action Researches This type of research is carried out collaboratively by members of various organizations for example, hospitals and manufacturing companies who produce products consumed by the hospitals (d) Critical and Feminist Researches Finds the truth about assumptions that are taken for granted. Such issues include gender inequality, and patriarchy oppressive systems. (e) Qualitative and Quantitative Researches Both qualitative and quantitative methods complement each other. (f) Formative Researches Practitioners who are professionals like lecturers, doctors, engineers seek to improve their practice and generate new findings. (g) Participatory Researches Based on the theory of Paulo Frere, which focus on the knowledge of Learners and seeks to enhance the ability of the poor to generate and control their own knowledge and plan and participate in implementation. (h) Summative Research Similar to another type known as evaluative research. Aims at evaluating projects or programmes when winding activities. 1.9.2 Strategies of Researches Strategies of research involve the style of the research process. The style is on the other hand, controlled by various factors which include the size, and scope of the coverage, the type of data being sought. Other factors may include funding and time limitations. The commonly used strategies are as presented below: 12 (a) Experimental Strategy The strategy is used when researchers want to establish causal connections between variables. (b) Survey Strategy Used in studies of large samples aiming at producing generalization about populations. (c) Case Study Strategy Applied in studies which involve the examination of a single instance of of phenomena in order to generate rich and thorough some broader class understanding of the situation. (d) Ethnography Strategy Used for researches that involve full participation of community members for purpose of observing the behaviour, social structure and relations of the people in their natural settings. (e) The Action Research Strategy Used in researches for the purpose of promoting changes in organized social practices and development of knowledge of these changes through various processes and practices. (f) Longitudinal Research Strategy Involves carrying out a research over different times to check on indicators of change or trend on the study subject. The study can be carried out after every one year, or five years on the same issue. (g) Cross-Sectional Research Strategy Involves studying a broad range of issues at one single point in time. An example of such a research can be the levels of migration mortality and fertility at that particular time in a district or region. 1.10 Summary In this lecture you have learned the definition of two concept, research and theory. A research is defined as a scientific and systematic controlled empirical and critical investigation of propositions presumed relationship about various phenomena. A theory is a set of ideas or opinion which explain the way things are and or why they exist. These ideas and or opinions have to be explained systematically and scientifically, basing on facts and rules related to the phenomena in question. You also learned about the characteristics and importance of research, that researches add knowledge and solve societal problems, while a theory in research guide researches. You were also introduced to the two main types of models of scientific inquiry which are deductive an Inductive models of scientific inquiry. 13 Lastly you learned about the types and strategies of research. You were informed that, all researches are either natural or social science researches. You were further informed that, there are two main types of researches which are basic and applied/action researches. These main types of researches have various types of research branches. Strategies of research were also learned, which include experimental strategy, survey, case study, ethnography, action research, longitudinal and cross sectional research strategies. 1.11 Review exercise (i) Discuss the various definitions of researches as defined by various authors. In your opinion, how would you define a research? (ii) Differentiate between a theory and a hypothesis, What is the role of concepts in developing a theory? (iii) Discuss the models of deductive and Inductive scientific Inquiry (iv) What is the difference between research types and research strategies? Explain 1.12 References • Kumar, R. (2005). Research Methodology: A step by Step Guide for Beginners. Sage Publication, India Pvt Ltd New Delhi 2005. Pp. 6-14 • Kothari, C. R. (2010). Research Methodology: Methods and Techniques. New Age International Limited Publishers, 2010. Pp. 1-24 • Gill, J. and Johnson, P. (2002). Research Methods for Managers. Sage Publications, 2002, London. Pp. 13-45 • David, M. and Sutton, C. D. (2004). Social Research: The Basics. Sage Publication, 2004, London. Pp.3-70 14 LECTURE TWO RESEARCH PROCESS AND RESEARCH ETHICS (By Dr. M. Kitula) 2.1 Introduction In lecture one you were learned about two main research concepts which are research and theory. You also learned about the importance of research, the characteristic of research, the link between research and theory the model of scientific inquiry, the types and strategies of research. In this lecture, you will learn about the research process and research ethics. 2.2 Learning outcomes At the end of this lecture, you should be able to: • Explain how one starts to develop an idea for a research topic • Narrate plans to develop a research proposal • Write a research proposal • Discuss issues related to research ethics 2.3 Developing a Research Topic and Problem The task of developing a topic for a research project sometime becomes a problematic exercise. This is because of the many related ideas that are in the current literature, or the idea could be too popular to ensure whether your topic wont duplicate other studies already done and might not be aware of and sometimes the idea you have is still hazy, it is not yet very specific. For inexperienced young researchers, it could involve a great deal of floundering around and spend sleepless nights trying to figure out what exactly should the topic be and its related research problem. Floundering around is inevitable before you establish a topic. But as you get involved in this process, you should be aware of the other various sources of research topics in order to get more exposure. 2.3.1 Avenues for Sourcing Research Topics Ideas that lead to formulation of a research topic are many. However, those with more exposure like academicians, staff in research institutes and policy makers have more advantages than others as they come across to various literature and through conferences and speech writing for politicians and the like. However, every researcher can access the sources of research topic through various means as indicated by Gill and Johnson (2002). 15 i) A research topic can be sourced, from workplace through your personal interaction with documents and issues raised by colleagues ii) Research topic can be sourced from academic documents and professional journals iii) Research topic can be sourced through advertisement in the media or news broadcasted and or thorough authorities’ statements iv) A research topic can be sourced from reports of conferences or public speeches v) A research topic can be sourced from reports of projects and programmes In addition to what Gill and Johnson suggested, a research topic can be sourced from reading literature in the library; while Gay (1992) also said a research topic can evolve from personal interest on an issue; that is, an individual can create his/her own topic and conduct a research. 2.3.2 Procedure to Follow in Generating a Research Topic Topics for research can be many as they can be part of the speeches given in public or in media or issues of concern from within your community or workplaces. In order for you to come out with a topic that has the characteristics of a good research topic, you need to do the following: • Brainstorm over the various ideas you have in mind. You can even share your ideas with friends to identify the topics in your interest • List down the various topics you have identified • Analyse the listed topics and by use of the characteristics of a good research topic, chose one most viable topic as your research topic. 2.3.3 Features of a Good Research Topic A research topic must have some features which can qualify it to be a viable topic for systematic and logical search for information. Before a researcher advances in a topic in preparing a plan for the research project, a consideration has to be made to ensure that the topic has the quality for a research topic. There rare various characteristics which have been identified by various scholars on research topic; Gill and Johnson (2002) had indicated some of these characteristics which states that: • A research topic have the qualities of having access to where information could be obtained • A Research topic should have the quality of having achievable objectives • A research topic should have the qualities of attracting funding agencies 16 • A research topic should have value in the sense that its findings will either add knowledge and provide solutions to community problems • A research topic has to be symmetry of the potential outcome. That is, a research topic which can produce valuable results despite the risks that a research project can face. 2.4 Planning for the Research Project A plan for the research project has to be established after a researcher has decided on on a viable research topic. A plan for implementing the project is what we call a research proposal. In a research proposal, the researcher translates the bright ideas into a statement which frames the real problem that a researcher will address in the project. The ideas are further translated into set of research aims and objectives. The plan which you prepare (the research proposal), makes it easier for you to implement the project but also it becomes easier to manage as every activity is planned and indicated on how to implement it. Hence, the importance of a research proposal in the research project. 2.5 The Research Development Process There are many stages of research development process depending on the type of research, style and field of study orientation. Kothari (2010) had suggested seven steps of research process while Kumar (2005) had suggested eight steps of research process. The two do not differ much and therefore I shall list down Kumar’s steps and present a chart of the steps by Konthari. Any of the two sets of steps can be used as preferred. 2.5.1 Kumar’s (2005) Eight Steps of Research Process These include: • Formulating a research problem • Conceptualizing a research design • Constructing an instrument for data collection • Selecting a sample • Writing a research Proposal • Collecting Data • Processing Data • Writing a research report 2.5.2 Kothari’s (2010) Seven Steps of Research Process 17 The chart presenting the steps shows that, the researcher has to define the research problem, Review the literature, Formulate the hypothesis, Design the research, Collect data, Analyse data and Interpret and Report. FF FF Review the literature Review concepts And theories Define research problem I Formulate hypothesis Review previous Research findings III Design research (including sample design) IV II Collect data (Execution) V F Analyse data (Test hypothesis if any) F Interpret And report VII VI F Where F = feed back (Helps in controlling the sub-system to which it is transmitted) FF =feed forward (Serves the vital function of providing criteria for evaluation) Figure 1.2.1: Steps of Research Process Source: Kothari, (2010). 2.6 Research Ethics Ethical Issues in research touches on all stakeholders of research, the respondents, the researcher, the funding urgencies and the users. Institutions which deal with researches including higher learning institutions are very sensitive on issues of ethics in relation to researches. To ensure researchers observe ethical issues in research, Institutions normally have policy forma that commit the researchers to observe ethical issues at level of questions used, samples taken on side of basic researches/experimental, at level of interview, and utilization of information obtained from the respondents to observe confidentiality. 2.6.1 Respondents and Data collection 18 On the side of respondents, the researchers are obliged to inquire on ethical issues of their respective respondents before they design the type of questions to ask their respondents to avoid offending them. They also have to be aware of their peoples cultures so that they can cope with the cultural practices of the people at the time they are collecting data. 2.6.2 Seeking Consent from Respondents Researchers require to seek consent from respondents before they start collecting data. It is unethical to collect data without consent of the respondents and their respective leaders. 2.6.3 Providing Incentives to Respondents Some researchers provide incentives to respondents for providing information. Some consider this as unethical. Providing incentives to respondents might make them to provide information which could be exaggerated to impress the researcher to give more information. In other cases, some respondents. 2.6.4 Seeking Sensitive Information from Respondents Researchers may sometimes need information which could be sensitive or confidential to respondents. The information could make them offended, upset, get embarrassed or be considered as an invasion of privacy. Such information could include sexual behaviour, drug use, shoplifting, age, salary, rape, battery and the like. The researcher needs to understand all these ethical issues before embarking on data collection activity. 2.6.5 Confidentiality of Respondents’ Information Information provided by respondents some of which could be confidential, should not be revealed to any other person unless there is consent from the respondent who provided the information. Revealing information to others without the consent of the provider is unethical. 2.6.6 Researchers’ Bias on Information Provided and Incorrect Reporting Sometimes the researcher can deliberately avoid information for purpose of avoiding the facts to cover up some people’s interest or his/hers. The report thus becomes distorted. This type of manipulation is unethical. 2.6.7 Misuse of Information by Sponsors/Users of Research Findings Having received the research findings, sponsors sometimes manipulate information to present a different case hence distorting the actual findings of the research project. The same can be tempted by the users. Treating research findings in this manner is unethical. 19 2.7 Summary You have learned about the process of research and ethics. Avenues of sourcing research topics were discussed which includes workplaces, academic documents and professional journals, the media, authority statements and public speeches, programmes and libraries. You also learned about the procedures of the procedure to follow in establishing a research topic, that it has to be accessible, achievable, valuable and worthwhile to attract funds. The steps of a research process were also discussed as presented by both Kothari (2010) and Kumar (2005). Lastly, the ethical issues related to research were discussed. These research ethics included issues of confidentiality, consent, collection of sensitive information, bias, and misuse of information. 2.8 Review Exercise • Discuss the causes of difficulty in establishing the research topic and problem • What is the rationale of research planning? • Critique the various steps of research process as presented by both Kothari and Kumar • Why is it important for the researcher to ensure there is adequate understanding of the culture and ethical issues of the study population before actual data collection? Explain. • Explain how researchers, sponsors and users can misuse research findings. 2.9 References • Kumar, R. (2005). Research Methodology: A step by Step Guide for Beginners, Sage Publication, India Pvt Ltd New Delhi 2005. Pp. 6-14 • Kothari, C. R. (2010). Research Methodology: Methods and Techniques. New Age International Limited Publishers, 2010. Pp. 1-24 • Gill, J. and Johnson, P. (2002). Research Methods for Managers. Sage Publications, 2002, London. Pp 13-45 • David, M. and Sutton, C. D. (2004). Social Research: The Basics. Sage Publication, 2004, London. Pp.3-70 20 LECTURE THREE COMPONENTS OF RESEARCH PROPOSAL – PART 1 (By Dr. E. Swai) 3.1 Introduction In lecture two you learned about developing an idea for your research topic. You also learnt about issues related to research ethics. In this third lecture, you are introduced to the components of research proposal. This topic is also covered in lectures four and five. 3.2 Learning outcomes At the end of this lecture, you should be able to: • Identify and list the components of a research proposal • Explain each of the components of the research proposal • Develop a research topic and statement of the research problem • Develop conceptual and or theoretical framework • Prepare a full research proposal 3.3 Research Proposal A research proposal is a plan that guides the research project by indicating the strategy one has to follow in doing the research project. It is a detailed plan showing the title, describing the background of the research, stating the research problem, the hypothesis/research questions, the objectives, the significance/justification of the research, for research. Research proposal is a serious statement of intent to reach your goal. In short, the research proposal is the roadmap of any researcher, showing how to conduct the research. Some people think of their research proposal as their clear star to guide them in a voyage of discovery; the proposal thus is used to chart a course and avoid undesired detours. Others see the proposal as similar to a proposal to live with a friend or life partner; it indicates a willingness to engage in a significant undertaking that has consequences for both parties. A good research proposal includes arrangements for check-points so that you can make changes when necessary to assist you to stay on course given the realities of life and the requirements for quality scholarship. 3.3.1 Components of Research Proposal There are several elements in research proposal. The following are popular elements which researchers address while preparing research proposals. (i) A Title/Research Title/topic 21 (ii) Background or Introduction of research problem (iii) Statement of the problem (iv) Objectives (v) Research Questions/Hypotheses (vi) Justification/Rationale/Significance (vii) Conceptual Framework and Literature Review (viii) Research Design (ix) Research Methods (x) Data Analysis Methods (xi) References (xii) Budget and time frame (xiii) Certification (xiv) Appendices 3.3.2.1 A Title/Research Title/Topic A research topic is your statement of the problem and major research questions stated in a summary. It is an open-ended phrase that contains the least number of concepts. Examples of research topics may be: Role of motivation in learning; Impact of Pollution on Environment; Relationship between Teaching and Learning, etc. 3.3.2.2 Background of Research Problem Background of research problem is the "nitty-gritty" of the body of your research proposal. It is an extensive explanation of the background to the problem, which includes sufficient information about the problem. Background of research problem reflects your scholarship and show evidence of a thorough research of your topic. The background to the problem also establishes the social significance of your study (the ‘who cares’ factor); demonstrates how the problem is worthy expenditure in effort, time, and resources. A good research problem describes in qualitative and quantitative terms, showing the nature and scope of the problem. It shows precisely what the problem is, and what are the social concerns, as well as its widespread. In the background of your research, identify the groups of persons who affected and who are likely to care about the matter and tell us .why you think we should care. Individuals reading this section of your work may disagree with your conclusions, but they would get a good sense of what you think the concerns are in both their nature and scope. 3.3.2.3 Statement of the Problem Statement of the problem is a statement which describes briefly the problem which the research project is addressing and the reasons for interest in the topic. The statement of the problem 22 clarifies a question about the problem that has not been subjected into research. Researchers who want to be sure of the feasibility of the problem statement of their research projects, they use a check list to ensure their research projects are viable. Table 1.3.1 shows the check list used to ensure research problem feasibility. Table 1.3.1: Checklist for testing the feasibility of the research problem ITEMS 1 Is the research problem of current interest? 2 Will the research results have social, educational or scientific value? 3 Will it be possible to apply the results in practice? 3 Does the research contribute to the science of education? 4 Will the research opt new problems and lead to further research? 5 Is the research problem important? Will you be proud of the result? 6 Is there enough scope left within the area of research (field of research)? 7 research problem? Will it be practically possible to undertake the research? 9 Will it be possible for another researcher to repeat the research? 10 Is the research free of any ethical problems and limitations? 11 Will it have any value? 13 14 Do you have the necessary knowledge and skills to do the research? Are you qualified to undertake the research? Is the problem important to you and are you motivated to undertake the research? Is the research viable in your situation? Do you have enough time and energy to complete the project? 15 Do you have the necessary funds for the research? 16 Will you be able to complete the project within the time available? 17 NO Can you find an answer to the problem through this research? Will you be able to handle the 8 12 YES Do you have access to the administrative, statistic and computer facilities the research necessitates? 3.3.2.4 Research Purpose and Objectives Research purpose shows the aim of wanting to do the research project. The purpose of the research is similar to the general object of the research as they present the overall statement of the aim of the project. On the other hand, the objectives of the research is to present the specific goals of the purpose of the research using action word which are verbs to indicate what exactly is the researcher supposed to in order to get the required information. It is a statement showing the activities the researcher does to reach the goal and this comes after you have identified your research problem. 23 Both the research purpose and Objectives culminate from the statement of the problem which is like the heart of the research project. Having a research objective without the aid of a problem statement is like starting a journey with a very vague idea of where you want to go. An example of the research objectives is as follows: i) To investigate performance level of OUT graduands in labour market. ii) To examine the adequacy of the sociology curriculum in producing marketable output. iii) To analyze the effectiveness of the lecturer’s motivational strategies at OUT aimed at ensuring rentation of staff. iv) To investigate whether there is any difference in learning between ODL mode and that of conventional mode of teaching. The objectives of the research project normally have three qualities. The objectives have to be researchable; that is, it enables data to be collected on time without using excessive resources; The objectives have to be measurable, that is, the information expected to be collected can analysed using a specific criteria not necessarily numerical. Lastly, the objectives have to be specific and or concrete so as to be able to show the specific targets and strategies to do the research. 3.3.2.5 Research Questions/Hypothesis (a) The Hypothesis This is a statement which shows the predicted outcome. It is a guess which plays a role of giving direction to the researchers thinking about the possible outcome of the research basing on the reviewed literature and related theory. It is a predicted answer to the problem. The Hypothesis is normally stated in the form of declarative sentences which are testable in order to prove the statement correct or wrong. It is categorized into null or alternative hypothesis. (b) Research Questions These are used to guide the study. They are used instead of hypothesis especially in inductive researches. A researcher interested in establishing the increased number of smoking youth despite high awareness of the consequences, can decide to use research questions instead of hypothesis. On the other hand, there are questions which are used to collect data and are derived from the specific objectives to get information to answer that specific objective. These are activity oriented questions. In most cases, activity oriented questions are confused with research questions which normally applied in social science researches instead of. It is important to develop research questions that are analytical rather than descriptive. How do analytical 24 questions look like? Analytical questions move beyond the "what" and explore the "how," and the "why." How do teachers in secondary schools motivate their students?” is analytical because it will lead you not only to explore the activities to motivate students, but how they do it. A descriptive question would ask: "What do teachers in secondary schools do to motivate their students? 3.3.2.6 Justification/Rationale/Significance The three concepts, justification, rationale and significance are normally used interchangeably. These concepts establish the social significance of the study (the ‘who cares’ factor). Justification of your research demonstrates that the expenditure in effort, time, and resources is worth it, not only for you, but also for others. You should indicate and defend why it is necessary to undertake the research, showing the benefits that will result from the research. For example, justification for a research on, “Role of motivation in learning” may be: An understanding of the role of motivation in learning is an important step towards creation of culture of learning. A culture of learning can additionally promote critical thinking and can reduce dependency on witchcraft and mysticism. Significance of a study refers to uses to which the findings are put. 3.3.2.7 Conceptual Framework and/or Theoretical Framework (a) Conceptual Framework A concept is a work or a phrase which symbolizes several interrelated ideas and meaning (Strauss and Corbin, 1998). The phrases conceptual framework is a broader idea of a research that contains key concepts and issues which a researcher wants to explore in the study. Conceptual framework is a basic structure of a research consisting of certain abstract ideas and concepts that a researcher wants to observe, experiment or analyze. When you connect these abstract concepts, you develop a conceptual framework. Framing the research project conceptually means placing boundaries around, thereby ruling in and ruling out certain lines of thought. The “frame” is the road map or outline that would guide the exploration. In developing conceptual framework, the researcher choose terms that connect the work to existing literature. Developing a conceptual framework is a creative endeavor. It involves thinking deep and wide about all the issues and topics you want to explore. This conceptual framework may be depicted as funnel similar to the one described by Marshall and Rossman (1999, p. 29). The broad end of the funnel (the mouth) represents the broadest idea comprising the key issue (motivation); Intermediary ideas (effective learning) are represented in the middle portion of the funnel. The narrow end represents the seed, or the gap in the literature (the relationship between motivation and learning) (Figure 1.3.1). 25 Figure 4.1: Research Funnel: Exploring Relationship between Motivation and Effective Learning Defining Motivation Measuring Effective Learning Exploring relationship b/w Motivation and Effective Learning Figure 1.3.1: Conceptual Framework: Motivation and Learning (b) Theory and Theoretical Framework Strauss and Corbin (1998) define theory as a set of well-developed concepts related through statements of relationships which together constitute an integrated framework that can be used to explain or predict phenomena. A theory is more than a set of well-developed concepts; it offers an explanation about phenomena. A theoretical framework is a collection of interrelated concepts. Unlike conceptual framework that tries to find connections between the concepts, theoretical framework comprises various theories that you will use to explain issues in your study. Theoretical framework help you determine what things and issues you will measure, and what statistical relationships you will look for. Developing theoretical framework means, theorizing your research. The term “theorizing” denotes choosing terms that connect your work to existing literature. It also entails formulating concepts and putting them into a logical systematic, and explanatory scheme. Developing a theoretical framework is a complex activity. It involves exploring an idea fully, considering it from many different angles or perspectives. It also involves thinking deep about the implications of certain theories that have been used to explain the issues. Framing the research project theoretically, means making decisions about and acting in relationship to many questions throughout the research process. The process of developing a theoretical framework is as important as the product as it sharpens the researchers senses by forcing them reveal, critique, and defend biases and assumptions they hold about the issue(s) they are exploring. In developing the theoretical framework(s), they must scrutnize theories and explanations that challenge researchers own biases and assumptions. This exercise allows researchers to reveal or bracket as much as possible their presuppositions regarding the phenomena under investigation. The process also introduces uncertainty and doubt 26 regarding what is being investigated and how best researchers can investigate. While developing the theoretical framework researchers become more and more self-critical; and this selfcriticality prepares researchers to look for and welcome surprises as the investigation unfolds. A sample of the theoretical Framework is provided in figure 1.3.2. Predictor Variables Dependent Variable Individual Domain • Motivation • Theories of Motivation Program Domain • Teacher characteristics • Students activities Teaching and Learning Classroom Interactions • Peer pressure Figure 1.3.2: Sample Theoretical Framework The goal at this phase of the study is to develop an instrument (a theoretical framework) that will sharpen and expand the ability to observe, describe and explain your issue. It is this goal which also guides the literature review. The theoretical framework is also supposed to help the researcher to make logical sense of the relationships of the variables and factors that have been deemed relevant/important to the problem. It provides definition of relationships between all the variables so the reader can understand the theorized relationships between them. 3.3.2.8 Literature Review The role of the literature review plays is to expose the researcher to the problem he/she is addressing. The researchers are familiarized more about the history of the problem, how it began, and the consequences related to the problem, how other researchers have addressed the problem, the theories that have been formulated related to the problem and the gaps that exist which are yet to be bridged in order to solve the problem. The literature sought in order to enrich the researchers knowledge and build better ideas and concepts about the problem of interest, has no boundaries. This can be sought from all over continents electronically as societal problems are cross cutting but even when they are localized, researchers can do research can do research in other continents and publish findings internationally. The researcher, while reviewing the literature, has to observe what other researchers have said about the problem, what related researches have been done previously, what type of findings have been produced , and what are the are the emerging gaps in the body of existing researches. 27 3.4 Summary In this lecture, you have learned about the various components of a research proposal. You were introduced to definitions of each of the components. The background is an extensive explanation of the background to the research problem. The statement describe briefly the problem which the research project addresses. The purpose and objectives were also discussed. The purpose gives general aim while the objectives provide the specific issues to be addressed. The hypothesis is a predictive statement that tries to answer the problem while the research question is the same as hypothesis used by social scientists in place of the hypothesis. Justification shows the worthiness of the project while significance refers to the use of findings. Conceptual frame work symbolizes several interrelated ideas and meaning while the theoretical framework is a collection of interrelated concepts. 28 LECTURE FOUR COMPONENTS OF RESEARCH PROPOSAL – PART 2 (By Dr. E. Swai) 4.1 Introduction In lecture 3, you learned about the components of research proposal. You were later introduced to the definitions of background to the research, problem statement, purpose and objectives, research questions/hypothesis, justification of research, conceptual and theoretical framework. In this lecture, you will learn more components of research proposal which includes, Research Design, Data collection Methods, Data Analysis Methods, References, Budget and time frame and Certification. 4.2 Learning outcomes At the end of this lecture, you should be able to: • Explain the meanings of various research components related to research methodology • Choose appropriate research design and methods for research project • Use appropriate data analysis methods • Write references appropriately • Prepare time frame and related budget, and Certification. 4.3 Research Methodology Research methodology is often a large section of the research proposal. Research methodology comprises in addition to other parts, Research Design, Methods of Collecting Data and Data analysis. Research Methodology application depends on how one understands the methodology and how to use it, and the characteristics of the specific approach selected. 4.3.1 Research Area This indicates to you all the possible areas that are supposed to be included in the study. However, the researcher has to indicate the actual areas which can be reached during field data collection as not all the areas can be reached due to various limitations. 4.3.2 Research Population The research population refers to the elements of research. The elements which are going to be involved in the study, either respondents if it involves people as elements of the study or animals, insects, plants etc. The research population statement indicated who the elements are 29 and later provides an indicative percentage which shall be representative of the whole study population which shall be directly involved in the study 4.3.3 Sampling Techniques These are techniques engaged in identifying a representative study are and study population. There are various techniques of sampling which engage some mathematical calculations. These include simple random sampling, cluster sampling, purposive, systematic, stratified etc. 4.3.4 Research Design The research design shows the type of the research and strategies to be used. It has three common elements. These include the research area, the research population and the sampling techniques to be applied after you have shown the type of research and strategies of research to be followed. Both the researcher and research advisory committee members need to be convinced that the selected design is better or most suitable for the intended research. The research design is your general plan of how you will go about answering your research questions (Saunders, Lewis and Thornhill, 2009). It is an overall plan of your research. The term research design is a structure of your research, how do you want your research to look like. Like a structure of a house, research design is influenced by your own preferences in consideration of your own capacity, time and other resources needed to construct your research. 4.3.5 Methods of Data Collection Methods of collecting data include specific instruments and procedure to collect information. In historical and documentary research, for instance, a careful list of sources is expected and why some are included while others rejected. In quantitative research, it is expected the sample size and recruitment will be justified on particular standards. In observational naturalist inquiry, the decisions taken to select what will be observed, when, where, and how need description. There are two main approaches for data collection. These are inductive and deductive approaches. In whichever approach one uses, the methods of collecting data are more or less similar. The methods include questionnaires, interview schedule, Focus Group Discussion (FGD), Observation, Participatory data collection etc. 4.3.6 Methods of Data Analysis Here the researcher explains in details what she/he will do with the data; how the data will be organized, inspected, entered in statistical programmes, analyzed, compared and interpreted. In this section the researcher drafts possible tables and charts that will be used to present possible relationships between several variables or categories. Here the researcher explains the appropriate statistical tests, relevant analytical processes and possible points of comparison or juxtaposition of paradoxes that will explicate what is going on in the data. 30 4.4 Time Frame and Budget table and Ways of Making Decisions 4.4.1 The Time Frame This is the section that specifies who is doing what, when, and where. What are the costs in money and time. What check-points will there be for evaluating how the research is progressing, what are the problems, and what changes are required. Thinking about time and ways of making decisions, especially a process to check mid-way cannot be ignored. Sometimes writing up a budget about costs for preparing surveys, transcription costs, honorariums for participants, and the time needed for observations are helpful to clarify feasibility of sampling. In all research projects, there can be serious disagreements or unexpected problems. There may also be pleasant surprises, such as finding an unexpected source of excellent data. It is necessary to think about what processes can be used to resolve concerns, respond to opportunities, and revise methodological decisions. 4.4.2 The Budget Dependent upon the sponsor's preferred format, this may be incorporated within the body of proposal, submitted as a separate document, or contained within an attachment or referenced appendix. It should include a definitive line-item budget for all direct costs, and administrative or indirect costs, unless prohibited by the sponsor. The extent of individual cost items should match the scope of the project, reflect real or estimated cost burdens, and not be padded. Each major cost item should be accompanied by a narrative explanation of the basis of costs, and avoid jargon terms. Cost contributions, either "in-kind" or real shillings, may be required to be explicitly identified by some sponsors. If a multiple year project, a detailed budget sheet should be provided for each year, plus a consolidated or summary budget page totalling all cost categories. 4.5 References A working bibliography is essential to write a satisfactory proposal. It is the foundation for the completed project or thesis. An example of this style shall be provided in the coming lectures. 4.6 Appendices Appendices may include: • Drafts of submissions to ethics; • A guide to interviews; • A specific questionnaire or instrument; • Draft tables indicating how data may be analyzed, and 31 • Letters of support of a proposal. Another useful appendix is • A proposed table of content for the completed project or thesis. Some proposals have few appendices, as the students and their committees decide to spend their time clarifying particular details when an ethics review is actually submitted, or drafting the table of contents after data analysis has begun. 4.7 Summary In this lecture, the definitions of each of the research components related to research methodology were provided. You were also exposed to different research design, which include sampling techniques. You were exposed to types and strategies of research in brief and the research approaches. You were also exposed to different types of research. You also learned the types of research data collection methods, data analysis methods, time frame, the budget and how to write references. 4.8 Review exercise • What are three key elements of research methodology? • What will be the best design for you research project? In one paragraph explain why you think the design you chose is the best considering the purpose of your research. • There are two major methods of collecting data for research project. What are they? In half a page explain in specific terms how you plan to collect data for your research, paying particular attention to research participants, specific instruments and procedure that you will follow to collect information. • Prepare a budget and time frame for your research. 4.9 References Galvan, J. L. (2003). Writing Literature Reviews: A Guide for Students of the Social and Behavioral Sciences Pyrczak Publishing Machi, L. A. & McEvoy, B. T. (2008 The Literature Review: Six Steps to Success. Corwin Press. Marshall, C. and Rossman, G. B. (1999). Designing qualitative research (3rd ed.). Thousand Oaks: Sage Publications. Saunders, M. Lewis, P. and Thornhill, A. (2009). Research Methods for Business Students. London: FT/ Pitman. Strauss, A. and Corbin, J. (1998). Basics of Qualitative Research Techniques and Procedures for Developing Grounded Theory. Sage Publications: London. 32 Van Manen, M. (1997). Researching Lived Experience: Human Science for an Action Sensitive Pedagogy. Suny Series in the Philosophy of Education. 33 MODULE TWO LITERATURE REVIEW AND REFERENCING 34 LECTURE ONE SOURCES OF LITERATURE (By Dr. P. Ngatuni) 1.1 Introduction One of the problems we observe from proposals submitted by many of our students is the lack of authoritative information to support their arguments. This is a signal that while information is much more available nowadays, our students are unable to locate and interact with the sources. Consequently, this lecture is designed to introduce you to the sources of literature, and how you can interact with them to effectively get what you are looking for. 1.2 Learning outcomes At the end of this lecture you will be able to: • Identify and describe a range of primary, secondary and tertiary literature sources available; • Plan and carry out literature search effectively; • Identify sources of literature appropriate to your literature needs and develop an understanding of the need to know how they work. • identify keywords and undertake a literature search using a range of methods including the internet; • Evaluate the relevance, value and sufficiency of the literature found. • Describe what plagiarism is and identify various measures to avoid it. 1.3 Literature sources Before you can search for the relevant information for your research project, you must first have clear knowledge of the available literature sources. Knowing the sources and their nature helps you determine the appropriate approach or technique to interact with such sources. For example, some sources may only be available in print form – say in the cabinets of some responsible offices or in libraries, while some may be available in electronic form, e.g. in CD ROMs or in some kind of database accessible through the internet, or in both. So looking for such information in the library while it is only available electronically (and vice versa) may not help you much. 35 The literature sources available to help you develop a good understanding of, and insight on, previous research can be divided into three categories: primary, secondary and tertiary sources. Figure 2.1.1 shows these categories. It is important to note that the different sources indicated in this figure do overlap; and that is what happens in reality. Figure 2.1.1: Literature sources available Source: Adapted from Saunders et al (2009:69) figure 3.2 The different categories of literature resources represent the flow of information from the original source. Saunders, et al (2009: 69) argues that as information flows from primary to secondary and eventually to tertiary sources it becomes (i) less detailed and (ii) less authoritative, but (iii) more accessible. This is mainly because primary literature sources can be difficult to trace. The arrows in Figure 5.1 show this flow. As you move from tertiary sources backwards to primary sources, you are moving towards the original ideas, and therefore the level of detail should increase. On the other hand, as you move from primary sources to tertiary sources, the time to publish increases and the level of detail decrease. The nature of this information flow is typical of traditional printed materials. With the current wave of moving towards electronic publications thanks to the power of internet, this situation is changing rapidly resulting into more direct means of both publishing and accessing information. With this move, even what was grey literature, e.g. government publications, are increasingly being made available via the internet. This is perhaps the main reason why a greater part of this lecture will focus on electronic sources. 1.3.1 Primary literature sources Primary literature sources are the first occurrence of a piece of work. They include such sources as reports, some central and local government publications (e.g. white papers, planning documents, unpublished manuscripts, etc). Unpublished manuscripts include sources such as theses/dissertations, conference proceedings, company reports, letters, memos, e-mail 36 messages, committee meeting minutes, etc. These primary sources are sometimes referred to as grey literature, simply because they are difficult to locate. Again with the advent of the internet, an increasing number of these sources are now being made available via the internet. Of these, the most accessible and that are most likely to be of use in showing how your research relates to those of other people, are reports, conference proceedings and theses/dissertations. (a) Reports Reports are documents produced by various organisations such as consultancy or advisory firms, government department or agencies, as well as individuals or academics. Access to reports may be difficult because (i) they are not as widely available as books (ii) are not well indexed in the tertiary literature especially in developing countries like Tanzania. For these reasons, you will need to inquiry extensively to know their existence and their location. Even if you are able to locate them you may find it difficult to gain access. Some of the organisations which produce them are increasingly being challenged to sustain themselves and given the high cost of producing primary data, most of these producers charge access fees, some of which may be prohibitive to individual researchers. The National Bureau of Statistics (NBS) for example is the Central Statistical Office of Tanzania and is charged with conducting censuses and surveys which yield a wide range of economic, social and demographic statistics1. NBS produces the Central Register of Enterprises (CRE), a useful resource for company research, but a CD-ROM version of this resource costs slightly above 400,000 shillings (2010 price). Individual academics are also increasingly publishing reports and their research on the internet. Although these can be useful sources of information, their use will require assessment of the authenticity of the material as well as the authority of the author. This is mainly because these reports may not have gone through the same review and evaluation process as journal articles or books. (b) Conference proceedings Conference proceedings, sometimes referred to as symposia, are often published as unique titles within journals or as books. Since most conferences are organized with a specific theme or a wide range of themes in mind, they make a very useful source of information relevant to your research project, so long as you can locate them. However, conference proceedings are not well indexed in the tertiary literature. To locate them you may need some specific search 1 http://www.nbs.go.tz 37 tools such as index to conference proceedings, library public catalogue or more general search engines such as Google. It is common practice nowadays for conferences to have a dedicated Webpage from where abstracts and occasionally the full papers presented at the conference can be obtained. (c) Theses/dissertations Previous dissertations and theses are unique sources of detailed information and of further references. The trouble is that they can be difficult to locate and when found, there may be only one copy at the awarding institution, with some restricted access. Specific search tools such as index to theses and library catalogue may be used to locate them. The other trouble is that in most cases only the Ph.D and M.Phil/MRes theses are covered well in the tertiary resources; less so for the research undertaken as part of taught masters degrees. The University of Dar es Salaam library catalogue for example can be used to locate electronically, via its website, theses and dissertation submitted to the University2 1.3.2 Secondary literature sources Secondary sources include such sources as books and journals. These are subsequent publications of primary literature, and they are aimed at a wider audience. They are much easier to locate than the primary literature most certainly because they are better covered by the tertiary literature. As the number of secondary literature sources is increasing daily, access to them is also increasingly being made via the internet. Libraries at many universities are rapidly exploring this avenue, as a way of cutting down subscription cost as well as improving access. Journals and books constitute the major component of this category. (a) Journals Journals are known as periodicals, serials and magazines and are published on regular basis. Although many publishers are moving towards online publishing accessible via the internet subject to subscription services, most journals are still being published in print form 3. Journals are also well covered by tertiary literature. You are advised to build a database of journals that publish materials related to your research project and make it a habit to browse these journals regularly to be sure of finding useful items. Many journals’ content pages can be browsed via the internet. (i) Refereed journals 2 Hold the control key on the keyboard of your internet connected computer and click on this link http://www.libis.udsm.ac.tz/opac/%28qh5aal45oly2y4rev3ajfa55%29/search.aspx 3 OUT also subscribes to a number of databases containing a good number of refereed journals. Being an OUT student gives you right of access. Go to http://www.out.ac.tz/current/Library/jounal1.html for a list of these databases. 38 Refereed academic journals are the most important sources when placing your ideas in the context of earlier research. Books are likely to be more important than professional and trade journals. Refereed journals (e.g. Huria Journal, JIPE, OUT law journal, Journal of Finance, etc) are journals that are evaluated by academic peers prior to being published, to assess their quality and suitability. Their suitability for research projects is also enhanced by the fact that they contain detailed reports of relevant earlier research. It is important to note that (i) not all academic journals are refereed; (ii) the articles in them are written by experts in the field and often for a more narrow audience of scholars with particular interest in the field (iii) the language used may be technical or highly specialized because a prior knowledge of the topic is assumed; (iv) articles accepted for publications in them may still need to undergo several serious revisions based on referees’ comments before it actually appears in print; and that (v) the relevance and usefulness of such journals vary considerably and you may need to worry about possible personal bias. (ii) Professional and trade journals Professional journals are produced for their members by organization. For example, the National Board or Accountant and Auditors (NBAA) publish The Accountant Journal for its members. They contain both a mix of news-related items and articles that are more detailed. Caution is required though as articles in these sources (i) can be biased towards their author’s or organisation’s views. (ii) are more practical in nature and more closely related to professional needs than those in academic journals. Trade journals fulfil similar function to professional journal. They are published by trade organizations or aimed at particular industries or trades such as catering or mining. They often focus on new products or services and net items. They rarely contain articles based on empirical research, although some provide summaries of research (b) Books and monographs Books and monographs are written for specific audiences. While some are aimed at the academic market with a theoretical inclination, others are aimed at practicing professionals which may be more applied in their content. Materials in books are usually presented in a more accessible manner and cover a wide range of topics. These features make them very good introductory sources to help clarifying your research questions, research objective or research methods. With this knowledge in mind you are better able to generate appropriate key words for your search queries (see a latter section for details). Remember also that books may contain out-dated material even by the time they are published. (c) Newspapers 39 Newspaper is another secondary source, a good source of topical events, developments within business, government, profession as well as recent statistical information. For example, you can obtain exchange rates, treasury yields, prices of ordinary shares, government securities, commodities, etc. In Tanzania for example, these information is available in most daily papers including “Nipashe” and “Majira”, and most of them are also accessible via the publishers’ websites. Caution is required here though because (Sterwart and Kamins, 1993; cited by Saunders, et al. 2009: 74), (i) newspapers may contain bias in their coverage be it political, geographical, or personal (ii) reporting can be inaccurate (iii) you may not pick up any subsequent amendments (iv) news presented may be filtered depending on events at the time, with priority given to more headline-grabbing stories. 1.3.3 Tertiary literature sources These are designed to help information users to either (i) locate primary and secondary literature or (ii) introduce a topic. So by design, they include indexes and abstracts as well as encyclopaedias and bibliographies. For example, the University of Dar es Salaam used to publish a bibliographic index, which showed the publications held by different information centres in Tanzania, by title, and coverage in terms of year of publication and/or volume. The use of these literature sources depends on your research question(s) and objectives, the need for secondary data to answer them, and the time available. 1.4 Importance of conducting literature search effectively In the preceding section you learned about the different categories of literature sources. In this section you will learn about the importance of conducting effective searches in these literature sources for what you are looking for. At this juncture, it is important to beware that the world of information is loaded with tones and tones of information – whether it be in print or in electronic form. On the other hand you also need information that is relevant to your research project, and you need to get it at the most optimal level of cost and time. This therefore creates the need for you to learn how to search for information effectively. You will recall from the preceding section that literature sources can be in print or electronic. Irrespective of whether the source is in print or electronic you need to know how to search/locate them more effectively and such knowledge and ability will help you to: • Find the materials you want from amongst the tonnes of printed materials in various information centres or from the huge number of online resources available; • make efficient use of limited library hours and space or limited access to PCs and bandwidth; and 40 • save time and money It is important to note that (i) you already have searching skills that are useful in both the print and electronic worlds (ii) these searching skills can be enhanced by understanding how both print and electronic searching work (iii) searching should not be done haphazardly; it usually require careful planning before you go to the information source, and where electronic sources are involved, before you even think of locating and switching on an internet computer terminal. The next sections will take you through the various steps in planning your search. 1.5 Planning your literature search strategy From the preceding section you were hinted that planning for literature search begins away from the library shelves or computer terminals. Students often find their literature search timeconsuming, expensive and frustrating. But planning your search carefully not only relieves you of these situations but it also ensures that you locate relevant, quality and up-to-date literature. The time spent on planning will be compensated by time saved when running a clear search strategy. Writing the search strategy down helps you maintain consistency and focus on what you are looking for while at the same time exhausting the available options. Whenever possible discuss the strategy with your research supervisor and/or the information centre manager – e.g. the librarian. Now planning a search strategy involves a series of activities. Such activities include (i) defining your information needs; (ii) defining the parameters of your search; (iii) generating key words; (iv) deciding which sources to use; (v) finding how they function; (vi) designing your search queries. After these activities, a search can be carried out, after which several follow up activities will follow. These include (vii) evaluating the literature so obtained on the basis of relevancy sufficiency and quality; and finally (viii) refining the search strategy as appropriate until each of the criteria in (vii) above is fulfilled. These activities are detailed in the following subsections. Where it involves free online materials e.g. Wikipedia, you will need to validate their authenticity. This will be discussed later in this lecture. 1.5.1 Defining your information needs The first question you would probably need to ask yourself is what sort of information are you looking for? The answer to this question will guide you in identifying the possible sources. For example, if you are looking for specific information like facts or dates, then reference sources such as data books, encyclopaedia, dictionary, the Web or even textbooks are usually the best. If instead you are looking for general information such as research areas, you may need to think more and ask yourself, for example, how much information is needed and at what depth? 41 Another question could be who is going to use the information? This is because the end user’s needs e.g. those of professional researchers, academicians, first or last year university students, etc., might affect the level and the quality of information you may require. 1.5.2 Defining the parameters of your search Once you are clear about the sort of information you need, you must then establish key parameters for your search. Since you will have already stated both your research objectives and your research questions, you may have a clear idea about what subject matter will be relevant for your search strategy, but not equally clear about several other important parameters. Defining the parameters within which you can search for information will require you to be clear about, among other things: the language of the publications, the subject area, the business sector, the geographical area, the publication period, the literature type, etc. (Bell, 2005; cited in Saunders, et al 2009: 75). To sort this task out you may consult your lecture note or text books in your research area, while at the same time taking notes about the subject areas that appear most relevant to your research question as well as the authors’ names. Often we hear our Masters students complaining of lack of literature on a selected topic for their term paper assignments or claiming that there no research done on the topic they have chosen for their dissertation project. These are quite unsafe claims and they may be attributed to some of their search parameters being defined too narrowly. So the advice here is to think broad. For example instead of thinking of ordinary shares, think of common stock, common equity, equity stocks, equity shares, ordinary equity and so forth so that you increase your chances of getting literature from different language bases relating to ordinary shares. 1.5.3 Generating your key words Key words also known as search terms are the basic terms that describe your research question(s) and objectives and you will definitely need them to search the tertiary literature. This makes their identification the most important part of planning your search for relevant literature. The notes you would have taken when re-examining your lecture notes and text books in the subject areas will provide a good basis for the task. The keywords can be identified using one or a number of different techniques (Table 2.1.1) in combination (Saunders et al 2007:71). Table 2.1.1: Key words generating techniques Technique Discussion Description with Take every opportunity to share and discuss your research with others 42 colleagues, your research either face-to-face, via emails or letters. The feedbacks you get will help supervisor and librarian you refine your ideas, approach and generate effective key words. Initial reading, You can improve the quality of the key words you generate by combining Dictionaries, the results of your discussion sessions with initial readings from various Thesauruses, recent review articles and materials like dictionaries, encyclopaedia, encyclopaedias handbooks. and handbooks, and thesauruses, both general and subject specific. These are also available via the internet or on a CD ROM. You can also Google it. Google search engine, for example, offers a “define” search option by typing “define:[enter the term]”. You can also use Wikipedia http://www.wikipedia.org. These are free resources although you may be required to authenticate the materials you adopt into your research Brainstorming project using refereed journal articles or textbooks. Brainstorming is a helpful technique for developing research question and objectives. It can also prove useful in generating key words. You can start with each member individually or collectively generating and writing down all the words and short phrases that come to mind on the research project. These can then be evaluated and the most relevant key Relevance tree words and phrases are selected. This is a more structured guide to your search process. Looks like an organisation chart and it is hierarchical in nature beginning with main heading followed by subheadings (which may represent the research objectives and or research questions, authors’ names etc). The following paragraph expands a little bit more on the technique. Source: Summarized from Saunders et al (2009) The relevance tree technique can be prepared after brainstorming and according to Jankowicz (2005) cited by Saunders et al (2007: 74), it can help you decide the following: i) Which key words are directly relevant to your research questions and objectives; ii) Which areas you will search first and which your search will use later; and iii) Which areas are more important? You can follow the following steps outlined in Saunders et al. (2009: 74) to develop your relevance tree: • Start with your research questions and objectives at the top level • Identify two or more subject areas that you think are important • Further subdivide each major subject areas that you think are of relevance 43 • Further subdivide the sub areas into more precise sub areas that you think are of relevance • Identify those areas that you need to search immediately and those that you particularly need to focus on • Add more areas as you continuously read and review the literature In performing the activity above you may need to ask the following questions: • What key words do you think will appear on the site or article you want? • What key concepts are each of the key word a part of, or related to? • Are there any synonyms for these keywords or concepts? • Are there any alternative spellings for your keywords/concepts • Are plurals or capitalisation involved? Example: Assume that you want to find information about “the health implications for water pollution”. Table 2.1.2: Summary of key word examples Keywords Concepts ‘water’ ‘pollution’ and ‘health’ ‘environmental degradation’ or ‘agricultural management’ or Synonyms Related to location Related causes ‘health’ Contamination, effluence, rivers, lakes, sea, coastal,’ domestic water’, etc ‘oil spills’, chemical, biological, toxic waste, green house gases, Alternative spellings Plurals smog, litter, etc none river(s), lake(s), disease(s) 1.5.4 Deciding which sources to use and knowing how they work After defining your information needs, the next thing will be to decide the sources to use. You will need to assess whether the following will be more appropriate for your information need? (i) Individuals’ and organisations’ home pages; (ii) Newspapers and magazines; (iii) Subject gateways, databases, catalogues; (iv) Journals—titles, abstracts or full text; (v) Reference resources, e.g., encyclopaedias, dictionaries; (vi) Books; (vii) Grey literature, e.g. government publications; etc. Alongside the decision of the appropriate source will be determination of whether the sources are accessible in print or electronically. Knowing this will help you determine a number of 44 things which include the kind of resources you will need – e.g. library subscription, photocopy facilities or simply internet connectivity and printers. Once you have determined the appropriate sources to use, whether in print or in electronic form, you must then find out how they work. Knowing this will help you design the approach to interact with the sources in the most optimal way i.e. getting the most out of the source. 1.5.5 Conducting your search You can use various approaches depending on the type of source identified and how the identified source works. So the following approaches are suggested in Saunders, et al (2007: 74) • Searching using tertiary literature sources • Obtaining relevant literature referenced in books and journal articles you have read • Scanning and browsing secondary literature in your library • Searching using the internet Books may not provide adequate up-to-date coverage of your research question, but some can give you a lead to further readings in addition to helping you generate key words and concepts relevant to your research objectives and questions. With such insight you can then search the tertiary sources more effectively Most learners are tempted to start with internet straight away. But most of the time this approach may not take you to academic literature. To avoid that it is advisable to begin with indexes and abstracts. These are some of the varieties of tertiary sources maintained by most university libraries. Having said that, these tertiary sources are also increasingly becoming accessible through the internet. An index will index articles from a range of journals and sometimes books, chapters from books, reports, theses, conferences and research. It will contain information like author(s), date of publication, title, journal title, volume and part number of journal issue and page number of the article. An extension to this is the citation index which provides list of other authors who have cited that author’s publication. The abstract goes further than that by providing summary of the article. The good thing is that many of these sources are now available electronically, and as long as your library subscribes, you can access them on campus or off campus. There are also increasing number of institutional repositories which can also be very useful4. Saunders, et al (2007: 77) provides the following guidelines when searching electronic databases: 4 See for example www.oro.open.ac.uk , a freely accessible repository maintained by the Open University UK which is very useful business, technology and education literature 45 • Ensure your key words match the controlled index language • Search appropriate printed and database sources • Note precise details, including the search strings used, of the actual searches you have undertaken for each database. • Note the full reference of each item found (where possible cut and paste) Noting the search string used or search strategy in sufficient details is becoming more emphasized nowadays (see for example Tranfield, et al., 2003: cited by Saunders et al 2007:77) as it will make it possible of others to replicate the search 1.5.6 Designing your search queries From the keywords and phrases generating process, you will need to construct search queries (using the keywords or phrases alone or in combination), which you will then send to the identified database. These keywords need to match the database’s controlled index language of pre-selected terms and phrases or descriptors. Otherwise, your search strategy may fail. Examples of items which may cause failure include but not limited to variation in spelling between English UK and English US (labour vs. labor); language correctness (chemist vs. drug store), terminology correctness (redundancy vs. downsizing); acronyms and abbreviations, jargons, etc. The search queries are created by a combination of the selected keywords and phrases linked using Boolean logic. Boolean logic consists of search strings which enables you combine, limit or widen the variety of items being searched. You can also use them with dates, journal titles, and names of organisations or people. Table 2.1.3 shows examples of most commonly used terms that use Boolean logic. Table 2.1.3: Common link terms that use Boolean logic Link item Purpose Example AND (+) Narrows search Recruitment AND Returns articles containing both key words Selection OR Widens search Recruitment OR selection NOT (-) Excludes terms from Recruitment NOT Returns articles containing “Recruitment” but search Selection not “Selection” * Or $ (truncation) Uses world stems to Motivat* pick up different words Litera* Returns articles containing Motivate; motivated, motivating; motivation. Picks up different Returns articles containing Organisation, ? Organi?ation Outcome Returns articles containing at least one key word Returns articles containing Literature, literacy, literary 46 (wild card) spellings organization Behavio?r Returns articles containing Behaviour, behavior Labo?r Returns articles containing Labour, labor Source: Adapted from Saunders et al. (2009:84) Worked example Let us now borrow an example form Burns and Burns (2008) in which one wanted to identify the role that personal employee characteristics, such as personality and motivation, have within successful project management outcomes. This is a simple example in which you can clearly see the key terms and the nature of the link among them. For example, key words here would be personality, motivation, and a phrase like “project management”. Having these in mind you can then use the Boolean operators OR and AND to search for articles on them. Surely, the different combination of these terms will yield differing size of outcomes. 3 Motivation Personality and Project and Project Management Management Personality, Motivation and Project Management 1 Personality and Motivation 2 Figure 2.1.2: Boolean operators as a Venn diagram Source: Adapted from Burns &Burns (2008:55) Figure 2.1.2 shows different ways 3of combining the terms and phrases. Using OR e.g. “personality” OR “motivation” OR “project management” will produce a large number of results and will be represented by all three ovals. Circle 1 contains all search results for the term “personality” while circles 2 and 3 contains search results for the term “motivation” and the phrase “project management”, respectively. The very centre of the three circles (the intersection) presents results for the “personality” AND “motivation” AND “project 47 1 2 management”. For each set of two circles the middle part outside the intersection, presents results of the two key words that are not part of the third. To demonstrate this point and several others, I used some databases selected from the multitude of databases to which the Open University of Tanzania Library subscribes. You have access to most of them within the campus, and to some both from campus and from offcampus. A sample of six databases is selected – Emerald, JSTOR, Oxford University Press (OUP), Sage Publications and Taylor & Francis. In addition, google – the most commonly used search engine by students and staff alike is added and most importantly the google scholar is also added. The main purpose here is to show you how the choice of source is important and the kind of expectations you should have. The other purpose is of course to show you what you miss each time you run for search engines like google rather than for the subject relevant databases. I run a search based on the different combinations of the key terms identified in Figure 2.1.2 and results are presented in Table 2.1.4. The results show that (i) as you move from general to academic databases, the level of results reduces, indicating that you are getting more to specific subject; (ii)combining the three terms with OR yields the most results (widens results) while combining them with AND limits the searches; (iii) searching for “project management” yields fewer results than using project management indicating the limiting power of searching for a phrases; (iv) it is important to beginning with much wider options to gain confidence that some materials are available and then limit your search using various combinations, while at the same time watching over you research questions; (v) Google scholar is shown here to be better than the general google, and for the latter it does not matter whether you use google.com or other variants like google.co.uk; (vi) For the proximity searches, it does not matter whether you use single or double inverted commas; (vii) For some databases e.g. the Taylor and Francis, the pattern demonstrated in the other databases is not maintained, indicating the need to find out from the website itself whether it works the same way as the others; (viii) Google clearly reports more search results than the Google scholar of the academic databases sampled here, simply because of the differences in the number of data records each search engine compares your search criteria against. For example Google claims that it searches more than 20 billion web pages each time you type in a query 5. But research shows that these represent only about 16 percent of what could be accessed electronically. So where then does the remaining 84 percent reside? 5 (http://www.google.com/intl/en/corporate/history.html November 2007). 48 Table 2.1.4: Search results for personality motivation and project management Boolean Operator OR AND Term Personality Motivation Project Management "Project Management"c Personality, motivation, "project management” Personality and motivation Personality and project management Personality and "project management" Motivation and "project management" Personality, motivation and "project management" Googlea Google Scholar Emeraldb WilleyBlackwell Synergy JSTOR 73,400,000 49,000,000 189,000,000 35,000,000 1,320,000 1,420,000 3,610,000 485,000 14,610 24,983 46,249 6,040 169,629 180,724 271,508 9,110 329,36 225,75 211,11 5,21 287,000,000 2,890,000 37,640 310,075 513,82 7,420,000 888,000 6,199 46,678 45,09 29,740,000 419,000 5,124 30,072 23,22 1,100,000 18,200 530 1,033 49 2,300,000 46,800 1,530 2,224 1,15 312,000 11,000 271 547 25 This table presents results of searches on pre-identified keywords using different Boolean search strategies University of Tanzania library website: http://www.out.ac.tz/current/Library/jounal1.html - Accessed on 9t hours. a There is no difference between using google.com and google.co.uk b based on journal articles found only. Others not reported here are results for books, bibliographical datab c For searches based on phrases it does not matter whether you use single ' ' or double commas " " 49 The online academic databases, newspapers, databases, books, company web pages, dictionaries, encyclopaedias, individual home pages, etc, would be the answer. Citations from these academic databases like those sampled here are from peer reviewed journals and are generally more academically credible. Although “googling” will certainly bring up reputable articles by reputable researchers, there is far more information of dubious authorship (Burns and Burns, 2008:56). These results will need to be authenticated before you can use them in your research. See the next section for the issues you will need to consider. Another useful Boolean operator is “NOT”. This operator allows you to more precisely distinguish within a term that may incorporate more than what you are interested in. For example the term motivation has two sides (i) the intrinsic motivation which comes from inside the individual – doing something because it is personally rewarding and give you a sense of self esteem; and (ii) extrinsic motivation which is driven by some external factors like pay or fear of being sacked. Now suppose it is the intrinsic motivation that you are more interested in. You can either use “AND” while constraining motivation to intrinsic motivation – “project management” AND “intrinsic motivation” or you use the operator “NOT” to exclude extrinsic motivation as follows: “project management AND motivation NOT “extrinsic motivation”. An alternative to Boolean operators implied Boolean logic is used in which “+” replaces the “AND”, and the “-” replaces the “NOT”. Please note that in typing your queries, the absence of a Boolean operator is significant because the space between two terms usually defaults to “AND”. 1.5.7 Reviewing search queries, obtaining and evaluating literature Tertiary literature search (which may take you a few trials with a number of refinements on the search strategy as describe in the preceding section) will give you details of what literature is available and where to locate it. The next step naturally will be to obtain these items. To obtain the items, you are advised to try the following: Check your library catalogue (both card and electronic catalogues), and remember that most libraries nowadays hold many periodicals on either CD-ROMs, provide access to subscribed for on line databases accessible via the internet, or both. There are other free sources (e.g. institutional repositories) around the world that can also be accessed via the internet. For those that are held by your library (in print or electronic form) note down their location, find and scan them in order to discover whether they are likely to be worth reading thoroughly. At this 50 stage the abstract alone could do you a lot of good. You can also browse other books and journals with similar class marks to see whether they may also be of use. Those that are not held by your library, the best way is to discuss with your librarian to find out whether they can be obtain through interlibrary loan from another library. Since this service can be expensive, you are advised to ensure that your really need it and secondly it is of high quality (refereed journal article). Once you have obtained the literature you are looking for, you must evaluate it on the basis of relevance, value and sufficiency. To do this you need to be at least aware of the two dilemmas (Saunders, et al 2009: 92-93). One is how do you know whether you are reading the relevant stuff; and two is how do you know when you have read enough. Answers to these questions require you to set the scope of your review and skills to determine the value of the items. In both cases, the research questions and objectives should be your starting point. •Assessing relevance You are advised to read all the literature that is closely related to the research objectives and research questions. Focus on relevance and do not critically assess the ideas contained within. You must have criteria for inclusion and for exclusion of articles so that when reading the literature you have found, you do some reflection on the research objectives and questions and at the same time measure the article against the inclusion/exclusion criteria. •Assessing value In assessing the quality of the research being reported in an article you should look at such issues as the methodological rigour and theory robustness and quality of the arguments. . For example, some articles may contain results of subjective evaluation rather than those of a systematic research. Good examples of the former would be managerial autobiographies in which most experiences of successful entrepreneurs may be published, or articles in trade magazines which may represent practices in the profession. Remember to make notes about the relevance of each item as soon as you read it and the reason for conclusion you drew from it. •Assessing sufficiency Although it is virtually impossible to read everything, determining when you have read sufficient literature is one of the challenges researchers are faced with. This is because as a researcher you will need to ensure that your critical review discusses what research has already been undertaken and at the same time ensuring that you have positioned your research project in the wider context 51 citing the main writers in the field. Saunders et al (2009:93) suggest that one clue for knowing that you have achieved this is when further searches provide mainly references to items you have already read. Saunders et al (2009:93) compile a checklist of relevant items, from their own experience as well as experience of several other writers, which can be used in the assessment of both relevance and value. These items are reproduced hereunder for your guidance Relevance •How recent is the item? •Is the item likely to have been superseded? •Are the research questions or objectives of the article sufficiently close to your own to make it relevant to your own research? •Is the context sufficiently different to make it marginal to your research questions and objectives? •Have you seen reference to this item in other items that were useful? •Does the item contradict or support your argument? Value • Does the item appear to be biased? i.e. illogical arguments, emotionally toned words or appear to choose those cases that support the point being made? • What are the methodological omissions within the work? • Is the precision sufficient? • Does the item provide guide for future research? 1.5.8 Recording the literature The relevant literature identified in the preceding processes must be recorded. But when recording it you must also reflect – determining how the item read will contribute to your research questions and objectives, and make notes with such a focus. Even if you print or photocopy all the materials, you are still advised to make notes because as you do so you think through the ideas in the literature in relation to your research. Sharp et al., 2002: cited in Saunders et al (2009:94) identifies three sets of information you need to record: •Bibliographic details 52 •Brief summary of content •Supplementary information Recording the literature has been made easy these days by the availability and accessibility to general software like Microsoft AccessTM or to specialist bibliographic software like Procite TM, Reference Manager for WindowsTM, EndNoteTM which not only help to record but also to organize and generate references automatically. The records of bibliographic details should be sufficiently done to enable you or other readers to locate the original items. For journal articles for example you should take note of the names and initials of the authors, year of publications, title of the article, journal title, volume number, part or issue number, and running page umbers. For a text book, you should take note, of the names of the authors and initials, year of publication, title and subtitle of the book, edition, place of publication and the publisher. For any other item, you will need to take similar notes but bearing in mind what will be required when generating references for your work. For a guide on how to write references see a later chapter in this study manual. Brief summaries will include such details as the key words used to locate them, the abstract, etc. These summaries will help you locate the relevant items later, facilitate references to your notes, and photocopies, and maintain consistence in your searches. Supplementary information can be any thing you feel will be of value. Saunders et al (2009:96) summarised some of the useful supplementary details to include ISBN of the book, class number, quotations, where it was found, the tertiary resource used and the key words used to locate it, evaluative comments, when the item was consulted, and file name with which it was saved. 1.6 Plagiarism Plagiarism has become an enormously important topic in recent years largely as a result of the ease with which materials can be copied from the internet and passed of as the work of the individual student. Plagiarism is a deliberate attempt at passing off the ideas or writings of another person as your own (Burns and Burns, 2008:63) without acknowledging the original source of the ideas used (Easterby-Smith et al., 2008 cited by Saunders et al 2009: 97): and it may include (Saunders et al 2009:97): 53 •Stealing from another source and passing it off as your own e.g. buying a paper from a research service, essay bank or term paper mill; copying a whole paper from a source text without proper acknowledgement; submitting another student’s work with or without that student’s knowledge, etc. •Submitting a paper written by someone else (e.g. a peer or relative) and passing it off as your own •Copying a section of materials from one or more source texts, supplying proper documentation including full reference, but leaving out quotation marks, thus giving an impression that the material has been paraphrased rather than directly quoted. •Paraphrasing material from one or more sources texts without supplying appropriate documentation To avoid being accused of this really a form of intellectual theft, you are advised to always provide references to the following cases (Burns and Burns, 2008:63): •Direct quotation from another source; •Paraphrased text which you have rewritten and/or synthesized but is based on someone else’s work; •Information derived from other studies; •Statistical information; •Theories and ideas derived from other authors; and •Interpretations of events or evidence derived from other sources. 1.7 Summary In this lecture you were introduced to various sources of literature and the steps you need to follow to effectively search them for what you are looking for. To enable you achieve this, various aspects such as how to identify keywords and structure effective queries were covered. To evaluate the literature you find, you were introduced to various measures you should take to evaluate the literature so found for relevance, value and sufficiency. Few hints on what is plagiarism and how you can avoid it were covered. The next lecture will introduce you to how you can critically review such literature to fit your purpose. 54 1.8 Review exercise Assume that you are interested in identifying the factors that influence the decision of Chinese firms to come and invest in Tanzania. i) Determine the relevant key words ii) Assume that the best source is electronic and then create a different sets of search queries iii) Using the different search queries you developed, visit online databases available at the OUT website and include google.com as well as scholar.google.com and run the search. Remember to make notes regarding the types of items that each of these services finds. iv) Which service do you think is likely to prove most useful to the research project above? v) Obtain a couple of articles on this subject and evaluate the value, and relevance of each to it. 1.9 References Saunders, M. N. K., Lewis & P., Thornhill, A. (2009) Research methods for business students. 5th edition. Harlow: Prentice Hall – Financial Times Burns, R.B. & Burns R.A. (2008) Business research methods and statistics using SPSS. London: Sage publications Ltd. 55 LECTURE TWO CRITICAL LITERATURE REVIEW (Dr. P. Ngatuni) 2.1 Introduction In the process of working on your research project, one thing you will definitely be asked by your supervisor is to write a literature review. This is also one part that will take you time before you can agree with your supervisor. Its process would have started with locating; obtaining, evaluating, recording and reading the literature (see discussions in chapter five). After that you will now be ready to conduct a literature review. But what is literature review; why is it so important to any research project and how do we do it, and how is it written? This lecture is designed to shade light to these questions. 2.2 Learning outcomes At the end of this lecture you will be able to: •Discuss the importance and purpose of the critical literature to your research project •Identify what you need to include when writing your critical review of literature •Write up a collected review of relevant information and document. •Design both conceptual and theoretical frame works from a review of literature and differentiate between them 2.3 What is critical literature review? A (critical) literature review is a detailed and justified analysis and commentary of the merits and faults of the literature within a chosen area, which demonstrates familiarity with what is already known about your research topic (Saunders, et al. 2009:590). For it to be critical, it need to cover substantial ground which will need lots of readings and to make judgements as to the value of each piece of work and to organise those ideas and findings that are of value into a review. Saunders et al (2009:59) summarizes results of a discussion with fellow research tutors about what they observe in students’ literature review pieces submitted (which I also share from experience in 56 my supervision assignments at OUT). While they acknowledged the work that student normally put into it, the following are the common observations •The purpose for which the literature review was undertaken is unclear •The literature review is a summary of the articles and books read, each article or book being given one paragraph or two, some of which are arranged either chronologically by authors or by subject. •Although some of the items reviewed are grouped together, the purpose of grouping is not made explicit. Consequently, Saunders et al likened these pieces of work to a shopping catalogue. Therefore, reviewing the literature critically should be able to help you establish (i) what research has been publish in the chosen area, (ii) identify any other research that might currently be in progress, (iii) identify research gaps that needs to be filled, which will help you situate your research project within those gaps. This implies that you should undertake a critical literature review with the view to enhance your subject knowledge and help you to clarify your to clarify your research question(s) further. How this is achieved is discussed in a latter section. 2.4 Importance of critical literature review Although you may feel that you already have a good knowledge of your research area, reviewing the literature is still very essential. There are two reasons for why you must review the literature. •Review of literature will help you to develop a good understanding and insight into relevant previous research i.e. what research has been published in the chosen area; and the trends that have emerged. • Review of literature helps you to generate and refine your research ideas. For example, some articles – review articles - contain both a considered review of the state of knowledge in the chosen topic area and pointers towards areas where further research needs to be undertaken. (Saunders 2009:27). This entails the research gap that needs to be filled. •When the literature review is critically done, it will help you demonstrate awareness of the current state of knowledge in your subject, its limitations, and how your research fits in this wider context. This is a key requirement in any research. Gall et al (2006) cited by Saunders et al. (2009: 61) adds the following purposes: 57 •Highlight research possibilities that have been overlooked implicitly in research to-date. •To discover explicit recommendations for future research which you could then use as a superb justification for your own research questions and objectives. •To help you avoid to simply repeating work that has been done already •To sample current opinions in newspapers professional and trade journals, thereby gaining insights into the aspects of your research questions and objectives that are considered newsworthy. •To discover and provide an insight into research approaches, strategies and techniques that may be appropriate to your own research questions and objectives. You must remember that the significance of your research and what you find out will inevitably be judged in relation to other people’s research and findings. For example, Jankowicz (2005:161) as quoted in Saunders et al (2009:59), argues as follows. “There is little point in reinventing the wheel … the work that you do is not done in a vacuum, but builds on the ideas of other people who have studied the field before you. This requires you to describe what has been published and to marshal the information in a relevant and critical way”. The purpose will also depend on the approach you are intending to use in your research i.e. whether it is deductive or inductive approach (refer to a relevant lecture for a discussion of this). For example, in a deductive approach you are expected to develop a theoretical or conceptual framework and then you subsequently test it using data. In inductive approach on the other hand, you are expected to explore your data and to develop theories from them that you will subsequently relate to literature. It is important that you note the following: (i) the purpose of literature review is not to provide a summary of everything that has been write in your research topic but to review the most relevant and significant research on your topic. Strauss and Corbin (1998) as cited in Saunders et al (2009: 61) argue that if the literature review is effective, new findings and theories will emerge that neither you nor anyone else has thought about. 2.5 Writing a a critical review Conducting a critical literature review requires critical reading 2.5.1 Critical reading 58 Conducting a critical literature review requires critical reading; but what is critical reading? Wallace and Wray 2006: cited by Saunders et all (2009:63) refers to critical reading as the capacity to evaluate what you read and the capacity to relate what you read to other information. So they suggested the following five critical questions that you can employ in critical reading (Table 2.2.1). Table 2.2.1: Critical questions for critical reading Critical Question What For Why am I reading this? A focusing devise Ensures you stick the purpose of reading Helps you avoid getting sidetracked by the author’s agenda What is the author trying to do in For deciding how valuable the writing may be for writing this? your purposes What is the writer saying that is Reflecting on your work relevant to what I want to find out? How convincing is what the author is Is the author’s argument/conclusion based on saying? evidence? What use can I make of the reading? Determining where its fits in your own work Source: Constructed from Saunders, et al (2009:63) To answer the question in the table above you need some key skills for effective reading, which include among others, previewing, annotating, summarizing, comparing and contrasting. The use value of each skill is summarized in the Table 2.2.2. Table 2.2.2: Skills for effective reading Previewing Looking around the document’s text before reading it – it helps establish the document’s purpose and how it may inform your literature search Annotating Conducing a dialogue with yourself the author and the issues and ideas at stake Summarising e.g. Outlining the argument of the text and eventually being able to state it in your own words Comparing and contrasting Asking yourself how your thinking has been altered by this reading or how the article has affected your response to the issues and theses of your research. 59 Source: constructed from Saunders et al (2009:62) 2.5.2 Approaching critical review Mingers (2000) cited by Saunders et al (2009: 64) considers four aspects of critical approach to the review of literature; - critique of rhetoric, critique of tradition, critique of authority, and critique of objectivity. A critique of rhetoric for example involves appraising or evaluating a problem with effective use of language in making reasoned judgement and of arguing effectively in writing. Critiques of tradition involves questioning where justification exist to do so the conventional wisdom. Critique of authority involves the questioning of the dominant view portrayed in the literature you are reading. Finally, the critique of objectivity involves recognising in your review that the knowledge and information you are discussing is not value free. In order for you to be effectively critical you need you need reading skills and the right attitude, read the literature with scepticism and be willing to question what you are reading. Both these require you to have read widely on your research topic and have a good understanding of the literature and have ability to make reasoned judgement that is argued effectively. Thus critical review may be taken to be a phrase describing the process of providing detailed and justified analysis of, and commentary on, the merits and faults of the key literature within your chosen area. You should therefore refer and assess research by recognised experts in your chosen area; consider and discuss research that supports and research that opposes your ideas, make reasoned judgement regarding the value of others’ research showing clearly how it relates to your research; justify your argument with valid evidence in a logical manner; and distinguish clearly between facts and opinion. To ensure that your literature review is sufficiently critical you need to ask yourself whether you have: • shown how your research question • justified your arguments by referencing relates to previous research reviewed • assessed the strength and weaknesses correctly published research 1. highlighted clearly those areas of previous research reviewed • been objective in your discussion and where new research is needed to provide assessment of other peoples research • included references to research that is arguments counter to your opinion • distinguished clearly between facts and opinions fresh insights and taken these in your 2. where there are inconsistencies in current knowledge and understanding 3. where there are omissions or 60 • made reasoned judgements about the value and relevance of others research to your own • justified clearly your own ideas bias in published research 4. where research findings need to be tested further 5. where evidence is lacking, inconclusive, contradictory or limited Source: (adapted from Saunders et al 2009: 65, Box 3.3 ) The more questions to which you can answer yes the more likely your review will be critical. 2.5.3 The process of conducting literature review The process of conducting literature review is continuous from when you define your research questions and objectives and it will continue in a circular form at various stages. At each stage, a completed circle, a refinement is done and the process continues, forming a spiral process shown in Figure 2.2.1. How long this spiral process will go on will be guided by how clear are the research questions and objectives. It begins with literature search (discussed in the preceding lecture). This should not be taken as misplaced rather it should be taken as a way of showing the connectedness of literature search and literature review; i.e. There is no review of what you do not have. 61 Figure 2.2.1: The literature review process Source: Saunders, Lewis, Thornhill and Jenkins (2003): reproduced from Saunders et al (2009:60) 2.5.4 Writing the literature review The writing process begins at the very beginning when you run your searches by taking notes of the articles you obtain and their full bibliographical details. hen you scan them through and assess their relevance and value to your research project you should also note down the ways in which they will be useful. In the end, when to begin writing the report of the review, it is when you really need skills to make your arguments. A lot of Master dissertations presented by our students suffer from this deficiency and you should try as much as you can to avoid it. For example Burns 62 and Burns (2008:61) advises that you should not write a literature review that is either disorganized ramblings or a chain of pointless isolated summaries of each document with each sentence beginning with ‘Brown (2001) says …’ or ‘Burns (2005 says …’, ‘Green (2003) says …’, etc. These are what Saunders et al (2009:58) likens with adjacent pages of shopping catalogues rather than a piece of literature review. To avoid these, Burns and Burns suggest that you should create a coherent argument that paraphrases and evaluates the literature and shows its relevance to what your problem or topic is. Since you make these arguments using what others have written, it is important that you give due acknowledgements to all your sources of information lest you will be accused of plagiarism – the academic crime discussed in the preceding lecture. A guide on what to cite, how to cite and how to show bibliographic details of the different types of documents – books, chapters in books, journal articles, magazines and newspapers, e-mails, web pages, etc will be treated in a separate lecture in this learning material. 2.5.5 The content of the critical review When thinking about the content of the critical review, you will need to remember the need to think about being able to combine academic theories and ideas. To achieve this, you must (i) evaluate the research that has already been undertaken in the area of your research, (ii) show and explain the relationships between published research findings and reference the literature in which they were reported, (iii) draw out the key points and trends, recognising any omissions and biases; and (iv) present them in a logical way, also showing the relationship to your won research project. To achieve this, Saunders et al (2009:63) recommends you to consider: • Including the key academic theories within your chosen area of research • Demonstrating that your knowledge of your chosen area is up-to-date • Enabling those reading your project report to find the original publications you cite • Acknowledging the research of others and writing it in the format prescribed in the assessment criteria 2.6 Structure of the critical review The literature review needs to be a description and critical of what other authors have written (Jankowicz, 2005 cited by Saunders et al 2009). Therefore in writing your review you need to 63 focus on your research question(s) and objectives. Saunders et al 2009:66) suggest that the focus of your research may be achieved by either: thinking of your literature review as discussing how far existing published research goes in answering your research questions, addressing the shortfalls at a later stage; or asking yourself how your review relates to your objectives where by if the answer is it does not or it does only partially then you know that there is a need for a clearer focus on your objectives. Although the precise of the structure of the critical review is usually a matter of choice, you may need to check with the assessment criteria. It can be a single chapter, a series of chapters or running through the whole report as you tackle issues. Most researchers have found it useful to think of the structure of their review as a funnel. This would be achieved by : • Starting at a more general level before narrowing down to your specific research question(s) and objectives • Provide a brief overview of key ideas and themes • Summarize compare and contrast the research of the key writers • Narrow down to highlight previous research work most related to your own research • Provide a detailed account of the findings of this research and show how they are related • Highlight those aspects where your own research will provide fresh insights • Lead the reader into subsequent sections on your project report which explores these issues. Figure 2.1.3 shows an example of the funnel model. The width of the funnel represents the number of publications that are available and the depth represents increasing or decreasing specialisation or relevance. At the top of the review funnel, there are publications that cover the general area of the research e.g. general text books, encyclopaedias, and so on. Next, there are the historical and seminal research papers which broach the general field of study. However, as one moves down the review funnel, towards one's specifically chosen field of endeavour, then the number of available papers decreases because specialisation has increased. Therefore, Toncich (2007) recommends that for an effective means of structuring review section, one should typically commence with generalities and history and, gradually, work its way through to more and more specific issues that, ultimately, lead to the directions proposed for the research program. A typical sequence would be to start with (i) Encyclopaedia or Web-Based (Internet) Search; (ii) General Text Book (iii) 64 Specific Text Book; (iv) Journal and Conference Papers (v) Trade Journals and Marketing Literature. A high-quality encyclopaedia is often a far better starting point for a research program than a text book or a journal paper, simply because credible, professionally-edited encyclopaedias cover a subject to a low depth but with a large breadth – various sections in such encyclopaedias are generally written by people who are chosen because they are internationally regarded in their field. Several pages of an encyclopaedia can often reveal the entire spectrum of key words, phrases and related areas of study that can subsequently be examined through text books and journal papers. Encyclopaedias also tend to provide historical time-frames and backgrounds that can be used to search for seminal papers during the course of a literature review. Better still; an encyclopaedia will often place an entire field of study Figure 2.1.3: The funnel model of literature structure Source: Toncich (2007:146) The next problem is how to write the results of the literature, specifically how to structure the main body of the literature review section. Carnwell and Daly (2001) recommend about four different 65 strategies. The choice is yours depending on what you want to achieve and what works better for you as well as for the subject that you are treating. These strategies include (i) examining the theoretical literature and then methodological literature underpinning the selected study; (ii) examining the theoretical literature and then the empirical literature in discrete sections; (iii) dividing the literature into content themes; and (iv) examining the literature chronologically. examining the theoretical literature and then methodological literature underpinning the selected study This would fit best in situations where empirical literature is absent; the only literature available might be of a theoretical nature. Such subjects often generate the need for qualitative research, such as grounded theory. The purpose of the literature review in this case will be two-fold: first to review the theories on the subject, and secondly to consider the implication of these theories for the development of an appropriate methodology to conduct a new study. Therefore the theoretical papers that critically discuss the nature, constituents and dimensions of the topic as portrayed by various authors could be reviewed and questions such as whether there is a consensus regarding the meaning, nature or constitution of the topic; whether there are counter-arguments as to the meaning, nature and constitution of the topic, if so what they are; and whether you the researcher agree with these counter-arguments? examining the theoretical literature and then the empirical literature in discrete sections This would fit for situations where the topic area contains many theoretical works (that discuss or describe a concept, construct, or topic that is not based on actual research) and empirical papers (those based on research with identified findings). The literature review in these cases could be divided into these two categories. (i) a section devoted to reviewing the theoretical literature (ii) a section devoted to the empirical literature. Analysis and evaluation of empirical literature, however, will need to include critical appraisal of methodologies used within the studies reviewed. You will need to answer question like whether the methodology of one study produced more valid results than another study (ii) whether one study has more practical relevance than another study. Dividing the literature into themes This strategy would fit to situations in which the literature could be divided into distinct themes, which would come from within the literature itself. Alternatively, the review could also be categorized methodologically, e.g: (i) Studies utilizing a survey approach (ii) Studies utilizing 66 interview approaches (iii) Studies using experimental approaches (iv) Studies using patient simulations. This method integrates theoretical literature and empirical literature, and might serve to guard against the temptation to description. Questions that would need to be answered could include whether the evidence is conclusive; whether there is theoretical consensus; whether the counterarguments or counterevidence exists, if not, whether you can think of any; whether there is multiple viewpoints or positions and what is your considered view; etc. Once you have reviewed and synthesized the literature within each theme, you should write a short summary identifying the key arguments and how they relate to the next theme. This technique ensures that each theme flows appropriately on to the next theme, so that the review flows as a logical structure. Examining the literature chronologically A chronological strategy of reviewing the literature would fit best in subject matter that has evolved over time periods, in which theories have been developed, tested and refined over several decades. Quick examples would include the theory of capital structure, or the capital market efficiency hypotheses in Finance. As with the other methods, you would lay out the literature in a clear structure, and analyse the literature within each time period. Questions pertaining to the literature would be the same as those discussed above. 2.7 Identification of research gaps (Concluding the literature review) The conclusion part of the literature review should integrate all the theme summaries into a broad terminal conclusion, which would logically lead onto the purpose of a new study and possible conceptual framework. In formulating a conclusion, it is necessary to draw together conclusions from both categories into the main conclusions. Gaps and shortcomings in previous works should now be evident, and why these may not answer a particular research question which therefore needs to be investigated. It might equally be justified to replicate one of the studies reviewed, for example, on different or larger population group. The gaps and shortcomings identified logically lead onto the purpose of a proposed study. It may also be possible to use the material in the different sections of the review to formulate either a conceptual or a theoretical framework. These are discussed in the next section. 67 2.8 Conceptual and theoretical frameworks If you have done your literature review critically, then the theory should be clearly delineated, showing the relationship between factors. Out of this then frameworks must be developed with which you will be able to make sense of the relationships and to identify the relevant variables. These frameworks are normally graphically presented although brief verbal materials can be added to explain (i) why you suggested these relationships and (ii) why you believed the variables are relevant. There are two framework each of which offers a particular support. These are theoretical and conceptual frameworks. 2.8.1 Conceptual framework For many of the quantitative studies this is quite an important item. Let us say for example that you are trying to establish whether employees’ retention rate is related to leadership style. Then after the review of relevant literature critically you realized that the leadership style in organisations i.e. the degree to which it is authoritarian or democratic is a factor that is likely to cause stress in employees and in turn affects their likelihood to stay in the organisations. These can be presented graphically (hence conceptually) as shown in Figure 2.1.4, where the concepts are places in ovals or ellipses while the arrows indicate the direction of influence Leadership styles: Authoritarian Democratic Employee stress Retention rates Figure 2.1.4: Conceptual framework Source: Adapted from Burns and Burns (2008:71) So a conceptual framework helps you to link abstract concepts to theory and it is the first stage in designing a piece of research. As the phrase implies, they are abstracts and may not be directly measured and hence the need to turn them to measurable variables. 2.8.2 Theoretical framework Recall from the preceding section that a conceptual framework is a mechanism through which you can portray graphically the relationship being proposed by the results of critical literature review 68 between abstract concepts. Theoretical framework on the other hand, is a graphical portrayal of the relationship that is proposed to exist between those abstracts but this time using measurable variables of the constructs. The added value here is the measurable variables and also the definition of the expected relationships between them and specification of the direction of such relationship. These in turn are the basis for formulating testable hypotheses. Figure 2.1.5 presents the same information as that presented in Figure 2.1.4 but extends it to show the theoretical framework i.e. the relationship in terms of measurable variables. Conceptual framework Leadership styles: Authoritarian Democratic Employee stress Retention rates Theoretical framework Management styles as measured by questionnaires Number of employees leaving each month Figure 2.1.5: Conceptual vs. theoretical framework Source: Adapted from Burns and Burns (2008:79) 2.9 Summary In this lecture you have learned the importance of reviewing the literature critically and how it can be done. You also learned that if your literature review is successfully done then you should be able to develop both conceptual and theoretical frameworks, the how, the differences and the importance of which were detailed. 2.10 Review exercise A classy hotel on the beaches on Kigamboni has three different classes of accommodation – standard, deluxe and royal. The standard was aimed to provide inexpensive accommodation for 69 families on holiday. The deluxe was meant for business travellers. The royal provided high quality services to wealthy international travellers. The hotel’s CEO was wondering how to differentiate among these three different types of accommodation in order to attract the appropriate type of client for each. In essence, the CEO felt that the revenues could be increased if clients and potential clients understood these distinctions better. Keen to develop a strategy to avoid any potential confusion on these types of accommodation, he commissioned a customer survey of those who had used each type of the facility. The results showed that many customers were unaware of the differences; many complained about the age of the buildings and their poor maintenance; the service quality was rated poor; and rumours were spreading that a name change was likely and franchise owners were becoming angry. The CEO then realized that he needed to understand how the different accommodation classification would be important so as to develop a marketing strategy. He also recognized that unless the franchise owners cooperated his plans would never reach fruition. Required: From this case; • Identify the problem • Identify the themes that you could include when writing a literature review on the case • Develop a conceptual framework • Develop a theoretical framework 2.11 References Burns, R.B. & Burns R.A. (2008) Business research methods and statistics using SPSS. London: Sage publications Ltd. Carnwell, R. & Daly, W. (2001) Strategies for the construction of a critical review of literature. Nurse Education in Practice (1) 57-63. Hart, Chris (2008) Doing literature review: Releasing the social science research imagination. Los Angeles: Sage Publications Limited Kothari, C. R. (2008) Research methodology: Methods and techniques. 2nd revised edition. New Delhi: New Age International (P) Limited Saunders, M. N. K., Lewis & P., Thornhill, A. (2009) Research methods for business students. 5th edition. Harlow: Prentice Hall – Financial Times 70 Toncich, D.J (2007) Key factors in postgraduate research – A guide to students Ch 7. http://www.doctortee.net/files/PHDBK2006-07p.pdf. accessed on 30th July 2010 71 MODULE THREE RESEARCH DESIGN AND DATA COLLECTION METHODS 72 LECTURE ONE RESEARCH DESIGN (By Dr. A. Gimbi) 1.1 Introduction A problem that follows the task of defining the research problem is the preparation of the design of the research project, popularly known as the “research design”. The meaning of research design can be described in various ways. Research design is the conceptual structure within which research is conducted; it constitutes the blueprint for the collection, measurement and analysis of data. Research design is a plan for collecting and utilizing data so that desired information can be obtained with sufficient precision or so that an hypothesis can be tested properly. Research design provides the glue that holds the research project together. A design is used to structure the research, to show how all of the major parts of the research project such as the samples or groups, measures, treatments or programs, and methods of assignment, work together to try to address the central research questions. 1.2 Learning outcomes By the end of this lecture you should be able to: • Explain the importance of research design • Describe the importance of having thought carefully about your research design Outline the main types of research designs and explain why these should not be thought of as always mutually exclusive • • Outline basic principles of experimental research design • Describe the main types of experimental designs • Explain the reasons for adopting multiple designs in the conduct of research 1.3 Need for research design Research design is required since it facilitates the smooth running of the the various research operations, thereby making research as efficient as possible yielding maximal information with minimal expenditure of effort and resources such as time and funds. Just as for better, economical and attractive construction of a house, we need a blueprint (commonly known as a house map) 73 well thought out and prepared by an expert architect, similarly we need a research design or a plan in advance of data collection and analysis for research project. Preparation of the research design must be done with great care since any error in it may upset the entire research project . Research design, in fact has a great bearing on the reliability of the results arrived at and as such constitutes the firm foundation of the entire edifice of the research work. The design helps the researcher to organize his ideas in a form whereby it will be possible for him to look for flaws and inadequacies. Such a design can even be given to others for their comments and critical evaluation. In the absence of such a course it will be difficult for the critic to provide a comprehensive review of the proposed study. Thoughtlessness in designing the research project may result in rendering the research exercise futile. It is therefore imperative that an efficient and appropriate design must be prepared before starting research operations. 1.4 Features of a good research design A good research design is frequently characterised by adjectives like flexible, appropriate, efficient, economical and so on. Generally, the design which minimises bias and maximises the reliability of the data collected and analysed is considered a good design. The design which gives the smallest experimental error is supposed to be the best design in many investigations. Similarly, a design which yields optimal amount of information and provides an opportunity for considering many different aspects of a problem is considered most appropriate and efficient design in respect of many research problems. Thus, the question of good design is related to the purpose or objective of the research problem and also with the nature of the problem to be studied. A design may be quite suitable in one case, but may be found wanting in one respect or the other in the context of some other research problems. One singe design cannot serve the purpose of all types of research problems. A research design appropriate for a particular research problem, must at least contain and involve the consideration of the following factors: i. A clear statement of the research problem ii. The objective of the problem to be studied 74 iii. Population to be studied iv. Procedures and techniques to be used for gathering the information v. Methods to be used in processing and analysing data. vi. The ability and skills of the researcher and his staff, if any vii. The availability of time and money for the research work. Crucially, when you choose to employ a particular research design, it should reflect the fact that you have thought carefully about why you are employing your particular research design. You should be able to answer why you chose to conduct your research in a particular organisation, why you chose the particular department, and why you chose to talk to one group of staff rather than the other. You must have valid reasons for all your research design decisions. The justification should always be based on your research questions(s) and objectives as well as being consistent with your research philosophy. 1.5 Types of Research Designs Understanding the types of designs and relationships among them is important in making design choices and thinking about the strengths and weaknesses of different designs. Research designs can be categorised as: 1.5.1 Exploratory research (Formulative research) design: Exploratory research is research into the unknown. It is used when you are investigating something but really don't understand it all, or are not completely sure what you are looking for. It's sort of like a journalist whose curiosity is peaked by something and just starts looking into something without really knowing what they're looking for. An exploratory research is a valuable means of finding out 'what is happening; to seek new insights; to ask questions and to assess phenomena in a new light'. It is particularly useful if you wish to clarify your understanding of a problem, such as if you are unsure of the precise nature of the problem. The main purpose of such studies is that of formulating a problem for more precise investigation or of developing the working hypotheses from an operational point of view. It may well be that time is well spent on exploratory research, as it may show that the research is not worth pursuing! 75 The research design appropriate for such studies must be flexible enough to provide opportunity for considering different aspects of a problem under study. In built flexibility in research design is needed because the research problem, broadly defined initially, is transformed into one with more precise meaning in exploratory studies, which fact may necessitate changes in the research procedure for gathering relevant data. There are three principal ways (methods) of conducting exploratory research: • A search of literature (survey of relevant literature) The survey of relevant literature happens to be the most simple and fruitful method of formulating precisely the research problem or developing hypotheses. Hypotheses stated by earlier workers may be reviewed and their usefulness be evaluated as a basis for further research. It may also be considered whether the already stated hypotheses suggest new hypothesis. In this case you should review and build upon the work already done by others, but in cases where hypotheses have not yet been formulated, your task is to review the available material for deriving the relevant hypotheses from it. • Interviewing 'experts' in the subject (Experience survey) This method involves a survey of people who have had practical experience with the problem to be studied. The aim of such a survey is to obtain insight into the relationships between variables and new ideas relating to the research problem. You must carefully select competent people who can contribute new ideas as respondents to ensure representation of different types of experience. The interview must ensure flexibility by allowing respondents to raise issues and questions which you have not previously considered. Thus an experience survey may enable you to define a problem more concisely and help in the formulation of research hypothesis. Such a survey may as well provide information about the practical possibilities for doing different types of research. • Analysis of 'insight-stimulating' examples This method is particularly suitable in situations where there is little experience to serve as a guide. The method consists of intensive study of selected instances of the phenomenon in which you are interested. In this case you can adopt different approaches such as examination of existing records, and unstructured interviewing. Your attitude, intensity of the study and your ability to 76 draw together diverse information into a unified interpretation are the main features which make this method an appropriate procedure for evoking insights. 1.5.2 Descriptive/Diagnostic/Survey research design Descriptive research studies are those concerned with describing the characteristic of a particular individual, or of a group, events or situations, whereas diagnostic research studies determine the frequency with which something occurs or its association with something else. The studies concerning whether certain variables are associated are examples of diagnostic research studies. In contrast, studies concerned with specific predictions, with narration of facts and characteristics concerning individual, group, or situation are all examples of descriptive research studies. Descriptive research is thus a type of research that is primarily concerned with describing the nature or conditions and degree in detail of the present situation. The emphasis is on describe rather than on judge or interpret. In descriptive as well as in diagnostic studies, you must be able to define clearly, what you want to measure and must find adequate methods for measuring it along with a clear cut definition of population you want to study. Since the aim is to obtain complete and accurate information in the said studies, the research design must make enough provision for protection against bias and must maximise reliability, with due concern for the economical completion of the research study. The design in descriptive/diagnostic studies must be rigid and not flexible and must focus attention on the following: a. Problem and objective formulation (what the study is about and why is it being made?) The research problem being tested should be explicitly formulated in the form of a question. When formulating the objective of your study, you must specify the objective with sufficient precision to ensure that the data collected are relevant. If this is not done carefully, the study may not provide the desired information. b. Designing/selection of methods of data collection (what techniques of gathering data will be adopted?) 77 You need to carefully design and or select methods by which you will obtain the appropriate data. Several methods such as observation, questionnaires, interviewing, examination of records etc with their merits and limitations, are available for the purpose and you may use one or more of these methods. While designing data-collection procedure, you have to ensure adequate safeguards against bias and unreliability. It is always desirable to pretest your data collection instruments before you finally use them for study purposes. c. Selecting the sample (how much material will be needed?) In most descriptive/diagnostic studies, you will take out sample(s) and then wish to make statements about the population on the basis of the sample analysis or analyses. Two important questions arise frequently when you anticipate to select a sample, namely: - How big should the test sample be? - What is the probability of mistakes occurring in the use of test sampling (instead of the whole population)? Special care should be taken with the selection of test samples. The results obtained from a survey can never be more authentic than the standard of the population or the representatives of the test sample. The size of the test sample can also be specified by means of statistics. It is important for you to bear in mind that it is desirable that test sampling be made as large as possible. The most important criterion that serves as a guideline here, is the extent to which the test sample corresponds with the qualities and characteristics of the general population being investigated. Take into consideration the next three factors before you make a decision with regard to the size of the test sample: - What is the grade of accuracy expected between the test sample and the general population? - What is the variability of the population? (This, in general terms, is expressed as the standard deviation.) - What methods should be used in test sampling? d. Collecting the data (where can the required data be found and with what time period should the data be related?) 78 Data collection refers to the gathering of information aimed at proving or refuting some facts. This is important issue in the research process as the validity of results of a statistical analysis clearly depends on reliability and accuracy of the data used. In fact the reliability and accuracy of the data depend on the method of collection. Sources of data There are two major sources of data that you can use that are the primary and secondary sources. Primary sources This is information that you gather directly from experimental studies or respondents using your research instruments. In experimental studies this information is obtained by measuring the variable(s) of interest. Secondary sources This is information that you gather from other previous studies e.g. published material and information from internal sources such as raw data and unpublished summaries. Use of secondary data saves time and cost for you and provide an insight on outcome from similar researches. Steps in data collection • Define your sample • Reflect on the research design • Ensure research instruments are ready • Define the data to be collected and how you are going to analyse them • Request permission to collect data from the relevant authorities • Pretest your research instruments To obtain data free from errors introduced by those responsible for collecting them, it is necessary for you to supervise closely the staff of field workers as they collect and record information. You may set up checks to ensure that the data collecting staff perform their duty honestly and without 79 prejudice. As data are collected, you should examine them for completeness, comprehensibility, consistency and reliability. e. Processing and analysing the data Data processing and analysis includes steps such as coding the interview responses, observations etc.; tabulating the data; and performing several statistical computations. Statistical computations such as averages, percentages and various coefficients must be worked out. Probability and sampling analysis may as well be used. The appropriate statistical operations, along with the use of appropriate tests of significance should be carried out to safeguard the drawing of conclusions concerning the study. To the extent possible, make sure you plan in detail the processing and analysing procedures before you embark on actual work. f. Reporting the findings This is the task of communicating the findings to others and the you must do it in an efficient manner. The report entails the reproduction of factual information, the interpretation of data, conclusions derived from the research and recommendations. The layout of the report needs to be well planned so that all things relating to the research study may be well presented in simple and effective style. You should make sure that you understand the meaning of the terminology used. Consult the recommended/other sources for detailed explanations. However, further reference must be made to aspects related to test sampling. The differences between Exploratory and Descriptive designs can be summarized as shown in Table 3.1.1. Table 3.1.1: Summary of differences between exploratory and descriptive research designs Type of Study Research Design Exploratory Descriptive Overall design Flexible design (design must provide opportunity for considering different aspects of the problem) Rigid design (design must make enough provision for protection against bias and must maximise reliability) (i) Sampling design Non-comparability sampling Probability sampling design 80 design (purposive or judgement (random sampling) sampling) (ii) Statistical design No pre-planned analysis design (iii) Observational design Unstructured instruments collection of data for Pre-planned design for analysis for Structured or well thought out instruments for collection of data (iv) Operational design No fixed decisions about the Advanced decisions operational procedures operational procedures Source: Kothari, (2009) about 1.5.3 Experimental (Hypothesis-testing) research design Hypothesis-testing research studies are those where you test the hypotheses of causal relationships between variables. Such studies require procedures that will not only reduce bias and increase reliability, but will permit drawing inferences about causality. This type of research design is known by a variety of names. Synonyms are, the cause and consequence, before and after, control group and laboratory design. It is research designed to study cause and consequence. A clear distinction between the terms experiment and experimental research should be evident. In the former there is normally no question about the interpretation of data in the discovery of new meaning. Experimental research, however, has control as fundamental characteristic. The selection of control groups, based on proportional selection, forms the basis of this type of research. Experimental research is basically the method that you can apply in a research laboratory. The basic structure of this type of research is elementary: two situations (cause and consequence) are assessed in order to make a comparison. Following this, attempts should be made to treat the one situation (cause) from the outside (external variable) to affect change, and then to reevaluate the two situations. The perceivable changes that occurred can then be presumed as caused by external variables. Since experimental designs originated in the context of agricultural operations, several terms of agriculture (such as treatment, yield, plot, block etc) are still used in experimental designs. 1.5.3.1 Basic Principles of Experimental designs a) Principle of replication 81 According to the principle, the experiment should be repeated more than once. Thus each treatment is applied in many experimental units instead of one. By doing so the statistical accuracy of experiment is increased. Conceptually replication does not present any difficulty, but computationally it does. For instance, if an experiment requiring a two-way analysis of variance is replicated, it will then require a threeway analysis of variance since replication itself may be a source of variation in the data. However, you should remember that replication is introduced in order to increase the precision of a study, that is to say, to increase the accuracy with which the main effects and interactions can be estimated. b) Principle of randomisation The principle provides protection against the effects of extraneous factors by randomization. In other words, the principle, indicates that you should design or plan the experiment in such a way that the variations caused by extraneous factors can all be combined under the general heading of “chance”. For instance, if you grow one variety of rice, say in the first half of the parts of a field and the other variety in the other half, then it is just possible that the soil fertility may be different in the first half in comparison to the other half. If this is so, your result would not be realistic. In such a situation, you may assign the variety of rice to be grown in different parts of the field on the basis of some random sampling technique i.e., you may apply randomization principle and protect yourself against the effects of the extraneous factors (soil fertility differences in the given case). As such, through the application of the principle of randomization, you can have a better estimate of the experimental error. c) Principle of Local Control Under this principle the extraneous factor, the known source of variability, is made to vary deliberately over as wide a range as necessary and this needs to be done in such a way that the variability it causes can be measured and hence eliminated from the experimental error. This means that you should plan the experiment in a manner that you can perform a two-way analysis of variance, in which the total variability of the data is divided into three components attributed to treatments (e.g. variety of rice), the extraneous factor (e.g. soil fertility) and experimental error. 82 Through the principle of local control you can eliminate the variability due to extraneous factor(s) from the experimental error. 1.5.3.2 Commonly used experimental designs Research designs can be weak or strong (or quasi which are moderately strong; that is, in between the weak and the strong designs) depending on the extent to which they control for the influence of confounding variables. Experimental designs refers to the framework or structure (outline, plan or strategy) of an experiment. There are several experimental designs which can be classified into two broad categories, viz., informal experimental designs and formal experimental designs. Informal experimental designs are those designs that normally use a less sophisticated form of analysis: • Before-and-after without control design • After-only with control design • Before-and-after with control design Formal experimental designs: • Completely randomized design (CRD) • Randomized block design (RBD) • Latin square design • Factorial designs a) Informal experimental designs i) Before-and-after without control design/One-group post-test design In such a design a single test group or area is selected and the dependent variable is measured before the introduction of the treatment. The treatment is then introduced and the dependent variable is measured again after the treatment has been introduced. The effect of the treatment would be equal to the level of the phenomenon after the treatment minus the level of the phenomenon before the treatment. 83 Test area: Level of phenomenon before treatment (X) Treatment introduced Level of phenomenon after treatment (Y) Treatment Effect = (Y) - (X) Figure 3.1.1 Before-and-after without control design/One-group post-test design Source: Kothari, 2009 The one-group post-test only design is a very weak research design with two problems: • A serious problem with this design is that you do not know whether the treatment condition had any effect on the dependent variable because you have no idea as to what the response would be without exposure to the treatment condition. That is, you don’t have a pretest or a control group to make your comparison with. • Another problem with this design is that you do not know if some confounding extraneous variable affected the dependent variable. With the passage of time considerable extraneous variations may be there in its treatment effect. Because of the problems with this design it generally gives little evidence as to the effect of the treatment condition. ii) After-only with control design In this design two groups or areas (test area and control area) are selected and the treatment is introduced into the test area only. The dependent variable is then measured in both the areas at the same time. The treatment impact is assessed by subtracting the value of he dependent variable in the control area from its value in the test area. Test area: Treatment introduced Control area: Level of phenomenon after treatment (Y) Level of phenomenon without treatment (Z) Treatment Effect = (Y) - (Z) Figure 3.1.2: After-only with control design Source: Kothari (2009). 84 The basic assumption in such a design is that the two areas are identical with respect to their behaviour towards the phenomenon considered. If this assumption is not true, there is the possibility of extraneous variation entering into the treatment effect. However, data can be collected in such a design without the introduction of problems with the passage of time. In this respect the design is superior to before-and-after without control design. iii) Before-and-after with control design/Pretest-posttest control-group design In this design two areas are selected and the dependent variable is measured in both the areas for an identical time-period before the treatment. The treatment is then introduced into the test area only, and the dependent variable is measured in both for an identical time-period after the introduction of the treatment. The treatment effect is determined by subtracting the change in the dependent variable in the control area from the change in the dependent variable in test area TIME-PERIOD I Test area: Level of phenomenon before treatment (X) Control area: Level of phenomenon without treatment (A) TIME-PERIOD II Treatment introduced Level of phenomenon after treatment (Y) Level of phenomenon without treatment (Z) Treatment Effect = [(Y) – (X)] – [(Z) - (A)] Figure 3.1.3: Before-and-after with control design/Pretest-posttest control-group design Source: Adapted from Kothari, (2009) This design is superior to the above two designs for the simple reason that it avoids extraneous variation resulting both from the passage of time and from non-comparability of the test and control areas. But at times due to lack of historical data, time or a comparable control area, you should prefer t o select one of the first two informal designs stated above. b) Formal experimental designs i) Completely randomized design (CRD) This design involves only two principles viz., the principle of replication and the principle of randomisation. It is the simplest possible design and its procedure of analysis is also easier. Subjects are randomly assigned to experimental treatments (or vice versa). For instance, if you have 10 subjects and if you wish to test 5 under treatment A and 5 under treatment B, the 85 randomisation process gives every possible group of 5 subjects selected from a set of 10 an equal opportunity of being assigned to treatment A and treatment B. One-way analysis of variance (oneway ANOVA) is used to analyse such a design. Even unequal replications can work in this design. It provides a maximum number of degrees of freedom to the error. Such a design is generally used when experimental areas happen to be homogeneous. Technically, when all the variations due to uncontrolled extraneous factors are included under the heading of chance variation, the design is referred to as CRD. The two forms of CRD are two-group simple randomised design and random replications design. • Two-group simple randomised design In two-group simple CRD, first of all you define the population and then randomly select a sample from the population. Then you randomly assign the sample items to the experimental and control groups. This design yields two groups as representatives of the population. POPULATION Random selection SAMPLE Experimental group Treatment A Control group Treatment B Random assignment Independent variable Diagrammatically the design can be shown as in Figure 3.1.4. Figure 3.1.4: Two-group simple randomised experimental design Source: Kothari, (2009). Since in the simple randomised design the elements constituting the sample are randomly drawn from the same population and randomly assigned to the experimental and control groups, it becomes possible to draw conclusions on the basis of samples applicable for the population. The experimental and control groups of such a design are given different treatments of the independent variable. The merit of such a design is that it is simple and randomises the differences among the sample items. It's limitation is that the individual differences among the individuals conducting the treatments are not eliminated, i.e., it does not control the extraneous variable and as such the result of the experiment may not depict a correct picture. 86 • Random replications design The limitation of the two-group simple randomised design is usually eliminated within the random replications design. In the random replications design the differences on the dependent variable are not ignored i.e. extraneous variable is controlled. The effect of the differences on the dependent variable are minimised (reduced) by providing a number of repetitions for each treatment. Each repetition is technically called a 'replication'. Random replications design serves two purposes i.e. it provides controls for the differential effects of the extraneous independent variables and secondly, it randomises any individual differences among individuals conducting the treatments. ii) Randomized block design (RBD) This design is an improvement over the CRD. In this design the principle of local control can be applied along with the other two principles of experimental designs. In the RBD, subjects are first divided into groups, known as blocks, such that within each group the subjects are relatively homogeneous in respect to some selected variable. The variable selected for grouping the subjects is one that is believed to be related to the measures to be obtained in respect of the dependent variable. The number of subjects in a given block would be equal to the number of treatments and one subject in each block would be randomly assigned to each treatment. In general, blocks are the levels at which you hold the extraneous factor fixed, so that its contribution to the total variability of data can be measured. The main feature of the RBD is that each treatment appears the same number of times in each block. The RBD is analysed by the two-way analysis of variance (twoway ANOVA) technique. Suppose four different forms of standardised test in mathematics were given to each of five students (selected one from each of the five I.Q. Blocks) and the scores obtained are as shown in Figure 3.1.5 below. Very low I.Q. Low I. Q. Average I. Q. High I. Q. Very high I. Q. Student A Student B Student C Student D Student E From 1 82 67 57 71 73 Form 2 90 68 54 70 81 Form 3 86 73 51 69 84 Form 4 93 77 60 65 71 87 Figure 3.1.5: Randomised block experimental design Source: Adapted from Kothari, (2009) If each student separately randomised the order in which they took the four tests by using random numbers or some similar device, such a design is a RBD. The purpose of this randomisation is to take care of such possible extraneous factors such as fatigue or experience gained from repeatedly taking the test. iii) Latin square design (LSD) This is an experimental design frequently used in agricultural research. The conditions under which agricultural investigations are carried out are different from those in other studies since nature plays an important role in agriculture. For instance, an experiment has to be made through which the effects of five different varieties of fertilizers on the yield of a certain crop, say wheat, is to be judged. In such a case, the varying fertility of the soil in different blocks in which the experiment has to be performed must be taken into consideration; otherwise the results obtained may not be very dependable because the output happens to be the effect not only of fertilizers, but it may also be the effect of fertility of soil. Similarly, there may be impact of varying seeds on the yield. To overcome such difficulties, the LSD is used when there are two major extraneous factors such as the varying soil fertility and varying seeds. The LSD is one wherein each fertilizer, in our example, appears five times but is used only once in each row and in each column of the design. In other words, the treatments in LSD are so allocated among the plots that no treatment occurs more than once in any one row or any one column. The two blocking factors may be represented through rows and columns, one through rows and the other through columns (Figure 3.1.6). FERTILITY LEVEL SEEDS DIFFERENCES I II III IV V X1 A B C D E X2 B C D E A X3 C D E A B X4 D E A B C X5 E A B C D 88 Figure 3.1.6: Latin square experimental design Source: Kothari, (2009). Figure 3.1.6 shows that in LSD the field is divided into as many blocks as there are varieties of fertilizers and then each block is again divided into as many parts as there are varieties of fertilizers in such a way that each of the fertilizer variety is used in each of the block (whether column-wise or row-wise) only once. The analysis of the LSD is very similar to the two-way ANOVA technique. The merit of this experimental design is that it enables differences in fertility gradients in the field to be eliminated in comparison to the effects of different varieties of fertilizers on the yield of the crop. But this design suffers from one limitation, and it is that although each row and each column represents equally all fertilizer varieties, there may be considerable difference in the row and column means both up and across the field. This, in other words, means that in LSD we must assume that there is no interaction between treatments and blocking factors. This defect can, however, be removed by taking the means of rows and columns equal to the fields mean by adjusting the results. Another limitation of the design is that it requires number of rows, columns and treatments to be equal. This reduces the utility of this design. In case of (2x2) LSD, there are no degrees of freedom available for the mean square error and hence the design cannot be used. If treatments are 10 or more, than each row and each column will be larger in size so that rows and columns may not be homogeneous. This may make the application of the principle of local control ineffective. Therefore, LSD of orders (5x5) to (9x9) are generally used. iv) Factorial designs Factorial designs are used in experiments where the effects of varying more than one factor are to be determined. They are especially important in several economic and social phenomena where usually a large number of factors affect a particular problem. Factorial designs can be of two types: • Simple factorial designs (two-factor factorial design) In this case, we consider the effects of varying two factors on the dependent variable, but when an experiment is done with more that two factors, we use complex factorial designs. Simple factorial 89 design may either be a 2 x 2 simple factorial design, or it may be a 3 x 4 or 5 x 3 or the like type of simple factorial design. Experimental Variable Control Variables Treatment A Treatment B Level I Cell 1 Cell 3 Level II Cell 2 Cell 4 Figure 3.1.7: Two by two simple factorial experimental design illustration Source: Kothari, (2009). In this design the extraneous variable to be controlled by homogeneity is called the control variable and the independent variable, which is manipulated, is called the experimental variable. Then there are two treatments of the experimental variable and two levels of control variable. As such there are four cells into which the sample is divided. Each of the four combinations would provide one treatment or experimental condition. Subjects are assigned at random to each treatment in the same manner as in a randomized group design. The means for different cells may be obtained along with the means for different rows and columns. Means of different cells represent the mean scores for the dependent variable and the column means in the given design are termed the main effect for treatments without taking into account any differential effect that is due to the level of the control variable. Similarly, the row means in the said design are termed the main effects for levels without regard to treatment. Thus, through this design we can study the main effects of treatments as well as the main effects of levels. An additional merit of this design is that one can examine the interaction between treatments and levels, through which one may say whether the treatment and levels are independent of each other or they are not so. The following examples make clear the interaction effect between treatments and levels. The data obtained in case of two (2 x 2) simple factorial studies may be as given in Figures 3.1.8a and 3.1.9b. STUDY I DATA 90 Training Control (Intelligence) Treatment A Treatment B Row Mean Level I (Low) 15.5 23.3 19.4 Level II (High) 35.8 30.2 33.0 Column mean 25.6 Figure 3.1.8a: Two by two simple factorial study I data 26.7 Source: Kothari, (2009). STUDY II DATA Training Control (Intelligence) Treatment A Treatment B Row Mean Level I (Low) 10.4 20.6 15.5 Level II (High) 30.6 40.4 35.5 Column mean 20.5 Figure 3.1.8a: Two by two simple factorial study II data 30.5 Source: Kothari, (2009). Both the above figures (study I and study II data) represent the respective means. The 2 x 2 design need not be restricted in the manner as explained above i.e., having one experimental variable or two control variables. For example, a college teacher compared the effect of the class-size as well as the introduction of the new instruction technique on the learning of research methodology. For this purpose he conducted a study using a 2 x 2 simple factorial design. His design in graphic form would be as shown in Figure 3.1.9. Experimental Variable I (Class size) Experimental Variable II (Instruction technique) New Small Usual Usual Figure 3.1.9: Two by two simple factorial experimental design 91 Source: Kothari, (2009). But if the teacher uses a design for comparing males and females and the senior and junior students in the college as they related to the knowledge of research methodology, in that case we will have a 2 x 2 simple factorial design wherein both the variables are control variables as no manipulation is involved in respect of both the variables. A factorial design is a design in which two or more independent variables are simultaneously investigated to determine the independent and interactive influence which they have on the dependent variable. It also has random assignment to the groups. • Each combination of independent variables is called a "cell." • Research participants are randomly assigned to as many groups are there are cells of the factorial design if both of the independent variables can be manipulated. • The research participants are administered the combination of independent variables that corresponds to the cell to which they have been assigned and then they respond to the dependent variable. • The data collected from this research give information on the effect of each independent variable separately and the interaction between the independent variables. • The effect of each independent variable on the dependent variable is called a main effect. There are as many main effects in a factorial design as there are independent variables. If a research design included the independent variables of gender and type of instruction, then there would potentially be two main effects, one for gender and one for type of instruction. • An interaction effect between two or more independent variables occurs when the effect which one independent variable has on the dependent variable depends on the level of the other independent variable. For example, if gender is one independent variable and method of teaching mathematics is another independent variable, an interaction would exist if the lecture method was more effective for teaching males mathematics and individualized instruction was more effective in teaching females mathematics. Illustration-4 x 3 simple factorial design 92 The 4 x 3 simple factorial design will usually include four treatments of the experimental variable and three levels of the control variable (Figure 3.1.10). Experimental Variable Control Variable Treatment A Treatment B Treatment C Treatment D Level I Cell 1 Cell 4 Cell 7 Cell 10 Level II Cell 2 Cell 5 Cell 8 Cell 11 Level III Cell 3 Cell 6 Cell 9 Figure 3.1.10: Four by three simple factorial experimental design illustration Cell 12 Source: Kothari, (2009). This model of a simple factorial design includes four treatments viz., A, B, C, and D of the experimental variable and three levels viz., I, II, and III of the control variable and has 12 different cells as shown above. This shows that a 2 x 2 simple factorial design can be generalised to any number of treatments and levels. Accordingly we can name it as such and such (_x_) design. In such a design the means of the columns provide the researcher with an estimate of the main effects for treatments and the means for rows provide an estimate of the main effects for the levels. Such a design also enables the researcher to determine the interaction between treatments and levels. • Complex factorial designs Experiments with more than two factors at a time involve the use of complex factorial designs. A design which considers three or more independent variables simultaneously is called a complex factorial design. In case of three factors with one experimental variable having two treatments and two control variables, each one of which having two levels, the design used will be termed 2 x 2 x 2 complex design which will contain a total of eight cells as shown in figure 3.1.11 below. Experimental Variable Treatment A Treatment B Control Variable 2 Control Variable 2 Control Variable Level I Level II 2 Level I Control variable I Level I Level II Control Variable 2 Level II Cell 1 Cell 3 Cell 5 Cell 7 Cell 2 Cell 4 Cell 6 Cell 8 Figure 3.1.11: Two by two by two complex factorial design 93 Source: Kothari, (2009). Factorial designs are used mainly because of the two advantages: • They provide equivalent accuracy (as happens in the case of experiments with only one factor) with less labour and as such are a source of economy. Using factorial designs, we can determine the main effects of two (in simple factorial designs) or more (in complex factorial design) factors (or variable) in one single experiment. • They permit various other comparisons of interest. For example, they give information about such effects which cannot be obtained by treating one single factor at a time. The determination of interaction of interaction effects is possible in case of factorial designs. 1.6 Summary There are several research designs and the researcher must decide in advance the collection and analysis of data as to which design would prove to be more appropriate for his research project. He must give due weight to various points such as the type of universe and its nature, objective of his study, the resource list or the sampling frame, desired standard of accuracy and the like when taking a decision in respect of the design for his research project 1.7 Review exercise 1. Explain the meaning and significance of a Research design 2. Explain the meaning of the following in context of Research design a) Extraneous variables b) Confounded relationship c) Research hypotheses d) Experimental and control groups e) Treatments 3. Describe some of the important research designs used in experimental hypothesis-testing research study. 4. “Research design in exploratory studies must be flexible but in descriptive studies, it must minimise bias and maximise reliability.” Discuss. 5. Give your understanding of a good research design. Is a single research design suitable in all research studies? If not, why? 94 6. Explain and illustrate the following research designs: a) Two group simple randomized design b) Latin square design c) Random replications design d) Simple factorial design e) Informal experimental designs 7. Write a short note on 'Experience Survey' explaining fully its utility in exploratory research studies. 8. What is research design? Discuss the basis of stratification to be employed in sampling public opinion on inflation. 1.8 References Gomez, K.A. and Gomez, A.A. (1984). Statistical Procedures for Agricultural Research (2ed). An International Rice Research Institute Book. A Wiley-Interscience Publication, John Wiley & Sons. Kothari, C. R. (2009). Research Methodology: Methods and Techniques. New Age International Limited Publishers. Mason, R.D., Lind, D.A and Marchal, W.G. (1999). Statistical Techniques in Business and Economics. Irwin McGraw Hill. Saunders, M. N. K., Lewis & P., Thornhill, A. (2009). Research methods for business students. (5ed). Harlow: Prentice Hall – Financial Times. Zar, J.H. (1996). Biostatistical Analysis. Prentice Hall, Inc. New Jersey. 95 LECTURE TWO DATA COLLECTION METHODS (By Dr. S. Massomo and Dr. D. Ngaruko) 2.1 Introduction Data collection is an important aspect of any type of research study. Inaccurate data collection can impact the results of a study and ultimately lead to invalid results. Data collection methods for impact evaluation vary along a continuum. At the one end of this continuum are quantitative methods and at the other end of the continuum are qualitative methods for data collection. In this lecture you will be introduced to various quantitative and qualitative methods used in collecting research data. 2.2 Learning outcomes At the end of this lecture you should be able to: •Outline various methods of collecting quantitative and qualitative data and state their •Describe various data collection instruments/tools •Distinguish between quantitative and qualitative data collection methods •Apply various qualitative data collection methods appropriate for a given type of research •Advantageously use a combination of different data collection techniques. •Identify ethical issues involved in data collection and ways of ensuring that your research informants or subjects are not harmed by your study. 2.3 Quantitative Data Collection Methods Quantitative data collection begins after a research problem has been identified and a research design/plan has been devised. It refers to the gathering of information aimed at proving or refuting some facts. This is one of the most important aspects in the research process as the validity of the results of a statistical analysis clearly depends on reliability and accuracy of the data used. In turn, the reliability and accuracy of the data depend on the method of collection. Data-collection 96 techniques allow us to systematically collect information about our objects of study (people, objects, phenomena) and about the settings in which they occur. The methods rely on random sampling and structured data collection instruments that fit diverse experiences into predetermined response categories. In this section we briefly describe quantitative methods that are used to gather quantitative data – that is information dealing with numbers and anything that is measurable. Statistics, tables and graphs, are often used to present the results of these methods (see details in Lecture one (Module Four) on quantitative data analysis). The methods should therefore be distinguished from qualitative methods described in section 2.4 of this Lecture. However, you should note that in most physical and biological sciences, the use of either quantitative or qualitative methods is uncontroversial, and each is used when appropriate. 2.3.1 Sources of data There are two major sources of data used by researchers. These are primary and secondary sources. Primary sources are information gathered directly from experimental studies or respondents using your research instruments. In experimental studies this information is obtained by measuring the variable(s) of interest. Secondary sources are information gathered from other previous studies e.g. published material and information from internal sources such as raw data and unpublished summaries. Different techniques are employed for collecting data from primary and secondary sources (see sections 2.3.4.1 and 2.3.4.2). 2.3.2 Steps in data collection • Define your sample • Reflect on the research design • Ensure research instruments are ready • Define the data to be collected and how you are going to analyse them • Request permission to collect data from the relevant authorities • Pre-test your research instruments 97 2.3.3 Need for correct sampling 2.3.3.1 Sampling A researcher need to consider need to draw a suitable sample from the population. In most cases it is costly, time consuming and sometimes impractical to work with entire population. Imagine a study that would require dissection and destruction of the hearts of a certain animal species, say the Giraffe our national symbol, it would be impractical to expect to use the entire population of giraffes from a certain game park. A carefully drawn sample can provide useful results which represent the entire population. However, it should be stressed that if you fail to draw your sample correctly you will end up with wrong conclusions from your study. 2.3.3.2 Choice of sample size Usually the larger the absolute size of a sample the more closely its distribution will be to the normal distribution – the central limit theorem. As a researcher you will need to consider the following aspects that may influence choice of sample size; • The confidence you need to have in your data • The margin of error that you can tolerate • The type of analysis you are going to undertake • The size of the total population from which your sample is being drawn 2.3.4 Methods of data collection There are several methods used for collecting data. Each method has its own uses and none is superior to all situations. Selection of appropriate method(s) for data collection is influenced by i) the nature, scope and object of enquiry ii) availability of resources iii) time factor and iv) the precision required (Kothari, 2009). In the next sections we briefly describe the various data collection techniques that can be used for collection of primary and secondary data. 2.3.4.1 Collection of secondary data In most cases there is a large amount of data that has been collected by others. As a starting point, you should try to locate these sources and retrieve the information. The information may not have been analysed or published before. Examples include; census data, meteorology data, files/reports, computer databases, Government reports, and documents such as budgets, organisation charts, 98 policies and procedures. Use of secondary data save time and cost for the researcher and provide an insight on outcome from similar researches. Additionally secondary data permit examination of trends over time. However, you need to check on the reliability, suitability and adequacy of the data. The demerits of secondary data include the following; ι) Sometimes it is difficult to gain access to the records or reports ιι) Data may not always be complete or exactly what is needed ιιι) You have to verify validity and reliability of the data ιϖ) Ethical issues concerning confidentiality may arise. 2.3.4.2 Collection of primary data Usually, quantitative data gathering strategies include: i) Experiments/clinical trials, ii) Observing and recording well-defined events, iii) Obtaining relevant data from management information systems, iv) Administering surveys (e.g., face-to face and telephone interviews, questionnaires etc). Let us briefly describe each of these strategies. a) Experiment If the researcher conducts an experiment, s/he observes and takes some quantitative measurements, or the data. In the case of a survey, data may be collected by using several methods such as observations, interview and administering questionnaires. If you must collect original data it is important that you i) establish procedure and follow them, ii) maintain accurate records of definition and coding, iii) pre-test your instruments and iv) verify accuracy of coding b) Observation Observation is a commonly used method especially in studies related to behavioural sciences (see also section 2.4.1). This technique involves systematically selecting, watching and recording 99 behaviour and characteristics of living beings, objects or phenomena. Observation can be the major research technique if the interest is to check the presence or absence of an object or phenomena e.g. the presence of latrines and their state of cleanliness, condition of roads/buildings/animals. In this case you obtain actual (real time) data versus self reported behaviour or perceptions. Furthermore, observations can give additional, more accurate information on behaviour of people than interviews or questionnaires. They can also check on the information collected through interviews especially on sensitive topics such as alcohol or drug use, or stigmatising diseases (Varkevisser et al., 2003). Observation may be done in three ways; i) Unobstructive - here no one knows you are observing, ii) Participant- you actually participate in the activity and iii) Obstructive- the people being observed know that you are there to observe them. While planning your observation it is advisable that you i) develop a checklist to rate your observation, ii) develop a rating scheme and iii) pilottest the observation data collection instrument(s). When observations are made using a defined scale they may be called measurements. For instance, the Point Count Method is one of the techniques used to assess the abundance of bird species in the wild. With this technique, the researcher randomly allocates point counts to be used as representative samples for the area. Point counts are visited over a period of several days or longer to assess how many and what types of birds are in an area. c) Tracking This is a modified observational technique that may be used to gather quantitative data. With tracking, research marketers are able to monitor the behaviour of customers as they engage in regular purchase or information gathering activities. Possibly the most well-known example of tracking research is used by websites as they track customer visits. But tracking research also has offline applications, especially when point-of-purchase scanners are employed, such as tracking product purchases at grocery stores and automated collections on toll roads as well as use of automated teller machines (ATMs) in banks. This method of research is expected to grow significantly as more devices are introduced that provide means for tracking. However, some customers may see tracking devices as intrusive and 100 many privacy advocates have raised concerns about certain tracking methods especially if these are not disclosed to customers. d) Interview method (face-to-face) An interview is a data-collection technique that involves oral questioning of respondents, either individually or as a group. The answers to the questions posed during an interview can be recorded by writing them down (either during the interview itself or immediately after the interview) or by tape-recording the responses, or by a combination of both. Interviews can be conducted with varying degrees of flexibility as described by Varkevisser et al., (2003). i) High degree of flexibility A flexible method of interviewing is useful if a researcher has as yet little understanding of the problem or situation he is investigating, or if the topic is sensitive. The unstructured or loosely structured method of asking questions can be used for interviewing individuals as well as groups of key informants. It is frequently applied in exploratory studies of qualitative nature. The instrument used may be called an interview guide or interview schedule6. ii) Low degree of flexibility Less flexible methods of interviewing are useful when the researcher is relatively knowledgeable about expected answers or when the number of respondents being interviewed is relatively large. Then questionnaires may be used with a fixed list of questions in a standard sequence, which have mainly fixed or pre-categorised answers. Face-to-face interviews enable the researcher to establish rapport with potential participants and therefore gain their cooperation. These interviews usually yield highest response rates in survey research. They also allow the researcher to clarify ambiguous answers and when appropriate, seek follow-up information. Disadvantages include; impractical when large samples are involved, time consuming and expensive. Other forms of types of interviews include Telephone interviews and Computer Assisted Personal Interviewing (CAPI). The later is a form of personal interviewing, but instead of completing a questionnaire, the interviewer type in the responses directly in her/his laptop computer. 6 Interview schedule is term used for loosely structured tools where interviewer asks and records answers. 101 e) Data collection using questionnaires Questionnaires often make use of checklist and rating scales. These devices help simplify and quantify people's behaviours and attitudes. A checklist is a list of behaviours, characteristics, or other entities that the researcher is looking for. Either the researcher or survey participant simply checks whether each item on the list is observed, present or true or vice versa. A rating scale is more useful when behaviour need to be evaluated on a continuum. They are also known as Likert scales. A questionnaire consists of a number of questions printed or typed in a definite order on a form or set of forms. However, the questionnaire needs to be carefully constructed in order to obtain the required information. The following four aspects of a questionnaire need to be taken care; i) Main aspects of a questionnaire • General form The questionnaire can be structured or unstructured. Structured questionnaires are those questionnaires with definite and pre-determined questions with a list of possible options/answers. The questions are presented with exactly the same wording and in the same order to all respondents. A highly structured questionnaire is one in which all questions and answers are specified and comments in the respondents own words are held to a minimum (Kothari, 2009). Whereas with unstructured (or non-structured) questionnaire, the interviewer is provided with a general guide on the type of information to be obtained and leaves formulation of questions to the interviewer. Unstructured questionnaires are ideal when the aim is to probe attitude and reasons for certain actions of feelings. Unstructured questionnaires are useful for obtaining facts with which a researcher is not familiar. • Questions sequence It is important to have a question sequence that is clear, logical and smooth-moving. Questions that are easiest to answer should be placed at the beginning and the questions should flow from general to specific questions. Focused questions related to objectives of the study should be placed in the main body of the questionnaire. • Questions formulation and wording 102 The formulation and wording of questions should produce questions that are i) easily understood, and useful ii) Avoid jargon and too technical terminologies, iii) use short simple clear questions to prompt for only one answer. • Questionnaire layout Avoid unnecessary congesting your questions, allocate sufficient space for open ended question and list choices downwards rather than horizontally (across the page). It is not advisable to have questions on both sides of a paper. ii. Administering written questionnaires A written questionnaire (also referred to as self-administered questionnaire) is a data collection tool in which written questions are presented that are to be answered by the respondents in written form. A written questionnaire can be administered in different ways, such as by: • Sending questionnaires by mail with clear instructions on how to answer the questions and asking for mailed responses. The main advantages of this method include the relatively low cost, can cater for large samples and reach widely spread respondents. The demerits of this method include; low rate of return of dully filled questionnaire and can only be used when respondents are educated and cooperating. • Gathering all or part of the respondents in one place at one time, giving oral or written instructions, and letting the respondents fill out the questionnaires; or • Hand-delivering questionnaires to respondents and collecting them later. The questions can be either open-ended or closed (with pre-categorised answers), but it should be short preferably of about 20 minutes. Table 3.2.1: Comparison between data collection techniques and data collection tools S/N 1 2 3 4 Data collection techniques Using secondary data Observation Interviewing Administering written Data collection tools Checklist; data compilation forms Eyes and other senses, pen/paper, watch, scales microscope, etc Interview guide, checklist, questionnaire, tape recorder Questionnaire questionnaires 103 2.4 Qualitative Data Collection Techniques Qualitative data collection methods play an important role in impact evaluation by providing information useful to understand the processes behind observed results and assess changes in people’s perceptions of their well-being. Furthermore qualitative methods can be used to improve the quality of survey-based quantitative evaluations by helping generate evaluation hypothesis; strengthening the design of survey questionnaires and expanding or clarifying quantitative evaluation findings. These methods are characterized by the following attributes: • They tend to be open-ended and have less structured protocols (i.e., researchers may change the data collection strategy by adding, refining, or dropping techniques or informants) • They rely more heavily on iterative interviews; respondents may be interviewed several times to follow up on a particular issue, clarify concepts or check the reliability of data • They use triangulation to increase the credibility of their findings (i.e., researchers rely on multiple data collection methods to check the authenticity of their results) • Generally their findings are not generalizable to any specific population, rather each case study produces a single piece of evidence that can be used to seek general patterns among different studies of the same issue Regardless of the kinds of data involved, data collection in a qualitative study takes a great deal of time. The researcher needs to record any potentially useful data thoroughly, accurately, and systematically, using field notes, sketches, audiotapes, photographs and other suitable means. The data collection methods must observe the ethical principles of research. From Craig and Douglas (2000) and Kombo and Tromp (2006) we can outline four major types of qualitative data collection techniques: • observational • projective and quasi-observational techniques; techniques; • depth interviews; • direct observation; • standardised • case tests; and studies 104 We briefly discuss each of these techniques in the following subsections. 7 2.4.1 Observational and quasi-observational techniques Observational techniques involve direct observation of phenomena in their natural settings. Observational research might be somehow less reliable than quantitative research yet it is more valid and flexible since the marketer is able to change his approach whenever needed. Disadvantages are given by the limited behavioural variables and the fact that such data might not be generalisable - we can observe a customer's behaviour at a given moment and situation but we cannot assume all further customers will act the same. Quasi-observational techniques on the other hand are reported to have increased in usage over the past decades, due to the large scale employ of surveillance cameras within stores. Such techniques cost less than pure observational ones since costs associated with video surveillance and taping are far lower than a researcher's wage; the tape can be viewed and analyzed at a later time, at the marketer's convenience. When performing videotaping of consumers' behaviours, they can be asked to give comments and insights upon their thoughts and actions while the conversation itself can be recorded and be further analyzed. The following are some common variants of observational and quasi-observational data collection techniques. 2.4.1.1 Pure observation With pure observation, the researcher watches behaviour of customers in real-life situation, either in situ or by videotaping the respondents (less intrusive). Videotaping can be specifically recommended when studying patterns of different cultures, since we can easily compare behaviours taped and highlight similarities and / or differences. 2.4.1.2 Trace measures Trace measures consist in collecting and recording traces of consumers' behaviour. Such traces can be fingerprints or tear of packages, empty packages, garbage cans analysis and any other ways a marketer can imagine (it's all about creativity here!). In e-Marketing, trace measures come under 7 Content of qualitative data collection techniques is mainly based on the article by Otilia Otlacan titled ‘Overview on Qualitative Data Collection Techniques in International Marketing Research’. Article Source: http://EzineArticles.com/? expert=Otilia_Otlacan 105 the form of recorded visits and hits - there are numerous professional applications that can help an e-marketer analyze the behaviour of visitors on his company's website. 2.4.1.3 Archival measures Achieve measures can be any type of historical records, public records, archives, libraries, collections of personal documents etc. Such data can prove to be of great use in analyzing behavioral trends and changes in time. Researchers can also identify cultural values and attitudes of a population at a given moment by studying mass media content and advertisement of the timeframe questioned. 2.4.1.4 Entrapment measures These are indirect techniques (by comparison to the previously mentioned ones) and consist in asking the respondent to react to a specific stimulus or situation, when the actual subject of investigation is totally different. The marketer plants the real stimulus among many fake ones and studies reactions. The method is quite unobtrusive and the marketer can gather valuable, nonreactive facts. When the respondent becomes aware of the true subject under investigation (s)he might change the behaviour and compromise the study. 2.4.1.5 Protocols Protocols are yet another observational research technique which asks respondents to think out loud and verbally express all their thoughts during the decision-making process. Protocols are of great value for determining the factors of importance for a sale and they can be collected in either real shopping trips or simulated ones. 2.4.2 Projective techniques Projective techniques are based on the respondent's performance of certain tasks given by the researcher. The purpose is to have the respondents express their unconscious beliefs through the projective stimuli; to express associations towards various symbols, images, signs. Projective techniques can be successfully employed to: a) Indicate emotional and rational reactions; b) Provide verbal and non-verbal communication; c) Give permission to express novel ideas; 106 d) Encourage fantasy, idiosyncrasy and originality; e) Reduce social constraints and censorship; f) Encourage group members to share and "open up" etc. Projective research techniques can take the following forms, presented in subsections that follow below. 2.4.2.1 Collages research technique Collage projective research technique is used to understand lifestyles and brand perceptions, respondents are asked to assemble a collage using images and symbols from selected sets of stimuli or from magazines and newspapers of their choice. 2.4.2.2 Picture completion Some pictures can be designed to express and visualize the issue under study and respondents have to make associations and / or attribute words to the given pictures. 2.4.2.3 Analogies and metaphors Analogies and metaphors are research techniques used when a larger range of projection is needed, with more complexity and depth of ideas and thoughts on a given brand, product, service, organization. The respondents are asked to freely express their association and analogies towards the object being studied; or they can be asked to select from a set of stimuli (e.g. photos) those that fit the examined subject. 2.4.2.4 Psycho-drawing Psycho-drawing is a data collection technique that allows study participants to express a wide range of perceptions by making drawings of what they perceive the brand is (or product, service). 2.4.2.5 Personalisation Personalisation consists in asking the respondents to treat the brand or product as if it is a person and start making associations or finding images of this person. This technique is especially recommended in order to understand what kind of personality consumers assign to a brand/ product /service. 107 2.4.3 In-depth interviews In-depth interviews are the most common type of qualitative data collection. These techniques of research put an accent on verbal communication and they are efficient especially when trying to discover underlying attitudes and motivations towards a product or a specific aspect/situation. Let us briefly outline a few types of in-depth interviews. 2.4.3.1 Individual in-depth interviews These types of interviews are performed on a person-to-person environment and the interviewer can obtain very specific and precise answers. Interviews can be conducted by phone or via internet-based media, from a centralized location: this can greatly reduce costs associated with research and the results are pretty much as accurate as the face-to-face ones. The only disadvantage would be the lack of non-verbal, visual communication. 2.4.3.2 Focus groups Discussion (FGD) FGDs are basically discussions conducted by a researcher with a group of respondents who are considered to be representative for the target population. Such meetings are usually held in an informal setting and are moderated by the researcher. Videotaping the sessions is common these days, and it can add more sources of analysis at a later time. Focus groups are perhaps the ideal technique, if available in terms of costs and time, to test new ideas and concepts towards brands and products; to study customers' response to creative media such as ads and packaging design or to detect trends in respondents' attribute and perception. One of the important advantages of focus groups is the presence of several respondents in the same time, providing a certain synergy. Disadvantages refer mainly to the costs involved and the scarcity of good professionals to conduct the interviews and discussions. Review Exercise Based on the extracts from the readings recommended for this lecture draw up a checklist of at least 10 common mistakes that many researchers make when undertaking semi structured interviews. 108 2.4.4 Direct Observation Observation is a tool that provides information about actual behaviour. Direct observation is useful because some behaviour involves habitual routines of which people are hardly aware. Observations can be made of actual behaviour patterns. Kombo and Tromp (2006) outline the following three variants of observation techniques: participant observation, unstructured observation, and structured observation. 2.4.4.1 Participant observation This is participation helps to reduce reactivity whereby the researcher becomes an active functioning member of the culture under study. It gives a researcher an intuitive understanding of what is happening in a culture. However the process is highly time consuming. 2.4.4.2 Structured observation With structured observation technique the observer is an onlooker with a small number of specific behaviour patterns and only on those whose names appear in the observation list predetermined prior to effective data collection. This suggests that the researcher must have a clue of the trait under observation. 2.4.4.3 Unstructured observation Unstructured observations are helpful in understanding the behaviour patterns in their physical and social context. Unlike the structured observation, here the researcher takes a position of an onlooker, that is, the researcher collects data in the forms of descriptive short notes. 2.4.5 Standardized tests Standardized tests are used by researchers mainly in education research studies, whereby the researcher frequently uses standardized tests to measure one or more of the variables in a study. There are many types of tests that one may consider to use in collecting data in educational research e.g. achievements tests, personality tests, aptitude tests (such as intelligence tests) etc. Irrespective of the type of tests used to collect data, the tests must be of high validity and of unquestionable reliability. 109 2.4.6 Case studies The case study is a means of obtaining detailed information on a relatively small number of units of study and we consider the characteristics of case studies, the circumstances in which they are useful. A case study approach is often appropriate when it is necessary to probe deeply into the systems governing behaviour and interrelationships between people and institutions. In a case study a single case (hence the name) or, more commonly in the social sciences, relatively few items, are studied for a period of time and the results recorded. The other research designs involve studying more than one group, or studying the same group at different times in order to make comparisons. A case study may be one or relatively few farmers or households, one group, one village, one project, one district, one nation. The focus of a case study is on the detailed structures, patterns, or interrelationships observed within each individual case included in the study, though the cases themselves may be selected to cover a range of different types of study units. Although case study is very flexible and valuable method of inquiry, its main limitation is its generalisability. One cannot generalize statistically from a single case study. The case study like the experiment does not represent a “sample” and therefore cannot be used for making statistical generalizations about the population. 2.5 Other important aspects in data collection 2.5.1 Data storage You need to decide on how to store your data for the short and long term before you conclude your study. The two major forms of storage are the electronic and non electronic (paper) form. A combination of both forms is highly recommended. 2.5.2 Ethical issues/considerations in data collection Varkevisser et al. (2003) emphasised that when we develop our data collection techniques, we need to consider whether our research procedures are likely to cause any physical or emotional harm. The harm may be caused, for example, by: 110 • Violating informants’ right to privacy by posing sensitive questions or by gaining access to records which may contain personal data; • Observing the behaviour of informants without their being aware (concealed observation should therefore always be crosschecked or discussed with other researchers with respect to ethical admissibility); • Allowing personal information to be made public which informants would want to be kept private, and • Failing to observe/respect certain cultural values, traditions or taboos valued by your informants, e.g. dressing code. Several methods for dealing with these issues may be recommended: • Obtaining informed consent before the study or the interview begins and you must ensure that all subjects participate voluntarily. • Not exploring sensitive issues before a good relationship has been established with the informant; • Ensuring the confidentiality of the data obtained; and • Learning enough about the culture of informants to ensure it is respected during the data collection process. If sensitive questions are asked, for example, about family planning or sexual practices, or about opinions of patients on the health services provided, it may be advisable to omit names and addresses from the questionnaires. 2.5.3 Challenges faced by researchers in data collection Some of the challenges encountered during data collection include the following issues • Inadequate quality control of the work done by assistants. • Failure to carry out a pilot study lead to haphazard work in the field. • Poor definition and selection of sample • Poor implementation: human and technical errors • Lack of sufficient follow-up on non respondents 111 2.6 Summary When thinking of conducting academic research, usually surveys and experimental designs are most likely the first techniques that come to one’s mind. However, experimental designs and surveys are a quantitative research approaches and, in order to understand respondents’ behaviour and the social and cultural context, we will need to perform some qualitative research as well. Qualitative methods are most certainly a more appropriate option when in need of researching patterns and attitudes in customer behaviour, understand the depth of the environment around the customer, and understand the cultural characteristics then influence a customer - especially when the marketer is not familiar with the country of culture. There are certain situations where qualitative research alone can provide the researcher with all insights needed to make decisions and take actions; while in some other cases quantitative research might be needed as well. We will stop by the main qualitative techniques and see how and where they can be employed in research. Quantitative data collection methods rely on random sampling and structured data collection instruments that fit diverse experiences into predetermined response categories. They produce results that are easy to summarize, compare, and generalize. Quantitative research is concerned with testing hypotheses derived from theory and/or being able to estimate the size of a phenomenon of interest. Depending on the research question, participants or experimental units may be randomly assigned to different treatments. If this is not feasible, the researcher may collect data on participant and situational characteristics in order to statistically control for their influence on the dependent, or outcome, variable. If the intent is to generalize from the research participants to a larger population, the researcher will employ probability sampling to select participants. You should remember that Qualitative research techniques involve the identification and exploration of a number of often mutually related variables that give insight in human behaviour (motivations, opinions, attitudes), in the nature and causes of certain problems and in the consequences of the problems for those affected. ‘Why’, ‘What’ and ‘How’ are important questions. 112 Structured questionnaires that enable the researcher to quantify pre- or post-categorised answers to questions are an example of quantitative research techniques. The answers to questions can be counted and expressed numerically. Quantitative research techniques are used to quantify the size, distribution, and association of certain variables in a study population. ‘How many?’ ‘How often?’ and ‘How significant?’ are important questions. The choice of data collection approach to use depends on the situation. Each technique is more appropriate in some situation than others. You may want to consider combination of a variety of data collection techniques. Oftentimes, both qualitative and quantitative research techniques are used within a single study. Multiple tools often help you to meet the evaluation needs. You should choose the correct tool to meet the needs of the evaluations (Table 3.2.2). Table 3.2.2: Summary guide to choice of data collection method If you Want anecdotes or in depth information Then use Qualitative approach Are not sure what you want to measure Do not need to quantify Want statistical analysis Quantitative approach Know exactly what you want to measure Want to cover a large group 2.7 Review exercise 1. State situations under which each of the observational and projective techniques of data collection can be more appropriate 2. Write brief notes supported with relevant examples to distinguish between the following; a) Observational and quasi-observational techniques b) Observational and projective data collection techniques c) Structured and unstructured observations d) Focus Group Discussions and In-depth interviews 113 e) Analogies and metaphors f) Personalisation and Protocols 3. Entrapment measures complement psycho-drawing data collection methods. The two are inseparable. Discuss this statement giving relevant examples. • Study the four cases presented below, and briefly describe for each case the following aspects; • What type(s) of study you would propose. • From whom (or from what) you would collect the data required for each study (your study populations). • For each case: which data collection techniques or combination of quantitative and quantitative data collection techniques you would use. Case one: You have just been appointed as a Regional Education Officer in a certain region. You note that in a certain district the number of secondary school drop outs for both girls and boys is on the increase. You suspect that the increase may be associated with the new ruby gemstone mining activities in some of the wards. How would you (i) Identify the underlying causes of school drop outs? ii) Determine whether and how students, parents and teachers themselves could contribute to alleviate the problem and iii) What methods would you borrow from qualitative approach. Case two: You are the new District Agricultural Officer. The district is faced with a maize food security threat due to extensive damage of crop produce in store by a certain insect pest. Describe how you would i) determine the perception and understanding of farmers on insect control strategies, ii) Compare the extent of crop loss in relation to the different methods of maize storage used by farmers, iii) Collect data to test the efficacy of two rates of insecticide application. Case three: You are a new supermarket manager and you would like to test whether grouping of similar products on shelves help to reduce bias in consumer selection. Case four: You suspect that performance of students in day schools may be influenced by provision of lunch to students. How would you collect data in a study aimed at testing that hypothesis? 114 2.8 References David M. and Sutton C. D. (2004). Social Research: The Basics. Sage Publication, 2004, London. Pp.370 Gill J. and Johnson P. (2002). Research Methods for Managers. Sage Publications, 2002, London. Pp 13-45 http://www.worldbank.org/poverty/impact/methods/qualitative.htm#indepth . Last visited in August 2010. IPDET (undated). Module 8: Data Collection Methods. www.worldbank.org/oed/ipdet/presentation/M_www.worldbank.org/oed/ipdet/presentation/M_0 8-Pr.pdfwww.worldbank.org/oed/ipdet/presentation/M_08-Pr.pdf . Last visited in August 2010. Kombo, D.K. and Tromp, D.L.A. (2006). Proposal and Thesis Writing. An Introduction. Paulines Publications Africa. Kothari, C. R. (2004). Research Methodology: Methods and Techniques. New Age International Limited Publishers, Pp. 1-24 Kumar, R. (2005). Research Methodology: A step by Step Guide for Beginners. Sage Publication, India Pvt Ltd New Delhi 2005. Pp. 6-14 Otlacan, O. (undated) titled ‘Overview on Qualitative Data Collection Techniques in International Marketing Research’http://EzineArticles.com/?expert=Otilia_Otlacan. Last visited in August 2010. Saunders, M. N. K., Lewis & P., Thornhill, A. (2009). Research methods for business students. (5ed). Harlow: Prentice Hall – Financial Times. Varkevisser, C.M., Pathmanathan, I and Brownlee, A. (2003).Overview of Data Collection Techniques. Module 10A: in Designing and Conducting Health Systems Research Projects: Volume 1. Proposal Development and Fieldwork. KIT/IDRC. www.idrc.ca/en/ev-56606-201-1- DO_TOPIC.html. Last visited in August 2010. 115 MODULE FOUR DATA ANALYSIS METHODS 116 LECTURE ONE ANALYSIS OF QUANTITATIVE DATA (By Dr. S. Massomo) 1.1 Introduction Data analysis is an important stage of the research process. Quantitative data in a raw form convey very little meaning to most people, hence data need to be processed using quantitative analysis techniques in order to turn them into useful information. Usually data analysis is preceded by a preprocessing stage where raw data collected may be edited, coded, classified and tabulated. In this section, we discuss the basics of statistical analysis. However, we do not cover the techniques in detail. If you are unsure of the techniques you should refer to your notes from undergraduate studies or consult statistics textbooks. 1.2 Learning outcomes At the end of this lecture, you should be able to: • Outline various quantitative data analysis techniques and state their application • Distinguish between descriptive and inferential statistic methods • Recognise different types of data and apply various quantitative data analysis techniques appropriate for a given type of data. • Select the most appropriate statistic to describe individual variables and to examine relationships between variables and trends in your data. • How to perform statistical calculations in a straightforward step-by-step manner. 1.3 Basic ideas about data analysis and presentation 1.3.1 Variable Variable is the word that is used to describe the name of any characteristic of a population or sample that is of interest to us, e.g. age. Data is the word that describes the actual values (measurements or observations) of the variable. Data may be either quantitative (numerical) or qualitative (categorical). Note that the word ‘data’ is the plural form of word ‘datum’. 117 1.3.2 Scales of measurement The selection of the type of analysis to use on a set of data depends on the scale of measurement of the data. These scales are nominal, ordinal and numerical (sub-divided into interval and ratio) the qualities of the scale of measurements are shown in Table 4.1.1. Table 4.1.1: Comparison of Qualities of Measurement Scales (After Dunn, 2001). Scale Provide Defining features name Nominal Names, Labels, categories. less Qualitative operations: =, ≠ information Examples Gender (1=male, More 2=female), Ethnicity qualitative or religion of a person, smoker Vs Ordinal Interval Observation ordered or non smoker Class rank (1st, ranked, 2nd ..), Rank such as Qualitative operations: <, > Order or ranking, equal low, high Fahrenheight intervals between temperature, observations, no true zero IQ score point. Qualitative operations: +, -, X, ÷ Order or ranking, equal Weight, height, More more intervals between reaction time, speed Quantitative information observations, true zero point etc. Provide Ratio Qualitative operations: +, -, X, ÷ a). Nominal scale A nominal scale: is where the data can be classified into non-numerical or named categories, and the order in which these categories can be written or asked is arbitrary. These include marital status 118 and sex. The best you can do with such data is to count the number of the observations in each category and then calculate proportion or percentage of all observations that fall into each category. b) Ordinal scale Ordinal scale: is where the data can be classified into non-numerical or named categories. An inherent order exists among the response categories. Ordinal scales are seen in questions that call for ratings of quality (for example, very good, good, fair, poor, very poor) and agreement (for example, strongly agree, agree, disagree, strongly disagree). Qualitative data that can be ranked may be analysed by appropriate non-parametric techniques. c) Numerical scale Numerical scale: is where numbers represent the possible response categories, there is a natural ranking of the categories, zero on the scale has meaning and there is a quantifiable difference within categories and between consecutive categories. 1.4 Methods of quantitative data analysis Statistical data analysis divides the methods for analysing data into two categories exploratory (descriptive) and confirmatory (inferential) methods. In the following section, we look at the exploratory methods. 1.4.1 Descriptive statistics 1.4.1.1 Introduction Descriptive statistics are statistical procedures that describe, organise and summarize the main characteristics of sample data. These include measures of central tendency (averages - mean, median and mode) and measures of variability about the average (range and standard deviation). These give the reader a 'picture' of the data collected and used in the research study. Often the first step taken towards summarizing a mass of numbers is to form a table of means and or a frequency distribution. We use example of age of a sample of active OUT students at Morogoro, to illustrate this point. Then we summarise the age data by using mean and standard deviation values in a table, bar graph, histogram and pie chart. The age data were collected from 368 students and recorded as shown in the first six columns in Table 4.1.2 Usually some sort of data processing is done before data analysis. The data in Table 4.1.2 were first coded to change sex and programme names into numbers, and then dates of birth 119 were converted to age in years. Finally, ‘Age at first year’ was obtained by subtracting ‘year of study’ from ‘Age in year 2010’. Table 4.1.2: Data recording sheet showing students names, date of birth, year of study, sex and programme of study S/N Name Sex Year of Progr. Date of Age in Progr. Sex Age at first Name birth year code year1 Study F 4 F 3 F 3 M 10 F 4 F 7 F 8 M 8 M 5 F 5 F 6 code 2010 29 38 26 55 36 38 40 48 50 39 35 1 A B.Com 26.06.81 1 1 25 2 B B.Com 22.02.72 1 1 35 3 C B.Com 30.11.84 1 1 23 4 D B.Com E 19.05.55 2 2 45 5 E B.Com E 18.12.74 2 1 32 6 F B.Ed 15.12.72 3 1 31 7 G B.Ed 12.04.70 3 1 32 8 H B.Ed 03.05.62 3 2 40 9 I B.Ed 01.01.60 3 2 45 10 J B.Ed 01.08.71 3 1 34 11 K B.Ed 28.03.75 3 1 29 12 367 368 Z M 1 LLB 07.05.82 28 15 2 27 NB: Age at first year was obtained by subtracting ‘year of study’ from ‘Age in year 2010’ We first summarize the data by calculating and presenting means, standard deviations and range. The summary is shown in Table 4.1.3. Later on we show how the information on the variable ‘age at first year’ can be summarised in a histogram, bar graph and a pie chart. Table 4.1.3: Summary of Age of students at first year of study across study programmes and sex (n=366). Program All Students Mean B.Com B.Com Ed B. Ed BA Ed BA Gen 27.7 ± 6.4 38.5 ± 9.2 36.2 ± 8.5 32.8 ± 8.8 35.8 ± 6.8 Mi Ma n 23 32 21 21 24 x 35 45 54 50 50 N 3 2 112 78 26 Female students M Mean in 27.7 ± 6.4 23 32.0 ± 32 36.1 ± 8.4 21 31.9 ± 9.7 21 37.0 ± 7.3 24 Male students Ma x 35 32 53 50 46 N Mean 3 1 44 36 6 45.0 36.3 ± 8.7 33.5 ± 8.1 35.5 ± 6.8 Mi Ma n 45 21 23 26 x 45 54 50 50 N 0 1 68 42 20 120 BA Jour BA Soc BA Tour BBA Ed BBA Gen BSc Ed BSc Env. BSc Gen CYP LLB Total 29.3 ± 4.6 35.9 ± 7.3 35.3 ± 10.2 40.3 ± 5.3 32.8 ± 6.2 29.1 ± 8.9 39.8 ± 12.4 38.6 ± 7.3 32.0 34.3 ± 7.4 23 24 23 36 23 21 23 24 32 23 34.8 ± 8.4 21 34 49 50 48 46 50 51 50 32 50 54 4 14 7 4 24 22 4 36 1 29 34.0 32.7 ± 7.6 39.0 28.5 ± 3.7 25.2 ± 5.8 40.6 ± 7.8 36.8 ± 13.1 366 33.7 ± 9.1 34 24 39 24 21 24 24 2 1 34 38 39 32 38 50 50 1 3 0 1 4 10 0 8 0 4 12 53 1 27.7 ± 4.2 36.8 ± 7.3 35.3 ± 10.2 40.7 ± 6.4 33.7 ± 6.3 32.4 ± 9.9 39.8 ± 12.4 38.1 ± 7.1 32.0 34.0 ± 6.4 23 28 23 36 23 22 23 24 32 23 31 49 50 48 46 50 51 49 32 48 3 11 7 3 20 12 4 28 1 25 35.3 ± 8.0 21 54 245 Comments on the data: You will notice that incidentally, there are no female students in BA Tour, BSc Env and CYP, programmes, likewise there are not any male students in B.Com. Age at first year ranges from 21 – 54, with the mean being 34.8± 8.4 standard deviation. The highest variation in age appears in BA Tour. Generally, age at first year seems to be similar for males and females students. However, female students seem to start studying at a younger age than male students in BSc Ed, BBA Ed and BA Soc programmes. We further use the data in the column of number of all students (i.e. column 5, Table 4.1.1) to plot Figure 4.1.1. The histogram in Figure 4.1.2 was prepared using seven classes representing the range of age at first year i.e. 21 – 54. Additionally a pie chart is presented in Figure 4.1.3 to indicate the 150 100 LLB CYP BSc Gen BSc Env BSc Ed BBA Gen BBA Ed BA Tour BA Soc BA Jour BA Ed BA Gen B.Ed 0 B Com Ed 50 B Com Number of students relative proportion of the distribution of the students’ age groups. Programmes 121 Figure 4.1.1: Bar Chart showing number of students at Morogoro in different undergraduate programmes (n=367) Number of students 100 75 50 25 0 21-25 26-30 31-35 36-40 41-45 46-50 51-55 Age groups Figure 4.1.2: Histogram showing the proportion of students age groups at first year of study (n=367) 122 Figure 4.1.3: Pie chart showing proportion of students in the seven age groups (n=367) 1.4.2 Inferential Statistics 1.4.2.1 Introduction In the previous section, we have shown how age data can be summarised, organised and presented in different ways. In this section, we look at another data analysis technique called inferential statistics. Inferential statistics extend the scope of descriptive statistics by examining the relationships within a set of data; in particular, inferential statistics enable the researcher to make inference, that is, judgements about the population based on the relationships within the sample data. This is achieved by estimation of population parameters and by hypothesis testing. Most measurement data where the aim is to compare means can be analysed using an analysis of variance (ANOVA), a t-test or a suitable non-parametric method. Scores and proportions often use a chi-squared test, while cause-effect/dose-response relationships use regression analysis. Other methods may be needed when there are multiple outcomes. You should note that the design of an experiment and its statistical analysis are intimately connected. 123 1.4.2.2 Hypothesis testing Hypothesis testing: Also known as testing of hypothesis or test of significance, is a statistical test that examines a set of sample data and on the basis of an expected distribution of the data (e.g. t, Z, F or Chi-square) leads to a decision about whether to accept the null hypothesis (Ho) or the alternative hypothesis (Hi). It is a procedure for establishing whether a test detected a reliable difference between two or more groups. The steps involved in hypothesis testing are: •State the null hypothesis (Ho) and the alternative hypothesis (Hi) •Select the level of significance •Choose appropriate test to use •Formulate a decision rule by obtaining a critical value from the table and state the decision rule; If the test statistic fall in the critical region we reject Ho •Determine the value of the test statistic •Compare the value of the test statistic with the critical value and conclude either do not reject Ho or reject Ho and accept Hi. a) Terminologies related to hypothesis testing We start by describing the important terminologies related to hypothesis testing; i) Hypothesis: A statement about a population that is subject for testing. A hypothesis may be null or alternative. ii) Null hypothesis: A statement about a population that is under test. Denoted as Ho, and Ho state that there is no difference between means or there is no effect. Always include the equal sign ‘=’. iii) Alternative hypothesis: A statement that is true when Ho is false and answers your research question. It can assume three possible forms; Hi: X > µ, or X < µ or X ≠ µ. Alternative hypothesis (Hi) determines whether the test is left/right one tailed of two tailed. It is characterised by presence of inequality sign. iv) Level of significance: Also called p-value or alpha. The p-value refers to the probability of rejecting the null hypothesis when it is true. It indicates the probability of observing a sample value as extreme as, or more extreme than, the value observed, given the null hypothesis is true. It indicates 124 the likelihood that Ho is not true. The smaller the p-value the better, however, there is no one level of significance that is applied to all tests. The P=0.05 and 0.01 are common for most studies. v) Test statistic: this is a value, determined from sample information that is used to determine whether to reject the null hypothesis, e.g. Z, t, F value. vi) A critical value: The value(s) which separates the critical region from the non critical region. The critical values are determined independently of the sample statistics. They are read from appropriate tables of distribution. vii) Critical region: also called rejection region, is a set of all values which would cause us to reject Ho. If the test statistic falls in the rejection region Ho is rejected. viii) Decision rule/statement: When the test statistic exceeds the critical value Ho is rejected and a statement is made based upon the null hypothesis. It is either ‘reject the null hypothesis’ or ‘fail to reject the null hypothesis’. Usually we never accept the null hypothesis. ix) Conclusion: A statement which indicates the level of evidence (sufficient or insufficient), at a specific level of significance and decide whether the original claim is rejected (null hypothesis) or supported (alternative hypothesis). x) Statistical significance: Refers to whether a statistical test detected a reliable difference between two or more groups, for instance one caused by the effect of an independent variable on a dependent measure. xi) Type I error: Occurs if we reject Ho when it is true. Usually more serious error. xii) Type II error: Accepting Ho when it is false, that is saying true when it is false. For example: usually defendants are presumed innocent until proven guilty. The purpose of a court trial is to see whether a null hypothesis of innocence is rejected by the weight of the data (evidence). The null hypothesis : Ho = the person is innocent, The alternative hypothesis Hi = the person is guilty Which is more serious error? Convicting an innocent person or letting the guilty person go free? b) Testing for the population mean for large samples i) Introduction The standard normal distribution, that is the Z distribution, is appropriate test statistic to use when the sample size has at least 30 observations and population standard deviation is unknown. If the 125 sample size is less than 30 consider using the t-test described in section 11.2.4. The Z statistical test uses the sample standard deviation to estimate the unknown population standard deviation. In the following sections we look at use of Z test for inference about one or two populations. ii) Inference about one population mean The one sample Z test may be used to test whether there is any significant difference between a sample mean and a known population mean value. We can verify if the age of students in first year, using the sample of Morogoro OUT students, is similar to the starting age in convectional Universities. Suppose the average age of students when they begin studying in convectional Universities is 28 years, the one sample Z test will be used to compare the sample mean with the population mean of 28 years. The sample statistics are mean = 34.8, n=366 and Standard deviation of 8.4 (see Table 4.1.3). In this case Ho: µ = X= 34.8, Hi: µ ≠ X ≠ 34.8. If we decide to use the 0.05 significance level, computation of the value of the test statistics is done as follows; The formula is: Where X = Sample mean, µ = Population mean, SD = Sample standard deviation and n = sample size. 15.45 Critical value from table at p = 0.05 is 1.96 Conclusion: Since test statistic (Z=15.45) exceed the critical value (1.96). We fail to accept null hypothesis Ho: µ = X and instead we accept the alternative hypothesis Hi: µ ≠ X. In other words the sample mean (34.8) is significantly different from the population mean (28). iii) Inference about two means using the Z test We use the data of age of BA Ed students (see Table 4.1.3) to illustrate how Z test can be used to test for statistical difference between two samples. In this case, we compare the mean age at first year for female students versus that of male students in the BA Ed programme, using a two sample Z test at the 0.05 level of significance. 126 Table 4.1.4: Mean age, standard deviation data for BA Ed students Group Female students Male students Mean age 31.9 33.5 Standard deviation Sample size 9.7 36 8.1 42 Null Hypothesis (Ho): X1 = X2 that is, the age at first year is the same for female and male BA Ed students. Alternative Hypothesis (Hi) X1 ≠ X2 that is, the age at first year is different for female and male BA Ed students. Z= = Z= Z= = = = -0.81 (but we use the absolute value 0.81) Critical value from table at p = 0.05 is 1.96 Conclusion: Since test statistic (calculated absolute value of Z: 0.81) is less than the critical value (1.96), we fail to reject the null hypothesis and conclude that the two means for age at first year do not differ significantly. c) Testing for the population mean for small samples i) Introduction In the previous section, we showed how the Z test statistic can be used in hypothesis testing. In this section, we look at hypothesis testing using the t test. The t-test, which use the t distribution, is used for small samples (n<30) because the Z distribution provides unreliable estimates of differences between samples when the number of available observation is less than 30. Characteristics of the t-distribution are: • It is a continuous distribution • Mound • Flatter (note it is not bell) shaped and symmetrical or more spread out than standard normal distribution 127 • There is a family of t-distribution depending on the number of d.f. Application of the t-test The t test was created to deal with small samples when parameters and variability of larger parent population are unknown. The t tests are used to compare one or two sample means but not more than two means. The use of t test in hypothesis testing assumes that the sample was drawn from a normally distributed population. The t test detects a significant difference between means when the (i) Difference is large, (ii) Sample standard deviation is small and or (iii) Sample size is large Variation of the t test I. Single or one sample t test This is used to compare the observed mean of one sample with a hypothesized value assumed to represent a population. The t or Z test both use similar formulas Test statistic = = It tries to answer the question: is it likely that a sample with a given mean could have come from a population with the proposed µ? It is usually used to determine if some set of scores or observation deviate from some established pattern. If the population standard deviation, sigma, is unknown, then the population mean has a student's t distribution, and you will be using the t-score formula for sample means. The test statistic is very similar to that for the z-score, except that sigma has been replaced by s and z has been replaced by t. The critical value is obtained from the t-table. The degree of freedom for this test is n1. Worked example of a single or one sample t test Suppose a retail shop sells an average of 320 units of a certain product per day with a standard deviation of 40. After an extensive advertising campaign the manager calculated the average sales of the product for the next 25 days to see whether an improvement has occurred. The average sales turned out to be 345 units per day. 128 From this information can we say that the advertisement significantly improved sales of the product? The answer is No, until when we perform a statistical test, in this case we use the one sample t-test and the 0.05 level of significance. The hypotheses Null Hypothesis (Ho): X = µ . Advertisement does not affect/improve sales of the product. Alternative Hypothesis (Hi) X > µ . Advertisement does improve sales of the product. We use the formula Critical value at t 0.05, 24df = 1.711 Conclusion: Since the value of test statistic (t = ) is greater than the critical value (1.711) We fail to accept null hypothesis Ho: X = µ and accept the alternative hypothesis Hi: X > µ units. In other words the sample mean (345) is significantly different from the population mean (320) hence the advertisement had improved sales of the product. II. The T test for independent groups (two sample test) Samples are independent when they are not related. Independent samples may or may not have the same sample size. Independent sample t test is designed to detect significant difference between one group e.g. a control group and another group such as the experimental group. It tries to answer the Question: Is X1 different from X2 or could the two sample means come from identical population? Assumptions required for the independent t test; • The population are normally distributed • The populations are independent • Standard deviation are the same in both population 129 • Therefore the standard deviation are pooled Worked example of a two sample t test Suppose we have two samples with the following measurements, and we want to test whether the two means are statistically different say at the 0.05 level of significance. Sample 1 2 4 9 3 2 Sample 2 3 7 5 8 4 n = 5, X1 = 4.0, SD = 3 2.9155 n = 6, X2 = 5.0, SD = 2.0976 Step 1. Calculate the means and standard deviation for each sample Step 2. Pool the sample variances S2p = = 6.222 Step 3. Determine the t value = 0.662 Where S2p = is the mean of the first sample = is the mean of the second sample = is the number in the first sample = is the number in the second sample = is the pooled estimate of the population variance Conclusion: Since calculated t value (0.662) is less than the critical value of 2.262 at p=0.05 and 9 degrees of freedom, we fail to reject the null hypothesis and conclude that there is no statistical difference between the means of the two samples 130 III. Dependent samples t test (paired samples t test) Dependent sample t test is commonly used with samples in which the subjects are paired or matched in some way. Dependent samples must have the same sample size, but note that it is possible to have the same sample size without being dependent. Type of dependent samples are: (i) Those characterised by a measurement, an intervention of some type, then another measurement. In this case a paired t test is designed to detect the presence of measurable change in the average attitude/behaviour of group from one point in time to another point in time. (ii) Those characterised by a matching or pairing of observations, e.g. newly- wed couples where husbands are paired with their wives. It tries to answer the Question: Is the mean one (X1) different from mean two (X2)? Procedure We calculate the Mean of the Difference: not the difference between the two means. The idea with the dependent t test is to create a new variable, D, which is the difference between the paired values. You will then be testing the mean of this new variable. Here are some steps to help you accomplish the hypothesis testing • Write down the original claim in simple terms. For example; After > before. • Move everything to one side: After - Before > 0. • Call the difference you have on the left side D: D = After - Before > 0. • Convert to proper notation: • Compute the new variable D and be sure to follow the order you have defined in the third step above. Do not simply take the smaller away from the larger. From this point, you can think of having a new set of values. Technically, they are called D, but you can think of them as X. The original values from the two samples can be discarded. • Find the mean and standard deviation of the variable D. Use these as the values in the t-test Value of t is = note the S.E in this formula 131 s.d= Note that this formula is similar to the formula for s.d. d= Where d = mean of the difference between paired or related observations s.d = standard deviation of the distribution of the differences between paired or related observations Worked example: Paired t test A study was carried to measure the effect of a fitness campaign for OUT students at Dodoma regional centre. Five students were randomly sampled and their weights (in Kg) were recorded before and after the exercise as presented in the following table. We determine, using the 0.05 level of significance, whether the campaign had any significantly effect on the students. Table 4.1.5a: Weight of students in Kilogrammes, before and after the fitness campaign. Student Before After A 88.45 89.35 B 76.65 73.93 C 83.00 81.65 D 70.30 68.04 E 76.20 72.57 Table 4.1.5a: Weight of students in Kilogrammes, before and after the fitness campaign. Student A B C D E SD = Before 88.45 76.65 83.00 70.30 76.20 Total Mean After 89.35 73.93 81.65 68.04 72.57 = = D -0.90 2.72 1.35 2.26 3.63 9.06 1.81 d2 0.81 7.40 1.82 5.11 13.18 28.32 = = 1.73 Formula t = = = = = 2.35 132 Critical value at 4 degrees of freedom = 2.132 (This is read from the Students t distribution table at 4 degrees of freedom for a one tailed test) Conclusion Since test statistic (calculated t value: 2.35) > the critical value (2.132), we fail to accept the null hypothesis and accept alternative hypothesis. We conclude that the fitness campaign significantly reduced the weights of students. d) The Analysis of Variance/ F-Test i) Introduction The F-test, commonly known as the analysis of variance (ANOVA), is the most widely used method of statistical analysis of quantitative data. It is important that every researcher doing quantitative studies should be familiar with this technique. The ANOVA calculates the probability that differences among the observed means could simply be due to chance. The ANOVA is based on Fstatistical distribution. It is closely related to Student's t-test, but whereas the t-test is only suitable for comparing two treatment means, the ANOVA can be used both for comparing several means and in more complex situations, provided that certain assumptions are met. The t statistic and F ratio share a specific relationship when only two means are present. The relationship is = t, or t2 = F ratio. The ANOVA partitions the total variation into a number of parts such as Treatment, Block, Error and Total, depending on the design of the experiment. The assumptions that have to be met include; 2. Data be in interval or ratio scale 3. The population where the sample were drawn have equal variances 4. That the observations are independent 5. That the residuals (deviations from group means) have a normal distribution 6. Data were randomly sampled In some cases a scale transformation is necessary in order for the assumptions required for a valid ANOVA are met. ii. Variations of ANOVA 133 There are several variations of the ANOVA used according to the number of factors 8, treatments and other sources of variation. • The completely randomised design (CRD) this is the simplest one factor experimental design that is only used when experimental units are homogenous. For instance, nutrient medium in Petri dishes is homogenous mixture that can be used for testing the sensitivity of a certain bacteria to different drugs or different levels of a single drug. • Blocked or stratified designs: These are the most commonly used in ANOVA. The designs include randomised block and the Latin square. They break the experiment up into smaller "miniexperiments": These designs are usually analysed by a two-way analysis of variance without interaction (see worked example). • Factorial designs: These designs look at the effect of two or more "factors" simultaneously. There can be any number of factors and any number of levels of each factor. Factorial designs provide extra information at little or no extra cost. They are commonly used to: Study any interactions among factors. It is often important to know what factors influence the outcome of an experiment. Increase the amount of information from an experiment without increasing the numbers of experimental units. They are almost like doing two or more experiments simultaneously with the same experimental units. Find the combination of factors which produces most sensitivity in subsequent similar experiments. Blocking and factorial designs are not mutually exclusive. Both can be used in a single experiment We look at the example of a single factor ANOVA using the Randomised complete Block Design (RCBD). 8 A factor is an independent variable within the ANOVA. A factor must have two or more levels within it to be analytically viable. 134 The single-factor ANOVA The RCB design partitions the total variation into parts associated with treatments, blocks and error. The design consists of blocks of equal size, each of which contains all treatment. Blocking technique help to reduce experimental error by eliminating the contribution of known sources of variation among the experimental units. This is achieved by grouping experimental units into blocks such that variability within each block is minimised and variability among the blocks is maximised. In experiments blocking is most effective when the experimental area has a predictable pattern of variability, e.g. slope of field, soil fertility gradient etc. Table 4.1.6: Layout of a Randomised Complete Block experiment with four treatments (A, B, C & D) and four blocks/replications. Note that each treatment appear once in each block. Block I A C D B Block II C B A D Block III C D B A Block IV B A D C Predictable direction of variation in fertility or soil moisture Worked example: the Single factor two way ANOVA A field experiment was carried out to test the effect of fertilizer application on the yield of rice. There were three treatments9 namely; (i) no fertilizer, (ii) fertilizer A and (iii) fertilizer B. The No fertilizer serves as a control treatment. The experiment was laid out in a Randomised Complete Block Design with four replications. In this case we test whether there are any statistical differences among the means for treatments and blocks/replications. This is an example of a two way ANOVA with two sources of variations i.e. blocks and replications. Null hypothesis H0: µ1 = µ2 = µ3 Alternative H1: The mean of at least one treatment is different. 9 Treatment refers to the cause or specific source of variation in the data. In experiments it is a procedure whose effect is being measured, in this case fertiliser application. 135 Table 4.1.7: Yield of rice (Kg/plot) following application of fertilizer in a field trial laid out in Randomised Complete Block Design. Treatments No fertilizer Fertilizer A Fertilizer B Treat. Totals The Grand total: Replication 1 2 3 4 Rep. 6.0 6.9 7.2 20.1 6.5 7.0 7.8 21.3 5.5 6.6 6.8 18.9 6.4 7.5 7.4 21.3 Totals 24.4 28.0 29.2 81.6 = 6.0 + 6.4 +… 6.8 = 81.6 (x)2/n = Correction factor (CF) = (81.6)2/12 = 554.88 Total sum of squares = 6.02 + 6.42 +…..6.82 - CF 559.56 – CF = 4.68 Replication sum of squares = ((20.12 + ... 18.92) / 3) – CF = (1668.60 / 3) – CF = 556.2 – CF = 1.32 Treatment sum of squares = ((24.42 + 28.02 + 29.22) / 4) – CF , = (2232/4) – CF = 558.00 – CF = 3.12 Error Sum of Square = Total SS – (Rep SS + Treat SS) = 4.68 – (1.32 + 31.2) = 0.24 Table 4.1.8: Analysis of variance of data in Table 4.1.5a & b from single factor two way Randomised Complete Block Design. Source of variation Replication Treatment Error Total Conclusion Degree of Sum of Mean SS F-Value Table F-value freedom 3 2 6 11 squares 1.32 3.12 0.24 4.68 0.44 1.56 0.04 11.00 39.00 (p ≤ 0.05) 4.76 5.14 (ii) Since calculated F value 39.0 (test statistic) for treatments > table value 5.14 (critical value) we fail to accept Ho. We accept Hi: and conclude that the means for treatments/fertilizers differ significantly at p ≤ 0.05. 136 (ii) You will also notice that the calculated F value 11.0 (test statistic) for replications > table value 4.76 (critical value) we fail to accept Ho. We accept Hi: and conclude that the means for replications differ significantly. From the above example the ANOVA give the probability (p ≤ 0.05) that the observed differences among the means for the three treatments (Fertilizers) and replications could have arisen by chance sampling variation. Other examples of situations where ANOVA can be used Case one: A certain supermarket has shops in four locations namely Temeke, Upanga, Kariakoo and Ubungo. The manager recorded weekly sales of one product for each shop in the following 11.9 below. Table 4.1.9: Number of product sold at four supermarkets over a period of four weeks Week The number of product sold in each week Temeke Upanga Kariakoo Ubungo 1 124 155 311 170 2 234 208 350 188 3 430 234 298 168 4 120 249 330 174 In this case the manager can use a single factor two way ANOVA to test difference in mean sales from the four shops across the four weeks and also test whether there is any difference in weekly sales across all shops. Case two: In recent years, affordable motorcycles, nicknamed ‘Yebo-yebo’, were imported and sold throughout the country where some of them are used to carry passengers. However, most regions have recorded an upward trend in the number of accidents involving motorcycles. As the new Chief of Police in Dar es Salaam you would like to determine whether there is a difference in the mean number of accidents involving motorcycles in three districts. Suppose the record of the number of accidents reported in each district for a sample of seven days is as follows; Table 4.1.10: Record of motorcycle accidents in three districts Day Number of motorcycle accidents reported Temeke Ilala Kinondoni 137 1 21 12 18 2 13 14 21 3 18 13 15 4 19 15 20 5 18 12 22 6 11 10 18 7 16 11 16 The Chief of police can use a single factor two- way ANOVA to test whether there is any differences in the means for three districts and for the seven days period. e) Mean Separation and Scale transformation i) Mean separation The ANOVA technique does not indicate which mean differs from which. Further analysis is required to determine which means are significantly different. There are two strategies that can be used: i) Planned, single degree of freedom F- tests (orthogonal contrasts) and ii) Multiple comparisons (means separation). The mean separation techniques include; Least Significant Difference, Student-Newman-Keuls, Duncans Multiple Range Test, etc. ii) Data transformation (Scale transformation) A scale transformation may be needed to improve the situation if the assumptions about normality of the residuals and homogeneity of variances are not met. Unfortunately, there is no general rule on how much of a departure from normality and homogeneity of variance there has to be to make a transformation of scale or the use of non-parametric methods necessary. In this case it would probably be sensible to transform the data (see below), but it is a borderline case. Three transformations commonly used are: • The logarithmic transformation for skewed measurement data Biological data often has a skewed distribution, particularly when the concentrations of something are being measured. Concentrations cannot be less than zero, but often there are a few high values. Taking the logs (to any base, but usually to base 10) will often result in a better fit to the assumptions. All the statistical analyses would then be done on the logarithm of each observation, but 138 in presenting the results, the means should be back-transformed by taking antilogs. However, the standard deviations cannot be treated in this way. If there are some numbers below one, you can avoid negative numbers by adding one before taking logs, and subtracting it again after taking the antilogs. • The square root transformation for counts Counts where the mean count is low (e.g. where a lot of the counts are 0, 1, 2 or 3) often have a Poisson distribution where the mean is equal to the variance. A square root transformation will normalise the residuals, i.e. each count should be replaced by its square root. Sometimes one is added to each number before taking square roots. • The logit transformation for percentages Percentages where a large proportion of the values are either less than 20% or greater than 80% have a skewed distribution because it is not possible to have values of less than 0% or greater than 100%. The logit transformation is X=ln{p/(1-p)}, where p is the proportion, should correct this situation. f) Regression and Correlation analyses In the previous sections, we showed procedures used for comparisons of means and analysis of variance. The procedures were used to evaluate only one variable at a time. However, when you suspect interrelationships and association to occur among different variables then you should also consider possibility of using other data analysis techniques. Interrelationships of quantitative data can be examined using regression and correlation analysis. i) Regression analysis Regression analysis is used when the aim is to study a cause-and-effect relationship between variables. Regression analysis describes the effect of one or more variables (independent variables) on a single variable (dependent) by expressing the later as the function of the former. It is therefore important that you distinguish between the dependent and the independent variable before you perform this analysis. The relationship between the dependent variable and independent variable(s) may be specified based on (i) accepted biological concepts, secondary data or past experience or (ii) based on the data gathered in the experiment itself. ii) Correlation analysis 139 Correlation analysis is used to test the nature and strength of association between two or more variables even when there is no evidence of a cause-and-effect relationship between variables. Regression and correlation procedures can be classified according to the number of variables involved and the form of functional relationship between the dependent and independent variables. The types are; i) Simple linear regression and correlation, ii) Multiple linear regression and correlation, iii) Simple non linear regression and correlation and iv) Multiple non linear regression and correlation (Gomez and Gomez, 1984). In this section, we describe the procedure used in simple linear regression and correlation because of its simplicity and its wide usage in research. There is only one independent and one dependent variable and the functional relationship is assumed to be linear. • Simple linear Regression analysis This technique deals with estimation and tests of significance concerning the two parameters α and β in equation Y= α + βx when a linear relationship exist. However, it does not provide any test as to whether the best functional relationship is indeed linear. We illustrate the procedure for simple linear regression analysis. Worked example: simple linear regression analysis We use hypothetical data10 presented in the following table; Table 4.1.11: Five rates of fertilizer applied in a carrot farm and the corresponding yield of carrots realised Fertilizer rates (Kg/Ha) 5 10 15 20 25 Carrot yield (Kg/Plot) 3 6 8 10 10 (a) 10 Computation of the simple linear regression equation between the two variables Usually crop response to fertiliser application become negative at higher level of the fertiliser and hence the relationship becomes non linear. 140 Mean X = 15 = 1375 = 309 β= = = Formula for α= = = 0.36 = = 2.0 Simple linear regression equation: (Y= α + βx) = Y = 2 + 0.36x This equation can be used to predict the yield of carrot that may be realised following application of different levels of fertiliser. For instance, if we want to estimate the yield of carrot when 22 kg/ha of fertiliser is applied. Then Y=2+ 0.36 X 22, Y=2+7.92, Y= 9.92. Note that the yield of 9.92kg/ha is lower than the yield of 10kg/ha obtained with application of 20 kg/ha, implying that yield of carrot cannot be improved by additional increase in the fertiliser rate beyond the 10 kg/plot. • Simple Linear Correlation analysis This technique deals with estimation and tests of significance of the simple linear correlation coefficient r. The correlation coefficient (r) indicates the strength (-1.0 to +1.0) and nature (positive or negative) between variables. You should note that both -1 and +1 equally indicate perfect linear association. The value of r = 0 indicates no linear relationship, r > 0 indicates positive linear relationship and r < 0 indicates negative linear relationship. For example, an r value of 0.8 imply that 64% [(100)(r2) = (100)(0.8)2 = 64)] of the variation in the variable Y can be explained by the linear function of the variable X. It should be noted that a zero r value indicates the absence of a linear relationship between two variables but it does not indicate the absence of any relationship between the variables. It is possible for two variables to have a non linear relationship, such as quadratic form (Gomez and Gomez, 2004). Worked example: simple linear correlation analysis We use the data from the previous section to illustrate the procedure for estimation and test of significance of the simple linear correlation coefficient r between two variables X and Y is as follows; 141 Formula for simple linear correlation coefficient r = r= = = = = 0.99 Testing the significance of the simple linear correlation coefficient This can be done by evaluating the ratio of r and the standard error of the estimate and then use the t test with n-2 degrees of freedom. However, in practice we usually compare the calculated r value to the r value from a table of Simple linear coefficients (e.g. Gomez and Gomez, 1984: Appendix H; Zar, 1996: Appendix B). The table values at 3 (i.e. n-2) degrees of freedom are 0.878 at the 0.05 level of significance and 0.959 at the 0.01 level. The calculated r value is declared significant if it exceeds the table value at a specific level of significance. Hence, we can conclude that the calculated r value is significant different from zero at the 0.01 probability level. This implies that there is strong evidence that the two variables, i.e. carrot yield and fertiliser applied, are highly associated. You should note that a significant correlation coefficient r value does not always imply a cause-effect relationship. Notes: Pearson r is a correlation coefficient that may be used with interval or ratio scaled data, whereas Spearman rs is ideal for ranked or ordinal data and also nominal data g) Non parametric techniques Non-parametric methods are used when the assumptions of normality of the residuals and equal variation in each group, required for parametric methods such as the t-test and ANOVA, are not met and cannot be met by a suitable transformation of scale. The more widely used methods replace each observation by its rank in the total set of observations. This comes with a cost. In general, non-parametric tests are not as powerful as parametric tests because, in transforming the data to ranks, they throw out some useful information. This means that a non-parametric test may fail to detect a true treatment effect which could be detected by a parametric method. Therefore, where possible, parametric methods should be used. i) Advantages of using non parametric test • Usually distribution free. Do not posses any underlying assumptions that must be met before they can be applied to the data • Can be used to analyse data that are not precisely numerical. e.g. interval or ratio data 142 • Ideal for analysing data from small samples • Generally easy to calculate ii) Disadvantages of non parametric test • They • The are less statistically powerful than parametric tests scale of measurement analysed by non parametric tests are less sensitive than those analysed by parametric tests iii) Examples of Non Parametric test The commonly used non parametric tests are: • The Mann-Whitney test: This is equivalent of Student's t-test, but it is a test of whether the medians (rather than the means) are the same in each group. It can be used to test whether two independent samples have been drawn from the same population. Like the t-test it is only appropriate when comparing two groups. • The Kruskal-Wallis test: This is the equivalent of the one-way ANOVA. It can be used to compare the medians of two or more groups, the null hypothesis being that all are samples from the same population. If the over-all differences are statistically significant, post-hoc comparisons are done by comparing each pair of medians in turn using the same test. • The Friedman's test: This is the equivalent of the two-way ANOVA without interaction, appropriate for a randomised block experimental design. Again, it tests the null hypothesis that all treatment groups came from the same population, and is a test of group medians, removing any block effect. • Chi-square test: We describe in detail this procedure in the next section. The Chi square test The Chi square test is one of the non-parametric11 tests commonly used for statistical inference. Chi-square test Formula: χ2 Chi-square distribution 11 Non parametric test: Is an inferential test that, one that makes few or sometimes no assumptions regarding any numerical data or the shape of the population from which the observation were drawn (Dunn, (2001), other non parametric test include the Mann-Whitney test and Friedman’s test. 143 A distribution obtained from the multiplying the ratio of sample variance to population variance by the degrees of freedom when random samples are selected from a normally distributed population Application of the χ2 test Chi square may be used to tests whether obtained observations (of categorical data) conform to or diverge from the population proportions specified by the null hypothesis. In this case goodness of fit test is used. The Chi-square may also be used to determine whether frequencies associated with two nominal variables (with two or more categories each) are statistically independent of one another or dependent on each other. In this case Chi-square test of independence is used. In the next section we describe the two variations of the Chi-square in detail. Variation of the χ2 test I. Chi- square Goodness-of-fit Test The idea behind the chi-square goodness-of-fit test is to see if the sample comes from the population with the claimed distribution. Another way of looking at that is to ask if the frequency distribution fits a specific pattern. Two values are involved, an observed value, which is the frequency of a category from a sample, and the expected frequency, which is calculated based upon the claimed distribution. The idea is that if the observed frequency is really close to the claimed (expected) frequency, then the square of the deviations will be small. The square of the deviation is divided by the expected frequency to weight frequencies. A difference of 10 may be very significant if 12 was the expected frequency, but a difference of 10 isn't very significant at all if the expected frequency was 1200. If the sum of these weighted squared deviations is small, the observed frequencies are close to the expected frequencies and there would be no reason to reject the claim that it came from that distribution. Only when the sum is large there is a reason to question the distribution. Therefore, the chi-square goodness-of-fit test is always a right tail test. The test statistic has a chi-square distribution when the following assumptions are met 1) The data are obtained from a random sample. 2) The expected frequency of each category must be at least five (5). This goes back to the requirement that the data be normally distributed. You're simulating a multinomial experiment (using a discrete distribution) with the goodness-of-fit test (and a continuous distribution), and if each 144 expected frequency is at least five then you can use the normal distribution to approximate (much like the binomial). The following are properties of the goodness-of-fit test • The data are the observed frequencies. This means that there is only one data value for each category. • The degrees of freedom are one less than the number of categories, not one less than the sample size. • It is always a right tail test. • It has a chi-square distribution. • The value of the test statistic doesn't change if the order of the categories is switched. The test statistic is χ2 Worked example: Chi Square test for Goodness of fit Four similar products (e.g. brands of toothpastes/soap/carbonated drinks) are displayed for sale in a shop. The shop manager wants to find out whether or not the four similar products are equally preferred by customers. S/He recorded sales of the different products over a specific period and present the data in the following Table; Table 4.1.12: Amount of four products sold over a period of four months in a shop Amount of products sold A B C D Total 249 161 347 243 1000 In this case, if the four products were equally preferred we should expect the sales to conform to a ratio of 1:1:1:1. (That is 250 units of each product). We perform a test, using the 0.05 level of significance, to determine if customers equally prefer the four products. We follow the steps involved in hypothesis testing as follows; State the null and alternative hypothesis • Null hypothesis (Ho): the four products are equally preferred hence X1=X2=X3=X4. • Alternative hypothesis (Hi): the four products are NOT equally preferred hence X1≠X2≠X3≠X4 145 Traditional method χ2 Formula : The expected numbers (under the null hypothesis) in each cell are equal to Total x as we are dealing with a ratio of 1:1:1:1 Expected numbers for cell A above = 1000 x = 250 Table 4.1.13: Chi-square calculations Product Observed A B C D Total (Obs) 235 275 247 243 Expected (Obs-Exp) (Obs-Exp)2 (Exp) 250 250 250 250 1000 15 -25 3 7 0 225 625 9 49 0 0.900 2.500 0.036 0.196 3.632 Calculated χ2 value (test statistic) = 3.632 The χ2 table value (critical value) at 0.05 = 3.841 (This value is read from a Chi-Square in the column under P=0.05 at 1 degree of freedom). Conclusion: Since calculated χ2 value (3.632), the test statistic, is less than the critical value at 0.05 (3.841), we fail to reject the null hypothesis (Ho) and conclude the sales of the four products do not deviate significantly from the expected ratio of 1:1:1:1 that is 250:250:250:250. We conclude that the four products are equally preferred by consumers. Note this example can also be used to test other similar hypotheses such as i) Test whether heritability of a character is controlled by a single dominant gene based on the number of different phenotypes in filial one generation of crosses between the two parents. The phenotypes of all filial one progenies usually conform to a 1:1 ratio and the ratio of Filial 2 progenies should conform to a 3:1 ratio. 146 II. Chi- Square test for independence In the test for independence, the claim is that the row and column variables are independent of each other. This is the null hypothesis. The multiplication rule said that if two events were independent, then the probability of both occurring was the product of the probabilities of each occurring. This is key to working the test for independence. If you end up rejecting the null hypothesis, then the assumption must have been wrong and the row and column variable are dependent. Remember, all hypothesis testing is done under the assumption the null hypothesis is true. The test statistic used is the same as the chi-square goodness-of-fit test. The principle behind the test for independence is the same as the principle behind the goodness-of-fit test. The test for independence is always a right tail test. In fact, you can think of the test for independence as a goodness-of-fit test where the data are arranged into table form. The table is called a contingency table. The test statistic has a chi-square distribution when the following assumptions are met • The data are obtained from a random sample • The expected frequency of each category must be at least 5. • χ2 The following are properties of the test for independence • The data are the observed frequencies. • The data are arranged into a contingency table. • The degrees of freedom are the degrees of freedom for the row variable times the degrees of freedom for the column variable. It is not one less than the sample size, it is the product of the two degrees of freedom. • It is always a right tail test. • It has a chi-square distribution. The expected value is computed by taking the row total times the column total and dividing by the grand total. The value of the test statistic doesn't change if the order of the rows or columns is switched. The value of the test statistic doesn't change if the rows and columns are interchanged (transpose of the matrix). 147 Worked example: Chi Square test for Independence An ecological study was carried out to determine whether there is any association between two plant species in Serengeti plains. The researcher randomly threw a 1m x 1m sampling frame several times and recoded the presence and or absence of the two plant species in the samples as follows; Table 4.1.14: Data collection sheet: Showing the presence or absence of the species A and species B in each sampling frame S/N Species A Species B Remarks 1 1 0 Added in cell B 2 0 1 Added in cell C 3 1 1 Added in cell A 4 0 0 Added in cell D 450 1 0 Added in cell B Table 4.1.15: Frequency of counts of absence or presence of species A and Species B in sampling frame Plant species A Present Absent Totals This table is called a contingency table. Plant species B Present Absent 90 (A) 181 (B) 66 (C) 113 (D) 156 294 Totals 271 179 450 If we want to perform a test, using the 0.05 level of significance, to determine if there is any association between the two species, then we need to follow the steps involved in hypothesis testing as follows; State the null and alternative hypothesis • Null hypothesis (Ho): the presence of Species A is associated with the presence of Species B • Alternative hypothesis (Hi): the presence of Species A is NOT associated with the presence of Species B Shortcut method (You are strongly discouraged from using the shortcut method, instead use the traditional method) Formula χ2 = 148 ≈ 0.639 Traditional method Formula : χ2 The expected numbers (under the null hypothesis) in each cell are equal to Expected numbers for cell A above = Calculations S/N Observed A B C D value (Obs) 90 181 66 113 Totals = 62.05 Expected Difference d2 d2/Exp value (Exp) 93.95 177.10 62.05 116.95 (d) -3.95 3.95 3.95 -3.95 0.00 15.60 15.60 15.60 15.60 0.166 0.088 0.251 0.133 0.639 Calculated χ2 value (test statistic) = 0.639 The χ2 table value (critical value) at 0.05 = 3.841 (This value is read from a Chi-Square in the column under P=0.05 at 1 degree of freedom) Conclusion: Since calculated value (0.639), the test statistic, is less than the critical value at 0.05 (3.841), we accept null hypothesis (Ho) and reject alternative hypothesis (Hi), and conclude there is no association between the two species Note this example can also be used to test other similar hypotheses such as whether; 1.5 Review exercises 1. According to the Mendelian genetic model, a certain plant should produce offspring that have white and red flowers, in the proportion of 75% and 25%, respectively. A sample of 100 such offspring was coloured as follows: White 79 and Red 21. Using an appropriate test, can you reject the Mendelian hypothesis at the 5% level? 149 2. A survey was conducted to investigate whether or not alcohol consumption is associated with cigarette smoking. The following information was compiled for 600 individuals. Using p=0.05, test the hypothesis that smoking and alcohol consumption are independent. What are the null and alternative hypotheses? Drinker Non Drinker Smoker 193 89 Non Smoker 165 153 1.6 Summary This lecture has illustrated different statistical techniques for various types of research. The diversity indicates the availability of appropriate statistical techniques for most research problems. However, the diversity also indicates the difficulty of matching the best technique to a specific research problem. Choice of the correct statistical procedure for a given research must be based on expertise in statistics and in the subject matter of the research. You should attempt to seek assistance if you are not comfortable with statistics techniques. Table 4.1.11 and 4.1.17 should orient you on choice of correct statistical procedure to use for the different types of data collected. It is strongly advised that before collecting one datum, you should complete an analysis plan that promotes identifying necessary statistical tests in advance. This lecture covered descriptive and inferential statistics techniques commonly used to handle quantitative data Descriptive statistics are very useful in most studies. They help to describe, organise and summarize the main characteristics of sample data. They give a general picture of your data to allow for additional data manipulation. The arithmetic mean and the standard deviation are the most useful measure of central tendency and variability, respectively. Other measures of central tendency include the mode and median. The measures of variability e.g. standard deviation account for the way data in the sample or population deviate from the relevant measure(s) of central tendency. Variability is low when the spread of scores around the mean is small. Inferential statistics are used in hypothesis testing, generally to demonstrate mean differences. Hypothesis testing compares sample data and statistics to either known or estimated population parameters. There are two types of hypotheses, conceptual/theoretical and statistical. Conceptual hypotheses identify predicted relationships among independent variables and dependent measures. Statistical hypotheses test whether the predicted relationships are mathematically supported by the 150 existing data, that is, do differences based on sample statistics reflect differences among the population parameters? For instance, an experiment usually results in some means or affected proportion of different groups such as control and treated animals. Means will differ because each animal is different. Proportions affected could differ by chance. Means and proportions may also differ as a result of the treatment. The aim of the statistical analysis is to calculate the probability that differences as great as or greater than those observed could be due to chance. If this probability is high, then chance may be the explanation, if it is low then a treatment effect may be the explanation. Various inferential statistical tests are based on a conceptual model where between-groups variation (attributed to an independent variable) is divided by within-groups variation (the error term or the degree of similarity observed in each group). Generally, researchers want to obtain a large amount of between-group variation relative to a small amount of within group variation. In this lecture, we have illustrated data calculation by using statistical calculators however, these days the actual calculations are almost always done using a computer (see Lecture 12). Table 4.1.16: Research design and some selected tests available to analyse their data (after Dunn, (2001) Research Design Non Parametric test for Nominal data Ordinal data Parametric tests One sample χ2 goodness of fit - One sample t or z test Two independent χ2 test of Mann-Whitney U Independent groups t samples Two dependent samples independence - test Wilcoxon test Dependent groups t matched pairs test More than three χ2 test of signed rank test - One way ANOVA independent samples Correlation independence - Spearman rx Pearson r Table 4.1.17: Examples of commonly used statistical tests ad their application 151 Type of test Simple Linear regression Application Examines changes in level of variable Y relative to analysis changes in the level of variable X. Predict the value of a dependent variable from one or more independent variables. Simple linear correlation analysis Assess strength and nature of linear relationship between Independent group t test two variables. Examines mean differences between sample means from Single or one sample t and Z test two groups Determine whether observed sample mean represents a Dependent group t test population Determine differences between two means drawn from Analysis of Variance (ANOVA) the same sample at two different point in time Determines the differences means of two or more levels Chi-square (goodness of fit) test of an independent variable Examine whether categorical data conform to Chi-square test of independence proportions specified by a null hypothesis Determines whether frequencies associated with two Kolmogorov-Smirnov or Mann- nominal variables are independent Can be used to test whether two groups (categories) are Whitney U test different 1.7 Additional review exercises 1. Define the word ‘significant’ in statistical context. 2. Explain the difference between statistical hypothesis and conceptual/theoretical hypothesis 3. Describe using your own examples when a researcher should use the dependent t-test, one way ANOVA and Regression analysis. 4. Using your own example, briefly outline the steps involved in testing of hypothesis. 5. The district education officer wants to confirm the allegation that O-level students in the district performed miserably in last year English examination. Suppose 10 students are randomly picked, and their scores are 76, 77, 75, 58, 57, 79, 54, 67, 79 and 94. Test whether or not the district performance is significantly different from the national average of 82.7 using the 5% level of significance. 152 6. Study the data presented in the following table and (a) Determine the mean, Standard deviation, range and mode. (b) Using five classes, summarize the data in a histogram. 35 55 43 51 7. 46 43 64 50 63 42 49 66 69 59 39 63 54 45 59 57 50 44 60 56 62 57 42 51 68 47 60 38 38 48 42 61 40 46 38 54 A random sample of 10 observations from one population revealed a sample mean of 23 and a sample standard deviation (SD) of 4, A random sample of8 observation from another population revealed a sample mean of 26 and a sample SD of 5. At the 0.05 significance level, test if there is a difference in the population means. 1.8 References Aczel, A. D. (1999). Complete Business Statistics. Irwin McGraw Hill. Dunn, D. S. (2001). Statistics and Data Analysis for the Behavioural Sciences. Irwin McGraw Hill. Gomez, K. A. and Gomez, A. A. (1984). Statistical Procedures for Agricultural Research (2ed). An International Rice Reaseerch Institute Book. A Wiley-Interscience Publication, John Wiley & Sons. James, J. (2009). Introduction to Applied Statistics: lecture notes http://people.richland.edu/james/lecture/ last visited in January, 2009. Kothari, C. R. (2009). Research Methodology: Methods and Techniques. New Age International Limited Publishers. Mason, R. D., Lind, D. A and Marchal, W. G. (1999). Statistical Techniques in Business and Economics. Irwin McGraw Hill. Saunders, M. N. K., Lewis & P., Thornhill, A. (2009). Research methods for business students. (5ed). Harlow: Prentice Hall – Financial Times. Zar, J. H. (1996). Biostatistical Analysis. Prentice Hall, Inc. New Jersey. 153 LECTURE TWO ANALYSIS OF QUALITATIVE DATA (By Dr. D. Ngaruko) 2.1 Introduction In previous lectures it was pointed out that we use qualitative research techniques if we wish to obtain insight into certain situations or problems concerning which we have little knowledge. Qualitative techniques such as the use of loosely structured interviews with open-ended questions, (focus) group discussions, observations, projective and participatory approaches will therefore be appropriate in many studies, especially at the onset. For sensitive topics they may be the only reliable techniques. Irrespective of how and for what purpose the data has been collected, the researcher usually ends up with a substantial number of pages of written text that needs to be analysed. Although procedures and outcomes of qualitative data analysis differ from those of quantitative data analysis, the principles are not so different. IDRC-Science for humanity 12 identifies five principle stages of the qualitative data analysis process i.e. • describe the sample populations; • order and reduce/code the data (data processing); • display summaries of data in such a way that interpretation becomes easy, e.g., by preparing compilation sheets, flowcharts, diagrams or matrices; • draw conclusions, relate these to the other data sets of the study and decide how to integrate the data in the report; and • if required, develop strategies for further testing or confirming the (qualitative) data in order to prove their validity. Throughout this lecture we will look into details at each of these points. 12 IDRC is a Canadian Crown corporation that works in close collaboration with researchers from the developing world in their search for the means to build healthier, more equitable, and more prosperous societies: http://www.idrc.ca/en/ev56452-201-1-DO_TOPIC.html 154 2.1 Learning outcomes At the end of this session you should be able to: • Describe efficient ways of ordering and summarising qualitative data. • Indicate why it is essential to start summarising and analysing during the field work. • List the major steps in analysing qualitative data and drawing conclusions. • Make an outline of how you will proceed with the ordering and summarising of your qualitative data, and with the subsequent analysis. • Plan on how to report your qualitative data, integrated in the most effective way with your other data. • Indicate, either now or at the end of data analysis, what additional activities you will undertake to test or confirm your findings in order to prove their validity. 2.2 Procedures for processing and displaying of qualitative data This section covers four important sub-sections sections describing efficient ways of ordering, coding and summarising qualitative data. That is, description of the sample population; content analysis; and summarising data in compilation sheets, matrices, figures and table. 2.2.1 Description of the sample population in relation to sampling procedures This is the first step in data processing (as well as in the reporting of findings) is a description of the informants. If numbers allow, relevant background data may be tabulated, for example on age, sex, occupation, education or marital status, as is the practice in quantitative studies. However, as qualitative data originates from small samples (sometimes a handful of key informants or focus group discussions and observations) more information is required to place the data in its context. For example, who were the key informants, what made you decide to choose them? Who took part in the focus group discussions? How were the participants of the groups selected and how representative are they for your study population? For observations: under what circumstances were they carried out? Who were observed, and by whom? Unless this type of information is provided, interpretation of data may appear haphazard. 2.2.2 Content analysis: Ordering and coding of data 155 Content analysis is the most important purpose of the analysis which principally involves ordering. By counting the answers under each label, however, the researcher gains insight as well in how common the different reasons are. In this section we will discuss two types of qualitative data: answers to open questions, and more elaborate narratives from loosely structured interviews or FGDs. Coding is the conversion of the verbatim answers to categorical data. A simplifying frame must be imposed on the large variety of real-life situations. One way is to provide precoded answers for the respondents to choose from. Another way, which is of more relevancy in qualitative data analysis, is to record verbatim answers and then to fit them into categories devised after the event (a coding frame). 2.2.2.1 Answers to open questions The most commonly collected qualitative data are the answers to open questions. This is more true of the more probing questions beginning with “Why ….”. Ordering and coding of data is also common in such questions with added categories like “Other answers, please specify…” in case the preferred list is not exhaustive. The answers are systematically ordered of the data in question. Let us take an example of the study on answers to the question ‘Why are you smoking?’ which we will discuss in depth again to analyse the different steps of analysing open ended questions: STEP 1: Listing - A first, basic step in the analysis of answers to open questions is to list the answers of a sample of 20-25 informants as they were provided (adding the questionnaire number in order to avoid losing the connection with the informant’s other data). STEP 2: Reading - Then read the answers carefully, remembering the purpose of the question. The question ‘why are you smoking’ was supposed to help nursing students to develop an intervention against smoking. STEP 3: Coding - Make rough categories of answers that seem to belong together and code them with a key word. For example, answer 3 (It gives me pleasure) and answer 14 (I like to blow smoke rings) could be labelled with the term ‘pleasure’, which could be abbreviated with the code pleas. STEP 4: List per code - Then list again all answers but now per code, so that you get some 5-7 short lists, for example: 156 STEP 5: Interpretation - Then interpret each list, and end up with some 5-7 meaningful categories with a characteristic key word. For example: Pleasure, being sociable, giving status, giving selfconfidence, addiction, defiance. There may be discussion on the need to split up some categories or combine others with few answers. Answers 17 and 18, for example could be put in a separate category reducing stress. In that case there would be seven categories. The category defiance may have two answers: for instance answer number 4- I do not see why I would give up smoking and number 12 - Why not? The exclamation marks indicate that defiance rather than lack of knowledge forms the motivation for the answer. Without this addition by the interviewer, these answers would have been difficult to code. Now you can make a tentative interpretation according to the assumed willingness of your informants to change their behaviour. For those who smoke for pleasure or to socialise it might be most easy to give up smoking. Those who are addicted but tried to stop and those who feel they derive status from smoking might form a middle category, whereas for those who smoke to enhance their selfconfidence and reduce stress or who are very defiant at the question why they smoke, it might be most difficult to stop. Now try a next batch of 20-25 answers and check if the labels work. It is well possible that at this stage still some labels will be changed or that you decide to add new categories or combine others. STEP 6: Final list - Make a final list of labelled categories and code all data including the data you already processed with the abbreviated codes. Then discuss whether you will stick to your tentative interpretation of the data and what this means for the content of the messages to address different reasons for smoking. 157 2.2.2.2 Elaborate narratives This is another way of undertaking content analysis. The data from interviews with key informants or focus group discussions (FGDs) are as a rule more bulky than answers to open questions. The carefully transcribed field notes and tapes may consist of pages of narrative text. When analysing the texts we usually discover that, no matter how good our guidelines for the discussion were, the data contain valuable information but also a number of less essential details. In addition, the data is usually not presented in the order we need for our analysis, since informants may jump from one topic to the other. To make the analysis easier, we have to order and reduce the data. Ordering is best done in relation to the objectives and the discussion topics. Again, let us systematically follow a number of steps: STEP 1: Reread your objectives and discussion topics and then carefully read a number of the interviews, FGDs or narrative observations you want to process. Number the material according to the broad discussion topic it pertains to. Use a yellow marker to highlight particularly illustrative remarks. Use the margins to define sub-topics. For example, in a gender and leprosy study carried out in different countries (adopted from http://www.idrc.ca/en/ev-56451-201-1-DO_TOPIC.html: Module 4) it appeared that the discussion topic stigma had to be differentiated according to different social settings in which it occurred: among close relatives (parents-children), spouses, in-laws, and community members. Further, a distinction had to be made between self-stigmatisation (e.g., a wife diagnosed as a leprosy patient encouraging her husband to marry a second wife in order to prevent divorce, or a patient not attending community meetings for fear of being avoided) and stigmatisation by others. Different degrees of severity in stigmatisation could also be distinguished, varying from slight avoidance to complete expulsion. If stigma would be topic (6) in your discussion list, you would mark everything related to stigma with a (6) in the margin, and add key words such as self-stigm., spouse, in-laws, comm., in the margin, as well as key words such as sleep(ing) sep(arately) or divorce indicating the severity of the stigma. STEP 2: List key words - List all key words that belong to a certain topic in the sub-categories that have been developed under step 1. For instance, everything belonging to stigma could be subdivided and listed in the four major social settings in which stigma was found to manifest itself. 158 STEP 3: Interpret the data – At this level distinguish the major forms in which stigma manifests itself in these different social settings, try to make a ranking order of severity and link it to other variables (such as degree of deformity, socio-economic status) in order to understand differences in stigma. STEP 4: Code data – The fourth step is to code all your qualitative data. If necessary, adapt your coding scheme as you order, code and interpret more data. In that case, you should again read and possibly re-code the material you have already processed. Note: You may already have analysed and coded your qualitative data in the field in order to adjust and deepen your interview guides or topic lists. In that case it may be possible to develop your final coding list in one cycle instead of two. However, instead of developing a very detailed coding system on your rough data, you may also refine your interpretation as you record your roughly coded, summarised data in compilation sheets, which we are going to cover in the next section. 2.2.3 Summarising data in compilation sheets This is the next stage in the qualitative data analysis process. After ordering and coding the data you have to summarise them. A useful first step is summarising all data of each study unit per study population on separate compilation sheets. Like the master sheets for quantitative data, compilation sheets for qualitative data consist of a number of columns with the topics covered by the study as headings. These may be further sub-divided in smaller themes that you identified and coded when ordering the data (see a hypothetical example in table 8.1 below). Each interview, FGD or observation gets a number and is successively entered in that sequence on the relevant compilation sheet. If there are different categories of informants within one study population, for example, young mothers and an older generation of mothers, or male and female patients, the data for these groups are entered on separate sheets. If the topics covered in those subgroups are not completely identical, it is important to be systematic and follow roughly the same sequence of topics for each category of informants. The information inserted is summarised in key words and key sentences, clear enough to remember the statements informants made. (As the number of each study unit is entered in the compilation sheet, it is always possible to go back to the original data and present the full statement, for example in a presentation or in the research report). Now you have an overview of all data per study population on one or more big sheet(s). If you read the columns, you have a list of answers of all group members on a certain (sub-)topic. If you read 159 horizontally, you can per informant relate different topics to each other or to personal characteristics of the informant. It becomes also easy to compare the answers of different groups on specific issues by comparing compilation sheets. Table 4.2.1 is a hypothetical example of the compilation sheet. Table 4.2.1: Example of compilation sheet (gender and leprosy) 160 161 Let us focus on Table 4.2.1 to describe the information contained therein. Table 4.2.1 presents the personal data of leprosy patients (recently declared cured) and a number of topics and sub-topics discussed with them. Stigma actually experienced, which originally was one topic, has in the compilation sheet been subdivided in the four major social settings in which stigmatisation may occur: close blood relatives, marriage, wider circle of spouse’s relatives and community. In each of those still finer distinctions can be made (e.g., community can be neighbours, friends, work mates, school mates or distant community members). As samples are small, these may all be inserted under the heading ‘community’. Codes (italics) can be added to the statements presented in key words, for example big fear and worried under the heading ‘first reaction’. From the three examples presented, it already appears (confirmed by the analysis of all data in all four countries) that in general the stigma feared when patients hear the diagnosis of leprosy is bigger than the stigma in reality experienced. Patient (12) is in this respect an exception. Ironically, the husband who divorced her had already died from another disease at the moment she was declared cured from leprosy. Horizontal comparison of the data of patient (1) teaches us that it is highly unlikely that the man’s friends do not know about the disease, as even after he has been declared cured he has visible signs. Here the researchers had to interview the friends to find out if indeed this man was (or had not been) stigmatised at all by the community. Note that interpretation of data and labeling becomes indeed easy when using compilation sheets, as a researcher can visualise all aspects of his/her informants even if (s)he looks at one aspect at a time for the whole study population. A next step in summarising may be the combination, contrasting or further analysis of important topics through graphical displays such as matrices, diagrams, flow charts and tables. 2.2.4 Summarising of data in matrices, figures and tables 2.2.4.1 Matrices Matrices can be used for quantitative as well as qualitative data comparison. In qualitative data we may compare different groups or data sets on important variables, presented in key words. A MATRIX is a chart that looks like a cross-table, but contains words (as well as, sometimes, numbers). Table 8.3 is a hypothetical example (adopted from IDRC-Science for humanity-Module 2313) of a summarised FGD discussion on changing weaning practices, in which the researchers listed 13 http://www.idrc.ca/en/ev-56467-201-1-DO_TOPIC.html 162 the answers of young mothers concerning the introduction of soft foods and those of mothers above childbearing age. They then summarised these answers in a matrix: Table 4.2.2: Matrix on introduction of soft baby foods among mothers of different age groups This type of display makes it easy for the researcher to conclude that: • younger mothers start giving soft foods, on average, 2.5 months earlier than the generation of their own mothers; • younger mothers use a larger variety of soft weaning foods than women in the preceding generations; and • younger mothers give soft foods to their babies more frequently, but for the same reasons as their mothers did. Matrices facilitate data analysis considerably. They are the most common form of graphic display of qualitative data. They can be used to order and compare information in many ways, for example, according to: •time sequence (of procedures being investigated in different periods, for example), •type of informants (as in the example above), or location of data collection (to visualise differences between rural and urban populations). 163 2.2.4.2 Diagrams A diagram is a figure with boxes containing variables and arrows indicating the relationships between these variables. When analysing the problems you wanted to investigate during the development of your protocols, most groups developed a diagram. In a similar way diagrams can be developed to summarise findings of a study. (See Figures 8.1 and 8.2). You might use a diagram to illustrate a crucial issue in your study, combining all available qualitative and quantitative data collected. Figure 4.2.1: Reasons for early introduction of soft foods by young mothers Diagrams, like matrices, can be of great assistance in providing an overview of the data collected and in guiding data analysis. Figure 4.2.2: Reasons for late introduction of soft foods by young mothers 164 2.2.4.3 Flow charts FLOW CHARTS are special types of diagrams that express the logical sequence of actions or decisions. Flow charts are especially useful to summarise different flows of events that are mutually connected. For instance, a counselling team in Bulawayo, Zimbabwe, for example, which interviewed some 95 HIV positive persons in-depth over a period of two years, summarised the roughly 100 pages of interview material for each informant by drawing five lines (see Figure 4.2.3). One central line presented the development of the disease over time, with crises and periods of relative well-being. Another line presented different forms of medical care sought, a third the flaws in economic status connected to the disease (e.g., loss of job, seeking employment elsewhere), a fourth the possible changes in social status such as divorce or (re)marriage, whereas a fifth line presented the patient’s emotional status linked to events occurring in the four other fields (e.g., positive coping, depression). Figure 4.2.3: Flowchart on coping of HIV+ persons with their condition over time 165 166 These flow charts were extremely useful for comparison of data, per informant and between different groups of informants (e.g. males/females, single/married). They highlighted the impact of the disease on the lives of different groups of patients and their way of coping with it14. 2.2.4.4 Tables Qualitative data can also be categorised, coded, inserted in master sheets or computer and counted, together with other quantitative data, and displayed in tables. Answers to open-ended questions in questionnaires will usually be categorised and summarised in this way. However, you will in the first place want to analyse the content of the individual answers in each category. 2.3 Drawing and verifying conclusions Drawing and verifying conclusions is the essence of data analysis. It is not an isolated activity, however. When we start summarising our data in compilation sheets, flowcharts, matrices or diagrams, we continuously draw conclusions, and modify or reject quite a number of them as we proceed. Writing helps generate new ideas as well. Therefore writing should start as early as possible, right from the onset of data processing and analysis, if only for ourselves. No creative insights should get lost. Note: Collection, processing, analysis and reporting of qualitative data are closely intertwined, and not (as is the case with quantitative data) distinct successive steps. It may often be necessary to go back to the original field notes and verify conclusions, collect additional data if available data appear controversial, and get feedback from all parties concerned. 2.3.1 Identifying variables and associations between variables Sometimes we do not know enough about a situation to define variables beforehand. Only during or at the end of the study it will be possible to define certain variables and search for associations with other variables, without having the prior aim of measuring them. Many studies have qualitative parts with open questions, key informant interviews, focus group discussions or observations for the purpose of identifying these variables. The researcher who uses such a qualitative approach should be like a detective who searches for evidence, accounts for countervailing evidence, and verifies the 14 Meursing (1997) A world of silence in ICDR, Canada 167 findings by looking for independent, supporting evidence, until (s)he is confident about possible associations among certain variables which shed light on the problem under investigation. Drawing from an example covered in section 8.2 for example, if we find among the mothers who wean their children early that quite a number have jobs, we may assume that having a job contributes to early weaning. Similar studies carried out elsewhere with similar findings support this assumption (independent evidence). Only if there are very few employed women who wean their children late, however, can we be more certain that our assumption is true, and for each of those exceptions we should try to find an explanation. Do the mothers take their children with them (at place of work) or do they work near their homes so that they can feed the baby during breaks? Or do they successfully combine breast-milk with alternatives? If yes, why don’t more mothers try this combination? etc. 1. Finding confounding or intervening variables In some cases variables appear to be related but the association cannot easily be explained. Other times it seems that variables should logically go together, but you cannot find a relationship. In cases such as these there may be another variable (‘Q’) influencing the association between the two variables concerned, that has to be identified. Figure 4.2.4: A confounding variable Q between variables A and B For example, one expects a relationship between the quality of drinking water and the incidence of diarrhea. It is assumed that the incidence of diarrhoea would decrease as the number of water faucets in a village increased. If there is no change over time, there might be a confounding variable. People, for example, may dislike the taste of tap-water so much that they use it for everything, except for drinking. Note: Such unexplained associations may appear in any study. The essential characteristic of a qualitative research approach is that it purposively looks for such associations during the fieldwork, and that additional questions and tools may be developed to highlight such relationships. In quantitative surveys that attempt to objectively measure the strength of a presupposed association between two variables, the tools should not be changed once the fieldwork is ongoing. 168 2.3.3 Integrating qualitative and quantitative data Thus far we have discussed the analysis of qualitative data as a separate activity. However, if a research team has collected qualitative as well as quantitative data, which is the case in most HSR studies, it would be foolish not to look at them in combination, as this can inspire to deeper and more rewarding analysis. For example, the Indonesian ‘gender and leprosy’ research team found, when analysing the registration data of 4500 new leprosy patients who had registered over the past five years, that the M/F ratio was most unfavourable in the age group of 15-44 years. This was a puzzling finding, as in Nepal women in this age group were reporting much better (though still less than men). In-depth interviews with staff revealed that they suspected adolescent girls and young women to hide their skin patches, because of shameful associations with dirt, ugliness. This provided the incentive for a further break down of the quantitative data, which revealed that the M/F difference in reporting was indeed most pronounced in the 15-34 age group, and levelled off above 35. The reason(s) for this relatively large gender difference in the younger age groups were then further explored. 2.3.5 Content analysis of qualitative data for action Quantitative data serve in the first place to convince health authorities that there is indeed a serious, sizeable problem; qualitative data help to provide ideas on how to solve it. The FGDs on weaning foods with young mothers and mothers who had surpassed the childbearing age, for example, will yield many suggestions on how to develop interventions with the mothers which they are likely to consider useful and will be able to implement. Likewise, the in-depth interviews with leprosy and exleprosy patients will provide new insights into how best to counsel new patients and their close relatives/spouses in order to reduce unnecessary fears. 2.3.6 Computer analysis of qualitative data With the ever-increasing importance of computers in research, strategies for analysing qualitative data by computer have been/are being developed. There are several possibilities, ranging from simple word processing programs to highly sophisticated Qualitative Data Management Software including possibilities for statistical testing of associations. Some examples of such software include Nvivo, Qualitan, SPSS just to mention a few. As numbers are usually small in qualitative studies and content analysis, which can be done by hand, is most likely more important than testing of associations, we will not elaborate these techniques here 169 2.4 Reporting qualitative data Basically, there are two ways of reporting qualitative data that form part of a study in which different research techniques were used. One way is summarising the major qualitative results in a separate section of the findings, with examples and quotations, following the objectives that guided the collection of this particular data. The results would then be discussed in the chapter ‘Discussion’, together with the results of other, more quantitative data collection tools and would subsequently be reflected in the summary of the findings and the recommendations. Another possibility is to fully integrate different data sets in the chapter of findings, ordered according to the objectives of the entire study. If quantitative and qualitative data have been analysed and sometimes even collected in an integrated way, it would also be logical to present them in an integrated fashion. Attention should be paid that no valuable data get lost. Therefore a rough draft of all important findings is required in any case, after which can be decided to present the data either in separate sections or chopped up for integration with other data. 2.5 Further strategies for testing or confirming qualitative findings to prove validity Researchers who use quantitative research designs reduce their data to numbers and apply statistical tests. This does not necessarily insure that their research results are valid: something may have gone wrong during sampling or collection of data or even in the earlier design of the study (overlooking possible confounding variables). The following strategies will therefore be of use to any researcher. They are particularly relevant, however, to qualitative research, since the small numbers of qualitative data often generate questions concerning its validity. a) Check for representativeness of data. b) Check for bias due to observer bias or the influence of the researcher on the research situation. c) Cross-check data with evidence from other, independent sources. d) Compare and contrast data. e) Use extreme (groups of) informants to the maximum. f) Do additional research to test the findings of your study. • to replicate certain findings, • to rule out (or identify) possible intervening variables, • to rule out rival explanations by investigating them, or 170 • to look for negative evidence. g) Get feedback from your informants. 2.5 Summary We have reached to the end of this lecture. Throughout analysis and reporting of qualitative data you need to involve all parties concerned in the various stages of the research. This is important not only for ethical reasons or because it will improve the chances that the results will be implemented, but also because it will improve the quality of your study design, of your data, and of the conclusions drawn from these data. Suggestions and additional information collected during feedback sessions will invariably increase the quality of your research report. 2.6 References Miles MB and Huberman AM (1984) Qualitative data analysis, a sourcebook of new methods. Beverley Hills, CA, USA.: Sage Publications. Patton MQ (1990) Qualitative Evaluation and Research Methods. 2nd ed. Newbury Park, CA: Sage Publications. Spradly JP (1979) The ethnographic interview. New York, NY, USA.: Holt, Rinehart and Winston. Walker R (ed) (1985) Applied qualitative research. Hants, UK: Gower Publishing Company Ltd. Willms DG and Johnson NA (1996) Essentials in Qualitative Research: A Notebook for the Field. Hamilton, Canada: Mc Master University. Yin RK (1984) Case study research: design and methods. Beverly Hills, CA, USA.: Sage Publications. NB: A major source of inspiration for writing this module was Miles and Huberman’s book. Section V of this module is a heavily abbreviated and adapted version of their chapter VII. 171 LECTURE THREE THE USE OF COMPUTERS IN DATA ANALYSIS (Dr. D. Ngaruko) 3.1 Introduction Statistics, as a scientific discipline, is evolving rapidly. In large part of this evolution rate is a consequence of the development of computer. Just about 30 years ago data analysis took place either by means of a rudimentary (by today’s standards) calculator or on a centrally-located computer. Nowadays, most analysis takes place inside a personal computer located on the user’s desk (or via a network, on some other personal computers or workstation). Punched tapes and cards have given way to direct interaction with the computer. However from a data analytic viewpoint, these are all superficial changes. The fundamental changes relate to what data analyses can be done, and the speed with which data analysis can be done. All of these are again, result of the development of powerful computer software. Most statistical packages also provide facilities for data management. Whereas some types of statistical software is discipline specific, there are others which are applicable more generally across all knowledge disciplines. Table 9.1 summarizes most common packages and the disciplines where they are more appropriate. We will only be covering SPSS for Windows in this lecture. 3.2 Learning outcomes At the end of this course you should be able to: •Explain and appreciate the role of computer software in data analysis. •Undertake quantitative data entry into SPSS for Windows •Generate frequency tables and descriptive statistics using SPSS for Windows •Run and interpret Crosstabs and Chi-Square statistic using SPSS for Windows •Compute T-Tests for dependent and independent samples using SPSS for Windows •Run correlations and regression analyses using SPSS for Windows 172 3.3 Data analysis by computer A statistical package is a suite of computer programs that are specialised for statistical analysis. It enables people to obtain the results of standard statistical procedures and statistical significance tests, without requiring low-level numerical programming. Traditional methods which might have taken months of painstaking hand calculation can now be tacked effectively instantaneously. One consequence of this is that one does not need to be so sure one is doing the right thing before undertaking it. This can be both good and bad: one can fit several or even many different models to a set of data, which is good; but one can also over fit the data (finds a model which fits the data so well that it does not generalize very well), which is bad. There is also a very real danger here: that modern sophisticated statistical methods can be used without a proper understanding of the (often deep) theory underlying them.This has obvious implications for the validity of any conclusions one might draw. It seems that the development of accessible software has led not to the redundancy of statisticians (as was once feared might happen)but to an even greater need for them. Computer power has also provided impetus for the invention of entirely new statistical methods, methods which would have been completely impracticable or even inconceivable before such machines were available. Examples of such methods are: •Resampling methods-e.g. jackknife, bootstrap, and cross validation methods. These tools repeatedly analyze resample drawn from the original sample, and so get an idea of the variability which results from sampling •Non-linear models. These are often analytically intractable, and need rapid optimization methods to estimate the parameters of the models. •Stochastic optimization methods. These permit global optima to be found, even in the presence of many local optima. Simulation allows one to explore the properties of estimators or models which cannot be solved using analytic methods. •Non-parametric smoothing and curve estimation methods are becoming increasingly important •New kinds of statistical models, such as graphical models are being developed. Table 4.3.1: Common types of computer data analysis software 173 S/N 1 2 3 4 5 6 7 8 Software SHAZAM SPSS Stata StatsDirect S-PLUS Unistat SAS RATS 9 Quantum 10 11 12 13 Minitab MATLAB GenStat GAUSS Discipline where used comprehensive econometrics and statistics package comprehensive statistics package comprehensive statistics package general statistics package mostly used in medical statistics general statistics package general statistics package that can also work as Excel add-in comprehensive statistical package with programming language Regression Analysis of Time Series, comprehensive econometric analysis package part of the SPSS MR product line, mostly for data validation and tabulation in Marketing and Opinion Research general statistics package programming language]] with statistical features general statistics package programming language for statistics The softwares in table 4.3.1 represent novel kinds of statistical tools, but even deeper and more fundamental changes are occurring. Graphical methods are becoming increasingly important. Whereas not so long ago producing a graph was a slow and painful process, now accurate and revealing displays can be produced with ease. This has led to a new philosophy of informal data analysis, moving away from formal inference to informal sitting and examination of data and making use of immense power of the human eye to detect patterns. Interactive data analysis in general is becoming central to the way data analysis is done. We need no longer such through the mountains of line-printer output seeking the one number we want. Now those numbers will appear on the screen on command, and along with pictures and graphs. 3.3.1 Introduction to the basics of SPSS for Windows SPSS (originally, Statistical Package for the Social Sciences) is a computer program used for statistical analysis. Between 2009 and 2010 the premier software for SPSS was called PASW (Predictive Analytics SoftWare) Statistics. The company announced July 28, 2009 that it was being acquired by IBM. As of January 2010, it became "SPSS: An IBM Company". SPSS was released in its first version in 1968 after being developed by Norman H. Nie and C. Hadlai Hull. Norman Nie was then a political science postgraduate at Stanford University, and now Research Professor in the Department of Political Science at Stanford and Professor Emeritus of Political Science at the 174 University of Chicago.[3] SPSS is among the most widely used programs for statistical analysis in social science. It is used by market researchers, health researchers, survey companies, government, education researchers, marketing organizations and others. The original SPSS manual (Nie, Bent & Hull, 1970) has been described as one of "sociology's most influential books".[4] In addition to statistical analysis, data management (case selection, file reshaping, creating derived data) and data documentation (a metadata dictionary is stored in the datafile) are features of the base software. Statistics included in the base SPSS software include: • Descriptive statistics: Cross tabulation, Frequencies, Descriptives, Explore, Descriptive Ratio Statistics • Bivariate statistics: Means, t-test, ANOVA, Correlation (bivariate, partial, distances), Nonparametric tests • Prediction for numerical outcomes: Linear regression • Prediction for identifying groups: Factor analysis, cluster analysis (two-step, K-means, hierarchical), Discriminant Prior to SPSS 16.0, different versions of SPSS were available for Windows, Mac OS X and Unix. The Windows version was updated more frequently, and had more features, than the versions for other operating systems. The following is a release of SPSS versions from version 15 to the current one, version 18: • SPSS 15.0.1 - November 2006 • SPSS 16.0.2 - April 2008 • SPSS Statistics 17.0.1 - December 2008 • PASW Statistics 17.0.3 - September 2009 • PASW Statistics 18.0 - August 2009 • PASW Statistics 18.0.1 - December 2009 • PASW Statistics 18.0.2 - April 2010 This part of the lecture covers a brief look at what SPSS for Windows and what SPSS is capable of doing. It is not our intention to teach you about statistics in this tutorial. For that you should rely on your classes in statistics and/or a good textbook. If you're a novice this tutorial should give you a 175 feel for the programme and how to navigate through the many options. Beyond that, the SPSS Help Files should be used as a resource. Further, SPSS sells a number of very good manuals. 3.3.1.1 Starting SPSS for Windows SPSS for Windows has the same general look a feel of most other programmes for Windows. Virtually anything statistic that you wish to perform can be accomplished in combination with pointing and clicking on the menus and various interactive dialog boxes. Presumably, SPSS is already installed on the (your) computer you are using. If you don't have a shortcut on your desktop go to the [Start => Programs] menu and start the package by clicking on the SPSS icon. Once you've clicked on the SPSS icon a new window will appear on the screen. The appearance is that of a standard programme for windows with a spreadsheet-like interface. At the bottom left corner of the screen there are two tabs representing the two primary components of then data editor: data view and variable view. The data view screen is designed to hold raw data for analysis. The variable view screen contains information about that data set. In fact the variables are formatted in the variable view and data entry is only possible when SPSS screen is on the data view. The Data Editor displays the contents of the active data file. Note that the information in the Data Editor consists of variables and cases. In Data View, columns represent variables, and rows represent cases (observations). In Variable View, each row is a variable, and each column is an attribute that is associated with that variable. Variables are used to represent the different types of data that you have compiled. A common analogy is that of a survey. The response to each question on a survey is equivalent to a variable. Variables come in many different types, including numbers, strings, currency, and dates. 176 As you can see, there are a number of menu options relating to statistics, on the menu bar. There are also shortcut icons on the toolbar. These serve as quick access to often used options. Holding your mouse over one of these icons for a second or two will result in a short function description for that icon. The current display is that of an empty data sheet. Clearly, data can either be entered manually, or it can be read from an existing data file. 3.1.1.2 Data Entry into SPSS As noted before data entry in SPSS of the data collected using instruments such as questionnaires follows two stages: first is variable formatting in variable view mode and the second is the real data entry when SPSS is in data view mode. The process of formatting the questionnaire into SPSS is more tedious than data entry. The former involves transformation of the questions in the questionnaires into variables in the computer. To begin the process of adding data, just click on the first cell that is located in the upper left corner of the datasheet. It's just like a spreadsheet. Enter each data point then hit [Enter]. Once you're done with one column of data you can click on the first cell of the next column. 177 The data view below indictaes the first column as "Respondent’s age" and the second column as "number of siblings". If you're entering data for the first time, like the above example, the variable names will be automatically generated (e.g., var00001, var00002,....). They are not very informative. To change these names, click on the variable name button. For example, double click on the "var00001" button. Once you have done that, a dialog box will appear. The simplest option is to change the name to something meaningful. For instance, replace "var00001" in the textbox with operational variable name (refer to SPSS tutorial for details) Each variable has several characteristics represented on the variable view. Thus in addition to changing the variable name one has to make changes specific to these characteristics such as [Type], [Labels], [Missing Values], and [Column Format]. • [Type] - One can specify whether the data are in numeric or string format, in addition to a few more formats. The default is numeric format. 178 • [Labels] - Using the labels option can enhance the readability of the output. A variable name is limited to a length of 8 characters, however, by using a variable label the length can be as much as 256 characters. This provides the ability to have very descriptive labels that will appear at the output. Often, there is a need to code categorical variables in numeric format. For example, male and female can be coded as 1 and 2, respectively. To reduce confusion, it is recommended that one uses value labels . For the example of gender coding, Value:1 would have a correspoding Value label: male. Similarly, Value:2 would be coded with Value Label: female. (click on the [Labels] button to verify the above) • [Missing Values] - This option provides a means to code for various types of missing values. • [Column Format] - The column format dialog provides control over several features of each column (e.g., width of column). Once the variables are formatted correctly then click the Data View tab to continue entering the data. The variable names that you entered in Variable View are now the headings for the columns in Data View (as seen above). Data are entered in the first row, starting at the first column for the first case (respondent) hence there will be as many rows as there cases (respondents) in the data view. The non-numeric data, 179 such as strings of text, can also be entered into the data editor by selecting “string” as the type of the variable. 3.1.1.3 Data analysis using SPSS for Windows a) Frequency tables and descriptive statistics To begin, click on [Analyze=>Descriptive Statistics =>Frequencies].....for frequencies and [Analyze= Descriptive Statistics=>descriptives]....for simple descriptive statistics. Frequency tables are ideal for categorical variables whereas descriptives are ideal for scale variables. The result is a new dialog box that allows the user to select the variables of interest. Also, note the other clickable buttons along the border of the dialog box. The buttons labelled [Statistics...] and [Charts...] are of particular importance. Select variables of interest from the list followed by a mouse click on the arrow pointing right. The consequence of this action is transference of the selected variables to the Variables list. At this point, clicking on the [OK] button would spawn an output window with the Frequency information for each of the variables in form of tables. However, more information can be gathered by exploring the options offered by the [Statistics...] and [Charts...]. [Statistics...] offers a number of summary statistics. Any statistic that is selected will be summarized in the output window. As for the options under [Charts...] click on Bar Charts to replicate the graph in the text. 180 Once the options have been selected, click on [OK] to run the procedure. The results are then displayed in an output window. In this particular instance the window will include summary statistics for the variable in question, and the frequency distribution. You can see all of this by scrolling down the window. The results should also be identical to those in the text. You may have noticed from the above that calculating summary statistics requires nothing more than selecting variables, and then selecting the desired descriptive statistics. The frequency example allowed us to generate frequency information plus measures of central tendencies and dispersion. These statistics can be calculated by clicking directly on [Analyze => Descriptive Statistics =>Descriptives]. Not surprisingly, another dialog box is attached to this procedure. To control the type of statistics produced, click on the [Options...] button. Once again, the options include the typical measures of central tendency and dispersion. Each time as statistical procedure is run, like 181 [Frequencies...] and [Descriptives...] the results are posted to an Output Window. If several procedures are run during one session the results will be appended to the same window. b) Crosstabs and Chi-Square The computation of the cross tabulation (simply known as crosstabs) Chi-Square statistic can be accomplished by clicking on [Analyze => Descriptive Statistics => Crosstabs...]. This particular procedure will be your first introduction to coding of data, in the data editor. To this point data have been entered in a column format. That is, one variable per column. However, that method is not sufficient in a number of situations, including the calculation of Chi-Square, Independent T-tests, and any Factorial ANOVA design with between subjects’ factors. I'm sure there are many other cases, but they will not be covered in this tutorial. Essentially, the data have to be entered in a specific format that makes the analysis possible. The format typically reflects the design of the study, as will be demonstrated in the examples. For the Chi-Square statistic, the table of data can be coded by indexing the column and row of the observations. To perform the analysis, • Select [Analyze => Descriptive Statistics => Crosstabs...] to launch the controlling dialog box. • At the bottom of the dialog box are four buttons, with the most important being the [Statistics...] button and the [cell...] button. You must click on the [Statistics...] button and then select the Chisquare option, otherwise the statistic will not be calculated. In the cell button you have an option of selecting displays of percentages across the rows, along the column and total. Exploring this dialog box makes it clear that SPSS can be forced to calculate a number of other statistics in conjunction with Chi-square. For example, one can select the various measures of association (e.g., contingency coefficient, phi and Cramer’s v,...), among others. • Move the one variable into the Row(s): box, and the other variable(s) into the Column(s):, then click [OK] to perform the analysis. A subset of the output looks like the following. 182 The resultant cross tabulation and its associated Chi Square are as indicated below. It can be seen in the figure below that he Chi Square has to be selected for it to be displayed in the same output as the cross tabulation. In the previous lectures you should have learnt on how to interpret results. Normally the two (crosstab results and associated Chi Square) are discussed together. For well informed conclusions from the cross tab the observed P-value of the Pearson’s Chi Square should be less or equal to 0.05 for one tailed and 0.025 for 2-tailed asymptotic significance levels. From the illustration below we can see that the t-value is significant at 5% level of significance, implying that there is significant relationship between the two variables (sex and general happiness). 183 Although simple, the calculation of the Chi-square statistic is very particular about all the required steps being followed. More generally, as we enter hypothesis testing, the user should be very careful and should make use of manuals for the programme and textbooks for statistics. c) T-Test By now, you should know that there are two forms of the t-test, one for dependent variables and one for independent variables, or observations. To inform SPSS, or any stats package for that matter, of the type of design it is necessary to have to different ways of laying out the data. For independent ttests, the observations for the two groups must be uniquely coded with a Group variable. Like the 184 calculation of the Chi-square statistic, these calculations will reinforce the practice of thinking about, and laying out the data in the correct format. i) Dependent T-Test To calculate the t statistic click on [Analyse => Compare Means => Paired-Samples T Test...], then select the two variables of interest. To select the two variables, hold the [Shift] key down while using the mouse for selection. You will note that the selection box requires that variables be selected two at a time. For the dependent design, the two variables in question must be entered in two columns. Once the two variables have been selected, move them to the Paired Variables: list. This procedure can be repeated for each pair of variables to be analyzed. Finally, click the [OK] button. The critical result for the current analysis will appear in the output window as follows, 185 Paired Samples Test Paired Differences 95% Confidence Interval of the Mean Std. Std. Error Deviation Mean Difference Lower Upper t df Sig. (2-tailed) Pair 1 Number of Brothers and Sisters - Age of -41.602 17.710 .457 -42.498 -40.705 -91.008 1500 .000 Respondent As you can see an exact t-value is provided along with an exact p-value, and this p-value is greater that the expected value of 0.025, for a two-tailed assessment. Closer examination indicates several other statistics are presented in output window. Quite simply, such calculations require very little effort! ii) Independent T-tests When calculating an independent t-test, the only difference involves the way the data are formatted in the datasheet. The datasheet must include both the raw data and group coding, for each variable. To generate the t-statistic follow the following simple procedure • Click on [Analyse => Compare Means => Independent-Samples T Test] to launch the appropriate dialog box. • Select the dependent variable from the list of variables and move it to the Test Variable(s): box. • Select "group" - the grouping variable list - and move it to the Grouping Variable: box. • The final step requires that the groups be defined. That is, one must specify that Group1 - the experimental group in this case - is coded as 1, and Group2 - the control group in this case - is coded as 2. To do this, click on the [Define Groups...] button. Click on the [Continue] button to return to the controlling dialog box. • Run the analysis by clicking on the [OK] button. 186 • The output for the current analysis extracted from the output window looks like the following. The p-value of .004 is way lower than the cutoff of 0.025, and that suggests that the means are significantly different. Further, a Levene's Test is performed to ensure that the correct results are used. In this case the variances are equal; however, the calculations for unequal variances are also presented, among some other statistics - some not presented. In the next section we will briefly demonstrate the calculation of correlations and regression, as discussed in Chapter 9 of Howell. In truth, you should be able to work through many statistics with your current knowledge base and the help files, including correlations and regressions. Most statistics can be calculated with a few clicks of the mouse. d) Correlations and Regression To calculate a simple correlation matrix, one must use [Analyse => Correlate => Bivariate...], and [Analyse => Regression => Linear] for the calculation of a linear regression. Let us briefly outline how the two analyses are performed in SPSS. i) Simple Correlation • Click on [Analyse => Correlate => Bivariate...], then select and move "IQ" and "GPA" to the Variables: list. [Explore the options presented on this controlling dialog box.] • Click on [OK] to generate the requested statistics. The results from output window should look like the following, 187 Correlations Number of Brothers and Sisters Number of Brothers and Pearson Correlation Sisters Sig. (2-tailed) Age of Respondent Pearson Correlation Sig. (2-tailed) Age of Respondent 1 .116** .000 .116 ** 1 .000 **. Correlation is significant at the 0.01 level (2-tailed). As you can see, Pearson Correlation coefficient =0.116, and p=.000. The results suggest that the correlation is significant at 5%. Note: In the above example we only created a correlation matrix based on two variables. The process of generating a matrix based on more than two variables is not different. That is, if the dataset consisted of 10 variables, they could have all been placed in the Variables: list. The resulting matrix would include all the possible pair wise correlations. ii) Linear Regression analysis • Initiate the procedure by clicking on [Analyse => Regression => Linear...] • Select and move endogenous variable into the Dependent: variable box • Select and move exogenous variable into the Independent(s): variable box • Click on the [OK] to generate the statistics. Note: A variety of options can be accessed via the buttons on the bottom half of this controlling dialog box (e.g., Statistics, Plots,...). Many more statistics can be generated by exploring the additional options via the Statistics button. Some of the results of this analysis are presented below, 188 Coefficientsa Standardized Unstandardized Coefficients Model 1 B (Constant) Age of Respondent (AGE) Coefficients Std. Error 3.023 .215 .020 .004 Beta t .116 Sig. 14.083 .000 4.534 .000 a. Dependent Variable: Number of Brothers and Sisters (SIBS) The resultant statistics are "Constant", or a from the text, and "Slope", or B from the text. In the above output of the regression analysis, the dependent variable is number of brothers and sisters whereas the independent variable is the age of respondent. As such, one can predict number of brothers and sisters with the following equation, SIBS = 3.023 + 0.116* AGE The interpretation of tis model is that there is a significant positive impact of age on number of siblings i.e. if age increases by 1% then the number of brothers and sisters increases by 0.116%. Note: Multiple regression analysis involves more than one independent variables and the Bs for each independent variable is interpreted independently of other variables. The interpretation is the same as for the simple regression but other variables are held unchanged. e) One-Way ANOVA As in the independent t-test datasheet, the data must be coded with a group variable. To complete the analysis, • Select [Analyse => Compare Means => One-Way ANOVA...] to launch the controlling dialog box. • Select and move "Scores" into the Dependent list: • Select and move "Groups" into the Factor: list • Click on [OK] 189 The preceding is a complete speciation of the design for this one-way ANOVA. The simple presentation of the results, as taken from the output window, will look like the following, The analysis that was just performed provides minimal details with regard to the data. If you take a look at the controlling dialog box, you will find 3 additional buttons on the bottom half [Contrasts...], [Post Hoc..], and [Options...]. Selecting [Options...] you will find, 190 f) Factorial ANOVA To conduct a Factorial ANOVA one only need extend the logic of the one-way design. To compute the relevant statistics – the following simple approach is required: i) Select [Analyse => General Linear Model => Simple Factorial...] ii) Select and move "Scores" into the Dependent: box iii) Select and move "Age" into the Factor(s): box. iv) Click on [Define Range...] to specify the range of coding for the variables Click on [Continue]. v) Select and move "Condition" into the Dependent: box vi) Click on [Define Range...] to specify the range of the Condition factor. vii) Under [Options...] activate Hierarchical, or Experimental, then activate Means and counts - Click [Continue] viii) Click on [OK] to generate the output. The output is a complete source table with the factors identified with Variable Labels 191 3.4 Summary SPSS is the statistical package most widely used. There seem to be several reasons for its popularity: • Force of habit: SPSS has been around since the late 1960s. (Political scientist Norman Nie, who co-authored The Changing American Voter with Sidney Verba, developed it. SPSS originally stood for “Statistical Package for the Social Sciences”, but the name has since been changed to reflect the marketing of SPSS outside the academic community. • Of the major packages, it seems to be the easiest to use for the most widely used statistical techniques; • One can use it with either a Windows point-and-click approach or through syntax (i.e., writing out of SPSS commands). Each has its own advantages, and the user can switch between the approaches; • Many of the widely used social science data sets come with an easy method to translate them into SPSS; this significantly reduces the preliminary work needed to explore new data. There are also two important limitations that deserve mention at the outset: SPSS users have less control over statistical output than, for example, Stata or Gauss users. For novice users, this hardly causes a problem. But, once a researcher wants greater control over the equations or the output, she or he will need to either choose another package or learn techniques for working around SPSS’ limitations; 192 SPSS has problems with certain types of data manipulations, and it has some built in quirks that seem to reflect its early creation. The best known limitation is its weak lag functions, that is, how it transforms data across cases. For new users working off of standard data sets, this is rarely a problem. But, once a researcher begins wanting to significantly alter data sets, he or she will have to either learn a new package or develop greater skills at manipulating SPSS. Overall, SPSS is a good first statistical package for people wanting to perform quantitative research in social science because it is easy to use and because it can be a good starting point to learn more advanced statistical packages. 3.5 References Burchinal, Lee (1997) - Methods for Social Researchers in Developing countries. Available in PDF and www.srmdc.net/ Last accessed on October 15th 2008. Gerber SB and Finn K.V (2005) Using SPSS for Windows. 2nd Ed. http://www.hmdc.harvard.edu/projects/SPSS_Tutorial. Last assessed in July 2010 http://stattrek.com- Select Tutorials and then Introduction to Probability and Statistics. Last assessed in July 2010 Robert Burns and Richard Burns ( ) Business Research and Statistics using SPSS. SAGE Pub. Ltd SPSS for Windows software (any version between Version 14 and most recent) SPSS programme/manual 193 MODULE FIVE RESEARCH REPORT WRITING 194 LECTURE ONE WRITING A RESEARCH REPORT, DISSERTATION AND THESIS (By Prof. S. Mbogo, Dr. L. Kisoza and Ms. H. Mtae) 1.1 Introduction The purpose of this chapter is to introduce you to how to write research reports, dissertations and theses. The chapter covers: rationale for writing reports, types of reports and main contents of the report. 1.2 Learning outcomes After completion of this chapter you should be able to: i) Identify the different types of the report ii) Outline main components of a research report iii) Outline main components of a report iv) Write scientific reports v) Explain the importance of formatting research reports 1.3 Rationale for Report Writing There are many reasons for reporting research results including: • Criteria for judging the capacity of research personnel. The research outputs can only be judged by research if is continuous. • To disseminate research findings to other researchers and indicating directions for future research. One of the purposes of research is to generating new information, for that matter research can provide solution to problems or avenues for further research. It will also prevent duplication of research efforts • To justify expenditure of public or donor funds. The funding agencies need to satisfy them selves that the research funds were worth spending. 195 1.4 How to Get Started The main assumption here is that you have come up with a good idea for research, had your proposal approved, collected the data, conducted your analyses and now you're about to start writing the dissertation. Take your proposal and begin by checking your proposed research methodology. Change the tense from future tense to past tense and then make any additions or changes so that the methodology section truly reflects what you did. You have now been able to change sections from the proposal to sections for the dissertation. Move on to the Statement of the Problem and the Literature Review in the same manner. You first have to make an outline for your report. This outline will contain three main parts. The first part will consist of a description of your problem, within its context (the country and research area), the objectives of the study and the methodology. This part should not comprise more than one quarter of the report. The second part will form the bigger part of your report and this will contain the research findings. The third and final part will consist of the discussion of your data, conclusions and recommendations. Then you will have to make your report, dissertation or thesis attractive and user-friendly with a creative title page, a preface with acknowledgements, a table of contents, and a list of tables, figures and abbreviations. The references you used for your study will have to be added, and annexes including your data-collection tools. Before you start writing, it is therefore essential to group and review the data you have analysed by objective. Check whether all data has indeed been processed and analysed as planned. Draw major conclusions and relate these to the literature read. If necessary go back to your raw data and refine your analysis, or go search for additional literature to answer questions that the analysis of your data may evoke. Compile the major conclusions and tables or quotes from qualitative data related to each specific objective. You are now ready to draft the report. Reports need to; 196 • Have a logical, clear structure • Be to the point , and • Use simple language and have a pleasant lay-out 1.5 Preliminary Considerations When writing research reports the following issues must be put into consideration: a) Knowledge of your audience • Know who your readers are. Must also take into consideration the community needs, as well as policy and programme makers • Why do they want to read your report? For instance it is established that most people would like to know solutions of the problems rather telling them what a problem is. b) Knowledge on how the reader reads your report Most readers want to know about new information generated by a particular research. The new knowledge is usually highlighted in the conclusion. For that matter such readers begin with the conclusion. c) Complete data analysis before you start writing a report Before you start writing a report you need to have a through review of data analysis and ask yourself the following: • If the conclusions are appropriate to the specific objectives • If the analytical tables are adequate • If all methods of data collection have been included 197 1.6 Types of Research Reports There are different types of research, therefore we have different types of research reports and to match the type of research. Examples of research reports are: a) Specific Project reports These include progress reports and annual reports. Formats for such reports differ depending on the funding agency. These reports have a very limited circulation. b) Theses and Dissertations These are specialized reports which are prepared by post graduate students. Their formats differ depending on specific requirements of particular Universities or Institutions. c) Technical Articles for Journals or Scientific Conferences These include either short or full length papers. The formats and styles of presentation depend on specifications of the publisher of the journal or proceedings. These articles have potential for circulation to wider audience, internationally depending on the distribution 1.7 Components of Research Report The research report should contain the following components 1.7.1 Title A good title is the one that has a minimum possible number of words that describe accurately the content of the paper 1.7.2 Cover page The cover page should contain the full title of the research report/dissertation/thesis and, the name of the author. If is a dissertation/thesis, it should include the degree and the university as well as the year of submission. 198 1.7.3 Abstract The summary should be written only after the first or even the second draft of the report has been completed. It should contain: • A very brief description of the problem (why this study was needed) • Main objectives (what has been studied) • Place of study (where) • Type of study and methods used (how) • Major findings and conclusions, followed by • Major (or all) recommendations. 1.7.4 Acknowledgements It is good practice to thank those who supported you technically or financially in the design and implementation of your study. Also your employer who has allowed you to invest time in the study and the respondents may be acknowledged. 1.7.5 Table of contents A table of contents is essential. It provides the reader with a quick overview of the major sections of your report, with page references, so that the reader can go through the report in a different order or skip certain sections. 1.7.6 List of tables, figures If you have many tables or figures it is helpful to list these also, in a ‘table of contents’ type of format with page numbers. Examples: Tables 199 • Table 1.1 means Table 1 in chapter one • Table 2.1 means Table 1 in chapter two Figures • Fig.1.1 means Figure 1 in chapter one • Fig.2.1 means Figure 1 in chapter two 1.7.7 List of abbreviations If abbreviations or acronyms are used in the report, these should be stated in full in the text the first time they are mentioned. If there are many, they should be listed in alphabetical order as well. The table of contents and lists of tables, figures, abbreviations should be prepared last, as only then can you include the page numbers of all chapters and sub-sections in the table of contents. Then you can also finalise the numbering of figures and tables and include all abbreviations. 1.7.8 Introduction Introductory chapter should give the reader a clear idea about the central issue of concern in your research and why you thought that this is worth studying. The introduction is a relatively easy part of the report that can best be written after a first draft of the findings has been made. It should certainly contain some relevant background data about the country, the data which are related to the problem that has been studied. Then the statement of the problem should follow, revised from your research proposal with additional comments and relevant literature collected during the implementation of the study. It should contain a paragraph on what you hope(d) to achieve with the results of the study. Global literature can be reviewed in the introduction to the statement of the problem if you have selected a problem of global interest. Otherwise, relevant literature from individual countries may follow as a separate literature review after the statement of the problem. You can also introduce 200 theoretical concepts or models that you have used in the analysis of your data in a separate section after the statement of the problem. 1.7.9 Literature review The main purpose of the literature review is to set your study within its wider context and to show the reader how your study supplements the work that has already been done by others. 1.7.10 Research Design and Methods/ Methodology This should be a detailed chapter giving the reader sufficient information to make an estimate of the reliability and validity of your methods. The methodology you followed for the collection of your data should be described in detail including description of: • The study type • Major study themes or variables (a more detailed list of variables on which data was collected may be annexed) • The study population(s), sampling method(s) and the size of the sample(s) • Data-collection techniques used for the different study populations • How the data was collected and by whom • Procedures used for data analysis, including statistical tests (if applicable). 1.7.11 Results It is the most straight forward chapter as you just have to report the facts that your research discovered. It includes; • Tables and graphs that will illustrate your findings • Quotes from interviewee’s (this is qualitative equivalent of tables and graphs) • Sections of narrative account that illustrate periods of unstructured observations. 1.7.12 Discussion 201 The main purpose of the discussion chapter is interpretation of the results that you presented in the previous chapter. It involves making judgements rather than reporting facts on research findings. Findings should be discussed by objectives. You should state the relation of your findings to the goals, questions and hypothesis you stated earlier. It includes also consideration of the implications of your research for the relevant theories which you detailed in your literature review. It is usual to discuss the strength, weaknesses and limitations of your study. The discussion may include findings from other related studies that support or contradict your own. 1.7.13. Conclusion and Recommendations a). Conclusion A conclusion is a synthesis of findings corresponding to a specific circumstance based on the researcher’s understanding. This should be conclusion to the whole chapter and not just the research findings but should not include new ideas. The best way is to follow a similar structure to that used in your findings section. b) Recommendation A recommendation is a suggested course of action based on the conclusions about a specific circumstance. It involves suggestions for improvements of programme or activity researched. The conclusions and recommendations should follow logically from the discussion of the findings. Conclusions can be short, as they have already been elaborately discussed in research findings. As the discussion will follow the sequence in which the findings have been presented, the conclusions should logically follow the same order. Recommendations should be placed in roughly the same sequence as the conclusions or may at the same time be summarised according to the groups towards which they are directed, for example, programme makers, policy makers, community or further studies. In making recommendations, use not only the findings of your study, but also supportive information from other sources. You should also consider constraints, feasibility and usefulness of the proposed solutions. 202 1.7.14 References This applies to materials that has been referred to or quoted in the study (Detailed explanation presented in the next chapter) 1.7.15 Appendices It includes materials that may be of interest to the reader but not very crucial to the study (Detailed explanation presented in the next chapter) 1.8 Writing Style The major myth in writing a dissertation is that you start writing at Chapter One and then finish your writing at Chapter Five. This is seldom the case. The most productive approach in writing the dissertation is to begin writing those parts of the dissertation that you are most comfortable with. Then move about in your writing by completing various sections as you think of them. At some point you will be able to spread out in front of you all of the sections that you have written. You will be able to sequence them in the best order and then see what is missing and should be added to the dissertation. One must take into consideration an axiom that there as many writing styles as there are writers. For that reason there is no any single prescription of writing style. Nonetheless, the following should be taken into consideration when one start to write a report: Firstly remember that your reader has many other urgent matters to attend, therefore is short of time. Again is probably not knowledgeable of research jargon Therefore consider the following • Simplify your report by keeping to essentials, be precise and specific. • Be clear, logical and systematic: Use adverbs and adjectives sparingly. Also be consistence in the use of tenses (past, present) • Justify what you report by making statements that are only based on facts and use short sentences • Always strive to inform not to impress. Always quantify your results, avoid expressions like large or small, in steady say almost 75% or one in three. 203 Dissertation-style writing is not designed to be entertaining. Dissertation writing should be clear and unambiguous. To do this well you should prepare a list of key words that are important to your research and then your writing should use this set of key words throughout. There is nothing so frustrating to a reader as a manuscript that keeps using alternate words to mean the same thing. If you've decided that a key phrase for your research is "educational workshop", then do not try substituting other phrases like "in-service program", "learning workshop", "educational institute", or "educational program." Always stay with the same phrase - "educational workshop." It will be very clear to the reader exactly what you are referring to. 1.9 Layout of the Report Ensure that your report has good layout, and meet the specifications of publishers, universities or institutions. A good layout helps your report to; make a good initial impression, encourage the reader, gives an idea of the organization of the information. In order to have a good layout take into consideration of the following: • An attractive layout for the title page, and clear table of contents • Consistency in margins and spacing • Consistency in headings and subheadings e.g. use of bold, italics, underline, lower case, upper case. • Consistency in numbering for figures and tables • Accuracy and consistency in quotations and references • High quality photocopying Review two or three well organized and presented dissertations. Examine their use of headings, overall style, typeface and organization. Use them as a model for the preparation of your own dissertation. In this way you will have an idea at the beginning of your writing what your finished dissertation will look like. A most helpful perspective! 204 1.10 Drafts A starting point is report drafts. It is advisable to prepare an outline of the report first, which consists of the following: • Headings of main section • Headings of subsections • Points to be made for each subsection • A list of tables and figures to be illustrated As you get involved in the actual writing of your dissertation you will find that conservation of paper will begin to fade away as a concern. Just as soon as you print a draft of a chapter there will appear a variety of needed changes and before you know it another draft will be printed. And, it seems almost impossible to throw away any of the drafts! After awhile it will become extremely difficult to remember which draft of your chapter you may be looking at. Print each draft of your dissertation on a different colour paper. With the different colours of paper it will be easy to see which is the latest draft. 1.11 Review exercises 1. What should one think of before embarking on report writing? 2. There are different types of reports. List all the types you know. 3. Choose a title of your own and write a simple report on it including all the major components of a report. 205 LECTURE TWO CITATION, REFERENCES AND APPENDICES (By Prof. S. Mbogo, Dr. L. Kisoza and Ms. H. Mtae) 2.1 Introduction The main purpose of this chapter is to introduce to you citation, references, quotations and appendices. It covers the purpose of citation, what to cite, citation styles, references, appendices and their importance. 2.2 Learning outcomes At the end of this chapter you are expected to be able to: I). Differentiate between citations, quotations, references and appendices ii). Explain the purpose of citation iii). Identify what to cite and quote iv). Explain the importance of citation, references and appendices v). Use citation, references, quotations and appendices appropriately in reports, dissertation and thesis writing. 2.3 Citation This means referring to the work of other authors in your own text. Normally it’s done to show evidence of the background reading work that has been done and to support the contents of your research report. Each citation requires a reference at the end of your text. These references may be from work presented in journal or newspaper articles, government reports, books or specific chapters of books, research dissertations or theses, material from the Internet etc. 2.3.1 Purpose of citation There are three main reasons for you to include citations in your papers: • To give credit to the authors of the source materials you used when writing the paper. • To enable readers to follow up on the source materials. 206 • To demonstrate that your paper is well-researched. 2.3.2 What to Cite You should cite all direct quotations, paraphrased factual statements, and borrowed ideas. The only items that you do not need to cite are facts that seem to be common knowledge When you draw a great deal of information from a single source, you should cite that source even if the information is common knowledge, since the source (and its particular way of organizing the information) has made a significant contribution to your research report. Failure to give credit to the words and ideas of an original author is plagiarism. 2.3.3 Citation Styles Various citation styles exist. They convey the same information, only the presentation of that information differs. Most style guides fall into two commonly used systems: • Author-date system (e.g. Harvard) • Numeric system (e.g. Vancouver, MLA-Modern Language Association) Whichever system you use, it is important that you are consistent in its application. 2.3.4 Citations in the text All ideas taken from another source regardless of whether directly quoted or paraphrased need to be referenced in the text. To link the information you use in your text to its source (book, article, etc.), put the author’s name and the year of publication at the appropriate point in your text. If the author’s name does not naturally occur in your writing, put the author’s surname and date in brackets. For example: There is some evidence that these figures are incorrect (Jones, 1992). If the author’s name is part of the statement, put only the year in brackets: For example: 207 Jones (1992) has provided evidence that these figures are incorrect. If there are two authors, give both: For example: It is claimed that government in the information age will “work better and cost less”(Bellamy and Taylor, 1998). Note: if you are giving a direct quotation then you need to include the page number. If there are more than two authors, cite only the first followed by ‘et al.’ (which means ‘and others’): For example: . . .adoptive parents were coping better with the physical demands of parenthood and found family life more enjoyable (Levy et al. 1991). Note: up to three author names can be given in your reference list/bibliography. If an author has published more documents in the same year, distinguish between them by adding lower-case letters: For example: In recent studies by Smith (1999a, 1999b, 1999c) . . . 2.3.5 Citation of work described in another work When an author quotes or cites another author and you wish to cite the original author you should first try to trace the original item. However, if this is not possible, you must acknowledge both sources in the text, but only include the item you actually read in your reference list. For example: 208 If Jones discusses the work of Smith you could use: Smith (2005) as cited by Jones (2008) or Smith’s 2005 study (cited in Jones 2008, p.156) shows that… Then cite Jones in full in your reference list. 2.3.6 Information found in more than one source If you find information in more than one source, you may want to include all the references to strengthen your argument. In which case, cite all sources in the same brackets, placing them in order of publication date (earliest first). Separate the references using a semi-colon (;). For example: Several writers (Jones 2004; Biggs 2006; Smith 2008) argue… 2.3.7 Chapter/section of an edited book For example: The view proposed by Franklin (2002, pp88). . . 1. Journal article . . . the customer playing the part of a partial employee (Dawes and Rowley, 1998). 2. Newspaper article For example: TGNP(2010) accused the 18th Parliament Session of Tanzania for not delivering as expected 209 3. Electronic information For example: One commentator (Ben, 2005) questioned whether educators will have time to acquire. . . Repeating a Citation ▪ After the first complete citation of a work, you may abbreviate subsequent instances by using either Ibid. or a shortened form of the citation. See the following examples of each style. Ibid. Use Ibid. to repeat a footnote that appears immediately before the current footnote. Ibid. takes the place of the author’s name, the title of the work, and as much of the subsequent information as is identical. For example: 50 Thomas Smith, “New Debate over Business Records,” The New York Times, December 31, 1978, sec. 3, pp 5. 51 Ibid., pp. 6. 1. Quotations 210 Quoting involves using exact words, phrases and sentences from a source, setting them off with quotation marks, and citing where the information was taken from. For example: According to Berestein (2003), the Middle Eastern water pipe known as the hookah recently "has been resurrected in youth-oriented coffee houses, restaurants and bars, supplanting the cigar as the fad of the moment" Smoothly incorporate the quote into your document. Try using a compound or complex sentence in which at least one entire clause includes only your original ideas. The other clause is the quote. For example: Because Lenina is incapable of real love, she misunderstands John's emotion when he tells her, "I love you more than anything in the world." 2.4 References This is a list of bibliographic details of all items referred to directly in the text. There are many styles, but most popular ones are; ix) Author - Date System (Havard and American Psychological Association (APA) systems) x) Numeric System (Vancouver Style and MLA-Modern Language Association Styles) You have to choose appropriate referencing system for your research report as many universities have different systems of referencing. 2.4.1 Importance of References References are used to: Enable the reader to locate the sources you have used; Help support your arguments and provide your work with credibility; 211 Show the scope and breadth of your research; Acknowledge the source of an argument or idea. Failure to do so could result in a charge of plagiarism. 2.4.2 Reference List Full references of sources used should be listed at the end of your work as a reference list. This list of references is arranged alphabetically usually by author. 2.4.3 Plagiarism Plagiarism is the submission of an item of assessment containing elements of work produced by another person(s) in such a way that it could be assumed to be your own work. Examples of plagiarism are: • the verbatim copying of another person’s work without acknowledgement • the close paraphrasing of another person’s work by simply changing a few words or altering the order of presentation without acknowledgement • the unacknowledged quotation of phrases from another person’s work and/or the presentation of another person’s idea(s) as one’s own. Plagiarised work may be from a published source such as a book, report, journal or material available on the internet. 2.4.4 What should you include in reference For each reference you make in a reference list or bibliography, it is essential that you record various pieces of information so that you keep track of all your references • Authors/editors • Year of publications • Title • Edition 212 • Publisher 2.4.5 How to Collect and Organise References It is often not easy (or possible) to retrieve sources after you have written your text. For this reason it is best to keep a good record of everything that you use. Bibliographic software, such as Endnote, Procite or Reference Manager, will help you organise your references according to different citation systems and to add the citations to your text. Alternatively, you could store your references on index cards. Start your references section at the beginning of writing process and add to it as you go along. Ensure that you have cited in reference section all those sources to which you have referred in the text. Ensure that all data and material taken as they are from another person’s published or unpublished written or electronic work is explicitly identified and referenced to its author. This also includes the work which is referred to in the written work of others even if the material is not quoted exactly as they are. 2.4.6 Author Date Referencing Styles 2.4.6.1 The Harvard Style According to Neville 2007, there are variations within the Harvard style including; • Name(s) of authors or organisations may or may not be in UPPER CASE • Where there are more than two authors, the names of the second and subsequent authors may or may not be replaced by et al. in italics. • The year of publications may or may not be enclosed in brackets • The title of publications may be in italics or may be underlined Examples: I) Books and chapters in books • Book (first edition) 213 Berman Brown, R. and Saunders, M. (2008).Dealing with statistics: What you need to know. Maidenhead: Open University Press. • Book (other than first edition) Morris, C. (2003). Quantitative approaches to business London: Financial Times Pitman studies. (6th edn). Publishing. • Book (no obvious author) Mintel Marketing Intelligence (1998).Designer wear: Mintel marketing intelligence report. London: Mintel International Group Ltd. • Chapter in a book Robson, C. (2002). Real World Research. (2nd edtn). Oxford: Blackwell. Chapter 3. Tuckman, A. (1999) Labour, skills and training. In: R. Levitt et al, eds.The reorganised National Health Service. 6th ed. Cheltenham: Stanley Thornes,pp. 135-155 • Chapter in an edited book containing a collection of articles King, N. (2004).Using templates in the thematic analysis of text. In C. Cassel and J. Symon (eds) Essential guide to qualitative methods in an organizational research. London: Sage. pp.256-270 • Books Kadolph, S.J. (2007) Textiles, 10th ed. New Jersey: Pearson Prentice Hall • Books with two or three authors Li, X. and Crane, N.B. (1993) Electronic style: a guide to citing electronicinformation. London: Meckler • Books with more than three authors Levitt, R. et al. (1999) The reorganised National Health Service. 6th ed.Cheltenham: Stanley Thornes. II) Other sources • Journal article (originally printed but same as found on line) Storey, J., Cressey, P., Morris, T. and Wilkinson, A. (1997). Changing employment practices in UK banking: case studies. Personnel Review. Vol.26, No. 1, pp. 24-42. 214 • Journal article only published online Illingworth, N. (2001). The Internet matters: exploring the use of the internet as a research tool. Sociological research Online, vol. 6, No.2. Available at http://www.socresonline.org.uk/6/2/illingworth.htm[Acessedhttp://www.socresonline.org.uk/6/2/illin gworth.htm[Acessed 14th May 2002]. • Magazine article (no obvious authors) Quality world. (2007). Immigration abuse. Quality World. Vol. 33, No.12, pp.6 • Papers in conference proceedings Gibson, E.J. (1977) The performance concept in building. In: Proceedings of the7th CIB Triennial Congress, Edinburgh, September 1977. London: Construction Research International, pp. 129-136 • Publication from corporate body (e.g Government publication) Great Britain. Department of the Environment, Development Commission (1980) 38th Report, 1st April 1979 to 31st March 1980. London: HMSO, 1979-80 HC. 798, pp. 70-81 • News paper article where the author is identified Kikwete J.K (2010).Time for Tanzania to rid itself of corrupt leaders. This Day Tuesday March 1 st 2010, pp1 • Thesis Tregear, A.E.J (2001) Speciality regional foods in the UK: An investigation from the perspectives of marketing and social history. Unpublished PhD thesis. University of Newcastle upon Tyne. III) Electronic Sources • Websites National electronic Library for Health, 2003. Can walking make you slimmer and healthier? (Hitting the headlines article) [Online] (Updated 16 Jan 2005) Available at: http://www.nhs.uk.hth.walking [Accessed 10 April 2005]. • Publications 215 Scottish Intercollegiate Guidelines, 2001. Hypertension in the elderly. (SIGN publication 20) [internet] Edinburgh : SIGN (Published 2001) Available at: http://www.sign.ac.uk/pdf/sign49.pdf [Accessed 17 March 2005]. • E-mail correspondence Available at: http://gog.defer.com/2004_07_01_defer_archive.htmlhttp://gog.defer.com/2004_07_01_defer_archiv e.html [Accessed 7 July 2005]. • Electronic books (e-books) Grrahame, K.(1917). The wind in the willows. Netlibrry (online). Available at http://www.netlibrary.comhttp://www.netlibrary.com [Accessed 14th July, 2005] • Article in electronic journals Bright, M. (1985).'The poetry of art', journal of the history of ideas, 46(2), pp.259-277 JSTRO (online). Available at :http://uk.jstro.org/http://uk.jstro.org/ [Accessed 16th June 2005] IV) Other types of documents • Acts of Parliament Higher Education Act 2004. (c.8), London: HMSO. For Acts prior to 1963, the regal year and parliamentary session are included:Road Transport Lighting Act 1957. (5&6 Eliz. 2, c.51), London: HMSO. • Statutory Instruments Public Offers of Securities Regulations 1995. SI 1995/1537, London: HMSO. • Command Papers and other official publications Royal Commission on civil liability and compensation for personal injury,1978. (Pearson Report) (Cmnd. 7054) London: HMSO. 216 Select Committee on nationalised industries (1978-9). Consumers and the nationalised industries: prelegislative hearings (HC 334 of 1978-9) London: HMSO. http://libweb.anglia.ac.uk/referencing/harvard.htm • Law report R v White (John Henry) [2005] EWCA Crim 689, 2005 WL 104528. Jones v Lipman [1962] 1 WLR 832. Saidi v France (1994) 17 EHRR 251, p.245 • Annual report Marks & Spencer, 2004. The way forward, annual report 2003-2004, London: Marks & Spencer • For an e-version Marks & Spencer, 2004. Annual report 2003-2004. [Online] Available at: http://www-marks-and-spencer.co.uk/corporate/annual2003/[Accessed 4 June 2005] N.B. the URL should be underlined • Map Ordnance Survey, 2006. Chester and North Wales. Land ranger series Sheet 106, 1:50000, Southampton: Ordnance Survey • Pictures, Images and Photographs Beaton, C., 1956. Marilyn Monroe. [Photograph] (Marilyn Monroe’s own private collection). • Beaton, C., 1944. China 1944: A mother resting her head on her sick child's pillow in the Canadian Mission Hospital in Chengtu. [Photograph] (Imperial War Museum Collection). Electronic reference : Dean, Roger, 2008 Tales from Topographic Oceans. [electronic print] Available at: http://rogerdean.com/store/product_info.php cPath=48&products_id=88 From home page/store/calendar/august [Accessed 18 June 2008]. V) Unpublished works 217 • Unpublished works Woolley, E. & Muncey, T., (in press) Demons or diamonds: a study to ascertain the range of attitudes present in health professionals to children with conduct disorder. Journal of Adolescent Psychiatric Nursing. (Accepted for publication December 2002). • Informal or in-house publications Anglia Ruskin University, 2007. Using the Cochrane Library. [Leaflet] • Personal communications O’Sullivan, S., 2003. Discussion on citation and referencing [Letter] (Personal communication, 5 June 2003). • Unpublished conference papers Saunders, M.N.K., Thornhill, A and Evans, C. (2002). Conceptualising trust and distrust and the role of boundaries: an organisationally based exploration. Unpublished paper presented at ‘EIASM 4th Workshop on Trust Within and between organisations’. Amsterdam, 25-26 Oct. 2007. • Internet site European commission. (2007). Eurostat-structural indicators. Available at http://epp.eurostat.ec.europa.eu/portal/page?_pageid=1133_47800773, 1133_47802558&_dad=portal &_schema=PORTAL [Accessed 27 Nov. 2007] • Internet reports and guides Browne, L. and Alstrup, P. (2006). What exactly is the Labour Force Survey? Available at http://www.statistics.gov.uk 2.4.6.1 APA STYLE The APA style guide prescribes that the Reference section, bibliographies and other lists of names should be accumulated by surname first, and mandates inclusion of surname prefixes. For example, "Martin de Rijke" should be sorted as "de Rijke, M." and "Saif Al-Falasi" should be sorted as "AlFalasi, S." Examples: 1. Published Materials 218 Book by one author • Sheril, R. D. (1956). The terrifying future: Contemplating color television. San Diego, CA: Halstead. Book by two authors • Kurosawa, J., & Armistead, Q. (1972). Hairball: An intensive peek behind the surface of an enigma. Hamilton, Ontario, Canada: McMaster University Press. Chapter in an edited book • Mcdonalds, A. (1993). Practical methods for the apprehension and sustained containment of supernatural entities. In G. L. Yeager (Ed.), Paranormal and occult studies: Case studies in application (pp. 42–64). London, England: OtherWorld Books. Dissertation (PhD or masters) • Mcdonalds, A. (1991). Practical dissertation title (Unpublished doctoral dissertation). University of Florida, Gainesville, FL. Article in a journal with continuous pagination (nearly all journals use continuous pagination) Rottweiler, F. T., & Beauchemin, J. L. (1987). Detroit and Narnia: Two foes on the brink of destruction. Canadian/American Studies Journal, 54, 66–146. (b) Kling, K. C., Hyde, J. S., Showers, C. J., & Buswell, B. N. (1999). Gender differences in self-esteem: A meta-analysis. Psychological Bulletin, 125, 470–500. Article in a journal paginated separately Journal_pagination • Crackton, P. (1987). The Loonie: God's long-awaited gift to colourful pocket change? Canadian Change, 64(7), 34–37. Article in a weekly magazine • Henry, W. A., III. (1990, April 9). Making the grade in today's schools. Time, 135, 28–31. Article in a weekly magazine with DOI 3) Hoff, K. (2010, March 19). Fairness in modern society. Science, 327, 1467-1468. doi:10.1126/science.1188537 Article in a print newspaper • Wrong, M. (2005, August 17). "Never Gonna Give You Up" says Mayor. Toronto Sol, p. 4. 219 2. Electronic sources Online article based on a print source, with DOI (e.g., a PDF of a print source from a database) • Krueger, R. F., Markon, K. E., Patrick, C. J., & Iacono, W. G. (2005). Externalizing psychopathology in adulthood: a dimensional-spectrum conceptualization and its implications for DSM-V. Journal of Abnormal Psychology, 114, 537-550. doi:10.1037/0021-843X.114.4.537 Online article based on a print source, without DOI (e.g., a PDF of a print source from a database) • Marlowe, P., Spade, S., & Chan, C. (2001). Detective work and the benefits of colour versus black and white. Journal of Pointless Research, 11, 123–127. Online article from a database, no DOI, available ONLY in that database (proprietary content--not things like Ovid, EBSCO, and PsycINFO) • Liquor advertising on TV. (2002, January 18). Retrieved from http://factsonfile.infobasepublishing.com/ OR • Liquor advertising on TV. (2002, January 18). Retrieved from Issues and Controversies database. Article in an Internet-only journal • McDonald, C., & Chenoweth, L. (2009). Leadership: A crucial ingredient in unstable times. Social Work & Society, 7. Retrieved from http://www.socwork.net/2009/1/articles/mcdonaldchenoweth Article in an Internet-only newsletter (eight or more authors) • Paradise, S., Moriarty, D., Marx, C., Lee, O. B., Hassel, E., . . . Bradford, J. (1957, July). Portrayals of fictional characters in reality-based popular writing: Project update. Off the Beaten Path, 7. Retrieved from http://www.newsletter.offthebeatenpath.news/otr/complaints.html Article with no author identified • Britain launches new space agency. (2010, March 24). Retrieved from http://news.ninemsn.com.au/technology/1031221/britain-launches-new-space-agency Article with no author and no date identified (e.g., wiki article) • Harry Potter. (n.d.). In Wikipedia. Retrieved March 12, 2010, from http://en.wikipedia.org/wiki/Harry_Potter 220 Entry in an online dictionary or reference work, no date and no author identified • Verisimilitude. (n.d.). In Merriam-Webster's online dictionary (11th ed.). Retrieved from http://www.merriam-webster.com/dictionary/verisimilitude E-mail or other personal communication (cite in text only) • (A. Monterey, personal communication, September 28, 2001) Book on CD • Nix, G. (2002). Lirael, Daughter of the Clayr [CD]. New York, NY: Random House/Listening Library. Book on tape • Nix, G. (2002). Lirael, Daughter of the Clayr [Cassette Recording No. 1999-1999-1999]. New York, NY: Random House/Listening Library. Movie • Gilby, A. (Producer), & Schlesinger, J. (Director). (1995). Cold comfort farm [Motion picture]. Universal City, CA: MCA Universal. 3. Statistical expressions in APA Note on Probabilities There are two ways to report statistical probabilityprobability : pre-specified probability given as a range below the chosen alpha levelalpha level and exact probability given as a calculated p-valuepvalue . Since most statistical packages calculate an exact value for p, the Publication Manual recommends that exact p-values should be reported. • Example: p < .05 • Example: p = .031 (preferred) Exceptions, where a pre-specified probability range may be preferred, include large or complex tables of correlations or when the p-value is particularly small (e.g., p < .001). Reporting FF-tests General format: F([df-between], [df-within]) = [F-obtained], p = [p-value], [eta-squared obtained] = [value]. • Example: F(2, 50) = 9.35, p < .001, η2 = .03. 221 If a p-value is not significantnot significant , then the letters ns are substituted, or the precise p-value is substituted prefaced by an equals sign. • Example: F(2, 50) = 1.35, ns. • Example: F(2, 50) = 1.35, p = .18. (preferred) If an F-value is less than 1, thereby implying that it can never be statistically significant, then neither the F-value itself, nor the associated p-value, is reported. • Example: F(2, 50) < 1. • Example: F < 1. Reporting tt-tests General format: t([df error])= [t-obtained], p = [p-value], [Cohen's d obtained] = [value]. • Example: t(9) = 2.35, p = .043, d = .70. Reporting χχ2χ2 tests General format: χ2([df error], N = [total sample size]) = [Chi-squared obtained], p = [p-value]. • Example: χ2(4, N = 24) = 12.4, p = .015. 2.5 Appendices This is a supplement to the research report, dissertation or thesis. It should not normally include material that is essential for the understanding of the report itself, but additional relevant material in which the reader may be interested. • Should be kept to the minimum • Materials which are interesting to know rather than essential to know should be in appendices • Should include a blank copy of your questionnaire, interview or observation schedule. Where these have been conducted in different language from that in which you write your submitted research report, you will need to submit both this version and the translation. In documents appendix may refer: 222 • Addendum, any addition to a document, such as a book or legal contract • Bibliography, a systematic list of books and other works • Index , a list of words or phrases with point to where related material can be found in a document • Specifically, a text added to the end of a book or an article, containing information that is important to, but is not the main idea of, the main text 2.6 Bibliography This is alphabetical list of bibliographic details for all relevant items consulted and used, including those items not referred to directly in the text. Bibliographies have the following formatting conventions: • The first author’s name is inverted (last name first), and most elements are separated by periods. • Entries have a special indentation style in which all lines but the first are indented. • Entries are arranged alphabetically by the author’s last name, or by the first word of the title if no author is listed. Examples 1.Books (Printed) One author Footnote 10 David A. Garvin, Operations Strategy: Text and Cases (Englewood Cliffs, NJ:Prentice-Hall, 1992), p. 73. Bibliography Garvin, David A. Operations Strategy: Text and Cases. Englewood Cliffs, NJ:Prentice-Hall, 1992. 2. Two authors Footnote 11 John P. Kotter and James L. Heskett, Corporate Culture and Performance (New York: Free Press, 1992), p. 101. Bibliography 223 Kotter, John P., and James L. Heskett. Corporate Culture and P erformance. New York: Free Press, 1992. 3. Three authors Footnote 12 John W. Pratt, Howard Raiffa, and R.O. Schlaifer, Introduction to Statistical Decision Theory (Cambridge: MIT Press, 1995), p. 45. Bibliography Pratt, John W., Howard Raiffa, and R.O. Schlaifer. Introduction to Statistical Decision Theory. Cambridge: MIT Press, 1995 4. Unpublished material Footnote 31 Sarah Dodd, “Transnational Differences in Entrepreneurial Networks,” paper presented at the Eighth Global Entrepreneurship Research Conference, INSEAD, Fontainebleau, France, June 1998. Bibliography Dodd, Sarah. “Transnational Differences in Entrepreneurial Networks.” Paper presented at the Eighth Global Entrepreneurship Research Conference, INSEAD, Fontainebleau, France, June 1998 2.7 Review exercise 1 1. Seven publications of various formats are described below. Try expressing the reference details for each in the Harvard style and putting them into alphabetical order in a reference list. • A book with the title: 'Occupational health and safety', published in Sydney in 2004 by McGraw-Hill, with authors M. Stewart and F. Heyes. This is the second edition. • A book with the title: 'Internal control and corporate governance', with authors K. Adams, R. Grose, D. Leeson and H. Hamilton, published in Frenchs Forest, NSW by Pearson Education Australia in 2003. • An article by M. Scardamalia and C. Bereiter, called 'Schools as knowledge-building organizations', published in 1999 in a book edited by D. Keating and C. Hertzman, called 'Today's children, tomorrow's society' in New York by Guilford as pages 274 to 289. 224 • An article by J. R. Savery and T. M. Duffy, called 'Problem based learning: an instructional model and its constructivist framework', published on pages 31 to 38 in the journal 'Educational Technology', volume 35, number 5, in 1995. • An article called 'Integration and thematic teaching: integration to improve teaching and learning' by S. Lipson, S. Valencia, K. Wixson and C. Peters, published in 1993 in the journal 'Language Arts', volume 70, number 4, pages 252 to 263. • A video recording of a television documentary called 'Embers of the sun', produced in 1999 by the Australian Broadcasting Corporation in Sydney. • A Web page with the title 'Telstra conferencing - video overview', found at the address: http://www.telstra.com.au/conferlink/videoconf.htm on 11 August 2004. No date on it, though Mozilla gives a last modified date of 4 July 2004. 2.8 Review exercise 2 • Read the following passage then choose an appropriate reference from the list below and fill in the gaps. • Make sure that the reference you choose means that the sentence is grammatical AND makes sense AND that it makes sense relative to the surrounding sentences. In other words that the overall meaning of the passage makes sense. • Provide reasons why you chose to select or not select the particular phrases. (….......................1995, p.6) children between the ages of five and eight who are repeatedly exposed to violent films are highly likely to commit some form of crime associated with physical violence. However, …...............................states such a claim is more emotive than reasoned in nature. Citing the lack of research to support this claim................................, that why the children are watching such movies in the first place is a far more pressing question that society needs to address. Phrase Reason for selecting/not-selecting As Jones states According to Jones Jones 225 Smith (1998, p. 9) According to Smith (1998, p. 9) Smith (p. 9) argues 2.9 References Anglia Ruskin University (2008). Guide to the Harvard Style of referencing. University Library. http://libweb.anglia.ac.uk [Accessed on 26th August,2010] Havard Business School (2009). Citation guide 2009-2010 academic year http://intranet.hbs.edu/dept/drfd/caseservices/styleguide.pdf. [Accessed on 26th August,2010] Shayo, H.E., (2010). Beginners referencing resource. Handbook for young researchers. The Open University of Tanzania. Saunders, M. N. K., Lewis & P., Thornhill, A. (2009). Research methods for business students. (5ed). Harlow: Prentice Hall – Financial Times. 226