RESEARCH . a. The systematic investigation into the study of materials, sources, etc, in order to establish facts and reach new conclusions. b. An endeavour to discover new or collate old facts etc by the scientific study of a subject or by a course of critical investigation. TYPES OF RESEARCH From the viewpoint of objectives, a research can be classified as Descriptive Correlational Explanatory exploratory Descriptive research Descriptive research attempts to describe systematically a situation, problem, phenomenon, service or programme, or provides information about living condition of a community, or describes attitudes towards an issue. Correlational research Correlational research attempts to discover or establish the existence of a relationship/ interdependence between two or more aspects of a situation. Explanatory research Explanatory research attempts to clarify why and how there is a relationship between two or more aspects of a situation or phenomenon. Exploratory research Exploratory research is undertaken to explore an area where little is known or to investigate the possibilities of undertaking a particular research study (feasibility study/ pilot study). . From the point of view of application, there are two broad categories of research: Pure Research Applied Research Pure Research It Involves developing and testing theories and hypotheses that are intellectually challenging to the researcher but may or may not have practical application at the present time or in the future. The knowledge produced through pure research is sought in order to add to the existing body of research methods. Applied research Applied research is done to solve specific, practical questions; for policy formulation, administration and understanding of a phenomenon. It can be exploratory, but is usually descriptive. It is almost always done on the basis of basic research. Applied research can be carried out by academic or industrial institutions. Often, an academic institution such as a university will have a specific applied research program funded by an industrial partner interested in that Program. . From the process adopted to find answer to research questions (inquiry mode) – the two approaches are: Structured approach Unstructured approach Structured Approach The structured approach to inquiry is usually classified as quantitative research. Everything that forms the research process- objectives, design, sample, and the questions that you plan to ask of respondents- is predetermined. It is more appropriate to determine the extent of a problem, issue or phenomenon by quantifying the variation. e.g. how many people have a particular problem? How many people hold a particular attitude? Unstructured Approach The unstructured approach to inquiry is usually classified as qualitative research. This approach allows flexibility in all aspects of the research process. Steps in Research Process 1. Formulating the Research Problem 2. Extensive Literature Review 3. Developing the objectives 4. Preparing the Research Design including Sample Design 5. Collecting the Data 6. Analysis of Data 7. Generalisation and Interpretation 8. Preparation of the Report or Presentation of Results-Formal write ups of conclusions reached. Considerations in selecting a research problem: Interest: a research endeavour is usually time consuming, and involves hard work and possibly unforeseen problems. One should select topic of great interest to sustain the required motivation Magnitude: It is extremely important to select a topic that you can manage within the time and resources at your disposal. Narrow the topic down to something manageable, specific and clear. . Measurement of concepts: Make sure that you are clear about the indicators and measurement of concepts (if used) in your study Level of expertise: Make sure that you have adequate level of expertise for the task you are proposing since you need to do the work yourself Relevance: Ensure that your study adds to the existing body of knowledge, bridges current gaps and is useful in policy formulation. This will help you to sustain interest in the study. . Availability of data: Before finalizing the topic, make sure that data are available. Ethical issues: How ethical issues can affect the study population and how ethical problems can be overcome should be thoroughly examined at the problem formulating stage. Steps in formulation of a research problem Working through these steps presupposes a reasonable level of knowledge in the broad subject area within which the study is to be undertaken. Without such knowledge it is difficult to clearly and adequately ‘dissect’ a subject area. . Step 1: Identify a broad field or subject area of interest to you. Step 2: Dissect the broad area into sub areas. Step 3: Select what is of most interest to you. Step 4: Raise research questions. Step 5: Formulate objectives. Step 6: Assess your objectives. Step 7: Double check. SOURCES OF DATA INTRODUCTION There are two types of data The primary and the secondary sources of data PRIMARY DATA Data collected by investigator for his own purpose, for the first time, from beginning to end, are called primary data. In other words data originally collected in the process of investigation are known as primary data. Primary data are original. . Primary data has not been published yet and is more reliable, authentic and objective. Primary data has not been changed or altered therefore its validity is greater than secondary data. SOURCE OF PRIMARY DATA Sources for primary data are limited and at times it becomes difficult to obtain data from primary sources because of either scarcity of population or lack of cooperation. Regardless of any difficulty one can face in collecting data; it is the most authentic and reliable. Sources of primary data includes : EXPERIMENTS Experiments require an artificial or natural setting in which to perform logical study to collect data. Experiments are more suitable for medicine, psychological studies, nutrition and for other scientific studies. In experiment, the experimenter has to keep control over the influence of any extraneous variables on the results. SURVEY Survey is a commonly used method in social sciences. Survey can be conducted in different method like questionnaire, interview and observation. Questionnaire Is the most commonly used method in survey. Questionnaire are a list of questions either openended or close-ended for which the respondent gives answers. Questionnaire can be conducted through telephone, mail, live in a public area, or in an institution, through electronic mail or through fax and other methods Interview Interview is a face-to-face conversation with the respondent. In interview the main problem arises when the respondent deliberately hides information otherwise it is an in depth source of information . The interviewer can not only record the statements the interviewee speaks but he can observe the body language, expressions and other reactions to the questions too. This enables the interviewer to draw conclusions easily Observation Observation can be done while letting the observing person know that he is being observed or without letting him know. Observations can also be made in natural settings as well as in artificially created environment. SECONDARY DATA Secondary data is the information which is already inexistence, and which has been collected, for some other purpose than the answering of the question at hand. In other words data collected by other persons is called secondary data. The data are therefore, called second hand data. These are available in published or unpublished forms. . The review of literature in any research is based on Secondary data. Mostly from books, journals and periodicals. BOOKS Books are available today on any topic that you want to research. The use of books start before even you have selected the topic. Books are most reliable secondary source. PUBLISHED SOURES Journals/Periodicals Journals and Periodicals are becoming more important as far as data collection is concerned. The reason is that journals provide up-to-date information which at times books cannot and secondly, journals can give information on the very specific topic on which you are researching rather talking about more general topics. . Magazines/Newspapers Magazine are also effective but not very reliable. Newspaper on the other hand are more reliable and in some cases the information can only be obtained from newspapers as in the case of some political studies . Electronic Sources Internet: Information that is not available in printed form is available on internet in the form of E-Journals Websites Blogs OTHER SOURCES Personal Records: Some unpublished data may also be useful in some cases. Diaries: Diaries are personal records and are rarely available but if you are conducting a descriptive research then they might be very useful. Letters: like diaries are also a rich source but should be checked for their reliability before using them. EDITING DATA Editing is the process of checking data for errors such as omissions, illegibility and inconsistency, and correcting data where and when the need arises Example 1: A questionnaire meant to be answered by adults over the age of 30 years has also been answered by some persons under the age of 30 years Example 2: A respondent gives her birthday as 1865 or claims to have a car insurance but says she doesn‘t own a car . Basic Principles of Editing: Checking of the no. of Schedules / Questionnaire) Completeness (Completed in filling of questions) Legibility To avoid Inconstancies in answers To Maintain Degree of Uniformity To Eliminate Irrelevant Responses Data Consistency and Completeness The data obtained from a questionnaire must be logically consistent, especially when questions are related Sometimes inconsistency of data may not be readily apparent. In this case, the data editor must judge what action to take (example: Salary of the CEO of a big corporation is given as USD 25,000 per annum) . Circumstances permitting, the data editor may have to insert data if answers to questions have been omitted by the respondent, but which can be answered on the basis of the other data obtained example: respondent does not answer a question asking if his organization has a website, but somewhere later answers that the organization has three websites Non-Responses and Out-OfOrder Answers Often, questions are left unanswered by respondents (Item Non-Response). In such cases, where data must be inserted, the data editor has some options such as using a „plug value“ according to some prespecified rule Sometimes respondents give answers to (openended) questions in other questions. In such cases, data has to be shifted around the questions . There are two types of Editing : 1. Field Editing 2. Central Editing Field Editing Field Editing is a form of data editing which is undertaken by the field supervisor while the data collection is in process with a view to finding omissions, checking the legibility of handwriting, and clarifying responses by respondents that are logically or conceptually inconsistent Precautions you must take while using Secondary Data The investigator should take precautions before using the secondary data. In this connection, following precautions should be taken into account. 1. Suitable Purpose of Investigation: The investigator must ensure that the data are suitable for the purpose of enquiry. 2. Inadequate Data: Adequacy of the data is to be judged in the light of the requirements of the survey as well as the geographical area covered by the available data. .3. Definition of Units: The investigator must ensure that the definitions of units which are used by him are the same as in the earlier investigation. 4. Degree of Accuracy: The investigator should keep in mind the degree accuracy maintained by each investigator. 5. Time and Condition of Collection of Facts: It should be ascertained before making use of available data to which period and conditions, the data was collected. . 6. Comparison: Investigator should keep in mind whether the secondary data is reasonable, consistent and comparable. 7. Test Checking: The use of the secondary data must do test checking and see that totals and rates have been correctly calculated. 8. Homogeneous Conditions: It is not safe to take published statistics at their face value without knowing their means, values and limitations. Sampling INTRODUCTION Sampling indicates the selection of a part of a group or an aggregate with a view of obtaining an information about the whole. This aggregate or the totality of all members is known as Population. The selected part, which is used to ascertain the characteristics of the population is called Sample. . The total number of members of the population and the number included in the sample are called Population Size and Sample Size respectively. Sampling methodology can be used by an auditor or an accountant to estimate the value of total inventory in the stores without actually inspecting all the items physically. Opinion polls based on samples is used to forecast the result of a forthcoming election. Advantages of sampling over Census The census or complete enumeration consists in collecting data from each and every unit from the population. Sampling has a number of advantages as compared to complete enumeration due to a variety of reasons: . Less Expensive The first obvious advantage of sampling is that it is less expensive. If we want to study the consumer reaction before launching a new product it will be much less expensive to carry out a consumer survey based on a sample rather than studying the entire population which is the potential group of customers. Less time Consuming The smaller size of the sample enables us to collect the data more quickly than to survey all the units of the population even if we are willing to spend money. This is particularly the case if the decision is time bound . Greater Accuracy Complete enumeration may result in accuracies of the data. Consider an inspector who is visually inspecting the quality of finishing of a certain machinery. After observing a large number of such items he cannot just distinguish items with defective finish from good one's. Once such inspection fatigue develops the accuracy of examining the population completely is considerably decreased. . On the other hand, if a small number of items is observed the basic data will be much more accurate. Physically impossibility of Complete Enumeration In many situations the elements being studied get destroyed while tested. TYPES OF SAMPLING There are two basic types of sampling depending on whom or what is allowed to govern the selection of the sample. We have: Probability Sampling Non- Probability Sampling Classification of Sampling Methods Sampling Methods Probability Samples Systematic Cluster Nonprobability Stratified Simple Random Convenience Judgment Snowball Quota PROBABILITY SAMPLING In probability sampling the decision whether a particular element is included in the sample or not is governed by chance alone. All probability sampling designs ensure that each element in the population has some non zero probability of getting included in the sample. This would mean defining a procedure for picking up the sample based on chance. . In the category of probability sampling, we have: Simple random sampling Stratified sampling Systematic sampling Cluster sampling SIMPLE RANDOM SAMPLING In simple random sampling the selected items are drawn “at random” from the population. It ensures that: Each of the samples of size n has equal probability of being picked up as the chosen sample . Each element of the population has an equal probability of getting included in the sample Simple random sampling is the most widely-used probability sampling method because it is easy to implement and easy to analyze. . It is imperative to have all members of the population before a simple random sample can be picked up. Such an exhaustive list of all population members is called a sampling frame. One way to obtain simple random sample would be the lottery method. Each of the N population members is assigned a unique number(or marked). The numbers are placed in a bowl and thoroughly mixed. Then, a blindfolded researcher selects n numbers. Population members having the selected numbers are included in the sample. Random Sampling With and without replacement Suppose we use the lottery method described above to select a simple random sample. After we pick a number from the bowl, we can put the number aside or we can put it back into the bowl. If we put the number back in the bowl, it may be selected more than once; if we put it aside, it can be selected only one time. When a population element can be selected more than one time, we are sampling with replacement. When a population element can be selected only one time, we are sampling without replacement. Random Sampling Numbers The random sampling numbers are collection of digits generated through a probabilistic mechanism. The numbers have the following properties: The probability that each digits 0,1,2,3,4,5,6,7,8,9, will appear at any particular place is the same, namely 1/10 The occurrence of any two digits in any places is independent of each other. When reading from random number tables you can begin anywhere (choose a number at random) but having once started you should continue to read across the line or down a column. An extract from a table of random sampling numbers: 3680 2231 8846 5418 0498 5245 7071 2597 . If we were doing market research and wanted to sample two houses from a street containing houses numbered 1 to 48 we would read off the digits in pairs 36 80 22 31 88 46 54 18 04 98 52 45 70 71 25 97 and take the first two pairs that were less than 48, which gives house numbers 36 and 22. If we wanted to sample two houses from a much longer road with 140 houses in it we would need to read the digits off in groups of three: 368 022 318 846 541 804 985 245 707 1 25 97 and the numbers underlined would be the ones to visit: 22 and 125. STRATIFIED RANDOM SAMPLING When heterogeneity is present in the population with regards to subject matter under consideration, it is often a good idea to divide the population into groups (segments or strata). Stratified random sampling consists of selecting a certain number of sampling units from each stratum to ensure representation from all relevant segments in order to increase efficiency. . Example: Designing a suitable marketing strategy for consumer durable, the population of consumers may be divided into strata by income levels and a certain number of consumers can be randomly from each stratum. Landscapes -stratified by habitat characteristics People -stratified by characteristics (such as sex, occupation, etc.). . In stratified sampling, the population of N units is first divided into L sub-population of N , units, 1,N 2,...,N L respectively. These sub-groups are non-overlapping so that they comprise the whole population such that NN ... N N 12 L A sample size is selected independently from each of the different strata. Then the collection of these samples constitute a stratified sample thus nn . If a n ... n 1 2 L simple random sample selection scheme is used in each stratum then the corresponding sample is called a stratified random sample. . The stratification should be performed in such a way that the strata are homogeneous with themselves with respect to the characteristics under study . On the other hand , strata should be heterogeneous between themselves. . NOTATION The suffix h (h=1,2,...,L) denotes the stratum and i the unit within the stratum. Nh :- Total number of population units in stratum h. nh :- Total number of sample units in stratum h. NL Xh X i1 Nh hi :-the population mean of stratum h . N h :The h-th stratum weight. Wh N X hi :- Value of the characteristic for the i-th unit in stratum h. NL X h X hi :-The total observation in stratum h i 1 . NL 2 ( X X ) hi h :- the population variance of stratum h 2 i1 sh Nh1 Allocation of Sampling Size in Different Strata In stratified sampling, the sample to different strata is allocated on the basis of considerations. The total number of units in the stratum i.e stratum size The variability within the stratum The cost of taking observation per sampling in each stratum Allocation of a Sample to Strata Equal Allocation: If the strata are presumed to be of roughly equal size, and there is no additional information regarding the variability or distribution of the response in the strata, equal allocation to the strata is probably the best choice: n nh L Proportional Allocation: If the strata differ in size, allocation of sample sizes to strata might be performed proportional to these stratum sizes: Nh nh n N . Optimum Allocation: The allocation which minimizes the variance of the estimator of the mean (and total) Optimum allocation (with equal cost): Optimum allocation(with unequal cost): W h/ C h h n nL h W / C h 1 h h h SYSTEMATIC METHODS It is commonly used and simple to apply; it consists of taking every k-th sampling unit after a random start. Example: suppose you want to sample 8 houses from a street of 120 houses. 120/8=15, so every 15th house is chosen after a random starting point between 1 and 15. If the random starting point is 11, then the houses selected are 11, 26, 41, 56, 71, 86, 101, and 116. . If there were 125 houses, 125/8=15.625, so should you take every 15th house or every 16th house? If you take every 16th house, 8*16=128 so there is a risk that the last house chosen does not exist. To overcome this the random starting point should be between 1 and 13. On the other hand if you take every 15th house, 8*15=120 so the last five houses will never be selected. The random starting point should now be between 1 and 20 to ensure that every house has some chance of being selected. . Example Select a sample of size 5 from a population of size 30 by using systematic sampling solution We first compute k(sample interval)= N 306 n 5 The number from 1 to 30 are then written as follows: . 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 A number is then selected at random from the first row i.e. 1 to 6. If the selected number is 4 (i.e. at random start), then the sample will be 4, 10, 16,22 and 28. CLUSTER SAMPLING Cluster sampling may be used when it is either impossible or impractical to compile an exhaustive list of the elements that make up the target population. Usually, however, the population elements are already grouped into subpopulations and lists of those subpopulations already exist or can be created. . Example Let’s say the target population in a study was KNUST students. If there is no list of all KNUST students in the school. The researcher could, however, create a list of departments in the school, choose a sample of departments, and then obtain lists of students from those departments. MULTISTAGE SAMPLING The multi-stage sampling procedure is used for large scale enquiry covering large geographical area such as a region. . An illustration: A bank may like to gather information regarding the quality of customer service it is offering in a region. A random sample of districts is selected from the list of districts. From each of the selected districts a number of branches are randomly selected. From each of the selected branches a number of depositors which is the ultimate sample sampling unit is selected randomly for collecting information. . The districts are called first stage units The branches are known as the second stage units The depositors are regarded as the third stage units. This is an illustration of three stage sampling, the third stage units being the ultimate sampling units. NON-PROBABILITY SAMPLING Non probability sampling is the sampling procedure which does not provide any basis for estimating the probability that each item in the population possesses to be included in the sample. In such a case, the sampling error is not measurable and the error in the estimator tends to increase sharply because the representativeness of the sample members is questionable. . Nevertheless, non probability samples are useful in certain situations. This is the case when the representativeness is not particularly the primary issue. In general, some types of non probability sampling methods includes: Convenience Judgement Quota sampling Snowballing CONVENIENCE SAMPLING Under convenience sampling, the samples are selected at the convenience of the researcher or investigator. We have no way of determining the representativeness of the sample. This results into biased estimates. . Therefore, it is not possible to make an estimate of sampling error as the difference between sample estimate and population parameter is unknown both in terms of magnitude and direction. It is therefore suggested that convenience sampling should not be used in both descriptive and causal studies as it is not possible to make any definitive statements about the results from such a sample . Convenience sampling: may be quite useful in exploratory designs as a basis for generating hypotheses. is also useful in testing of questionnaire etc. at the pretest phase of the study. is extensively used in marketing studies and otherwise. JUDGEMENT SAMPLING Judgement sampling is also called purposive sampling. A researcher deliberately or purposively draws a sample from the population which he thinks is a representative of the population. But all members of the population are not given chance to be selected in the sample . The personal bias of the investigator has a great chance of entering the sample . If the investigator chooses a sample to give results which favours his view point, the entire study may be vitiated. . However, if personal biases are avoided, then the relevant experience and the acquaintance of the investigator with the population may help to choose a relatively representative sample from the population. It is not possible to make an estimate of sampling error as we cannot determine how precise our sample estimates are. . ILLUSTRATION Suppose we have a panel of experts to decide about the launching of a new product in the next year. If for some reason or the other, a member drops out from the panel, the chairman of the panel, may suggest the name of another person whom he thinks has the same expertise and experience to be a member of the said panel. This new member was chosen deliberately - a case of Judgement sampling QUOTA SAMPLING This is a very commonly used sampling method in marketing research studies. The sample is selected on the basis of certain basic parameters such as age, sex, income and occupation that describe the nature of a population so as to make it representative of the population. . The investigators or field workers are instructed to choose a sample that conforms to these parameters. The field workers are assigned quotas of the numbers of units satisfying the required characteristics on which data should be collected. However, before collecting data on these units the investigators are supposed to verify that the units qualify these characteristics. . If in our population, 20% of the population is in high income group, 35% in the middle income group and 45% in the low income group. Suppose we decided to select a sample of size 200 from the population. Then, samples of size 40, 70 and 90 should come from high income, middle income and low income groups respectively SNOWBALL SAMPLING • The sampling procedure in which the initial respondents are chosen by probability or nonprobability methods, and then additional respondents are obtained by information provided by the initial respondents Determining Sample Size Determining Sample Size For the Mean For the Proportion Sampling Error • The required sample size can be found to reach a desired margin of error (e) with a specified level of confidence (1 ) • The margin of error is also called sampling error. – the amount of imprecision in the estimate of the population parameter – the amount added and subtracted to the point estimate to form the confidence interval Determining Sample Size • For the Mean X Z /2 σ n • Thus: the Sampling error (margin of error) e Z /2 σ n Making n the subject Z /2 σ n 2 e 2 2 To determine the required sample size for the mean, you must know: • The desired level of confidence (1 - ), which determines the critical value, Zα/2 • The acceptable sampling error, e • The standard deviation, σ Required Sample Size Example • If = 45, what sample size is needed to estimate the mean within ± 5 with 90% confidence? Z σ (1.645) (45) n 2 219.19 2 e 5 2 2 2 2 • So the required sample size is n = 220 • Note: Always round up If σ is unknown • If unknown, σ can be estimated when using the required sample size formula – Use a value for σ that is expected to be at least as large as the true σ – Select a pilot sample and estimate σ with the sample standard deviation, S Determining Sample Size For the Proportion The sampling error (margin of error) p(1 p) eZ n Making n the subject Z p (1 p) n 2 e 2 To determine the required sample size for the proportion, you must know: • The desired level of confidence (1 - ), which determines the critical value, Zα/2 • The acceptable sampling error, e • The true proportion of events of interest, p • p can be estimated with a pilot sample if necessary (or conservatively use 0.5 as an estimate of p) Required Sample Size Example How large a sample would be necessary to estimate the true proportion defective in a large population within ±3%, with 95% confidence? (Assume a pilot sample yields p = 0.12) Solution: For 95% confidence, use Zα/2 = 1.96 e = 0.03 p = 0.12, so use this to estimate p Z /22 p (1 p) (1.96)2 (0.12)(1 0.12) n 450.74 2 2 e (0.03) Thus n = 451