SAMPLING AND DATA COLLECTION Dr.Zia Farooqi Population • It is the collection of a specified group of individuals, objects, that has common observable characteristics • Elementary units Types of population • Finite (Countable) – Books in college library • Infinite (indefinitely large) – Temperature at different time • Real or true or existent – Employees of UOL • Hypothetical – Time and death Sampling • A part of population which is selected according to some rule or plan to draw a conclusion about a population • Sampling is a process used in statistical analysis in which a predetermined number of observations are taken from a larger population. • The methodology used to sample from a larger population depends on the type of analysis being performed. • Sample size Characteristics of sample • • • • • Representativeness Homogenecity Adequacy Independent Similar regulatory conditions Sampling methods • Sampling Methods can be classified into one of two categories: – Probability Sampling: Sample has a known probability of being selected – Non-probability Sampling: Sample does not have known probability of being selected as in convenience or voluntary response surveys Probability Sampling • In probability sampling it is possible to both determine which sampling units belong to which sample and the probability that each sample will be selected. The following sampling methods are examples of probability sampling: – – – – – Simple Random Sampling (SRS) Stratified Sampling Cluster Sampling Systematic Sampling Multistage sampling Simple Random Sampling (SRS) • A simple random sample is a subset of a population in which each member of the subset has an equal probability of being chosen. • Suggest examples!! • Advantage of SRS – no need exists to divide the population into subpopulations or take any other additional steps before selecting members of the population at random. • Disadvantage – A sampling error can occur with a simple random sample if the sample does not end up accurately reflecting the population it is supposed to represent. – For example, in our simple random sample of 25 employees, it would be possible to draw 25 men even if the population consisted of 125 women and 125 men. Techniques for SRS • Assign numbers to individuals and pick randomly or by computer draw • It difficult for large populations • Make a list • Arrange sequentially • Use selection method – – – – – – Random number table Lottery Throwing of dyes Throwing of coins Blindfold methods Sieve method Stratified Sampling • Stratified Sampling is possible when it makes sense to partition when a heterogeneous population is split into fairly homogeneous groups. • These groups are called strata. An individual group is called a stratum. With stratified sampling one should: – partition the population into groups (strata) Population Strata random sample Random Sample Sampled – obtain a simple from each group All Students in the of 11 different FG 20 students from 11 × 20 = 220 (stratum) Federal Capital • Examples: schools in Islamabad each of the 11 elementary schools selected students Cluster Sampling • Cluster Sampling is very different from Stratified Sampling. With cluster sampling one should – divide the population into groups (clusters). – obtain a simple random sample from all possible clusters. • It is important to note that, unlike with the strata in stratified sampling, the clusters should be microcosms, rather than subsections, of the population. Population Clusters Random Sample Sampled All Students in the FG Schools of Federal Capital 11 different FG schools in Islamabad 3 FG schools from the Every student in the 11 possible FG 3 Selected FG Schools schools For Your Ease… Population Strata Random Sample Sampled All Students in the FG Schools of Federal Capital 11 different FG schools in Islamabad 20 students from each of the 11 elementary schools 11 × 20 = 220 selected students Population Cluster Random Sample Sampled All Students in the FG Schools of Federal Capital 11 different FG schools in Islamabad 3 FG schools from the Every student in the 11 possible FG 3 Selected FG Schools schools Multistage sampling Systematic Sampling • In which sample members from a larger population are selected according to a random starting point and a fixed periodic interval. • This interval, called the sampling interval, is calculated by dividing the population size by the desired sample size. • Despite the sample population being selected in advance, systematic sampling is still thought of as being random if: – the periodic interval is determined beforehand and – the starting point is random How to conduct systematic Sampling? • K=N/n N=Total population n=Sample size • For example, if you wanted to select a random group of 1,000 people from a population of 50,000 using systematic sampling, all of the potential participants must be placed in a list and a starting point would be selected. • Once the list is formed, every 50th person on the list, starting the count at the selected starting point, would be chosen as a participant, since 50,000/1,000 = 50. • For example, if the selected starting point was 20, the 70th person on the list would be chosen followed by the 120th, and so on. Multistage sampling Nonprobability sampling • • • • • Convinient sampling Quota sampling Judgement (Purposive ) sampling Snowball sampling Haphazard sampling Non-probability Sampling • The following sampling methods are types of non-probability sampling that should be avoided: – volunteer samples – haphazard (convenience) samples • Since such non-probability sampling methods are based on human choice rather than random selection, these can be biased. • Therefore, the two types of non-probability samples listed above are called "sampling disasters." Convenient sampling • Units which are accessible • Less cost and time • Consecutive sampling • Sampling , research and move to next study group • Small population Quota sampling • Researcher divide population in equal or propionate representatio0n of subjects • Age, gender, race, education, religion etc • If we take socioeconomic status – Upper class=30 – Upper middle class=30 – Middle class=30 – Lowe middle class=30 – Poor=30 Judgment sampling • Subjects are chosen according to a specific propose in mind • Fit for study Snow ball sampling • Research select initial and pass to others • Like students of some classes or CR target The Process • • • • • 1. Identify the population of interest. 2. Specify a sampling frame. 3. Specify a sampling method. 4. Determine the sample size. 5. Implement the plan. Sampling size calculation • n=N/1+Ne2 • N=population • e=error (5% or 0.05) • If a # of employee of KMU are 25000 what would be the sample size • Infite calculation • S=Z2xp(1-p)/m2 • S=sample size • M=margin of error 50% (0.5) • Z= confidence interval – 95=1.96 – 90=1.645 – 99=2.576 • 384.16 • Adjusted sample • If population is 100000 • S=(S)/1+[(S1)/population] • 383 • 95% z or a • Z2x((p+q)/l2) • • • • Z=confidence interval P=prevelence of disease q=100-p l=related precision • P=10 • L=20% • Now 20% of 10% prevelance Data Collection • The process by which the researcher collects the information needed to answer the research problem. • In collecting the data, the researcher must decide: – – – – Which data to collect How to collect the data Who will collect the data When to collect the data • The selection of data collection method should be based on the following: – – – The identified hypothesis or research problem The research design The information gathered about the variables RESEARCH INSTRUMENTS • The type of instrument used by the researcher depends on the data collection method selected. • Types of Research Instruments: – – – – – – – – Questionnaire Checklist Distribution Interview Observation Records Experimental Approach Survey Approach QUESTIONNAIRE Questionnaire • A series of questions designed to elicit information, which is filled in by all participants in the sample. • This can be gathered either by oral interview or by written questionnaire. • This is the most common type of research instrument. • Advantages include: – Relatively simple method of obtaining data. – Less time is consumed. – Researcher is able to gather data from a widely scattered sample. Disadvantages of a Questionnaire • Responses to a questionnaire lack depth. • Respondent may omit or disregard any item he chooses • Some items may force the subject to select responses that are not his actual choice. • Length of the questionnaire is limited according to the respondent’s interest. • Printing may be costly especially if it is lengthy. • Data are limited to the information that is voluntarily supplied by the respondents. • Some items maybe misunderstood. • The sample is limited to those who are literate. Types of Questions! • Open Ended: This gives the respondents the ability to respond in their own words. • Close Ended: This allows the subject to choose one of the given alternatives. Criteria of a Good Questionnaire • • • • • Clarity of Language Singleness of Objective One-to-One Correspondence Correct Grammar, Spelling, and Construction The questionnaire must be constructed observing grammatically correct sentences, correctly spelled words Types of Questions • Dichotomous questions: This requires the respondent to make a choice between two responses such as yes/no, male/female, or married/unmarried. • Multiple-choice question: The respondents are asked to select a response according to their own point of view. Example: People have different views on “Co Education”, which of the following best represent your views? 1. Co Education is necessary to provide adequate exposure. 2. Co education is immoral and should be totally banned. 3. Co Education has undesirable side effects that suggest need for caution. 4. Co Education has beneficial effects that merit its practice. 5. Co Education is moral and should be practiced. Types of Questions - Continued • Rank-Order questions • The respondents are asked to choose a response from the “most” to the “least”. – Example: Why must family planning be practiced? Rank your answers from the 1-most reasonable to 5-least reasonable? ___Limits maternal disabilities. ___Gives parents more time to meet family needs. ___Helps maintain financial viability of the family. ___Affords more working hours for couples. ___Ensures family capability to educate all the children in the future. Types of Questions - Continued • Rating questions: The respondents are asked to judge something along an ordered dimension. Example: On the scale of 1 to 5 where 1 means strongly disagree and 5 means strongly agree, the Emergency in PIMS provides you the necessary services. Scale ___ 5 - Strongly agree ___ 4 - Agree ___ 3 - Uncertain ___ 2 - Disagree ___ 1 - Strongly disagree Home Work for Today!!! • Prepare a questionnaire on the topic assigned to you. The questionnaire must include two parts: – Personal Information – Specific Information • Make 15 Questions in maximum • Make sure you include each type of questions that you have studied: – Open Ended – Close Ended • • • • Dichotomous MCQ Rating Question Rank Order Question CHECKLIST Check List Hygiene Practices Washing Hands Before Meals Bathing Every Day Using Sanitizers Using Antiseptic Soaps Trimming Nails Regularly Not sharing items like towels etc. Of Great Importance Of Little Importance Have No Effect Have Poor Impact Interviews Interview • This involves either structure or unstructured verbal communication between the researcher and subject, during which information is obtained for a study. • Advantages of Interview – Data from interview are usable – Depth of response can be assured – In an exploratory study, the interview technique provides basis for the formulation of questionnaire – Clarification is possible – No items are overlooked – Higher proportion of responses is obtained – Greater amount of flexibility is allowed • Disadvantages of Interview – Time element – Biases may result – Costly OBSERVATION Observation • Advantages of Observation – – – – – Produces large quantities of data. All data obtained from observation are usable Relatively inexpensive Subjects are usually available.13. The observation technique can be stopped or begun at any • Disadvantages of Observation – Accurate prediction of a situation or event to be observed is unlikely. – Interviewing selected subjects may provide more information, than waiting for the spontaneous occurrence of the situation. – Observed events are subject to biases. – Extensive training is needed. RECORDS Records • Record refers to all the numbers and statistics that institutions, organizations and people keep as a record of their activities. • Sources: Census data, Educational records, Hospital/clinic records • Advantages of Records – Records are unbiased – Records often cover a long period of time – Inexpensive • Disadvantages of Records – All the researcher can have is what is there. If the record is incomplete, there is no way it can be completed – No one can be sure of the conditions under which the records were collected. – There is no assurance of the accuracy of the records. EXPERIMENTAL APPROACH Experimental Approach • Two Groups of Experimental Approach – Treatment / Experimental group – Control group • Let us Gather the Advantages and Disadvantages Testing the Reliability of Research Instrument • Stability – This refers to the extent to which the same results are obtained with repeated use of an instrument • There are two categories for tests of stability: – Test / Retest – Repeated observations • Testing the Reliability of Research Instrument – This refers to the extent to which all parts of the measurement techniques are measuring the same concept. • Test of Equivalence – This refers to the consistency of the results by different investigators or similar tests at the same time.