Foundations of Research 1 Psychology 242, Dr. McKirnan Research Sampling. Defining your target population Probability & NonProbability sampling methods. Dr. David J. McKirnan, University of Illinois at Chicago, Psychology; mckirnanuic@gmail.com Please run this as a PowerPoint Show Go to “slide show” and click “run show”. Click through it by pressing any key. Focus & think about each point; do not just passively click. Foundations of Research The big picture: Research sampling Define your target population What group do you want to generalize to? How is / is not a member of the group? What is your sampling frame? 2 Foundations of Research 3 Sampling Sampling: Who do you want to generalize to? Any study assesses only a sample of the population. There are many different ways we may collect a sample. There are many different populations or subpopulations we may be interested in. The size and breadth of a sample can affect the Internal or External validity of the study. Psychology 242, Dr. McKirnan Week 6; Sampling Foundations of Research 4 Define the target population Who do you want to generalize to? Mammals Humans All Western people Breadth of population to sample from (i.e., size of sampling frame). Represents increasing external validity. Psychology 242, Dr. McKirnan All Americans Young Americans College students UIC Students This class Week 6; Sampling Specificity (and ease) of sampling frame. Generally increases internal validity. Foundations of Research 5 Who do you want to generalize to? Samples typically represent targeted sub-populations Demographic or ‘status’ groups; Ethnic or Socio-economic status groups Geography; e.g., urban dwellers… Medical / clinical groups; specific diagnosis Behavioral groups Registered voters Home owners Ever used marijuana… Targeting specific groups increases Internal validity by decreasing the complexity of the sample. Groups defined by self-identification or subjective state “Conservatives” vs. “Liberals” Above a ‘cut point’ on a stress or depression scale Views oneself as “highly likely to vote…”. Psychology 242, Dr. McKirnan Foundations of Research Research samples & validity EXAMPLE Clinical drug trials illustrate the conflict between internal v. external validity in sampling. People with diverse symptoms and backgrounds see physicians for depression. To enhance internal validity drug researchers use exclusion criteria to select only participants who fit a specific definition of depression Zimmerman et al. suggest that too many exclusion criteria compromises the validity of this research area. (click image for article) Zimmerman, M.l, Mattia, J.I., & Posternak, M.A. (2002). Are Subjects in Pharm-acological Treatment Trials of Depression Representative of Patients in Routine Clinical Practice? Am J Psychiatry, 159, 469–473. Week 6; Sampling Psychology 242, Dr. McKirnan 6 Foundations of Research Exclusion criteria & validity EXAMPLE The study begins with a large # of people self-referred for depression They exclude those with serious mental illness, drug abuse or personality disorder… …whose symptoms are not severe enough, are suicidal, or who have other affective disorders.. …whose symptoms are too recent OR too long-standing… …and end up with a small, carefully selected sub-set of patients (8.4% of general depression patients). Psychology 242, Dr. McKirnan Week 6; Sampling 7 EXAMPLE Foundations of Research External vs. internal validity in sampling Applying rigorous study selection criteria for drug trials excludes the great majority of routine depression patients. Rigorous participant selection for internal validity seriously compromises external validity in these studies. This leaves the actual usefulness of anti-depressant (and other) medications for the general population in doubt. To be useful research must balance the need for careful subject selection with the need for representativeness Psychology 242, Dr. McKirnan Week 6; Sampling 8 Foundations of Research Who is a group member? Are you between 14 and 30 and have a computer or smart phone available? A = Yes B = No C = Not sure – lost count. Psychology 242, Dr. McKirnan Week 3; Experimental designs 9 Foundations of Research Who is a group member? Do you use Facebook or other media 5 times a week or more? A = Yes B = No C = Not sure – lost count. Psychology 242, Dr. McKirnan Week 3; Experimental designs 10 Foundations of Research Who is a group member? Are you a “Facebook user”? A = Yes B = No C = Not sure – let me Facebook that. Psychology 242, Dr. McKirnan Week 3; Experimental designs 11 Foundations of Research Who is a group member? Do you live in Pilson, Humboldt Park or other neighborhood that is mostly Latino? A = Yes B = No C = Maybe – I’m not sure Psychology 242, Dr. McKirnan Week 3; Experimental designs 12 Foundations of Research Who is a group member? Do you speak Spanish? A = Yes B = No C =¿cuál era la pregunta? Psychology 242, Dr. McKirnan Week 3; Experimental designs 13 Foundations of Research Who is a group member? Are you Latino? A = Yes B = No C = Maybe – I’m not sure Psychology 242, Dr. McKirnan Week 3; Experimental designs 14 Foundations of Research 15 Define the target population Who do you want to generalize to: who is in the group? Once we choose our sampling group, we must decide on criteria for membership… To sample social media users do I use a … Rough demographic criterion? Behavioral criterion (which behavior?) Self-identification? To sample “Latinos”… Is geographic status specific enough? Is Spanish language the defining characteristic? Can / must one call oneself “Latino” (even if you do not speak Spanish…)? Clearer and narrower group criteria increases Internal validity by making the sample more homogeneous. Foundations of Research 16 Define the target population Who do you want to generalize to: who is in the group? Once we choose our sampling group, we must decide on criteria for membership… To sample social media users do I use a … Rough demographic criterion? Some of these criteria are easier to reliably measure Behavioral criterion (which behavior?) than others; Self-identification? To sample “Latinos”… Demographic variables are often available in census data Is geographic status specific enough? Behavioral or subjective Is Spanish language the defining characteristic? criteria require direct assessment, and can be less reliable. Can / must one call oneself “Latino” (even if you do not speak Spanish…)? Foundations of Research Define the target population 17 Who do you want to generalize to: who is in the group? Once we choose our sampling group, we must decide on criteria for membership… To sample social media users do I use a … To Rough demographic criterion? Of course different criteria may yield very Behavioral criterion (which behavior?) different samples. Self-identification? Our choice of sampling criteria must be based on sample “Latinos”… our theory, hypothesis, Is geographic status specific enough? or research question. Is Spanish language the defining characteristic? Can / must one call oneself “Latino” (even if you do not speak Spanish…)? Foundations of Research 18 Sampling criteria Demographic or ‘status’ marker Behavioral Subjective / selfidentification Who is a “Latino”? Neighborhood residence? Spanish speaking? Cultural practices? Self-description? # Hours registered Describes occupation as ‘student’ Who is a “Student”? Lives on a campus Who is “gay” or “lesbian”? Lives same-sex 2person household? Sexual or other patterns? Self-identification as gay / lesbian? Pattern of behaviors and feelings? Describes self as “depressed”? Who is “depressed”? Received a diagnosis from MH professional ? Presents at Doctor’s office for general malaise? Dr. David J McKirnan Sampling Foundations of Research 19 Sampling criteria Demographic or ‘status’ marker Behavioral Subjective / selfidentification Who is “Latino”? Neighborhood residence? Spanish speaking? Cultural practices? Self-description? # Hours registered Describes occupation as ‘student’ Who is a “Student”? Lives on a campus Who is “gay” or “lesbian”? Lives same-sex 2person household? Sexual or other patterns? Self-identification as gay / lesbian? Who is “depressed”? Each criteria may meet the goals of a particular Received a diagnosis Pattern of behaviors hypothesis or empirical question. Describes self as from MH professional ? and feelings? “depressed”? Of course different choices may lead to very Presents at Doctor’s different samples office for general Some criteriamalaise? are easy to assess but may be only approximate Others may require relatively difficult assessments Dr. David J McKirnan Sampling Foundations of Research Overview: From research question to sample What is the research question? Are we describing some natural process? …testing a theory? What is the population of interest? What population or subpopulation is relevant to our research question? Whom do we want to generalize to? Category of participant criterion? Demographic or “Status” criteria? Behavioral criterion? Self-Identification, attitudes or beliefs? Operational definition of enrollment criteria? Specific measures Actual recruitment? Concrete processes to recruit and enroll participants. Psychology 242, Dr. McKirnan Week 6; Sampling 20 Foundations of Research From theory to sample: Asthma among African-Americans. Study structure & research question: EXAMPLE Adherence to a medication regimen is key to health among people with asthma. Medication adherence is generally low, particularly among AfricanAmerican adolescents, who have high rates of asthma. Self-determination theory proposes that autonomous motivation (being self-directed), self-confidence, and relatedness (family routines & parental support) underlie adherence. This study tests the hypothesis that three variables comprising self-determination theory will be associated with patients’ adherence to medications. Because young African-Americans have a significant health burden from asthma, the study focuses on them. Bruzzese, J., Idalski C, Lam, P, Deborah A.; Naar-King, S. (2014) Adherence to asthma medication regimens in urban African American adolescents: Application of self-determination theory. Health Psychology, Vol. 33.5 (May 2014): 461-464. Article here. Psychology 242, Dr. McKirnan Week 6; Sampling 21 Foundations of Research From theory to sample: Asthma among African Americans. 22 Population of interest? Young African-Americans who suffer from poorly controlled asthma. EXAMPLE Category of participant criterion? Demographic or Status criteria African-American adolescents Self-Identification / attitudes Poorly controlled asthma. not a criterion in this study. Behavioral criterion Already participating in long-term asthma control study. Operational definition of enrollment criteria? “Adolescent”: Age 10 – 18. “Poorly controlled”: At least one asthma-related hospitalization or two asthma-related emergency department visits in the last 12 months. Actual recruitment? Recruited from the hospital’s outpatient immunology clinic after an asthma-related clinic visit or hospitalization Psychology 242, Dr. McKirnan Week 6; Sampling Foundations of Research 23 Results Having asthma regulation embedded in the family EXAMPLE routine was the only predictor of medication adherence. Multiple regression analysis (all variables are tested simultaneously) Dr. David J McKirnan Sampling 24 Foundations of Research Who do you want to generalize to: Your “Sampling Frame”. What is known about your larger population? Are there Census or survey data? E.g., are there “population” data on depressed people? Do we know the demographic profiles of Facebook users? Data about your target population will help you determine how well your sample represents that population. What is its size, sub-groups, location…. Where / how can I best recruit members of the population Will some sub-groups require different recruitment methods than others? Will different recruitment methods be biased in favor of some subgroups? E.g., internet surveys are biased against less computer-oriented people. Foundations of Research 25 Click Having a broad population helps with… A = Avoiding confounds. B = External validity. C = Internal validity. D = Specificity of the design. Dr. David J McKirnan Sampling Foundations of Research 26 Click Having a more narrow population helps with… A = Avoiding confounds. B = External validity. C = Internal validity. D = Specificity of the design. Dr. David J McKirnan Sampling Foundations of Research 27 Click It is not true that using a behavioral or objective versus subjective or ψ sample criterion… A = Must be based on the theory or hypothesis you are testing. B = Typically leads to the same sample characteristics. C = Requires different measures to screen participants D = Can substantially affect the results of your study. Dr. David J McKirnan Sampling Foundations of Research Click What is a sampling frame? A = Sample of the different stimuli that will be used in the experiment. B = The decision to use a behavioral versus a selfidentification or subjective criteria for group membership. C = The list of sub-populations we plan to study. D = Census, survey or other data about the target population that allows us to know if our sample is representative. Dr. David J McKirnan 28 Foundations of Research Research sampling Defining your target population Probability & NonProbability sampling methods. Psychology 242, Dr. McKirnan Week 3; Experimental designs 29 Foundations of Research Major forms of sampling Probability (Random) Sampling Recruit (or select) participants to maximize the representativeness of the sample to a known population. Uses some form of random selection. Requires that each member of the population has a known (often equal) probability of being selected. Most externally valid approach to sampling general populations Non-Probability Sampling Use available samples for convenience, or targeted outreach to unusual or small populations. Selection may be either systematic or haphazard, but is not random. Often the most externally valid approach to unusual, small, or extreme groups, or groups where little is known. When used only for convenience it is the least externally valid. 30 Foundations of Research Participant Selection Sample Random Selection or a Random Sample refer to how we recruit participants; who is in the sample. Psychology 242, Dr. McKirnan 31 Watch that word ‘random’! Participant Assignment Experimental Experimental Treatment or Procedures Manipulation Results Group A Procedure Treatment Outcome Group B Procedure Control Outcome (Group C) (Procedure ) (Alternate Treatment?) (Outcome) Random Assignment is how we (should) assign participants to different groups. Foundations of Research Probability / Random Sampling 32 • Core feature: all members of the study population have an equal (or known) chance of being sampled • Procedure: Choose participants in a systematic, random fashion. • e.g., every 100th student ID, • Every 1000th person on a voter registration record. • Advantages: eliminates obvious biases of convenience sampling • Limitations: • May under-sample unusual / hard to reach participants • Some may be unavailable in, e.g., telephone lists, computer files. Psychology 242, Dr. McKirnan Week 6; Sampling Foundations of Research Basic Forms of random sampling • Simple Random Sampling: Select a specific % of a target population; all members of population have about equal chance of selection. • Multi-Stage: Randomly select population units (census tracts, households, schools..), then randomly select individuals within unit. • Stratified: Random within population sub-blocks, e.g., gender (randomly select 50 women and randomly select 50 men), ethnicity, etc. • Cluster: Random within (potentially convenience) clusters, e.g., specific locations or “venues”, events, times of day, etc. Psychology 242, Dr. McKirnan 33 Foundations of Research Simple Random sampling Objective: Attempts to truly represent the general population; absolute minimal selection bias. Procedure: Recruitment method where all members of the population have ~ chance of being selected: Examples: Gallup polls using random digit dialing surveys “Long form” of the census to a small % of U.S. households Advantages: Most representative sampling frame for general (non-targeted) population Disadvantages: Any recruitment method excludes some people (no telephone, no stable address, etc.). Psychology 242, Dr. McKirnan Week 6; Sampling 34 Foundations of Research Multi-Stage Random sampling 35 Objective: Focused & efficient random sample. Procedure: Concentrate recruitment in specific locations or venues. Examples: NIDA household drug surveys: 1) Random select moderate # of census tracts nationally 2) randomly select small % of households within each tract; 3) Interview 1st adult who answers phone in each household “CITY” HIV study among youth: 1) Randomly select bars, clubs, other venues across the city 2) Randomly select days & times to recruit in them. 3) Randomly approach every 4th person who enters the venue for an interview Advantage: Much more efficient that simple random Disadvantage: Same as above; necessarily excludes some people Bias in who answers phone in drug-using households…? Not all young gay men go to bars or similar venus… Psychology 242, Dr. McKirnan Week 6; Sampling Foundations of Research Stratified or cluster sampling Objective: Represent every key segment of the population. Procedure: Decide which population segments are important • e.g. ethnic groups, • geographic areas. • Self-identification This decision must be based on your hypothesis or empirical question. Randomly select from each segment. Proportionate: Same sampling fraction from each segment; approximates overall population • e.g., sample 1% of all African-Americans, 1% of all Latinos… Disproportionate: Unequal sampling fraction across segments, to over-represent smaller groups • e.g., select larger % of recent immigrants… Psychology 242, Dr. McKirnan 36 Foundations of Research Non-Probability Sampling 37 Useful for populations that: Cannot be randomly sampled; “hidden” or difficult to reach No sampling frame available, such as census data, describing its size, composition, etc. Examples: drug users, recent immigrants, gay men… Likely to misrepresent the population May be difficult or impossible to detect this misrepresentation Often over-sensitive to incentives: paying participants attracts more poor people “Respondent Driven” sampling (RDS) allows for “targeted” population estimates Psychology 242, Dr. McKirnan Week 6; Sampling Foundations of Research Non-Probability methods (1) Haphazard Sampling; “Man on the street” College psychology majors Available medical / therapy clients Volunteer samples Problem: No evidence for representativeness Advantage: availability of participants Modal Instance Sampling; “Typical” case Typical New Yorker describing trade tower tragedy Typical voter. Problem: May not represent the modal group. Advantage: Describe simple, “typical case” Haphazard / Modal instance often used by journalists or qualitative-descriptive studies; see NYT “down low” article. Psychology 242, Dr. McKirnan Week 6; Sampling 38 Foundations of Research Non-Probability methods, 2 Venue & time / space Sampling Sample a specific, well-defined, often hard to reach group Assume group members are well represented at specific locations or settings (“venues”). Use “Intercept” methods for reaching participants Use indigenous outreach workers from the population Develop a standard recruitment script Collect / distribute contact information for later participation Time / Space randomization: Lessen bias due to choice of venue: Randomly approach different venues at different times Randomly select participants within the venue (e.g., every 4th person…) Strategy must be based on a clear epidemiological or theory question. Examples: Shopping mall intercepts, gay recruitment Psychology 242, Dr. McKirnan Week 6; Sampling 39 Foundations of Research Outreach / venue sampling: examples of palm cards Psychology 242, Dr. McKirnan Week 3; Experimental designs 40 Foundations of Research Outreach lead sheet Psychology 242, Dr. McKirnan Week 3; Experimental designs 41 Foundations of Research 42 Non-Probability methods, 3 Targeted Multi-Frame Sampling Sample a specific, hard to reach group No census or similar data for sampling frame. Uses multiple (convenience) sampling “frames”: Direct outreach to places where population members are available (venue sampling). Newsletters, internet lists & chat rooms Organizations or meeting places Strategy must be based on a clear epidemiological or theory question. Most common & valid convenience sample Examples: “MTV” Market segments Shoplifters Psychology 242, Dr. McKirnan Week 6; Sampling People who have risky sex Homeless people… Foundations of Research 43 Non-Probability methods (3) Snowball / “Respondent Driven” Sampling (RDS) Early participants are paid to recruit others, who recruit others, etc. Choice of seeds. Form of targeted sampling: Recruit network of “linked” people tracked by referrals Problem: Eligibility criteria Sensitive to incentives! Advantage: Access unusual or “hidden” people related by a common behavior. With enough “generations” of links can well represent a target population. Often part of multi-frame approach. With RDS can show “chain” of referrals / links. Useful for people who mistrust research or where personal contact is necessary for recruitment (HIV, drug use). Portrays “chain” of influence or, e.g., infectious disease. Psychology 242, Dr. McKirnan Week 6; Sampling Foundations of Research RDS coupon examples Heckathorn, D.D. & Magnani, R. (2004). Snowball and RespondentDriven Sampling. In: Behavioral Surveillance Surveys: Guidelines for Repeated Behavioral Surveys in Populations at Risk of HIV Psychology 242, Dr. McKirnan Week 6; Sampling 44 Foundations of Research RDS; chain description Heckathorn, D.D. & Magnani, R. (2004). Snowball and Respondent-Driven Sampling. In: Behavioral Surveillance Surveys: Guidelines for Repeated Behavioral Surveys in Populations at Week 6; Sampling Psychology 242, Dr. McKirnan Risk of HIV. 45 Foundations of Research Example of social network sampling: Bearman et al., Romantic ties among adolescents 46 With a number of smaller chains And a small % in 2 to 4 person chains A substantial majority of students are in an extended, linked chain of relationships. Psychology 242, Dr. McKirnan Week 6; Sampling From sampling perspective, several “seeds” access most of the population Findings suggest a clear potential for STI transmission. Foundations of Research Non-Probability methods Quota Sampling Similar to cluster sampling, except you cannot randomly sample each Select people non-randomly according to quotas population segment. Must have clear theory / research question to pick relevant population characteristic(s). Proportional quota sampling • Represent major characteristics of a population. If gender is important, and the proportion of women :: men in your population = 65% :: 35%, the sample must meet that quota. Non-proportional quota sampling • Sample enough members of each group to test hypothesis, even if the sample is not proportional. (e.g., recruit 50 women & 50 men, even though the real proportion is 65::35). • Helps assure that you have good representation of smaller population groups. Psychology 242, Dr. McKirnan Week 6; Sampling 47 Foundations of Research Non-Probability methods Web sampling Typically highly targeted samples Gay / bisexual men… Adolescents… “Gamers”… Typically access through existing venues: Users of specific web sites List-serves, e-mail lists Active recruitment in “chat rooms” Problem: Inherent bias in computer literacy(?) Advantage: Cheap large national sample Access unusual or “hidden” people who reach others via internet Psychology 242, Dr. McKirnan Week 6; Sampling 48 Foundations of Research Non-Probability methods; Heterogeneity Sampling 49 • Sample every sector of a population -- at least several of everyone -- without worrying about proportions. • At least some members of each geographic area • …ethnic group • …behavioral group (voters & non-voters…) • Assume that a few people are a good proxy for the group. Examples: focus groups or qualitative interviews about products, social issues... Problem; Cannot be sure a few people really represent their sub-group. Advantage: At least some representation of all subgroups. Psychology 242, Dr. McKirnan Week 6; Sampling Foundations of Research 50 Click A probability sample is… A = Based on some form of random selection. B = Always representative of the population C = Best for any population D = Is usually easier to collect than other sample approaches. Dr. David J McKirnan Sampling Foundations of Research 51 Click A non-probability sample… A = Is perfectly OK if you have limited resources. B = Just consists of grabbing the most convenient possible participants. C = Is never adequate to generalize from. D = Can be best for hard to reach or unusual participants. Dr. David J McKirnan Sampling Foundations of Research 52 Click A Gallup poll or telephone survey is a… A = Simple random sample. B = Multi-stage random sample. C = Social network or “snowball” sample. D = Haphazard sample. Dr. David J McKirnan Sampling Foundations of Research 53 Click Respondent-driven sampling, where target people recruit people like them, is a… A = Simple random sample. B = Multi-stage random sample. C = Social network or “snowball” sample. D = Haphazard sample. Dr. David J McKirnan Sampling Foundations of Research 54 Click My distributing a survey to this class is a… A = Simple random sample. B = Multi-stage random sample. C = Social network or “snowball” sample. D = Haphazard sample. Dr. David J McKirnan Sampling Foundations of Research 55 Click Selecting every 100th registered voter and contacting them for a survey is a… A = Simple random sample. B = Multi-stage random sample. C = Social network or “snowball” sample. D = Haphazard sample. Dr. David J McKirnan Sampling Foundations of Research 56 Click Randomly selecting classes across the university, than sampling each 3rd person, is a… A = Simple random sample. B = Multi-stage random sample. C = Social network or “snowball” sample. D = Haphazard sample. Dr. David J McKirnan Sampling Sampling overview Foundations of Research 57 Who do you want to generalize to? Summary Who is the target population? broad – external validity narrow – internal validity How do you decide who is a member? demographic / behavioral criteria? subjective / attitudinal? What do you know about the population already – what is the “sampling frame”. Is a Probability or random sample possible? “Hidden” population? Socially undesirable research topic? Easily available via telephone, door-to-door? Sampling frame adequate to choose selection method? Psychology 242, Dr. McKirnan Week 6; Sampling Foundations of Research 58 Overview, 2 Summary Types of Non-probability Samples Haphazard Modal instance Venue – time / space Multi-frame Snowball / Respondent driven Web Quota Heterogeneity Psychology 242, Dr. McKirnan Week 3; Experimental designs Foundations of Research Probability sampling simple multi-stage cluster or stratified Summary 59 Overview, 3 Non-probability sampling targeted / multi-frame snowball quota, etc. Most externally valid Assumes: Clear sampling frame Population is available Less externally valid for hidden groups. Less externally valid High “convenience” Best when: No clear sampling frame Hidden / avoidant population.