Sampling and Selection Biases There are a number of variables researchers can control in a study, however there other factors which might influence the sample/results which the researchers have no control over. The goal of any study is to minimize the uncontrollable factors as much as possible. In this set of notes we’ll look at possible biases that could be encountered. Selection Bias – This type of bias occurs when part of the target population is ________ sampled or sampled at a different rate than intended. A ________________________ sample is usually biased since the units which are easiest to select or more likely to respond are often not representative of the target population. Example: Deliberately or _________________________ selecting a “representative” sample. For example, suppose we want to estimate the average amount a shopper spends at the Mall of America in a shopping trip. What if the researcher simply looks around and samples shoppers who look like they’ve spent an “average” amount. The researcher has deliberately imposed their own judgment of what “average” means. Example: _________________________ the target population can also create a selection bias. If the target population hasn’t been defined correctly then there may be large gaps/holes in representation of the true target population. Example: Using a sampling ______________ which fails to include all of the target population may create undercoverage. For example, the U.S. Behavioral Risk Factor Surveillance System survey suffers from undercoverage since telephone interviews are the main method of collecting information. This method automatically eliminates those individuals who don’t have a phone, are incarcerated, or perhaps live in a facility such as a nursing home. Example: 1 A sample may suffer from _________________________. This occurs when units are included in the target population when they don’t belong. For example, most surveys require a participant to be at least 18 years of age. If this has not been checked beforehand then individuals are surveyed who are not part of the target population. Example: _________________________ listings in the sampling frame could also cause problems with the selection of your sample. For example, if you are using the telephone directory as your sampling frame, households with more than one telephone line are more likely to be chosen than those with only one. Therefore, not everyone has the same chance of participating in the study. Example: _________________________ a convenient member of a population in for the designated participant in the study. For example, if the person selected for the study isn’t home, you might just go next door rather than coming back and collecting information on the selected participant. Example: _________________________ occurs when researchers fail to obtain responses from all of the individuals chosen for the sample or responses are not returned by selected participants. Example: Allowing the sample to consist of only ________________________ creates information which is not reliable. Typically, individuals who have a strong opinion are going to be the ones to respond and the views of those “in the middle” may be completely ignored. Example: 2 Question: For the given scenario, describe the target population and sampling frame. Also, discuss any possible sources of selection bias or inaccuracy of responses. Entries in the online encyclopedia Wikipedia can be written or edited by anyone with Internet access. This has given rise to concern about the accuracy of the information. Giles (2005) reports on a Nature study assessing the accuracy of Wikipedia science articles. Fifty subjects were chosen on a “broad range of scientific disciplines.” For each subject, the entries from Wikipedia and Encyclopedia Brittanica were sent to a relevant expert; 42 sets of useable reviews were returned. The editors of Nature then tallied the number of errors reported for each type of encyclopedia. (source: Sampling: Design & Analysis by Lohr, page 20, Question 9) 3