TYPES OF BIAS Section 4.1A Remember…. • Population • Consists of all objects that I wish to describe • Census – survey of the population • Sample • Subset of the population • Used to predict the population Sample Survey We often draw conclusions about a whole population on the basis of a sample. Choosing a sample from a large, varied population is not that easy. Choose a sample Survey 1. Define the population we want to describe. 2. Say exactly what we want to measure. *A “sample survey” is a study that uses an organized plan to choose a sample that represents some specific population 3. Decide how to choose a sample from the population Current Population Survey • Contacts 60,000 households each month • Produces monthly unemployment rate and a lot of other economic and social information • Population is defined as all U.S. residents (legal or not) 16 years of age and over who are civilians and are not in an institution such as prison. • Unemployed – if you are available for work and if you actually looked for work in the last 4 weeks. Bad Sampling…. • Sample that does not represent the population. Convenience Sampling • Choosing individuals who are easiest to reach. • Example: Questioning the 1st 100 people to come to the store • Example: Mall interview • What problems can you see? Studies that use convenience sampling generally have results that are suspect because the selection of individuals is not random. The results should be looked on with extreme skepticism. Voluntary Response Sample • Consists of people who choose themselves by responding to a general appeal. • Those most likely to respond are the people with strong opinions, especially negative opinions. • Television call in polls • FACT: Only about 15% of the public has ever responded to a call-in poll. That is not a representative sample of the population as a whole! Example • The American Family Association (AFA) is a conservative group that stands for “traditional family values. It had a poll on its Web site – about same sex marriage in 2004. They had 60% of 850,000 people responded that they favored same-sex marriage. This did not support AFA’s position. What do you think happened? What type of Sampling • A farmer brings a juice company several crates of oranges each week. A company inspector looks at 10 oranges from the top of each crate before deciding whether to buy all the oranges. Convenience Sampling: This could lead him to think that the oranges are of better quality than they really are, if the farmer puts the best oranges on the top. What type of Sampling • The ABC program Nightline once asked whether the United Nations should continue to have its headquarters in the United States. Viewers were invited to call one telephone number to respond “Yes” and another for “No”. There was a charge for calling either number. More than 186,000 callers responded and 67% said “No”. Voluntary Response Sample: In this case, those who are happy that the United Nations has its headquarters in the U.S. already have what they want and so are less likely to worry about responding to the question. Activity #1 • Guess the length of my string. Activity #2 - M&M Rectangles • I want to know the average size of the rectangles on this page. • Pick 10 rectangles that you believe are representative of all of the on the page. • Find the area of each of the 10 you chose. • Find the average area of all 10 Bias The sampling method is biased if it systematically favors certain outcomes. Exam Tip: Always tell which way it is biased. Ex: “Explain how using a convenience sample of students in your stats class to estimate the proportion of all high school students who own a GDC could result in bias.” You might respond: “This sample would probably include a much higher proportion of students with a GDC than in the population at large because a GDC is required for the stats class.” In other words, this method would probably lead to an overestimate of the actual population proportion. 3 Sources of Bias • Selection Bias • Nonresponse Bias • Response Bias Selection Bias • The method of selecting the sample systematically excludes some part of the population of interest. • Example – the M&Ms lab voluntary response Measurement / Response • Method of observation tends to produce values that systematically differ from the true value in some way. • Measurement – improperly calibrated scale, the string activity • Response – Improperly worded questions Example: “Should illegal immigrants be prosecuted and deported for being in the U.S. illegally, or shouldn’t they?” 69% favored deportation. Vs. “Should illegal immigrants who have worked in the U.S. for two years be given a chance to keep their jobs and eventually apply for legal status?” 62% favored allowing them to stay. Response Bias - More • Response bias occurs when the answers on a survey do not reflect the true feelings of the respondent. It can occur in many different ways. • Interviewer Error • Misrepresented Answers • Wording of Questions • Ordering of Questions or Words • Type of Question • Data-Entry Error Interview Error An interviewer should be trained to be able to get truthful responses from people. If people don’t feel they can trust the interviewer they will give questionable answers. Also, be aware of interviewers who have a vested interest in the results of the survey. Ex: Would you trust a survey conducted by a car dealer that reports 90% of customers say they would buy another car from the dealer? Misrepresented Answers Some survey questions result in responses that misrepresent facts or are flat-out lies. Ex: A survey of recent college graduates may find their self-reported salaries are inflated. Also, people may overestimate their abilities. Ex: Ask people how many pushups they can do in 1 minute and then ask them to do the pushups. How accurate were they? Wording of Questions Balanced Not Too Vague Questions should be asked in a balanced form to prevent bias. Ex: The yes/no question: “Do you oppose the reduction of estate taxes?” Should be changed to “Do you favor or oppose the reduction of estate taxes?” Another consideration in wording a question is not to be vague. Ex: “How much do you study?” is too vague. Should be changed to “How many hours do you study statistics each week?” Ordering of Questions or Words Many surveys will rearrange the order of the questions or words within a questionnaire so that responses are not affected by prior questions. Ex: The Gallup organization routinely asks the following question of 1017 adults aged 18 years or older: Do you (rotated) approve or disapprove of the job Barack Obama is doing as president? The words approve and disapprove are rotated to remove the effect that may occur by writing the word approve first in the question. Nonresponse Bias • Responses are not actually obtained from all individuals selected for inclusion in the sample. • Example – Failure to return polls Nonresponse occurs when an individual chosen for the sample can’t be contacted or refuses to participate. All surveys suffer from nonresponse bias Non response bias can be controlled by 1) using callbacks 2) using rewards a) Cash for completing survey b) Incentives that state responses have an impact on future policy. What type of Bias • Bill is assigned by his editor to determine what most Americans think about a new law that will place a federal tax on all modems and computers purchased. The revenues from the tax will be used to enforce new online decency laws. Bill, being technically inclined, decides to use an email poll. In his poll, 95% of those surveyed opposed the tax. Bill was quite surprised when 65% of all Americans voted for the taxes. SELECTION BIAS: Excluded those people NOT technically inclined! What type of Bias • The United Pacifists of America decide to run a poll to determine what Americans think about guns and gun control. Jane is assigned the task of setting up the study. To save mailing costs, she includes the survey form in the group's newsletter mailing. She is very pleased to find out that 95% of those surveyed favor gun control laws and she tells her friends that the vast majority of Americans favor gun control laws. SELECTION BIAS: Pacifists will most likely favor gun control laws. A proportion of the population was left out! What type of Bias • Large scale polls were taken in Florida, California, and Maine and it was found that an average of 55% of those polled spent at least fourteen days a year near the ocean. So, it can be safely concluded that 55% of all Americans spend at least fourteen days near the ocean each year Selection Bias: The states chosen for the polls had easier access to the ocean. A large part of the population was left out. Simple Random Sample. • Good sampling technique. • A simple random sample (SRS) of size n consists of n individuals from the population chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected. Let’s look at those M&M rectangles again! • Let’s randomly pick 10 rectangles. • Press • Math • Prb • RandInt (1,100,10) • Find the area of the ten rectangles that match the numbers you generated. Choosing a SRS…using a Random Number Chart • Be specific about how you select. Ex: I’m going to start with line 100 and pick two digits going across the row. The number will represent the group number I will sample. • Indicate the stopping rule. Ex: I will stop this process when I have found 10 samples. • Tell whether you sample with or without replacement. Ex: I do not want to repeat numbers because I need 10 distinct groups therefore, I will sample without replacement. • Use labels to identify subjects selected to be in the sample. Ex: • How to Choose an SRS Sampling and Surveys Definition: A table of random digits is a long string of the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 with these properties: • Each entry in the table is equally likely to be any of the 10 digits 0 - 9. • The entries are independent of each other. That is, knowledge of one part of the table gives no information about any other part. How to Choose an SRS Using Table D Step 1: Label. Give each member of the population a numerical label of the same length. Step 2: Table. Read consecutive groups of digits of the appropriate length from Table D. Your sample contains the individuals whose labels you find. We are planning an article on family-friendly places to stay over spring break at a nearby beach town. The editors intend to call 4 randomly chosen hotels to ask about their amenities for families with children. They have an alphabetized list of all 28 hotels in the town. 01 Aloha Kai 08 Captiva 15 Palm Tree 22 Sea Shell 02 Anchor Down 09 Casa del Mar 16 Radisson 23 Silver Beach 03 Banana Bay 10 Coconuts 17 Ramada 24 Sunset Beach 04 Banyan Tree 11 Diplomat 18 Sandpiper 25 Tradewinds 05 Beach Castle 12 Holiday Inn 19 Sea Castle 26 Tropical Breeze 06 Best Western 13 Lime Tree 20 Sea Club 27 Tropical Shores 07 Cabana 14 Outrigger 21 Sea Grape 28 Veranda 69051 87201 64817 97245 87174 88221 09517 22356 84534 77183 06489 88725 Sampling and Surveys 69051 64817 87201 87174 97245 09517 84534 06489 69 05 16 48 17 87 17 40 95 17 84 53 40 64 89 87 20 Our SRS of 4 hotels for the editors to contact is: 05 Beach Castle, 16 Radisson, 17 Ramada, and 20 Sea Club. Activity – Table T5 • Which rectangle would you choose if you used the random number table? (Table D – back of the book) • How would you do it? Homework • Page 226 (1-11) odd • Worksheet – Bias