REVIEW OF UNIT 3 PACKET In a simulation a person guesses on a 4 choices, multiple choice quiz. 5. If a person were to get 30% correct in this experiment, would you be fairly convinced that he or she does better than just guessing? Explain clearly, using the results of your simulation. 6. If a person were to get 50% correct in this experiment, would you be fairly convinced that he or she does better than just guessing? Explain clearly, using the results of your simulation. where p is the population proportion of interest and p0 is replaced by the conjectured value of interest. The null hypothesis is typically statement of "no effect" or "no difference." The test of significance is designed to measure the strength of evidence against the null hypothesis. 7. In the present context, the null hypothesis is that the subject is just guessing. Change this verbal statement into a null hypothesis. H0: p = 8. If a student has some knowledge of the subject of the multiple-choice question and answers without guessing, then the proportion of correct answers the student gives will, in the long run, exceed what proportion? 9. Write the alternative hypothesis: Ha: p 10. Would a person who is just guessing always guess correctly on 1/4 of the trials? What phenomenon is this? 11. Go to the “quiz” linked above and answer the multiple-choice question. Report to the class spreadsheet if you got it correct or not. Page 1 12. Is there sufficient evidence to convince you that your class can clearly do better than guessing on a multiple-choice question? Explain. 1. 2. 3. 4. 5. 6. 7. 8. What is the difference between an observational study and an experiment? What is the difference between independent and dependent variable How do you avoid bias? How do you get rid of bias? List a few impersonal mechanisms you can use to select a random sample: If you are not working with a random sample, what can you not do confidently? If you are working with a random sample, what is it reasonable to do? If your sampling method is biased and you take a larger sample you will ___ reduce the bias and you will produce a more _______ estimate that is still not close to the population value. True/False 9. _____1. In a simple random sample, every member of the population has an equal chance of being included in the sample. 10. _____2. In a stratified sample, strata are designed so that members in each strata are heterogeneous, i.e., mixed and unalike. 11. _____3. The purpose of an experiment is to determine the effect of the independent response variable on a dependent explanatory variable. 12. _____4. A control group generally consists of observational units who are not to receive the treatment that is the focus of the experiment. 13. _____5. It is fair to say that in every experiment, there are lurking variables; in some cases, they may be a source of confounding while in other cases, they may not. 14. _____6. An experiment is blind if neither the observational units nor the person administering the treatment knows which units are in the control group and which units are in the treatment group. 15. _____7. When an investigator expects that one specific characteristic of the experimental units will likely affect the results of the experiment, a block design is Page 2 appropriate. 16. _____8. An observational study can produce causal results whereas an experiment can only identify an association. 17. _____9. A statistic is said to provide unbiased estimates of a population parameter if values of the statistic from different random samples are centered at the actual parameter value. 18. _____10. Sampling variability refers to the fact that the values of sample statistics vary from sample to sample. 19. _____11. Sample statistics from smaller samples are more precise and closer together than those from larger samples. 20. _____12. The tendency of a sample statistic refers to how much the values vary from sample to sample. 21. _____13. As long as the population is at least 10 times as large as the sample size, the precision of a sample statistics depends on the sample size, not the population size. 22. 23. _____14. Taking a larger sample reduces bias. 24. 25. _____15. Comparison groups are especially important in medical studies because subjects often respond positively simply to being given a treatment. This phenomenon is known as the controlled factor effect. Short Answer. 26. 16. The owner of a club with 1000 members wants to survey 50 members about the friendliness of the staff. Three sampling methods are described below. One is a simple random sample; one is a convenience sample, and one is a voluntary response sample. Which is which? a. a. Ask the first 50 members who enter the club one morning. b. b. Leave a stack of response cards by the sign-in desk with a sign asking members to participate. c. c. Put each name on a single slip of paper. Place all of the slips in a hat and mix well. Draw one slip out and note the name. Continue picking and noting the names until 50 different names are selected. 27. 17. Agricultural scientists for a chemical company want to determine if a newly developed fertilizer produces heavier tomatoes than the fertilizer they currently manufacture. For their first pilot study, they have 24 healthy young tomato plants growing in individual pots, numbered from 1 to 24. Describe the design of a completely randomized, controlled experiment to test whether the new fertilizer produced heavier tomatoes. Then construct a graphic illustrating your design. 28. 18. Does using a calculator improve understanding of mathematical concepts? All 200 fifth-graders at a school are randomly assigned to one of two groups. One group studies addition of fractions with the aid of a calculator, the other studies the same topic without a calculator. Scores on a fractions test are compared after two weeks. Comment on the extent to which inferences can be drawn about a larger population and whether cause and effect can be established. 29. 19. Adam is your school’s star soccer player. When he takes a shot on goal, he typically Page 3 scores half of the time. Suppose that he takes six shots in a game. To estimate the probability of the number of goals Adam makes, use simulation with a number cube. One roll of a number cube represents one shot. a. a. Specify what outcome of a number cube you want to represent a goal scored by Adam in one shot. b. b. For this problem, what represents one trial of taking six shots? c. c. Perform and list the results of ten trials of this simulation. d. d. Identify the number of goals Adam made in each of the ten trials you did in part (c). e. e. Based on your ten trials, what is your estimate of the probability that Adam scores three goals if he takes six shots in a game? 1. Sample size refers to how many ________________ are in a sample. The number of samples for most of what we do in class is the number of students in our class; each student collects a sample. 2. In an actual study, you would only take ___ sample from a given population. As a learning tool, you have taken many samples from the same population to study how sample results ____ from sample to sample. 1. Identify the population, sample and sample size in each of the following settings. a. A quality control engineer at a factory that produces TVs selects 10 TVs from the production line each hour for 8 hours. The engineer inspects each TV for defects in construction and performance. Population: Sample: Sample size: b. Prior to an election, a local news organization surveys 1000 registered voters to predict which candidate will be elected as governor. Population: Sample: Sample size: c. Describe a possible sampling frame for (b) above. Page 4 2. Identify the type of bias(es) in each of the following scenarios. a. David hosts a podcast and he is curious how much his listeners like his show. He decides to start with an online poll. He asks his listeners to visit his website and participate in the poll. The poll shows that 89% of the respondents "love" his show. b. David hosts a podcast and he is curious how much his listeners like his show. He decides to poll the next 1000 listeners who send him fan emails. They don't all respond, but 94 of the 97 listeners who responded said they "loved" his show. c. A senator wanted to know about how people in her state felt about internet privacy issues. She conducted a poll by calling 1000 people whose names were randomly sampled from the phone book (note that mobile phones and unlisted numbers aren't in phone books). The senator's office called those numbers until they got a response from all 1000 people chosen. The poll showed that 42% of respondents were "very concerned" about internet privacy. d. A senator wanted to know about how people in her state felt about internet privacy issues. She conducted a poll by calling people using random digit dialing, where computers randomly generate phone numbers so unlisted and mobile numbers can still be reached. They called over 1000 random phone numbers—most people didn't answer—until they had reached 1000 respondents. The poll showed that 42% of respondents were "very concerned" about internet privacy. 1. Even though we can’t establish a cause-and-effect relationship from an observational study, how can it still help researchers? 2. When suggesting a confounding variable, what should you clearly link it to? 1. An expected value is interpreted as the long-run _____________ of a numerical random process. 2. Many people fall into the trap believing that probabilities should also hold in the ______________. Remember, probability is a long-term property. 2a. A study is to be conducted of the effectiveness of a new diet called Fatbegone. The treatment group will go on the new diet for a period of three months. The control group will not Page 5 receive any information about Fatbegone. Instead they will have weekly counseling sessions on topics such as healthy eating habits, exercise, sleep, etc. Each person’s weight will be recorded at the beginning of the study and at the end of three months. The change in weight for each subject will then be recorded. One hundred adults are available for the study. Describe or create a graphic for a completely randomized design. 2b. Suppose researchers believe that a person’s response to the new diet is affected by how much overweight the person is to begin with. The researchers have determined that 40 of the subjects are slightly overweight, 44 of the subjects are moderately overweight, and 36 of the subjects are extremely overweight. Explain how you would design an experiment that blocks on initial weight. An Introduction to Confounding Variables: Night Lights and Near-Sightedness (From Teaching Statistical Concepts with Activities, Data, and Technology, a presentation by Chance and Rossman, September 1, 2010 and Introduction to Statistical Investigations by Tintle, et all, 2019) Near-sightedness typically develops during the childhood years. Recent studies have explored whether there is an association between development of myopia and the use of night-lights with infants. Quinn, Shin, Maguire, and Stone (1999) examined the type of light children aged 2-16 were exposed to. Between January and June 1998, the parents of 479 children who were seen as outpatients in a university pediatric ophthalmology clinic completed a questionnaire. One of the questions asked was “Under which lighting condition did/does your child sleep at night?” before the age of 2 years. The parents chose between “room lighting,” “a night light,” and “darkness.” Based on the child’s most recent eye examination, they were separated into two groups: nearsighted and not near-sighted. (a) Identify the observational units and the two variables in this study. For each variable, specify whether it is quantitative or categorical. Observational units: Variables: (b) Which variable is being considered the explanatory variable and which is being considered the response variable? Page 6 (c) Is this an observational study or an experiment? Explain how you can tell. The following table displays the sample data: Darknes Night Room s light light Not near154 154 34 sighted Near-sighted 18 78 41 Total 172 232 75 Tota l 342 137 479 (d) Compute the following conditional probabilities: i. P(nearsighted|darkness) ii. P(nearsighted|night light) iii. P(nearsighted|room light) 1. Random ________ aims to produce a _______ that is representative of the population. Random ________ eliminates ______. 2. Random ________ aims to produce _______________________ that are similar in all respects except for the treatment imposed. Random ______________ eliminates ________________. 3. An expected value is interpreted as the long-run _____________ of a numerical random process. 4. Many people fall into the trap believing that probabilities should also hold in the ______________. Remember, probability is a long-term property. Page 7 What is the difference between confounding and lurking variables? Example: Decide if each situation uses matched pairs or creates two samples. a. From Activity 1-16: “Smokers who participated in the study were randomly assigned to receive either the nicotine lozenge being tested or a placebo lozenge” b. From Activity 7-4: “States are classified by whether they are east or west of the Mississippi River” c. From Activity 5-5: “Every person received the same sequence of 30 letters, but there were presented in two different groupings” Example: Decide if each situation uses matched pairs or creates two samples. a. From Activity 1-16: “Smokers who participated in the study were randomly assigned to receive either the nicotine lozenge being tested or a placebo lozenge” b. From Activity 7-4: “States are classified by whether they are east or west of the Mississippi River” c. From Activity 5-5: “Every person received the same sequence of 30 letters, but there were presented in two different groupings” 1. If a participant does not have any idea what the correct answer is and therefore guesses each time, what proportion would he or she get correct in the long run? Page 8 2. Describe how you could use a spinner to simulate this experiment over and over for a person who just guesses for each of the thirty trials. Page 9