Unit 3: Collecting Data - Observational Studies and Experiments - Key (Topics 3-5, 11) Work for Oct 14/15 starts here: In this unit, we will study the vocabulary of sampling, experimentation, and simulation. We will learn what a simple random sample is and how to design a good experiment or observational study. Furthermore, we will learn how to use simulation to explore the possible outcomes of an experiment when it is difficult (or costly) to run an experiment. Basic Experimental Design Vocabulary (including some review of basic vocabulary from Unit 1) The person or thing to which the number or category is assigned is called the observational unit as we already know. In an experiment, we are looking to collect information from the observational units. We refer to the information collected as data. A variable is any characteristic of an observational unit that can be assigned a number or category. Variables can be classified as quantitative or categorical (qualitative). A quantitative variable measures a numerical characteristic whereas a categorical (qualitative) variable records a group designation. Binary variables are categorical variables with only two possible categories, for example, right or left. In addition, variables are independent or dependent (just as you learned in algebra). The independent variable is also called the explanatory variable. This is the variable whose effect you wish to study. The dependent variable is also called the response variable. This is the outcome you record. Often the explanatory variables come first in a time sequence. To start an experiment, write a hypothesis of what you think will happen in the experiment. Write an ifthen statement in words that includes your independent and response variables: If ___________________________ is changed by _________________________________________ observational unit statement including explanatory variable then I expect that _________________________________________. statement including response variable The overall hypothesis of the experiment is similar to a research question. After establishing a hypothesis, you will set up a null hypothesis and an alternative hypothesis. The null hypothesis is typically a statement of “no effect” or “no difference.” It will state what is true before you conduct the experiment. The null hypothesis should reflect that the explanatory variable will not impact or change the observational unit. The alternative hypothesis should reflect the change you expect. Any time a change or treatment is actively imposed by the researcher, an experiment is being conducted. Alternatively, if a researcher is passively observing and recording information, an observational study is being conducted. Observational studies can also use previously collected data. If the researcher does not directly impact the subjects or change the set-up of the situation, it’s an observational study. There are advantages and disadvantages to observational studies. One advantage is that they generally occur in natural settings. Many animal studies are observational studies. The major disadvantage is that no definite cause-and-effect relationship can be concluded. At most an association can be found between two variables. In an experiment the researcher deliberately imposes some treatment on individuals to measure their responses. You can draw a cause-and-effect conclusion based on the results of an experiment. Page 1 Practice: Read the following article and the answer the questions: Dolled Up or Working, Barbie Reduces Girls’ Career Dreams, but Mrs. Potato Head Doesn’t An unexpected role model for girls. (Anne Farrar/The Washington Post) By Los Angeles Times March 10, 2014 In a psychology lab at Oregon State University, 37 girls ages 4 to 7 have finally demonstrated what feminists have long warned: Playing with Barbie dolls drives home cultural stereotypes about a woman’s place and suppresses a little girl’s career ambitions. But here’s an unexpected though preliminary finding from the same experiment: Playing with Mrs. Potato Head appears to have the effect on little girls of attending a “Lean In” circle. After spending just five minutes with Jane Potato Head, girls believed they could grow up to do pretty much anything a boy could do. Don’t be fooled by those career-girl Barbies dressed up as doctors, astronauts, politicians and ocean explorers. The authors of the study report that whether girls played with “Doctor Barbie” — decked out in a white coat and jeans with a sparkly applique — or “Fashion Barbie” — dressed in a formfitting minidress and high heels — they were likely to judge themselves capable of plying, on average, 1.5 fewer occupations than a boy could. Those who played with Mrs. Potato Head foresaw basically the same range of career opportunities as boys did. (The researchers presented the girls with 10 jobs, half of them traditionally male-dominated and half traditionally female-dominated.) “Although the marketing slogan suggest[s] that Barbie can ‘Be Anything,’ girls playing with Barbie appear to believe that there are more careers for boys than for themselves,” wrote authors and psychology professors Aurora M. Sherman of Oregon State University and Eileen L. Zurbriggen of the University of California at Santa Cruz. The results of this experiment were published in the journal Sex Roles. (https://today.oregonstate.edu/archives/2014/mar/playing-barbie-dolls-could-limit-girls%E2%80%99career-choices-study-shows) 1. Why is the study classified as an experiment, rather than an observational study? The girls were given either a Barbie or Mrs. Potato Head, so a treatment was imposed. 2. What research question do you think the researchers were trying to answer? Do girls who play with Mrs. Potato Head think they are capable of having more careers than girls who play with Barbie? 3. What/who are the observational units and the sample size? girls, ages 4 to 7; n = 37 4. What are the variables? Classify the variables as categorical (binary?) or quantitative. Explanatory: toy (Mrs. Potato Head or Barbie), binary categorical Response: number of career options, quantitative 5. Whom might this research affect now or within the next five years? Explain. Answers will vary, but as a girl this could have personally happened to them so they may need to be aware of this issue. For boys, they may have friends who they could help understand this issue. If you’re watching a younger child, you could think about what toys they could play with after reading this. It could impact gifts that parents give to younger children or it could impact advice parents give to their children as they are looking at colleges and careers. Page 2 The Vocabulary of Sampling The population in a study refers to the entire group of people or objects (observational units) of interest. A sample is a (typically small) part of the population from whom or which data are gathered to learn about the population as a whole. If the sample is selected carefully, so that it is representative of (has similar characteristics to) the population, you can learn useful information. The number of observational units (people or objects) studied in a sample is the sample size. 1. Identify the population, sample and sample size in each of the following settings. a. A quality control engineer at a factory that produces TVs selects 10 TVs from the production line each hour for 8 hours. The engineer inspects each TV for defects in construction and performance. Population: TVs produced at the one particular factory Sample: TVs selected from production line each hour Sample size: 80 b. Prior to an election, a local news organization surveys 1000 registered voters to predict which candidate will be elected as governor. Population: Registered voters Sample: Registered voters chosen for survey by local news organization Sample size: 1000 In selecting a sample from a population, ideally we would start with a sampling frame, an actual list of every member of the population that we want to sample from. c. Describe a possible sampling frame for (b) above. List of all registered voters in the state From the definition above, a sample is any subgroup of the population. But, if we wish to be able to draw inferences about the population, the sample must reflect or represent the population characteristics. We also want the sample to be random. For a sample to be a random sample, every member of the population must have an equal chance of being selected. This involves using a chance process to determine which members of a population are included in the sample. We will look at processes to select a random sample including using your calculator, a random digits table, and more. A sampling procedure is said to have sampling bias if it tends systematically to overrepresent certain segments of the population and to underrepresent others; that is, the sample will not be representative of the population. There are several types of biased sampling. The first is convenience sampling, so called because the data are collected from people whom the researchers can easily contact. For example, if you stand at the door of the cafeteria and ask the first 30 students how much time they spent on homework, it is unlikely that this accurately represents the homework habits of all students. A second type of sampling bias is voluntary response sampling. This type of sample allows people to choose to be in the sample by responding to a general invitation. People who volunteer for such surveys cannot be thought of as representative of the population; voluntary response sampling is always a poor sampling method. A problem that plagues even well-designed surveys is that of nonresponse bias. When people selected for inclusion in a study refuse to participate, their opinions are not represented in the collected data. Page 3 Finally, the behavior of the respondent or interviewer can cause response bias in sample results. The picture below shows how an interviewer is collecting a sample at a dog show about whether the respondent prefers cats or dogs. Due to the location of the interview, there is most likely response bias. 2. Identify the type of bias(es) in each of the following scenarios. a. David hosts a podcast and he is curious how much his listeners like his show. He decides to start with an online poll. He asks his listeners to visit his website and participate in the poll. The poll shows that 89% of the respondents "love" his show. voluntary response b. David hosts a podcast and he is curious how much his listeners like his show. He decides to poll the next 1000 listeners who send him fan emails. They don't all respond, but 94 of the 97 listeners who responded said they "loved" his show. Convenience sample; non-response bias c. A senator wanted to know about how people in her state felt about internet privacy issues. She conducted a poll by calling 1000 people whose names were randomly sampled from the phone book (note that mobile phones and unlisted numbers aren't in phone books). The senator's office called those numbers until they got a response from all 1000 people chosen. The poll showed that 42% of respondents were "very concerned" about internet privacy. Response bias d. A senator wanted to know about how people in her state felt about internet privacy issues. She conducted a poll by calling people using random digit dialing, where computers randomly generate phone numbers so unlisted and mobile numbers can still be reached. They called over 1000 random phone numbers—most people didn't answer—until they had reached 1000 respondents. The poll showed that 42% of respondents were "very concerned" about internet privacy. Non-response Page 4 A parameter is a number that describes a population, whereas a statistic is a number that describes a sample. 3. For each statement, identify whether the numbers underlined are statistics or parameters. a. Of all U.S. kindergarten teachers, 32% say that knowing the alphabet is an essential skill. parameter b. Of the 800 U.S. kindergarten teachers polled, 34% say that knowing the alphabet is an essential skill. statistic 4. Of the U.S. adult population, 36% has an allergy. A sample of 1200 randomly selected adults resulted in 33.2% reporting an allergy. a. Who is the population? U. S. adult population b. What is the sample? 1200 selected adults c. Identify the statistic and give its value. Proportion of U. S. adults in the 1200-person sample who have an allergy; 33.2% d. Identify the parameter and give its value. Proportion of all U. S. adults who have an allergy, 36% 5. In your own words, explain why the parameter is fixed and the statistic varies. The parameter reflects an entire population which is relatively unchanged. A statistic results from a sample and sampling variability is reflected in differing statistics. The purpose of a sample survey is to gather information about a population without disturbing the population in the process. Sample surveys are one type of observational study. An example of an observational study would be if a researcher was trying to determine the effects that eating strictly organic foods has on overall health. The researcher finds 200 individuals, where 100 of them have eaten organically for the past three years, and the other 100 haven't eaten organically in the past three years. They then give each subject an overall health assessment. Lastly, they analyze the data and use it to draw conclusions on how eating organically can affect one's overall health. This is an observational study, because the researcher hasn't done anything other than observe the individuals in the study. Page 5 6. One study of cell phones and the risk of brain cancer looked at a group of 469 people who have brain cancer. The investigators matched each cancer patient with a person of the same age, sex, and race who did not have brain cancer, then asked about the use of cell phones. Result: “Our data suggest that the use of hand-held cellular phones is not associated with risk of brain cancer.” a. Is this an observational study or an experiment? Justify your answer. Observational study; no treatment was imposed. b. What is the explanatory variable? Response variable? The explanatory variable is the use of hand-held cellular phones. The response variable is the incidence of brain cancer. c. Based on this study, would you conclude that cell phones do not increase the risk of brain cancer? Why or why not? No. An observational study can suggest an association but cannot prove a cause-and-effect relationship. 7. An educational software company wants to compare the effectiveness of its computer animation for teaching biology with that of a textbook presentation. The company gives a biology pretest to a group of high school juniors, and then divides them into two groups. One group uses the animation, and the other studies the test. The company retests all students and compares the increase in biology test scores in the two groups. a. Is this an observational study or an experiment? Justify your answer. This is an experiment. A treatment was imposed—use of the textbook vs. use of computer animation. b. What is the explanatory variable? Response variable? The explanatory variable is the delivery method of material—text or animation. The response variable is the increase in biology test scores. c. If the group using the computer animation has a much higher average increase in test scores than the group using the textbook, what conclusions, if any, could the company draw? Because this is an experiment, a cause-and-effect relationship can be assumed to have occurred. The conclusion is that computer animation is more effective than traditional textbook delivery. ** Now turn to pages 39-40 in your book and complete Activity 3-4: Candy and Longevity ** In the Candy and Longevity Study, the presence of additional variables is introduced. Variables that are not considered in the study but that may also be related to the response variable are called lurking variables. Some of these variables may also differ from the explanatory variable groups in such a way that we cannot distinguish between their effects and those of the explanatory variable. These confounding variables prevent us from drawing a valid cause-and-effect relationship between the explanatory and response variables. Page 6 An Introduction to Confounding Variables: Night Lights and Near-Sightedness (From Teaching Statistical Concepts with Activities, Data, and Technology, a presentation by Chance and Rossman, September 1, 2010 and Introduction to Statistical Investigations by Tintle, et all, 2019) Near-sightedness typically develops during the childhood years. Recent studies have explored whether there is an association between development of myopia and the use of night-lights with infants. Quinn, Shin, Maguire, and Stone (1999) examined the type of light children aged 2-16 were exposed to. Between January and June 1998, the parents of 479 children who were seen as outpatients in a university pediatric ophthalmology clinic completed a questionnaire. One of the questions asked was “Under which lighting condition did/does your child sleep at night?” before the age of 2 years. The parents chose between “room lighting,” “a night light,” and “darkness.” Based on the child’s most recent eye examination, they were separated into two groups: near-sighted and not near-sighted. (a) Identify the observational units and the two variables in this study. For each variable, specify whether it is quantitative or categorical. Observational units: 479 children seen as outpatients at the clinic Variables: type of light children were exposed to (categorical), nearsightedness (categorical) (b) Which variable is being considered the explanatory variable and which is being considered the response variable? Explanatory – type of light, Response - nearsightedness (c) Is this an observational study or an experiment? Explain how you can tell. Observational study – the researchers did not impose a treatment on the children, they simply collected information about them. The following table displays the sample data: Darkness Night light Room light Total Not near-sighted 154 154 34 342 Near-sighted 18 78 41 137 Total 172 232 75 479 (d) Compute the following conditional probabilities: i. P(nearsighted|darkness) = 18/172 = 0.104 or 0.105 ii. P(nearsighted|night light) = 78/232 = 0.336 iii. P(nearsighted|room light) = 41/75 = 0.546 or 0.547 Notice that as the amount of light in the room is increased, the percentage of children that are nearsighted also increases. So, we can say that an association exists between whether or not a child is nearsighted and the amount of light used in the child’s room before the age of 2. Association raises questions about cause-and-effect. This observational study cannot answer the question “Does sleeping with a night light cause a child’s chance of being nearsighted to increase?” Is there another explanation for the higher percentage of near-sightedness in children using the room light and night light compared to those with no light? Page 7 The association between amount of light in the room and eyesight condition appears to be real. An alternative explanation is that nearsightedness is often an inherited trait. But does this explain the observed association? It does if parents who are nearsighted are more likely to use a room light or a night light with their infants because they need the extra light to check on their child at night. This appears to be a reasonable explanation. The problem is we don’t know if this explanation is better or worse than the idea that providing more light causes a child’s chance of being nearsighted to increase. Both explanations are consistent with the data we have and the association we observed. If the parent’s eyesight tend to differ among the children with well-lit rooms, night lights, and darkness, we have no way of separating out the “effects” of this variable from the lighting condition. In this case, the parents’ eyesight is considered a confounding variable (see below). When another variable has a potential influence on the response, but its effects cannot be separated from those of the explanatory variable, we say the two variables are confounded. When we classify subjects into different groups based on existing conditions (i.e., in an observational study), there is always the possibility that there are other differences between the groups apart from the explanatory variable that we are focusing on. Therefore, we cannot draw cause/effect conclusions between the explanatory and response variables from an observational study. There will always be “other variables” floating around in any study. What we are concerned about is these potential confounding variables that prevent us from isolating the explanatory variable as the only influence on the response variable. For example, eye color is another variable but it is unlikely that it is related to eye condition and lighting use. The key to discounting a cause-and-effect explanation is to identify a potential confounding variable and explain how it is linked to both the explanatory variable and the response variable in a way that also explains the observed association. Keep in mind that the association revealed is real; we are just saying there could be an alternative explanation. Here is a classic example of confounding variables: There is an association between murder rate and the sale of ice cream. As the murder rate raises so does the sale of ice cream. One suggestion for this could be that murderers cause people to buy ice cream. This is highly unlikely. A second suggestion is that purchasing ice cream causes people to commit murder, which is also highly unlikely. Then there is a third suggestion which includes a confounding variable. It is distinctly possible that the weather causes the Page 8 correlation. While the weather is icy cold, fewer people are out interacting with others and less likely to purchase ice cream. Conversely, when it is hot outside, there is more social interaction and more ice cream being purchased. In this example, the weather is the variable that confounds the relationship between ice cream sales and murder. Exercise 1: An article about handwriting appeared in the October 11, 2006 issue of the Washington Post. The article mentioned that among students who took the essay portion of the SAT exam in 2005-06, those who wrote in cursive style scored significantly higher on the essay, on average, than students who used printed block letters. (a) Identify the explanatory and response variables in this study. Classify each as categorical or quantitative. Explanatory – type of writing (categorical), Response – essay score (quantitative) (b) Is this an observational study or an experiment? Explain briefly. Observational study – researchers did not impose a treatment on the students taking the SAT, they simply observed the essays once written and collected information (c) Would you conclude from this study that using cursive style causes students to score better on the essay? If so, explain why. If not, identify a potential confounding variable, and explain how it provides an alternative explanation for why the cursive writing group would have a significantly higher average essay score. No, you can’t draw a cause and effect conclusion from an observational study. There could be confounding variables such as wealth of the students which could influence their choice of writing and their essay score if they have better access to education or test prep. The article also mentioned the following which could be confounding (it seems to indicate that printed block letters came from typing and not handwriting): Forming letters with the hand by using a pen or pencil is cognitively different than pushing a physical or virtual key on a keyboard. When learning, forming letters by hand creates a connection with the movement of the hand to the visual response of seeing the letter on the page. There are multiple processes coexisting simultaneously: the movement of the hand, the thought of the letter, and the visual cue of the letter. This is reading and writing concurrently, which is a necessary skill. Children need to go through this process to fully understand the English language and connect words to motor memory. Learning cursive handwriting is important for spelling skills, enabling children to recognize words when they read them later. Typing doesn’t have the same effect on the brain, as it doesn’t require the same fine motor skills and simultaneous activity. The article also mentioned a different study in which the same exact essay was given to many graders. Some graders were shown a cursive version of the essay and the other graders were shown a version with printed block letters. Researchers randomly decided which version of the essay the grader would receive. The average score assigned to the essay with the cursive style was significantly higher than the average score assigned to the essay with the printed block letters. (d) What conclusion would you draw from this second study? Be clear about how this conclusion would differ from that of the first study, and why that conclusion is justified. The second study is an experiment because the graders received either a cursive essay or a block letter essay, so a treatment was imposed. The researchers controlled the experiment by giving every grader the same essay. They also randomly assigned which grader would receive cursive and which would receive block letter type. Since this is a well-designed experiment, we can say that the cursive might have caused the essay to receive a higher score. Page 9 Exercise 2: Sports teams prefer to play in front of their own fans rather than at the opposing team’s site. Having a sell-out crowd should provide even more excitement and lead to an even better performance, right? Well, consider the Oklahoma Thunder, an NBA team, in its second season (2008-2009) after moving from Seattle. This team had a win-loss record that was actually worse for home games with a sell-out crowd (3 wins and 15 losses) then for home games without a sell-out crowd (12 wins and 11 losses). (These data were noted in the April 20, 2009, issue of Sports Illustrated in the Go Figure column.) (a) Identify the explanatory and response variables in this study. Classify each as categorical or quantitative. Explanatory – sell-out crowd (categorical), win (categorical) (b) Is this an observational study or an experiment? Explain briefly. Observational study – researchers did not impose a treatment, they looked at what happened after the 2008-2009 season was over. (c) Would you conclude from this study that playing at home with a sell-out crowd causes worse performance? If so, explain why. If not, identify a potential confounding variable, and explain how it provides an alternative explanation for why the performances against sell-out crowds would be worse than at games without a sell-out. You can’t conclude that the sell-out crowds caused the worse performances since this is an observational study. Another variable recorded for these data was whether or not the opponent had a winning record the previous season. Of the Thunder’s 41 home games, 22 were against teams that won more than half of their games. Let’s refer to those 22 teams as strong opponents. Of those 22 games, 13 were sell-outs. Of the 19 games against opponents that won less than half of their games that season (weak opponents), only 5 of those games were sell-outs. The confounding variable was the strength of the opponent (or opponent’s winning record from the previous season); sell-out crowds tended to be against stronger opponents. The possible presence of confounding variables is the reason that association alone does not justify a conclusion that differences in the explanatory variable cause differences in the response variable. Turn to page 42 and use the Watch Out to answer the following questions: 1. Even though we can’t establish a cause-and-effect relationship from an observational study, how can it still help researchers? Researchers can establish a relationship between variables which could also point to other factors to investigate. 2. When suggesting a confounding variable, what should you clearly link it to? Both the explanatory variable and the response variable. ** Turn to page 49 in your textbook and complete Exercises 3-20 and 3-21 ** Page 10 Work for Oct 19/20 starts here: Activity 4-1: Sampling Words Consider the population of 268 words in Lincoln’s Gettysburg Address which is on the following page. Select a sample by circling 10 words that you believe to be representative of this population of words. Answers will vary. The answers given below are one example. b. In the chart, record which words you selected and the number of letters in each word. a. Here are the completed sample tables: c. Create a dotplot of your sample results below (number of letters in each word). Also indicate what the observational units and variable are in this dotplot. Is the variable categorical or quantitative? (Dotplot based on sample given above) Observational units: Words from the Gettysburg Address Variable: Number of letters per word Type: Quantitative Page 11 d. Calculate the average (mean) number of letters per word in your sample. Is this number a parameter or a statistic? Also identify (in words) the other value. The average number of letters per word is five. This number is a statistic. The parameter would be the average number of letters per word in the entire Gettysburg Address. e. Combine your sample average with those of your classmates to produce a dotplot of sample averages. Be sure to label the horizontal axis appropriately. Answers will vary. student mean number of letters/word (sample mean) student mean number of letters/word (sample mean) student 1 11 21 2 12 22 3 13 23 4 14 24 5 15 25 6 16 26 7 17 27 8 18 28 9 19 29 10 20 30 mean number of letters/word (sample mean) Sample dot plot of 10 responses with the mean marked and proper label. Your dotplot should have more dots. f. Indicate what the observational units and variable are in the dotplot in part e. [Hint: To identify the observational units, ask yourself what each dot on the plot represents. The answer is different from the answer for part c.] Observational units: Samples of 10 words Variable: Average number of letters per word g. The average number of letters per word in the population of all 268 words is 4.29 letters. Mark this value on the dotplot in part e. How many students produced a sample average greater than the actual population average? What proportion of the students does this represent? In this example 8/10 or 0.8 of the students produced a sample average greater than 4.29 letters per word. Page 12 h. Would you say that this sampling method (asking people simply to circle 10 representative words) is biased? If so, in which direction? Explain how you can tell this from the dotplot. Yes, this sampling method appears to be biased. It appears to overestimate the population mean. This is evident from the dotplot because it is centered at about 5.7 (rather than 4.29), and it indicates that a large proportion of the class selected samples with averages greater than 4.29. i. Suggest some reasons why this sampling method turns out to be biased. Your eyes are most likely drawn to the longer words, and you tend to overlook the short, common words such as a, and, is, and or. Thus when you try to choose representative samples, you do not select enough short words in your sample. j. Consider a different sampling method: Close your eyes and point to the page 10 times to select the words for your sample. Explain why this method would also be biased towards overestimating the average number of letters per word in the population. If you use this method you would also be likely to select too many long words in your sample because the long words take up more space on the page and therefore have a greater chance of being selected when you blindly point to a word. k. Would using this same sampling method with a larger sample size (say, 20 words) eliminate the sampling bias? Explain. No, increasing the sample size will not make up for the biased sampling method. You will just end up with a larger sample of long words. l. Suggest how you might employ a different sampling method that would be unbiased. You need to employ a truly random method to select the words. You could write each word on the same size slip of paper, put each slip in a hat, mix them thoroughly, and then draw ten slips from the hat. Or number each word and then randomly generate numbers to choose a random sample. Activity 4-2: Sampling Words a. Use the Random Digits Table (Table 1 in the back of the textbook or page 669 in the electronic textbook) to select a simple random sample of 5 words from the Gettysburg Address. Do this by starting in the table at any point (it does not have to be at the beginning of a line) and reading off the first 5 three-digit numbers that you come across. (When you get a number greater than 268, skip it and move on. If you happen to get repeats, keep going until you have 5 different numbers between 001 and 268.) Record the random digits that you selected, the corresponding words (on page 58 in the textbook), and the lengths (number of letters) of the words. Many answers are possible. The following was obtained from the Random Digits Table, starting at the beginning of line 60: Now find a second sample of five different words using your calculator. Similar results as above. b. Determine the average length in your sample of ten words (you did five with the random digits table and five with the calculator—combine those) and record it here. Answers will vary. Page 13 c. Again, combine your sample mean with those of your classmates to produce a dotplot. Be sure to label the horizontal axis appropriately. Student Average number of letters 1 Student Average number of letters 16 2 17 3 18 4 19 5 20 6 21 7 22 8 23 9 24 10 25 11 26 12 27 13 28 14 29 15 30 Sample dot plot of 10 responses with the mean marked and proper label. Your dotplot should have more dots. d. Comment on how the distribution of sample averages from these random samples compares to those of your “circle 10 words” samples. This distribution is much closer to being centered at 4.29 and has a smaller horizontal spread than the previous one did (though the latter is not always the case). e. Do the sample averages from the random samples tend to over- or underestimate the population average, or are they roughly split evenly on both sides? Answers will vary. The sample averages are roughly split evenly on both sides of 4.29. f. Does random sampling appear to have produced unbiased estimates of the average word length in the population, which is 4.29 letters? Explain. Yes, random sampling appears to have produced unbiased estimates of the average word length in the population. Page 14 Activity 4-3: Sampling Words For this activity, you will be using the Sampling Words applet on the Wiley Resources. (Google Rossman Chance Sampling Words) Click “Show Sampling Options” to start. a. Specify 5 as the Sample Size and click on Draw Samples. Record the selected words, the number of letters in each word, and the average for the 5-word sample. Answers will vary. The following are from one particular running of the applet. Average number of letters: 4.2 b. Click Draw Samples again. Did you obtain the same sample of words this time? Did your samples have the same average length? You will probably not obtain the same sample of words or the same average length the second time. c. Change the number of samples from 1 to 498 in the Number of Samples field. Then click on the Draw Samples button. The applet now takes 498 more simple random samples from the population (for a total of 500) and adds the resulting sample averages to the graph in the lower-right panel. Record this value here: Answers will vary. Average of the 500 sample averages is 4.31 letters per word. d. If the sampling method is unbiased, the sample averages should be centered “around” the population average of 4.29 letters. Does this appear to be the case? Yes, this appears to be ―around 4.29 letters. e. What do you suspect would happen to the distribution of sample averages if you took a random sample of 20 words rather than 5? Explain briefly. [Hint: Comment on center (central tendency) and spread (consistency).] Answers will vary according to student expectation. f. Change the sample size in the applet to 20 and take 500 random samples of 20 words each. Summarize how this distribution of average word lengths compares to the distribution when the sample size was 5 words per sample. The center of this distribution should also be near 4.29, but the horizontal spread should be much smaller. g. Which of the two distributions (sample size 5 or sample size 20) has less variability (more consistency) in the values of the sample average word length? The distribution of the samples of size 20 has less variability (more consistency) in the values of the sample average word length. h. In which case (sample size 5 or sample size 20) is the result of a single sample more likely to be close to the actual population value? The result of a single sample is more likely to be close to 4.29 with a sample of size 20 than with a sample of size 5. i. Would taking a larger sample using a biased sampling method tend to reduce the bias? Explain. No, increasing the sample size when using a biased sampling method will not reduce the bias. It will only create a larger biased sample. The results from different samples will tend to be closer together but will still be centered in the wrong location (not around the parameter value of interest). If you want to reduce the bias you must change the sampling method. Page 15 Turn to page 62 and use the Watch Out to answer the following questions: 1. Sample size refers to how many observational units are in a sample. The number of samples for most of what we do in class is the number of students in our class; each student collects a sample. 2. In an actual study, you would only take one sample from a given population. As a learning tool, you have taken many samples from the same population to study how sample results vary from sample to sample. Activity 4-4: Sampling Words For this activity, you will be using the Sampling Words applet on the Wiley Resources. Now that you have explored the effect of sample size on sampling variability, you will investigate the effect of population size. a. Return to the Sampling Words applet. Again, ask for 500 samples of 5 words each, and look at the distribution of average word lengths in those 500 samples. Draw a rough sketch of the resulting distribution. The following is one example set of results. b. Change the setting from x1 to x4. Now the population consists of four copies of the 268 words, for a total population size of 1072 words. Predict the sample size you will need for a consistency/spread of the distribution of sample means to match part a. In other words, how much larger do you think the sample size needs to be to obtain the same level of precision? Answers will vary by student expectation. c. Once again, ask for 500 samples of 5 words each, and examine the distribution of average word lengths in those 500 samples. Comment on features of the resulting distribution, and compare this distribution to the one from part a. Both distributions should be roughly symmetric, centered at about 4.29 words with a horizontal spread from roughly 2 to 7 words. Page 16 d. Do these distributions seem to have similar variability (consistency)? Yes, these distributions seem to have similar variability. e. Did, in fact, much change at all when you sampled from the larger population? No, not much changed at all when you sampled from the larger population. f. Read the top of page 64. How large should a population be relative to the sample size? 10 times Turn to page 64 and use the Watch Out to answer the following questions: 1. List a few impersonal mechanisms you can use to select a random sample: Random digits table, calculator, computer 2. If you are not working with a random sample, what can you not do confidently? Consider your sample as representative of the population, meaning you can’t confidently generalize. 3. If you are working with a random sample, what is it reasonable to do? Generalize results from the sample to the population. 4. If your sampling method is biased and you take a larger sample you will not reduce the bias and you will produce a more precise estimate that is still not close to the population value. Page 17 Work for Oct 21/22 starts here: ** Complete Activity 5-1 from page 78 in the textbook ** Activity 5-2: Testing Strength Shoes a. Your teacher will provide you with 12 slips of paper with names on them. Shuffle the slips of paper (name side down) and randomly deal out 6 for the strength group and 6 for the ordinary shoe group. Record the names assigned to each group in this table, along with their genders and heights (which you can find on page 80 of the textbook): Answers will vary. Below is one example. b. Calculate and report the proportion of men in each group. Also subtract these two proportions (taking the strength shoe group’s proportion minus the ordinary shoe group’s proportion). Answers based on sample given in part a. Strength shoe group: 0.667 Ordinary shoe group: 0.667 Difference (strength – ordinary): 0.000 c. Calculate and report the average height in each group. Also subtract these two averages (taking the strength shoe group’s average minus the ordinary shoe group’s average). Answers based on sample given in part a. Strength shoe group: 68.667 inches Ordinary shoe group: 67.333 inches Difference (strength – ordinary): 1.333 d. Are the two groups identical with regard to both of these variables? Are they similar? No, the two groups are not identical with regard to both of these variables but they are similar. e. Combine your results with those of your classmates. Produce a dotplot of the differences in proportions of men. Also produce a dotplot of the differences in average heights. Be sure to label the axes of the dotplots clearly and identify the observational units in those plots. Answers will vary from class to class. The dotplot of differences in proportions should be roughly symmetric and centered around zero. The larger the class, the more symmetric the plot should be. The horizontal axis should be labeled ― difference in sample proportions with a scale from –1to 1, and the vertical axis should display the count/tally of each difference. The dotplot of the differences in average heights should also be roughly symmetric and centered at zero. The horizontal axis should be labeled ― difference in sample heights with a scale from approximately -6 to 6, and the vertical axis should display the count/tally of each difference. For both plots, the observational units are the random assignments. Page 18 f. Do both dotplots appear to be centered around zero? Explain why this indicates that random assignment is effective. Both plots should appear to be roughly centered near zero. This indicates that random assignment is effective because it is “balancing out” the proportion of men/women and the heights in both groups. In the long run, both groups are roughly the same with regard to these variables because the difference between them is zero. In particular, you have no prior suspicion that one group will have certain characteristics that differ from the other group. It is often helpful to diagram an experiment (see page 41 of your book). The diagram of the strength shoes problem would look like this: Control Group: Ordinary shoes People Compare mean heights . Treatment Group: Strength shoes The straight line after people (the observational/experimental units) and the treatment groups indicates that random assignment occurred in this experiment (as it always should). As we learn more about experimental design, our diagrams will become more detailed and complicated. ** Complete Activity 5-4 from p 84 in the textbook ** Activity 5-5: Memorizing Letters Record the memory scores for you and your classmates. The score is the number of letters remembered in the correct order before the first mistake. Record the version the student had, too. number number number student version student version student version of letters of letters of letters 1 11 21 2 12 22 3 13 23 4 14 24 5 15 25 6 16 26 7 17 27 8 18 28 9 19 29 10 20 30 a. Explain why this study is an experiment and not an observational study. This is an experiment because the teacher actively imposed the treatment (grouping of letters) on each subject/student. Page 19 b. Identify and classify the explanatory and response variables in this study. explanatory: which sequence of letters a subject was given type: binary categorical response: number of letters correctly memorized type: quantitative c. Produce a graphic showing the design of this study. d. Explain how random assignment was implemented and why it was important in this study. The instructor randomly decided which grouping of letters each student would receive. This was important because it prevented self-selection and controlled for confounding variables. You should not expect any differences between the two groups prior to the treatment. e. Explain how blindness was implemented and why it was important in this study. The students were blind to the fact that there were two different groupings of letters given out initially. They were unaware that you were trying to compare the effect of these groupings so they could not unintentionally influence the results f. Create dotplots of the memory scores, comparing the two treatment groups. (Remember to label the horizontal axis.) Answers will vary. Here is one representative set of answers. g. Comment on whether these experimental data appear to support the conjecture that those who receive the letters in familiar 3-letter chunks tend to memorize more letters. Yes, these data appear to support the conjecture that those who receive the letters in convenient 3-letter chunks tend to memorize more letters. The center of this plot is about six letters higher than for the JFKC plot. h. If the JFK group does substantially better than the JFKC group, could you legitimately conclude that the grouping of letters caused the higher scores? Explain how you would respond to the argument that perhaps the good memorizers were in the JFK group and the poor memorizers were in the JFKC group. Yes, because this was a well-designed, randomized, controlled experiment you could legitimately conclude that the grouping of letters caused the higher scores. Because you randomly assigned the students to each type of grouping, there should have been roughly an equal number of good memorizers in both the JFK and JFKC groups, so the randomization controlled for this potentially confounding variable. Page 20 Turn to page 87 and use the Watch Out to answer the following questions: 1. Random sampling aims to produce a sample that is representative of the population. Random sampling eliminates bias. 2. Random assignment aims to produce treatment groups that are similar in all respects except for the treatment imposed. Random assignment eliminates confounding. In the previous activities, we have seen the importance of randomization and the reduction of variability. The term experimental design refers to a plan for assigning experimental units to treatment conditions. A good experimental design serves three purposes: ▪ ▪ ▪ Causation. It allows the experimenter to make causal inferences about the relationship between explanatory variables and a response variable. Control. It allows the experimenter to rule out alternative explanations due to the confounding effects of extraneous variables (i.e., variables other than the explanatory variables). Variability. It reduces variability within treatment conditions, which makes it easier to detect differences in treatment outcomes. There are four key principles of experimental design: • • • • Comparison. Use a design that compares two or more treatments. Random assignment. Use chance to assign experimental units to treatments. Doing so helps create roughly equivalent groups of experimental units by balancing the effects of other variables among treatment groups. This can include completely randomized designs or randomized block design which is explained below. Control. Keep other variables the same for all groups, especially variables that are likely to affect the response variable. Control helps avoid confounding and reduces variability in the response variable. Replication. Use enough experimental units in each group so that any differences in the effects of the treatments can be distinguished from chance differences between groups. 30 experimental units is what will be use when it is up to us to decide. The goal of randomization or random assignment is to create groups that are as similar as possible before administering treatment. If we identify a variable — not of interest in the study — that has an effect on the response variable, then we should impose an additional level of control called blocking. Using this technique, we separate the experimental units into homogeneous groups first, and make the random assignment within each group. This extra step creates treatment groups even more alike, making it easier to observe the effects of the treatment. We call the design without blocking a "completely randomized design," and we call the design that includes blocking a "randomized block design." Page 21 Work for Oct 26/27 starts here: Blocking: Introduction by Examples To illustrate the concepts, we use a hypothetical experiment. Suppose a new medication has been developed to treat a virus that recently appeared in tropical areas of South America. High fever is one of this disease’s symptoms. The new medication, call it Drug Z, is an anti-viral drug, designed to reduce the release of viral particles in people already infected. The drug’s manufacturer will conduct a clinical trial with 600 men and women aged 18 to 70 to test the safety and effectiveness of Drug Z. One response variable is change in body temperature 8 hours after first dose. Example 1 – Completely Randomized Design As people become infected and are enrolled in the clinical trial, the researchers randomly assign each to one of two treatment groups. Subjects in Group 1 will receive Drug Z at a 325 mg dose, and subjects in Group 2 will receive the best drug available until now, call it Drug X. (It would be unethical to give subjects in Group 2 no drug at all, that is, to give them a placebo.) Each subject’s body temperature will be recorded before the first dose of medication and again 8 hours after first dose. The change in body temperature will be recorded, and the mean change for each treatment group will be calculated. Finally, the two means will be compared. The researchers hope that the observed difference between the means is too large a difference to be due to chance. The treatment of interest (Drug Z) is called an intervention and the Drug Z group is called the intervention group. This is a one-factor experiment, i.e. only one explanatory variable, namely the antiviral medication. Figure 1 below is an outline of this design. Figure 1 Completely randomized design (example 1) 600 subjects random assignment Group 1 300 subjects Treatment 1 New Drug Group 2 300 subjects Treatment 2 Old Drug compare mean change in temperature Completely randomized designs are inferior to more elaborate designs. The reason is that it is possible that not all potentially cofounding variables are removed. For example, men and women respond differently to medication. In the completely randomized design in Example 1, the random assignment to treatment groups was done without regard to gender. It ignored the differences between men and women. Even though the patients were assigned by random chance to the treatment groups, it is possible that one treatment group received more men than women. A better design will look separately at the responses of men and women. In other words, the researchers will separate the men and the women and then randomly assign each gender group to the different treatment groups. This is called the randomized block design. Page 22 Example 2 – Randomized Block Design The 600 subjects are assigned to blocks, based on gender. Then subjects within each block are randomly assigned to the two treatment groups (Drug Z and Drug X). The variable of gender is called a blocking variable. Eight hours after taking the treatments, the researchers compare the change in body temperature between the treatment groups within each block. Figure 2 below outlines this randomized block design. Figure 2 Randomized block design (example 2) 280 men random assignment Group 1 140 subjects Treatment 1 New Drug Group 2 140 subjects Treatment 2 Old Drug compare mean change in body temperature 600 subjects 320 women random assignment Group 3 160 subjects Treatment 1 New Drug Group 4 160 subjects Treatment 2 Old Drug compare mean change in body temperature Note that assignment to blocks is not random assignment! The randomized block design in this example is an improvement over the completely randomized design in Example 1. In both Example 1 and Example 2, comparison of treatment groups is used to implicitly prevent confounding. However, the randomized block design in Example 2 explicitly controls the variable of gender. Example 3 – Randomized Block Design with Additional Groups Suppose that the researchers in our hypothetical drug trial were not sure of the best dosage for Drug Z. Perhaps 500 mg is an effective dose; perhaps a 325 mg dose works just as well. The researchers must test both dosages. If they continue to use gender as a blocking variable, how many treatment groups will they require? (We say that the factor (Drug Z) has two levels (325 mg and 500 mg).) Summary – Randomized Block Design A block is a group of experimental units that are known, prior to the experiment, to be similar according to some variables and that these variables are expected to affect the response to the treatments. In the randomized block design, the random assignment to treatments is carried out separately within each block. Blocks are another form of control. The block design is to control the variables that are used to form the blocks (these variables are called the blocking variables). In Example 2, the blocking variable is the gender. Conclusion One important advantage of experiments over observational studies is that well designed experiments can provide good evidence for causation. In an experiment, an intervention (Drug Z in our examples) is applied to enough experimental units to ensure that the results of the experiments will not be dependent on chance variation (the principle of repetition). The experimental units are randomly assigned to an intervention group (Drug Z) and a control group (Drug X). This refers to the principles of randomization and control, which help reduce the potential of bias and prevent confounding by increasing the chance that confounding variables will operate equally on the intervention group and the control group. Then the only difference between the two groups is the intervention. When the intervention group experiences favorable results, we can be confident that the intervention makes the difference. Page 23 References 1. Moore. D. S., McCabe G. P., Craig B. A., Introduction to the Practice of Statistics, 6th ed., W. H. Freeman and Company, New York, 2009. 2. Design of Experiments. (March 9, 2011) ttps://introductorystats.wordpress.com/2011/03/09/designof-experiments/ Note: "Blocks" may seem like an odd term to refer to groups that share a common characteristic. The term has its origin in agricultural experiments, where the experimental units were plots of land that had been subdivided into "blocks." Exercises: For each experiment in problems 1 and 2 below, you are to create a diagram that describes your process in performing the experiment. All of them will be done with a completely randomized design and then a blocking design. 1. You believe that self-checkouts in supermarkets are faster than traditional checkouts with a cashier. Design an experiment that uses random assignment. It is further thought that the age of the customer affects time to check out. Design an experiment that also uses age as a blocking variable. Completely randomized design Grocery Story Experiment Group 1 100 subjects Group 2 100 subjects random assignment 200 subjects Figure 2 Randomized block design by Age 100 below 40 random assignment Treatment 1 Regular Cashier Treatment 2 Self Checkout Group 1 50 subjects Regular Cashier Group 2 50 subjects Self Check Out compare mean difference in check out times compare mean check out time Subjects 100 over 40 random assignment Group 3 50 subjects Regular Cashier Group 4 50 subjects Self Check Out compare mean check out time Page 24 2a. A study is to be conducted of the effectiveness of a new diet called Fatbegone. The treatment group will go on the new diet for a period of three months. The control group will not receive any information about Fatbegone. Instead they will have weekly counseling sessions on topics such as healthy eating habits, exercise, sleep, etc. Each person’s weight will be recorded at the beginning of the study and at the end of three months. The change in weight for each subject will then be recorded. One hundred adults are available for the study. Describe or create a graphic for a completely randomized design. Completely randomized design Weight Loss Experiment 100 subjects random assignment Group 1 50 subjects Treatment 1 Fatbegone Dieg Group 2 50 subjects Treatment 2 Counseling on Healthy Lifestyle compare mean change in weight after three months 2b. Suppose researchers believe that a person’s response to the new diet is affected by how much overweight the person is to begin with. The researchers have determined that 40 of the subjects are slightly overweight, 44 of the subjects are moderately overweight, and 36 of the subjects are extremely overweight. Explain how you would design an experiment that blocks on initial weight. Subjects will initially be grouped into three groups according to their weight: slightly overweight, moderately overweight, and extremely overweight. Each of these groups will be randomly divided into two groups according to treatment (Fatbegone or Counseling). There will be 6 treatment groups. Extra Practice: Blocking Homework 1. When students take math exams, the problems are usually in order of difficulty with easier problems first and more difficult problems towards the end. Does order of difficulty make a difference? Ninety-six Algebra 2 students taught by the same teacher are part of an experiment. a. Design a randomized experiment where students take exams whose problem difficulty ranges from easy to hard, hard to easy, or completely randomized. b. Suppose the researcher believes the response to order of difficulty depends on the student’s grade going into the exam, and she has classified each student as A/B or C/D or failing. Describe a design for this experiment that blocks on “current grade.” How many treatment groups will be required? What is a likely problem with this design? There will be 9 groups. Three groups will initially be set with the A/B, C/D or failing grades. Within each of these groups the same levels as above will be set. These groups could be very disparate in numbers. For example, at our school there will be very few failing grades. Page 25 2. A medical study of heart surgery investigates the effect of a drug called a beta-blocker on the pulse rate of the patient during surgery. The pulse rate will be measured at a specific point during the operation. The investigators will use 20 patients facing heart surgery as subjects. You have a list of these patients, numbered 1 to 20, in alphabetical order. a. Describe the design of a completely randomized, controlled experiment to test the effect of betablockers on pulse rate during surgery. Include a diagram. Figure 1 Completely randomized design (Pulse Rate during Surgery) 20 subjects b. random assignment Group 1 10 subjects Treatment 1 Beta Blocker Group 2 10 subjects Treatment 2 Control compare mean change in pulse rate Describe how you will use the random number table to select your subjects. Then, use the section from the random digits table below to carry out the randomization required by your design and list the outcome of the randomization. Begin with the first line and continue across. Mark on the digits to help explain your procedure. One method is to see the random number table to choose ten 2-digit numbers from 01 to 20, ignoring repeats. The patients with these numbers will receive the beta-blocker during their operation. The remaining 10 people will act as a control group and will not receive the beta blocker. Measure the pulse rate of all patients at the specified point in the operation and compare the difference in the mean pulse rate for the two groups. This requires a lot of numbers. Since the subjects are already numbered, we can devise a system where we essentially toss a coin. If we come to an even digit, the subject will be assigned to Treatment Group 1 (Beta Blocker); if we get an odd digit, the subject will be assigned to Treatment Group 2 (Control). Once a treatment group is filled we will put the remaining subjects in the empty treatment group. This means we will need no more than 20 digits. Subject 1 2 3 4 5 6 7 8 9 10 Digit 1 9 2 2 3 9 5 7 3 4 Treatment Control Control Beta Blocker Beta Blocker Control Control Control Control Control Beta Blocker Subject 11 12 13 14 15 16 17 18 19 20 Digit 0 5 7 5 Treatment Beta Blocker Control Control Control Beta Blocker Beta Blocker Beta Blocker Beta Blocker Beta Blocker Beta Blocker c. Who is assigned to each group? Group 1 (Control) includes subjects 1, 2, 5, 6, 7, 8, 9, 12, 13, 14 Group 2 (Beta Blockers includes subjects 3, 4, 10, 11, 15, 16, 17, 18, 19, 20 d. Suppose you suspect that gender affects the response to the drug. You now receive two lists, 10 men and 10 women, who will be used as subjects. Describe a randomized block design that takes gender into account. Block first according to gender. Then create two treatment groups for each block and randomly assign 5 in each treatment group. Page 26 Work for Oct 28/29 starts here: Activity 11-2: Random Babies Turn to page 230 to get help completing this activity. a. How many different arrangements are there for returning the four babies to the four mothers? 24 b. For each of these arrangements, indicate how many mothers get the correct baby. The first column has already been entered. 1234 __4__ 1243 __2__ 1324 __2__ 1342 __1__ 1423 __1__ 1432 __2__ 2134 __2__ 2143 __0__ 2314 __1__ 2341 __0__ 2413 __0__ 2431 __1__ 3124 __1__ 3142 __0__ 3214 __2__ 3241 __1__ 3412 __0__ 3421 __0__ 4123 __0__ 4132 __1__ 4213 __1__ 4231 __2__ 4312 __0__ 4321 __0__ c. How many of these arrangements result in 0 matches, 1 match, and so on? Record your answers in the first row of the table. Number of Matches Number of Arrangements Exact probability 0 9 .3750 1 8 .3333 2 6 .2500 3 0 0 4 1 .0417 Total 24 1.00 d. Calculate the theoretical probabilities by dividing your answers to part b by your answer to part a. Record your answers in the bottom row of the table. e. Comment on how closely the empirical probabilities from your class simulation analysis in the previous unit approximated these probabilities. Then comment on how closely the applet simulation analysis, using a greater number of repetitions or trials, approximated these probabilities. The empirical probabilities from this class are reasonably close to these theoretical probabilities. The applet simulation probabilities are even closer to the theoretical probabilities. f. For your class simulation results from the previous activity, calculate the average (mean) number of matches per repetition of the process. [Hint: Multiply each number of matches outcome by the number of occurrences for that number, sum those products, and then divide by the total number of trials. Results will vary. g. Calculate the expected value for the number of matches from the probability distribution, and compare that value to the average number of matches from the simulated data found in part e. E(X) = 0(.375) + 1(1/3) + 2(.25) + 4(1/24) = 1 match h. What is the probability that the result of this random process equals this expected value? Based on this probability, would you say that you “expect” this outcome to occur most of the time? Explain. The probability that any trial results in 1 match is 1/3. So we don’t expect this outcome “most” of the time. After many, many trials, however, we “expect” the average number of matches to be very close to 1. Page 27 Turn to page 232 and use the Watch Outs (there are 2) to answer the following questions: 1. An expected value is interpreted as the long-run average value of a numerical random process. 2. Many people fall into the trap believing that probabilities should also hold in the short run. Remember, probability is a long-term property. Activity 11-3: Family Births Turn to page 232 to get help completing this activity. Suppose a couple has two children. Assume that each child is equally likely to be a boy or a girl, regardless of the outcome of previous births. a. Suppose someone argues that the couple is guaranteed to have one boy and one girl, because that’s what a 0.5 probability means: 50% should be boys and 50% should be girls, and 50% of two children is one child. Do you believe this argument? (Do you know of any two-child families with two boys or two girls?) How would you respond to this argument to help the person see his/her faulty reasoning? Probability does not have a memory. With each birth, there is a 50% chance of a girl, but that probability does not change after the first child is born. If the first child is a girl, the probability that the second child is a girl is still ½, not zero. This probability applies “in the long run” – about half of all children born will be girls, but that is not true for every family. b. Now, suppose someone else claims that there are three possible outcomes for this family: two boys, two girls, or one of each. Equal likeliness therefore establishes that the probability of each of these outcomes is one-third. Do you believe this argument? If so, why? If not, how would you respond to it? There are four equally likely outcomes (not three) for a family with two children. The outcomes are: Boy first and then Boy; Boy first and then Girl; Girl first and then Boy; and Girl first and then Girl. So, the probability the couple has one boy and one girl is 2/4 or 50%, and the probability they have 2 children of the same gender (BB or GG) is also 2/4 or 50%. c. Use the Random Digits Table (Table I) found at the back of the book to simulate the children’s genders for four families of two children each. Start at any line and let each even digit represent a girl and each odd digit represent a boy. Record the random digits and the corresponding genders in the table: Answers will vary. The following were obtained using row 31 of the Random Digits Table: d. Continue until you have simulated the gender breakdowns for a total of 20 two-child families. Record how many and what proportion you have of each type. The following table indicates the gender breakdowns for a total of 20-two-child families: Page 28 e. Based on your (very small) simulation analysis, does it appear that the probability of each of these three outcomes is one-third? Explain. Based on this simulation, it does not appear that the probability of each of these outcomes is ⅓. It appears that the probability of one of child of each gender is about twice that of two girls or two boys. f. How could you obtain better empirical estimates of these probabilities? You could obtain a better empirical estimate of these probabilities if you simulate more families. g. Combine your simulation results with those of your classmates: Here are the results from one class simulation: h. Based on these (more extensive) simulation results, does it appear that the probability of each of these outcomes is one-third? Explain. It does not appear that the probability of each of these outcomes is ⅓. It appears that the probability of one girl and one boy each is roughly .5, whereas the probability of having two girls or of having two boys is about .25. Because each of the two children is equally likely to be a boy or girl, the correct way to list the sample space of equally likely outcomes is {B1B2, B1G2, G1B2, G1G2}, where the subscript indicates the first or second child born. These four outcomes are equally likely to occur. i. Use this sample space to determine the exact probabilities of a two-child couple having two girls, two boys, or one of each. Are these probabilities reasonably close to the empirical estimates from your class simulation? Two girls: P(GG) = ¼ = .25 Two boys: P(BB) = ¼ = .25 One of each: P(BG or GB) = ½ = .5 Yes, these probabilities are reasonably close to the empirical estimates from the class simulations. Work for Nov 3-5 starts here: Simulations When what we want to investigate is not easy to carry out (costly, dangerous, unnecessary, etc.), we can create a simulation to run the experiment. Remember the seven steps below: Steps in Creating a Simulation 1. 2. 3. 4. 5. 6. 7. Identify the real-world activity that is to be repeated. Link the activity to one or more random numbers. Describe how you will use the random number assignment to complete a full trial. State the response variable. Run several trials. Collect and summarize the results of all the trials. State your conclusion. We will illustrate the creation of simulations utilizing these steps for the following examples. Page 29 1. Look back at activity 11-1 from page 226 in the book and write out how the first 4 simulation steps were used: 1. Returning newborn babies to their mothers. 2. 1 – Jerry Johnson, 2 – Marvin Miller, 3 – Sam Smith, 4 – Willy Williams 3. Randomly generate the numbers 1-4 with no repeats. Generating the list of four numbers is one trial. A number in the correct place in the randomly generated list indicates the baby and mom were matched correctly. A number in the incorrect place in the randomly generated list indicates the baby and mom were matched incorrectly. For example, if the randomly generated list is 1, 2, 3, 4 all babies were matched correctly, but if the list is 1, 3, 2, 4 only babies 1 and 4 were matched correctly. 4. Number of correct matches (0, 1, 2, or 4). This is a quantitative variable. 2. Explain how you could conduct a simulation to determine the probability of these situations: a. Guessing the correct answer on at least 7 out of 10 true/false questions. 1. Guessing the correct answer on 7 out of 10 True/False questions 2. 0 – incorrect guess, 1 – correct guess 3. Randomly generate a list of ten numbers, consisting of only 0 and 1. Each list of ten numbers is one trial. If the list contains seven or more 1s, it simulates at least 7/10 correct answers. If the list contains fewer than seven 1s, it simulates fewer than 7/10 correct answers. 4. At least 7/10 correct answers (yes, no). This is a binary categorical variable. 5. Run at least 10 trials. 6. Make a table to summarize results; yes or no for each trial. 7. State conclusion as proportion of trials that were yes being the probability of guessing at least 7 out of 10 True/False questions correctly. b. Choosing a yellow tulip bulb from a bin if one in six of the bulbs in the bin is yellow. 1. Choosing a yellow tulip bulb from six bulbs if one is yellow 2. 1 – yellow bulb, 2-6 – not yellow bulb 3. Randomly generate a number from 1-6. Each number generated is one trial. If the number is 1, a yellow bulb was chosen. If the number is not 1, a yellow bulb was not chosen. 4. Yellow (yes, no). This is a binary categorical variable. 5. Run at least 10 trials. 6. Make a table to summarize results; yes or no for each trial. 7. State conclusion as proportion of trials that were yellow being the probability of picking a yellow bulb from six bulbs when one is yellow. 3. What is the probability of scoring 80% or better on a five-question true/false quiz if you guess at every answer? a. Design a simulation using your calculator that enables you to estimate this probability. 1. Guessing the correct answer on 4 out of 5 True/False questions 2. 0 – incorrect guess, 1 – correct guess 3. Randomly generate a list of five numbers, consisting of only 0 and 1. Each list of five numbers is one trial. If the list contains four or more 1s, it simulates scoring 80% or better. If the list contains fewer than four 1s, it simulates scoring less than 80%. 4. At least 80% (yes, no). This is a binary categorical variable. 5. Run at least 10 trials. 6. Make a table to summarize results; yes or no for each trial. 7. State conclusion as proportion of trials that were yes being the probability of scoring at least 80% on a True/False quiz by guessing. b. Run 20 trials of your simulation and record the number of times you score 80% or better. Answers will vary c. Calculate and interpret the experimental probability of scoring 80% if you guess every answer. P( 80%) = number of times at least four 1s occur 20 “Based on the simulation, I estimate that I will score 80% or higher on the quiz (your answer) of the time on average.” Page 30 4. a. Use your graphing calculator to conduct a simulation of tossing a coin 100 times. Start by writing out the first four steps of the simulation process. Complete a chart like the one below using your results. Do 10 trials at a time and record the cumulative frequencies. 1. Tossing a coin 2. 0 – heads, 1 – tails 3. Randomly generate a number from 0 to 1. Each randomly generated number is a trial. If the number is a 0, the coin landed heads up. If the number is a 1, the coin landed tails up. 4. Heads up (yes, no). This is a binary categorical variable. Number of Coin Flips 10 20 30 40 50 60 70 80 90 100 Total Number of Heads 5 10 17 21 27 33 36 41 45 50 Experimental Probability 50% 50% 56.7% 52.5% 54% 55% 51.43% 51.25% 50% 50% b. What happens to the observed probability as the number of trials increases? The probability gets closer to 50%. 5. a. Use your graphing calculator to conduct a simulation of drawing a queen from a standard deck of cards. The card is replaced after each time one is drawn. Start by writing out the first four steps of the simulation process. Complete a chart like the one below using your results to help determine the experimental probability of drawing a queen. Do 10 trials at a time and record the cumulative frequencies. 1. Drawing a queen 2. 1 – queen, 2-13 – not queen OR 1-4 – queen, 5-52 not queen 3. Randomly generate a number from 1 to 13. Each randomly generated number is a trial. If the number is a 1, the card is a queen. If the number is a 2-13, the card is not a queen. Since the probability is the same, using 1-13 with only 1 for the queen card works just as well as using 1-52 and using 1-4 as queen cards. It is easier to look for just 1 than to look for 1-4. 4. Queen (yes, no). This is a binary categorical variable. Number of Cards Drawn 10 20 30 40 50 60 70 80 90 100 Total Number of Queens Drawn 1 2 2 2 2 3 3 3 3 4 Experimental Probability 10% 10% 6.667% 5% 4% 5% 4.286% 3.75% 3.333% 4% b. Use your results to determine the experimental probability of drawing a queen. P(queen) = number of times a queen occured 4 = = 0.04 or 4% number of trials 100 Page 31 6. The following spreadsheet shows the results of a simulation experiment in which three coins were tossed simultaneously. a. Explain how a collection of randomly generated ones and zeros can simulate the experiment described above. 0 = tails 1 = heads b. Suppose that 1 represents a head and 0 represents a tail. Based on the results of this simulation, what is the experimental probability of getting three heads in one toss of the three coins? number of times 3 heads occured number of trials 2 1 = = = 0.111 or 11.1% 18 9 P(3 heads) = c. Describe what you would do to get a better estimate for the probability in part (b). Increase the number of trials. d. Explain how this experiment could also be used to simulate the sequence of the gender of three children born to a given family. 0 = male, 1 = female e. Describe the advantages of simulating this experiment instead of repeatedly tossing coins. Faster Better organized Easy to record data Page 32 Extra Practice: Simulation Homework 1. There is a new set of cards of famous mathematicians included in Andromeda candy bars. The set includes Conway, Descartes, Euler, Fermat, and Gauss. You wish to collect a complete set and wish to estimate how many candy bars you must purchase in order to get the complete set. The five cards were not uniformly produced. The distribution is as follows. Conway (10%), Descartes (10%), Euler (20%), Fermat (30%) and Gauss (30%). Create a simulation to estimate how many candy bars you should purchase based on the seven steps outlined above. a. What is the real-world activity that is being simulated? Purchasing a candy bar and identifying which card is in the wrapper. b. What mathematician should you assign to each group of numbers to represent the distribution? 0: Conway 1: Descartes 2,3: Euler 4,5,6: Fermat 7,8,9: Gauss c. What will make up one full trial of this simulation? A full trial will be the set, S, of numbers necessary to collect a complete set of five different cards. d. What is the response variable? In other words, what are we collecting from each trial? The response variable is the size of each set S, n(S) described in part c. e. Above is an excerpt from a random number table. (Calculator or computer simulations are also possible.) We will begin reading on the third line (the letter next to each number records the mathematician): 5 F 0 C 3E 4F 9G 7G 1D A complete trial required only 7 candy bar purchases. The response in this case is 7. Do you believe this is typical? No; we know that there is variability between samples. f. Continue reading in the table for the second trial: 1D 4F 6F 9G 7G 6F 6F 8G 8G 6F 5F 2E 3E 8G 5F 6F 7G 6F 1D 0C This trial required the purchase of 20 candy bars. The response in this case is 20. Conduct an additional 3 trials and record the results below. Include evidence of the trail and the response. 0 0 5 0 8 2 1—required 7 purchases 6 2 purchases 5 9 0 6 0 2 4 9 6 1 1 5 3 7 0—required 9 purchases 7 2 9 1—required 12 g. Once all 5 trials are complete, the results can be collected and summarized. What measure would be a good summary measure for this simulation? Explain. Write a summary statement. For these five trials, the mean number of candy bar purchases required in order to collect all 5 in the set is 11. The median is 9. Because it may occasionally be necessary to purchase a large number of bars in order to complete the set and thus the mean will be pulled up, the median may be the better estimate. Page 33 2. A field-goal kicker for a high school football team has an 80% success rate based on his attempts this year. Design and describe a simulation that will help you determine the experimental probability that he might miss three field goals in a row. 1. Field-goal kicker with an 80% success rate misses 3 kicks in a row 2. 0-7 success, 8-9 miss 3. Randomly generate a list of 3 numbers from 0-9. Each list of 3 numbers is one trial. If the list has all 8s and 9s, the kicker missed three in a row. If the list has any non-8 or 9 numbers, the kicker didn’t miss all three. 4. Three missed kicks (yes, no). This is a binary categorical variable. 5. Run at least 10 trials. 6. Make a table to summarize results; yes or no for each trial. 7. State conclusion as proportion of trials that were yes being the probability of missing three kicks in a row if the kicker has an 80% success rate. 3. Ten percent of the keyboards a computer company manufactures are defective. Design and describe a simulation that will determine the experimental probability that one or more of the next three keyboards to come off the assembly line will be defective. 1. Getting one or more defective keyboard out of three if 10% are defective overall. 2. 0 = defective, 1-9 = not defective 3. Randomly generate a list of 3 numbers from 0-9. Each list of 3 numbers is one trial. If the list has a least one 0, that simulates getting at least one defective keyboard. If the list has no 0s, then there are no defective keyboards. 4. Defective keyboard (yes, no). This is a binary categorical variable. 5. Run at least 10 trials. 6. Make a table to summarize results; yes or no for each trial. 7. State conclusion as proportion of trials that were yes being the probability of getting at least one defective keyboard out of three when overall 10% are defective. 4. Design a simulation that will allow you to estimate the probability of having five boys in a family of five children. Carry out the simulation and estimate the probability of this event. Complete at least 25 trials of the experiment. 1. Having five boys in a family of five children. 2. 1 = boy, 2 = girl 3. Randomly generate a list of five numbers from 1-2. Each list of five numbers is one trial. If the list is all 1s, the family has five boys. If the list has any 2s, the family doesn’t have five boys. 4. Five boys (yes, no). This is a binary categorical variable. 5. Run 25 trials. 6. Make a table to summarize results; yes or no for each trial. 7. State conclusion as proportion of trials that were yes being probability of having five boys in a family of five children. Page 34 Multiple Choice Guessing Experiment In an experiment to determine whether people can do better than guessing on a random multiple-choice question, participants are asked to read and answer a given question with four answer choices. Participants do not need to have any outside knowledge of the topic the question is related to, but on average should still perform better than simply guessing which is the correct answer. Suppose that an experiment consists of thirty of these trials. Part 1: Preliminary Analysis and Simulation 1. If a participant does not have any idea what the correct answer is and therefore guesses each time, what proportion would he or she get correct in the long run? 1/4 2. Describe how you could use a spinner to simulate this experiment over and over for a person who just guesses for each of the thirty trials. Use a spinner with four sections. Decide which section of the spinner represents the correct answer. Record which section the spinner stops at for 30 spins. Calculate the proportion of spins that landed in the section representing the correct answer. 3. Translate your simulation to a calculator exercise and run the simulation 30 times. What was your result? RandInt(1, 4, 30)→L1:SortA(L1) Then look at L1 to find how many 1s were generated out of 30 trials if 1 represents guessing the correct answer. 4. Combine your results with the rest of the class. How many times did each result happen (frequency)? Answers will vary. We “expect” 7-8 correct guesses out of 30, but we also anticipate variability in the simulated data. Here are the resulting proportions from an example series of 25 trials. number of correct guesses proportion correct Frequency 5 6 7 8 9 10 11 12 .17 5 .2 5 .23 4 .27 2 .3 3 .37 3 .4 0 .43 3 Graph is the sample frequency distribution. 5. If a person were to get 30% correct in this experiment, would you be fairly convinced that he or she does better than just guessing? Explain clearly, using the results of your simulation. No. In a series of 25 simulated trials, there were 9 trials in which the person had 30% or more correct “guesses.” Something that occurs in 9 out of 25 trials just by chance is not unusual—there’s no reason to think a person with 9 correct decisions isn’t just guessing. 6. If a person were to get 50% correct in this experiment, would you be fairly convinced that he or she does better than just guessing? Explain clearly, using the results of your simulation. Yes. In a series of 25 simulated trials, there were 0 trials in which the person had at least 50% correct “guesses.” A person who makes 15 or more correct decisions has done something that isn’t likely to happen just due to chance. It’s more likely to be true that the person has some idea what the correct answer is. Page 35 Part II: The Experiment. 1. The class will take a multiple-choice “quiz” to see if students can do better than just guessing. Each student will read the question and choose what they think is the correct answer without any assistance. As we will learn in more detail in an upcoming unit, it is important to identify the hypotheses of an experiment. The first is the null hypothesis and it is denoted by H0. It states that the parameter of interest (the value we are studying; proportion for categorical data and mean for quantitative data) is equal to a specific, hypothesized value. In the context of a population proportion, H0 has the form: H0: p = p0, where p is the population proportion of interest and p0 is replaced by the conjectured value of interest. The null hypothesis is typically statement of "no effect" or "no difference." The test of significance is designed to measure the strength of evidence against the null hypothesis. 7. In the present context, the null hypothesis is that the subject is just guessing. Change this verbal statement into a null hypothesis. H0: p = 0.25 8. If a student has some knowledge of the subject of the multiple-choice question and answers without guessing, then the proportion of correct answers the student gives will, in the long run, exceed what proportion? In the long run, the proportion of correct answers will exceed .25. The second hypothesis that we identify is the alternative hypothesis is denoted by Ha. It states what the researchers or pollsters suspect or hope is true about the parameter of interest. In this case, if we believe that the students can do better than guessing on the multiple-choice question then our alternative hypothesis is: Ha: p > p0 There are other forms of the alternative hypothesis that we will study at a later time. 9. Write the alternative hypothesis: Ha: p > 0.25 10. Would a person who is just guessing always guess correctly on 1/4 of the trials? What phenomenon is this? No. This is an example of sampling variability. 11. Go to the “quiz” linked above and answer the multiple-choice question. Report to the class spreadsheet if you got it correct or not. answers vary 12. Is this sufficient evidence to convince you that your class can clearly do better than guessing on a multiple-choice question? Explain. answers vary; explanation should refer to results of class simulation Page 36 In our upcoming unit, we will learn what “convincing” evidence is. We will also learn more about what type of data we are collecting, but in general there are three ways we will look at data: ▪ ▪ ▪ Collect a single piece of information from each observational unit which leads to one sample. From that sample we will calculate a sample mean and sample standard deviation. We will then use those statistics to decide whether our sample shows us something meaningful or significant or convincing. This will be a 1-sample test. Collect two independent samples, still collecting a single piece of information from each observational unit, but there are now two groups. From that we will calculate two sample means and two sample standard deviations. We will then use those statistics to decide whether there is a meaningful/significant/convincing difference between the two groups. This will be a 2-sample test. Collect two pieces of information from each observational unit and calculate a difference in the observations which leads to one sample. Two common examples for this type of data collection are that the observational units have similar characteristics or qualities such as a couple or it could be a single person we observe before and after a treatment. The differences calculated between the pairs are the sample we analyze. We find the sample mean and sample standard deviation of the differences to decide if there is a meaningful/significant/convincing result. This will be a matched pairs or paired test and will be conducted like the 1-sample test. Deciding if you have two independent samples or matched pairs is something that you will practice in Unit 5. You may need to decide this about your project data though, so you can look at Topics 22 and 23 in the book to help. Or you can ask your teacher for help with this concept as it relates to your project data. Here are a few examples to think about from earlier problems in RS1: Example: Decide if each situation uses matched pairs or creates two samples. a. From Activity 1-16: “Smokers who participated in the study were randomly assigned to receive either the nicotine lozenge being tested or a placebo lozenge” Two samples – there are different smokers randomly assigned to each treatment group b. From Activity 7-4: “States are classified by whether they are east or west of the Mississippi River” Two samples – the states are either east of the river or west of the river c. From Activity 5-5: “Every person received the same sequence of 30 letters, but there were presented in two different groupings” Two samples – each student only received one of the groupings of 30 letters to try to memorize If you thought those were all two sample examples, you are correct! Matched pairs are less common in the work we do in RS1, so if you think your experiment is matched pairs, please check with your teacher to make sure you are doing everything correctly. Page 37 Extra Practice: Review Problems True/False _____1. In a simple random sample, every member of the population has an equal chance of being included in the sample. _____2. In a stratified sample, strata are designed so that members in each strata are heterogeneous, i.e., mixed and unalike. _____3. The purpose of an experiment is to determine the effect of the independent response variable on a dependent explanatory variable. _____4. A control group generally consists of observational units who are not to receive the treatment that is the focus of the experiment. _____5. It is fair to say that in every experiment, there are lurking variables; in some cases, they may be a source of confounding while in other cases, they may not. _____6. An experiment is blind if neither the observational units nor the person administering the treatment knows which units are in the control group and which units are in the treatment group. _____7. When an investigator expects that one specific characteristic of the experimental units will likely affect the results of the experiment, a block design is appropriate. _____8. An observational study can produce causal results whereas an experiment can only identify an association. _____9. A statistic is said to provide unbiased estimates of a population parameter if values of the statistic from different random samples are centered at the actual parameter value. _____10. Sampling variability refers to the fact that the values of sample statistics vary from sample to sample. _____11. Sample statistics from smaller samples are more precise and closer together than those from larger samples. _____12. The tendency of a sample statistic refers to how much the values vary from sample to sample. _____13. As long as the population is at least 10 times as large as the sample size, the precision of a sample statistics depends on the sample size, not the population size. _____14. Taking a larger sample reduces bias. _____15. Comparison groups are especially important in medical studies because subjects often respond positively simply to being given a treatment. This phenomenon is known as the controlled factor effect. Short Answer. 16. The owner of a club with 1000 members wants to survey 50 members about the friendliness of the staff. Three sampling methods are described below. One is a simple random sample; one is a convenience sample, and one is a voluntary response sample. Which is which? a. Ask the first 50 members who enter the club one morning. b. Leave a stack of response cards by the sign-in desk with a sign asking members to participate. c. Put each name on a single slip of paper. Place all of the slips in a hat and mix well. Draw one slip out and note the name. Continue picking and noting the names until 50 different names are selected. Page 38 17. Agricultural scientists for a chemical company want to determine if a newly developed fertilizer produces heavier tomatoes than the fertilizer they currently manufacture. For their first pilot study, they have 24 healthy young tomato plants growing in individual pots, numbered from 1 to 24. Describe the design of a completely randomized, controlled experiment to test whether the new fertilizer produced heavier tomatoes. Then construct a graphic illustrating your design. 18. Does using a calculator improve understanding of mathematical concepts? All 200 fifth-graders at a school are randomly assigned to one of two groups. One group studies addition of fractions with the aid of a calculator, the other studies the same topic without a calculator. Scores on a fractions test are compared after two weeks. Comment on the extent to which inferences can be drawn about a larger population and whether cause and effect can be established. Page 39 19. Adam is your school’s star soccer player. When he takes a shot on goal, he typically scores half of the time. Suppose that he takes six shots in a game. To estimate the probability of the number of goals Adam makes, use simulation with a number cube. One roll of a number cube represents one shot. a. Specify what outcome of a number cube you want to represent a goal scored by Adam in one shot. b. For this problem, what represents one trial of taking six shots? c. Perform and list the results of ten trials of this simulation. d. Identify the number of goals Adam made in each of the ten trials you did in part (c). e. Based on your ten trials, what is your estimate of the probability that Adam scores three goals if he takes six shots in a game? Answers: 1T, 2F, 3F, 4T, 5T, 6F, 7T, 8F, 9T, 10T, 11F, 12T, 13T, 14F, 15F 16a-Convenience, 16bVoluntary, 16c-SRS 17-Use a random number generator to choose twelve 2-digit numbers from 01 to 24, ignoring repeats. Tomato plants with these numbers will receive the new fertilizer. The remaining 12 plants will act as a control group and will receive the old fertilizer. Measure the total weight of tomatoes produced by each plant, and compare the mean weight in the two groups. Control Group—old fertilizer Compare mean weights Tomato Plants Experimental group—new fertilizer Random Assignment 18 - There is random assignment so cause and effect can be inferred. But, there is no random sampling (as a matter of fact it was a census of the entire school) so we cannot generalize beyond that particular school. 19- a. Answers will vary; students need to determine which three numbers on the number cube represent scoring a goal. b. Rolling the cube six times represents taking six shots on goal or 1 simulated trial. c. Answers will vary. Performing only ten trials is a function of time. Ideally, many more trials should be done. d. Answers will vary. e. Answers will vary; the probability of scoring per shot is ½. Once you have completed the review questions above, complete the Self-Check problem at the end of each topic. The textbook page numbers and necessary calculator lists are given below. The solution to each SelfCheck follows in the textbook so you can check your answers immediately. • • • • Self-Check 3-5, textbook page 42 Self-Check 4-5, textbook page 64 Self-Check 5-6, textbook page 86 Self-Check 11-5, textbook page 236 Page 40 Glossary Alternative hypothesis states what the researchers suspect or hope to be true about the parameter of interest and can take on three forms (< hypothesized value, > hypothesized value, hypothesized value). The specific form of the alternative hypothesis is determined by the research question, before the sample data are examined. Anecdotal Evidence results from situations that come to mind easily and is of little value in scientific research Bias tendency of samples to differ from the corresponding population as the result of systematic exclusion of some part of the population Blind when the subjects are not aware which treatment they actually receive Blocking a technique in experimental design that filters out the effects of some extraneous factors; blocking creates groups that are similar with respect to a blocking factor; for example, gender, age, grade level Cause-and-effect Conclusions based on experiments where it is established that the explanatory variable is the most likely cause of the results in the response variable Comparison the process used to determine whether there is a statistically significant different between two or more groups Confounding Variable a variable that may differ among the explanatory variable groups in such a way that we cannot distinguish their effects from those of the explanatory variable; they may prevent us from drawing a causeand-effect conclusion between the explanatory and response variables Constant a value that remains unchanged Control holding extraneous factors constant so that their effects do not confound experimental conditions Convenience Sample samples that are easily accessible Double Blind when neither the person evaluating the subjects nor the subjects are aware of which treatment they actually receive Expected Value the long-run average result of a numerical random process Experiment a study in which the experimenter actively imposes the treatment on the subjects Explanatory Variable the variables whose effect you want to study Factor Each independent variable that we assume to influence the dependent variable of interest Intervention The treatment that is being tested, i.e. imposed on the experimental group Level (of a factor) The logical categories or intensities of factors or treatments Nonresponse members of the population did not have the opportunity to respond Null hypothesis states that the parameter of interest is equal to a specific value Observational Study when the investigator passively observes and records information on observational units Parameter a number that describes a population Placebo a treatment with no active ingredient or benefit (such as a sugar pill) Placebo Effect subjects often respond positively simply by being given a treatment Population the entire group of people or objects (observational units) of interest Page 41 Precision (of a sample statistic) how much the values vary from sample to sample Prospective Study a study that starts with a group and watches for outcomes (for example, the development of cancer or remaining cancer-free) during the study period and relates this to suspected risk or protective factors that might be linked to the outcomes. Probability Distribution a way of representing the likelihood of all the possible results of a statistical event. Random Assignment it creates treatment groups that are similar in all respects expect for the treatment imposed so that lurking and potentially confounding variables tend to balance out between the two groups. Representative Sample a sample that has similar characteristic to the population so that you can learn useful information Response Variable the variable that you suspect is affected by the other variable; often considered to be the outcome of interest Retrospective Study a study that starts with an outcome and then looks back to examine exposures to suspected risk or protective factors that might be linked to that outcome. Sample part of the population from which or which data are gathered to learn about the population as a whole Sampling Bias systematic tendency of a sampling method to overrepresent some parts of the population Sampling Frame an actual list of every member of the population that we want to sample from Sampling Variability the tendency of values of a sample statistic to vary from sample to sample Simple Random Sampling (SRS) when every possible sample of size n has an equal chance of being the sample ultimately selected Simulation a way to model random events Statistic a number that describes a sample Statistically significant when the difference in values of the response variable between two groups is so large that such an extreme difference would rarely occur by random assignment alone Stratified Sample separate simple random samples are independently selected from a set of non-overlapping subgroups Treatment explanatory variable Trial the single performance of an experiment Voluntary Response refers to samples collected in such a way that members of the population decide for themselves whether or not to participate in the study Unbiased Statistic this occurs if values of the statistic from different random samples are centered at the actual parameter value Page 42