Stat 4220 homework 1) Santa noticed that a lot of people put candy canes on their trees. He wants to evaluate whether the color of a candy cane is related to the type of Christmas tree. After randomly selecting 3054 houses Santa collected the following data. Color of Candy Cane Observations Classic Type of Christmas Tree Flocked Artificial Aluminum Red/White Blue/White Solid Red Brown Black with Green dots 298 301 300 293 201 210 203 212 132 162 148 132 97 101 112 103 19 15 13 2 Color of Candy Cane Expected Counts Classic Type of Christmas Tree Flocked Artificial Aluminum Red/White Blue/White Solid Red Brown Black with Green dots 291.6 308.0 302.9 289.6 202.0 213.4 A 200.7 140.4 148.3 145.8 139.5 101.0 106.7 104.9 100.3 12.0 12.7 12.5 11.9 Color of Candy Cane Partial Chi-Squared Classic Type of Christmas Tree Flocked Artificial Aluminum Red/White Blue/White Solid Red Brown Black with Green dots 0.1423 0.1570 0.0274 0.0397 0.0053 0.0541 CENSORED 0.6380 0.5024 1.2670 0.0317 0.3990 0.1599 0.3043 0.4749 0.0704 B 0.4329 0.0242 8.2410 What are the missing values for A and B 2) A journalist want to estimate the proportion of students who are in debt. The current best guess is that the proportion should be about 75%. She wants to get a 60% confidence interval for the true proportion with a margin of error that is less than 0.013. How many students should she survey? 3) The Daily Stat Fact reports that 10% of engineers get a job that is not engineering related. I think the report is way high. I randomly sample 625 engineers and find that 49 of them got a job that was not engineering related. Test whether the percentage reported really is too high 4) Harry Potter believes that he can tell if a person is a bad guy by listening to the background music when they come near. To find out if this is the case, Harry records what type of music he hears around 114 random people. Then Harry performs the Crucius curse to determine if the person is a good guy or bad guy. Based on the following data, determine if the type of background music is related to the person’s allegiance. Allegiance Good Bad Guys Guys Background Music Ominous Music Happy Music 45 38 13 18 Show all the steps of the hypothesis using specifically a Χ2 test of independence! 5) Katelyn has discovered that salt-licks from the Great Salt Lake are normally distributed, but they contain trace amounts of arsenic. She asks four of her friends to buy a salt-lick and measure the amount of arsenic. Here are their results: Raul: 28 cc Blaine: 44 cc Madison: 32 cc Leanne: 20 cc Using their data find a 98% CI for the average amount of arsenic in a salt-lick 6) Suppose you are testing whether green runts cause cancer. You have a large group of people who regularly eat runts, and a large group that never eat runts, you will mark which ones develop cancer before they die. The Willy Wonka Candy Company is worried that if a link is found to cancer that it would be devastating. They ask you to be extra cautious not to hurt the company’s image unless you’re absolutely certain about the results. Choose an α level besides 0.05 and explain why. 7) Some buildings in Laramie have been having problems with insects nesting inside the walls. A supervisor has suggested that it could be based on whether the building has iron supports or steel supports. Based on the data below, use any method you like to test whether that could be true. Insect problems No insect problems Iron 120 250 370 Steel 140 230 370 260 480 8) Donald Trump just finished studying 96 business, and has classified them according to the amount of risk the companies take (high, medium, or low), and what type of company (large, small, personal, or not-for-profit). His final conclusion is that the amount of risk a company takes does not depend on the type of company. Bill Gates says that is so not true. He says different types of companies have different types of risk levels. To keep the two from arguing you decide to compute the χ2 Test of Independence. When you hand the paper to Donald and Bill, they fight over it and tear the corner of the report (see the picture below). Determine statistically who you would say the data supports. As a hint, the partial χ2 values that you can see add up to 13.19, and the assumptions are met for the test. 9) Doctor Ann randomly selects 40 people to crack their knuckles daily, and 40 people to never crack their knuckles. Doctor Bob selects 40 pairs of twins and one twin will crack their knuckles daily and the other not. After 10 years they measure the amount of arthritis. Who will have a more powerful test? a) Dr. Ann’s test is more powerful because Doctor Bob’s 80 subjects are only 40 pairs of twins so his results will be similar to having a smaller sample size. b) Dr. Bob’s test is more powerful because taking the difference between twins will take out variability due to the genetics of each subject c) Dr. Bob’s test is more powerful because it is very unlikely that two different sets of twins will be related to each other which increases the chance that they were selected randomly d) Dr. Ann’s test is more powerful because the people who do not crack their knuckles will act as a control group in the experiment where they are not twins e) Dr. Ann’s test is more powerful because the subjects do know which treatment they are getting beforehand and it will reduce the risk of a placebo effect 10) Dr. Carl asks 1000 people to rate whether they “crack their knuckles frequently”, “crack their knuckles sometimes”, and “almost never crack their knuckles”. Then he evaluates if they have arthritis in their hands. What kind of test should Dr. Carl run to analyze this data assuming the conditions are met? A) 2 proportions z test B) One mean t-test C) Regression D) Matched Pairs E) Chi-squared 11) A genetics test is attempting to see if there is a relationship between nose type (Long, Medium, and Flat) and diet (Poor, Somewhat Healthy, and Healthy). Below is the data and output from a computerized Χ2 program. OBS Poor Some Healthy Long 10 12 15 Med 15 16 9 Flat 8 2 4 Χ2 Poor Some Healthy Long 0.87 0.003 1.15 Med 0.02 0.60 0.89 Flat 1.68 1.48 0.02 EXP Poor Some Healthy Long 13.4 12.2 11.4 Med 14.5 13.2 12.3 Flat 5.1 4.6 4.3 Test whether there is a relationship between nose type and diet. 12) A test to determine if major is related to social skills looks at 4 different majors and whether the student has social skills. The test has a p-value of 0.55. What is the conclusion? A) Because the number of majors is less than 5, no conclusions can be drawn. B) The p-value is less than α, so there is evidence to suggest a link between major and social skills. C) The p-value is greater than α so there is not evidence to suggest a link between major and social skills. D) The p-value is greater than α, so there is evidence to suggest a link between major and social skills. E) The p-value cannot be great than ½, so an error was made 13) The NYTimes did a study on the proportion of football players that have sustained a head injury. Their 95% confidence interval based on 109 random NFL players was (0.571, 0.629). Check which of the following (if any) are true. There is a 95% probability that the proportion is between 0.571 and 0.629 95% of the time the true proportion will be between 0.571 and 0.629 This sample was not large enough to be able to use the normal distribution by the Central Limit Theorem 95% of all confidence intervals from 109 NFL players will correctly contain the true proportion The true proportion is between 0.571 and 0.629 with 95% confidence For a new CI there is a 95% probability of the sample proportion being between 0.571 and 0.629 14) A sociologist wants to show that the food you eat actually changes your perception of how other people are feeling. She gathered 1000 volunteers, and randomly selected what food they would eat. Then she asked them to look at a photograph (of a person showing no emotion) and asked them to mark what emotion they thought the person was experiences. The data is shown below. Test at the 1% significance level (with all 7 steps of a hypothesis) if the food they ate is related to the emotion chosen. Happy Chocolate 22 Oranges 25 Breadstick 33 Salad 31 Steak 11 122 Angry 16 32 29 46 8 131 Sad 30 48 32 65 7 182 Surprised 9 8 11 21 6 55 Sleepy 39 66 51 98 8 262 Scared 44 65 25 102 12 248 160 244 181 363 52 1000 15) Google wants to know if the type of browser you use determines what you do on the internet. They installed spyware on 400 random computers and got the following data Firefox IE Chrome Social Media 31 32 18 81 Games 49 57 15 121 Work 80 70 48 198 160 159 81 400 Test whether what you do on the computer is related to the type of browser you use. 16) Nick knows the UW football team is better than CSU, but he wants to compare their average rushing yards. He is fairly certain that the rushing yards are normally distributed with the same variance for both teams. He randomly selects 11 UW games, and the average rushing yards were 110. He randomly selects 7 CSU games, and the average rushing yards were 93. Can Nick say with 99% confidence that UW has more rushing yards than CSU? Standard deviation of one game for UW: 16 yards Standard deviation of one game for CSU: 13 yards Pooled standard deviation for one game: 15 yards Matched Pairs standard deviation for one game: 7.5 yards Average standard deviation for both teams: 14.5 yards