- Modified Jeopardy Name that Chi-square Analyst Choice Pesky Assumptions Interpretation Mania Regression Junction 100 100 100 100 100 200 200 200 200 200 300 300 300 300 300 400 400 400 400 400 500 500 500 500 500 The only chi-square test where there are multiple populations of interest The analysis undertaken by a sports enthusiast who knows the percentage of singles, doubles, triples, and home runs hit during regular season who wants to know if a similar pattern exists for spring training games. The analysis undertaken by an airline to investigate whether males and females were equally bothered by their opposite gender’s use of the common armrest between two seats. (Assume people were either not bothered, bothered, or highly bothered). The analysis undertaken by park rangers who relocated troublesome bears, recorded their gender, and observed whether the bears remained in their new location, moved back, or moved to another location, and want to know if there is an association between gender and outcome of relocation. The analysis undertaken by scientists looking to compare response to whistle calls (alarms) (either enter or run to burrow or stand still/freeze) for squirrels in three different age groups and see if the responses differ between the three age groups. The test procedure that could be used to determine if deer that live directly below designated military air space have higher heart rates than the deer population average, thought to be 51.2 beats per minute. The test used to determine whether lions stalk zebras and wildebeests for differing amounts of time on average. The lions are observed stalking one of the animals only, not both. (stalk refers to the reduction in predator-prey distance when the prey is unaware or minimally alarmed by the predator) The test used to determine whether students in a special program at a school exhibited the expected 1 point change (growth) based on their third and fourth grade test scores for a standardized exam. The test used to determine whether a predictor is a significant predictor of the response in regression The test used when trying to investigate the effectiveness of mosquito repellent across four different brands, where each observation has the brand and length of time it was effective recorded. Requires assumptions about a population of differences Not a boxplot, this plot is often utilized to check the assumption of normality (in its many forms) and you cannot determine the number of peaks in the distribution from this plot. Require assumptions about expected counts Requires the assumptions that two samples be independent and satisfy the randomization and independence conditions as well as the populations either being normally distributed or have large sample sizes for both samples or some combination of those The assumption that either a population be normally distributed or the sample size be large is because we are actually interested in the distribution of this quantity. The distribution of a statistic when thinking about taking many random samples from a distribution and calculating the value of that statistic for each sample The estimated average magnitude of residuals in regression An indication of the success of a confidence interval in terms of capturing the population parameter of interest in the long-term over many different samples The estimated standard deviation of a statistic (Example: the estimated standard deviation of a sample mean) The estimated number of standard errors that a sample statistic differs from a hypothesized value when dealing with tests for means A regression line is provided as: yˆ 13.42 5.6 x The quantity 13.42 in this equation is what quantity? The quantity interpreted as the average change in y (the response) for a one unit increase in x (the predictor) The proportion of observed variation in the response explained by the linear relationship between the response and the predictor. The plot used to check the assumption that there is a constant standard deviation Most of the regression assumptions are checked using the residuals which are estimates of these quantities Test or CI? ANOVA Design/Sampling CLT and Related Probability General 200 200 200 200 200 200 400 400 400 400 400 400 600 600 600 600 600 600 800 800 800 800 800 800 1000 1000 1000 1000 1000 1000 Research question: What proportion of adults in the U.S. are in favor of the death penalty for persons convicted of murder? Research question: Is the mean human body temperature less than 98.6 degrees Fahrenheit? Research question: On average, how much difference is there between the adult heights of a father and his son? Research question: Is the mean number of tapeworms in the stomach of medicated sheep less than the mean number of tapeworms in the stomach of unmedicated sheep? Research question: Does response to a new treatment (yes = responds, no = doesn’t respond) for cancer depend on gender of the patient receiving treatment? The test statistic for an ANOVA is of this type The appropriate alternative hypothesis for any ANOVA Each ANOVA test statistic has two of these as associated quantities. This is the assumption about populations in ANOVA that is not related to normality. This procedure is performed after an ANOVA null hypothesis has been rejected to determine where the differences in means are among the different populations. A sampling method where every sample of size n from the population has an equal chance of being selected A sampling method in which every k-th individual in the population is chosen for the sample The optional principle of experimental design The type of bias that could occur when a question is worded a certain way or in any case where some part of the survey design could influence responses One of the three required principles of experimental design, besides control and replication. A random sample taken from a right-skewed parent population will likely have this distribution. The sample mean of a random sample taken from a right-skewed parent population will likely have this distribution. The standard rule of thumb for when the Central Limit Theorem applies If the population distribution is normal, then this distribution for the sample mean will already be normal. (I.E. what distribution do the CLT and related results talk about?) The CLT is not relevant for these tests. (Multiple correct answers) An example of a discrete probability distribution The probability that a randomly selected tree is older than 50 years given that it is on Amherst College property is an example of this type of probability. The relationship between two events when knowing that one occurs does not affect the probability that the other occurs. The three basic probability rules we covered. Suppose X is Binomial (n=1000, p=.25), and you want to know P(X>275). You could compute that probability using this method. The error associated with incorrectly rejecting the null hypothesis when it is in fact true. The quantity which is 1 minus the probability of a Type II error A 95 percent confidence interval will always be wider than a confidence interval of this level for a given random sample (i.e. the only thing you are changing is the level). (Multiple answers possible. Give one example). This quantity is the probability of obtaining your test statistic or something more extreme assuming that the null hypothesis is true. Significance level in a hypothesis test is equivalent to the probability of this error type. Final Exam is Monday, May 9th, 9 am -12 noon in SM 207 You can bring a two-sided page of notes and calculator, plus pen/pencils. Office Hours: ◦ Thursday – 2-4 ◦ Friday – 1-4 ◦ Sunday – 2-4 pm, SM 206 or 207 Good luck studying! Math dept. end of semester picnic is tomorrow (Saturday) from 12-2 at the Alumni House