MATH STAT TRIVIAL PURSUIT (SORT OF) FOR REVIEW (MATH 30) COLORS AND CATEGORIES Blue – Basics of Estimation Pink – Properties of Estimators and Methods for Estimation Yellow – Hypothesis Testing Brown – Bayesian Methods Green – Regression Orange – Nonparametric Procedures and Categorical Data Analysis BLUE 1 Suppose you have an estimator theta-hat, and you want to know its bias. How is bias computed? BLUE 2 How is MSE of an estimator computed? BLUE 3 What is a common unbiased point estimator for a population mean and what is its standard error? BLUE 4 What is a common unbiased point estimate of a difference in two population proportions, and what is its standard error? BLUE 5 A very important result related to samples from a normal distribution is that: The sample mean is ____________ distributed. The sample variance, appropriately scaled, is ____________ distributed. The sample mean and sample variance are ____________________. (Fill-in all three blanks for credit). BLUE 6 What are the 2 properties of pivot quantities and what are pivots used for? BLUE 7 How would you use the asymptotic normal distribution of many unbiased point estimators to create a confidence interval for their respective parameters? (You can just give the formula). Hint: Think of a specific case and generalize. BLUE 8 How is a t distribution formed? BLUE 9 How is an F distribution formed? BLUE 10 How do you form a small-sample confidence interval for a population mean? PINK 1 If relative efficiency is computed between two estimators, it means that both estimators were _______________, and if the numerical value of the relative efficiency is 2, then it means that the _____________ (first or second) estimator is better. PINK 2 What is the definition of consistency for an estimator? Bonus: What concept of convergence is this equivalent to? PINK 3 For an unbiased estimator, what is the “fast” way of showing consistency? Bonus: Do you remember what convergence result this was derived from? PINK 4 If you have a RS of n observations from a distribution with unknown parameter theta, and T is sufficient for theta, what does that mean? PINK 5 What is the result you can use to show sufficiency without resorting to computing conditional pdfs? PINK 6 What does the Rao-Blackwell Theorem say? Bonus: What’s the fast way of finding the quantity RB refers to in the end? PINK 7 Describe how the method of moments works. PINK 8 Describe how the method of ML estimation works. PINK 9 A main property of MLEs is that they are _____________, which means that …. PINK 10 If an estimator is NOT admissible (i.e. inadmissible), what does that mean? Give an example of an inadmissible estimator. YELLOW 1 What is the difference between simple and composite hypotheses? YELLOW 2 Describe the relationships between the two types of error in a hypothesis test, as well as their connection to power. YELLOW 3 If you have a test statistic, you can use either a rejection region approach or a p-value approach to determine if the null hypothesis should be rejected. What is the difference in the 2 approaches? (Describe). YELLOW 4 For the common large sample asymptotically normal z-tests, what is the rejection region for a 2-tailed test? Bonus: If the significance level is .05 for this test, what is the range of test statistics where you would NOT reject the null hypothesis (numerical values). YELLOW 5 How are hypothesis tests and confidence intervals related? YELLOW 6 What is the difference between the pooled and unpooled t-tests for 2 independent samples when considering tests for means? YELLOW 7 In order to determine which 2-sample t-test for small sample sizes is appropriate, you might have to run a test to check for equality of _______________, and in order to control your overall significance level, you might have to use a ____________ _____________. YELLOW 8 What does the Neyman-Pearson Lemma say? (Get the gist of it, what does it let you find, and how?) YELLOW 9 How do you determine if a most powerful test is UMP? YELLOW 10 How do you construct a likelihood ratio test? What is the asymptotic distribution related to LRTs? BROWN 1 What is the major difference between Frequentist and Bayesian approaches to statistics in terms of how the parameter theta is treated? BROWN 2 What is the difference between a proper and improper prior? What is the difference between an informative and uninformative prior? BROWN 3 How do you find the posterior density of theta? BROWN 4 What are conjugate priors? Give an example of a conjugate prior. BROWN 5 How would you find the Bayes estimate of: theta theta(1-theta) if you had the posterior density of theta? BROWN 6 A Bayes estimator is ALWAYS a function of a _______________ statistic because of the _______________ ________________. BROWN 7 How is a Bayesian credible interval different from a Frequentist confidence interval? BROWN 8 Is it possible for Bayesian and Frequentists intervals to agree? If yes, how might this happen? BROWN 9 Bayesian hypothesis testing is performed using ______ ________, which are Bayesian analogues of ________ test procedures, and which can allow you to find evidence in favor of your ___________ hypothesis. BROWN 10 What are some of the issues related to working with Bayes’ factors? GREEN 1 Relationships between two variables, X and Y can be deterministic or ________________. Regression is used when the relationship is _______________. This means that …. GREEN 2 When first developing regression models, this is the only constraint on the error terms. GREEN 3 If your regression model was: E (Y ) 0 1 x1 2 x2 3 x3 Then how many parameters do you need to estimate? GREEN 4 In least squares solutions for regression, what quantity is minimized to find the solution? (You can just give the simple LR quantity). GREEN 5 The least squares estimates are all ____________, and their variances are functions of _____________, which in turn can be estimated by _______, which is equal to (1/(n-2))SSE. GREEN 6 What is the full set of conditions on the error terms in order to get normal sampling distributions for the parameter estimates if sigma is known? GREEN 7 Why do we end up using a t distribution for inference about slope parameters in regression instead of a normal distribution? GREEN 8 What is the main difference between a confidence interval for a mean response and a prediction interval for an individual response in regression? GREEN 9 How are CIs for mean responses and prediction intervals for individual responses affected as the chosen x moves further from the mean of the x’s? GREEN 10 What is correlation and how do we test about it? ORANGE 1 Describe the two-sample shift model. ORANGE 2 Describe how the sign test works. ORANGE 3 Describe how the signed rank test works. ORANGE 4 Describe how the Wilcoxon Rank Sum/MannWhitney U test works. ORANGE 5 How does a Kolmogorov-Smirnov one-sample test work? Is the null hypothesis in the procedure simple or composite? ORANGE 6 How does the 2-sample Kolmogorov-Smirnov test work? ORANGE 7 When performing categorical data analysis, the main distribution you need to understand for the theoretical setup of problems is the ______________ distribution, but the test statistics turn out to have a different distribution, which is the ________________ distribution. ORANGE 8 How is a chi-square goodness of fit test performed? When should you perform one? ORANGE 9 How (and when) does a chi-square test of independence work? ORANGE 10 For 2x2 tables, inference is also possible with: _________ exact test for small sample sizes _________ ratios, which relies on an asymptotic ______ distribution for it’s natural log. REMINDER: Takehome Final Exam is due Friday, May 13th at 5 p.m. SHARP. Office Hours (see front cover of exam): Monday 9-12 during my other course’s exam Tuesday 10-12 Wednesday 1-3 Thursday 1-3 Any other time by appt. – just send me an email! THANKS FOR A GREAT SEMESTER! Math dept. end of semester picnic is Saturday from 12-2 at the Alumni House