Statistics Trivial Pursuit (Sort of) For Review (math 17)

advertisement
STATISTICS TRIVIAL PURSUIT
(SORT OF)
FOR REVIEW
(MATH 17)
COLORS AND CATEGORIES
Blue – Basic Graphs and Descriptive Statistics
 Pink – Assumptions (cumulative)
 Yellow – Statistical Theory and History
 Brown – Interpretations
 Green – Last 1/3 Inference
 Orange – Other Hypothesis Testing Related

BLUE 1

What are the descriptive statistics that are
sensitive to outliers?
BLUE 2

Provide the name and primary purpose of this graph.
BLUE 3

Provide a basic description of the distribution of this
variable from its graph (remember there are 3 things to
describe).
BLUE 4

What are the descriptive statistics used in the
creation of a boxplot?
BLUE 5

Name the rule used to compute outliers, and
describe how to apply it.
BLUE 6

Name graphs that are appropriate to display
categorical variables, and state whether or not
you should discuss the shape of distributions
based on those graphs.
BLUE 7

Compare/contrast these 2 distributions based on
the plot.
BLUE 8

A standard deviation of a measurement in feet is
3.4 feet, from a sample with a mean of 29.2.
Interpret the standard deviation.
BLUE 9

This plot is part of the preliminary analysis for
….
BLUE 10

If there was a high outlier in the distribution of a
particular variable, and it was removed, what
descriptive statistics are likely (or certain) to
change to a significant extent?
PINK 1

What is the assumption that all chi-square tests
have in common?
PINK 2

What is the assumption related to sample sizes
for a 2 sample z-test?
PINK 3

What is the assumption related to sample size
when constructing a confidence interval for p?
PINK 4

What are the specifics of the nearly normal
condition for a paired t-test?
PINK 5

What are the specifics of the nearly normal
condition for ANOVA?
PINK 6

What are the specifics of the 2 assumptions in
regression related to error terms?
PINK 7

You are told that the randomization and
independence condition is met for a sample of
high school students who were asked how much
money they received for their most recent
birthday. Describe what the randomization and
independence assumption means in this context.
PINK 8

What are some example tests where assumptions
related to normality are NOT required?
PINK 9

What are the specifics of the nearly normal
condition for a 2-sample t-test?
PINK 10

What is the assumption that all tests/CIs have in
common but which (since it is common to all)
Prof. Wagaman doesn’t require that you write
down when you list assumptions?
YELLOW 1

What is a sampling distribution for a statistic?
(conceptually)
YELLOW 2
(Fill in at least 3 of the blanks for credit)
The t distribution was discovered by ___________
who published under the pseudonym
____________. He discovered the t distribution
while working for _____________ in Ireland.
Specifically he was working in the field of
______________ (2 words, but one blank) and was
primarily responsible for checking out _________,
one of their many products.

YELLOW 3

What does the Central Limit Theorem say?
YELLOW 4

How are z-scores computed, and what are they
useful for? (variety of answers)
YELLOW 5

When sampling distributions have standard
deviations that involve unknown parameters,
and we plug in estimates for those parameters,
we obtain what value(s)?
YELLOW 6
Suppose 2 random variables X and Y are
independent. X has mean 6 and standard
deviation 3. Y has mean 14 and standard
deviation 4.
 What are the values of the mean and standard
deviation of X+Y?

YELLOW 7

What are the differences between a chi-square
test of homogeneity and a chi-square test of
independence?
YELLOW 8

What are the three types of bias in sampling?
YELLOW 9

If you are designing an experiment and you have
3 different drugs you want to try, and you want
to try them at 2 different doses each (1 pill or 2
pills daily), and you want to include (a) placebo
group(s), how many treatments are there in your
experiment?
YELLOW 10

Name and describe two different sampling
techniques.
BROWN 1

Running a hypothesis test for slope equal to 0 or
not, you obtain a t-test statistic value of -2.14.
Interpret this test statistic.
BROWN 2

A linear regression results in an R-squared value
of .81. Assuming linear regression was
appropriate, interpret this R-square in terms of
general X and Y variables.
BROWN 3

A random sample of n=16 observations yields an
s=24 (sample standard deviation). What is the
numerical value of the standard error of the
sample mean? Also, interpret this value.
BROWN 4

Describe what is wrong with the statement:

“A p-value is the probability that the null hypothesis
is true.”
BROWN 5

A 95% confidence interval for a mean weight of a
new dog breed goes from (25.2, 34.6) pounds.
Interpret the confidence interval given here.
BROWN 6

A regression results in an s_e value of 3.46. The
y-axis goes from 36 to 109. What does the s_e
value represent, and what does it tell you about
how well the regression does?
BROWN 7

A p-value for an ANOVA testing for equality of 5
means with an F of 24.56 is .0359. Interpret this
p-value.
BROWN 8

A 95% confidence interval for a mean weight of a
new dog breed goes from (25.2, 34.6) pounds.
Interpret the confidence level used here.
BROWN 9

A conclusion in a t-test of mu=150 vs. mu>150 is given
as:
Our evidence is not inconsistent with our null hypothesis.
How should this conclusion be changed to be correct?
BROWN 10

A p-value for a two-sided two sample z-test is
.1470 based on a Z of 1.45. Interpret this p-value.
GREEN 1

Which set(s) of graphs indicate it would NOT
be appropriate to perform an ANOVA?
Explain.
GREEN 2


You want to know if the distribution of class year
among Reunion workers is equally split among
first-years, sophomores, and juniors. What test is
appropriate?
(Note, I am assuming that seniors can’t get hired
to work Reunion, if they can, change this to
equally split among all four class years).
GREEN 3

An ANOVA where the null hypothesis is rejected
results in multiple comparisons of:
Estimate
lwr
upr
2-1 4.146737 -2.737867 11.031342
3-1 -3.742933 -10.627537 3.141671
3-2 -7.889670 -14.774274 -1.005066
Summarize what this multiple comparisons shows
you.
GREEN 4

If you wanted to know whether or not there is a
significant association between heart rate and
weight in rats, what statistical procedure would
you perform?
GREEN 5

You want to compare the means of 4 groups.
Describe why you would want to do an ANOVA
rather than 6 t-tests to compare all pairs of
means.
GREEN 6

You want to know if there is an association
between t-shirt size (S,M,L,etc.) and class year at
Amherst. What is the appropriate statistical
procedure to perform?
GREEN 7

You want to know if a higher proportion of
underclassmen have corrective lenses compared
to upperclassmen. Explain why there is no
appropriate chi-square test for this situation.
What analysis could you run?
GREEN 8

A balanced ANOVA is an ANOVA where….
GREEN 9

Describe the similarities and differences in
finding p-values for ANOVA and chi-square.
GREEN 10



A scatterplot for
regression is given as:
R also reports an Rsquared value of .81
What is the correlation
between X and Y?
ORANGE 1

What is power and how would you increase it for
a hypothesis test?
ORANGE 2

What is a Type I error?
ORANGE 3

If given a significance level of .035, for what pvalues would you reject the null hypothesis?
ORANGE 4

Explain the difference between practically
significant results and statistically significant
results.
ORANGE 5


Most of the tests we learned in class were
______________ tests. If certain assumptions
related to them are not met, you can run
________________ tests, one example of which is
________________________.
(Fill-in at least 1 blank).
ORANGE 6

Hypothesis tests and confidence intervals are
based on an understanding of the
__________________ _______________________
(two words) of statistics.
ORANGE 7

You are performing a t-test for mu=60 versus a 2sided alternative and all conditions are satisfied.
What is the expected value of your test statistic
under the null hypothesis?
ORANGE 8

You are testing for p=.4 vs. p>.4 and all
conditions are satisfied. Your sample results in
30 yes replies out of 100 responses. What can you
say about your p-value for this test?
ORANGE 9

In order to use a confidence interval to do a onesided t-test with a significance level of .05, what
confidence level would need to be used?
ORANGE 10

You are testing for mu=50 vs. mu>50, and the
appropriate confidence interval is (52,64). Can
you reject your null hypothesis? Explain.
REMINDER:



Final Exam is Monday, May 9th, 9 am -12 noon in SM 207
You can bring a two-sided page of notes and calculator,
plus pen/pencils.
Office Hours:
Thursday – 2-4
 Friday – 1-4
 Sunday – 2-4 pm, SM 206 or 207


Good luck studying!
THANKS FOR A GREAT SEMESTER!
Math
dept. end of semester
picnic is Saturday from 12-2 at
the Alumni House
Download