Statistics Jeopardy Review

advertisement
- Modified Jeopardy
Name that
Chi-square
Analyst
Choice
Pesky
Assumptions
Interpretation
Mania
Regression
Junction
100
100
100
100
100
200
200
200
200
200
300
300
300
300
300
400
400
400
400
400
500
500
500
500
500

The only chi-square test where there are
multiple populations of interest

The analysis undertaken by a sports
enthusiast who knows the percentage of
singles, doubles, triples, and home runs hit
during regular season who wants to know if a
similar pattern exists for spring training
games.


The analysis undertaken by an airline to
investigate whether males and females were
equally bothered by their opposite gender’s
use of the common armrest between two
seats.
(Assume people were either not bothered,
bothered, or highly bothered).

The analysis undertaken by park rangers who
relocated troublesome bears, recorded their
gender, and observed whether the bears
remained in their new location, moved back,
or moved to another location, and want to
know if there is an association between
gender and outcome of relocation.

The analysis undertaken by scientists looking
to compare response to whistle calls (alarms)
(either enter or run to burrow or stand
still/freeze) for squirrels in three different
age groups and see if the responses differ
between the three age groups.

The test procedure that could be used to
determine if deer that live directly below
designated military air space have higher
heart rates than the deer population average,
thought to be 51.2 beats per minute.


The test used to determine whether lions
stalk zebras and wildebeests for differing
amounts of time on average. The lions are
observed stalking one of the animals only,
not both.
(stalk refers to the reduction in predator-prey
distance when the prey is unaware or
minimally alarmed by the predator)

The test used to determine whether students
in a special program at a school exhibited the
expected 1 point change (growth) based on
their third and fourth grade test scores for a
standardized exam.

The test used to determine whether a
predictor is a significant predictor of the
response in regression

The test used when trying to investigate the
effectiveness of mosquito repellent across
four different brands, where each observation
has the brand and length of time it was
effective recorded.

Requires assumptions about a population of
differences

Not a boxplot, this plot is often utilized to
check the assumption of normality (in its
many forms) and you cannot determine the
number of peaks in the distribution from this
plot.

Require assumptions about expected counts

Requires the assumptions that two samples
be independent and satisfy the randomization
and independence conditions as well as the
populations either being normally distributed
or have large sample sizes for both samples
or some combination of those

The assumption that either a population be
normally distributed or the sample size be
large is because we are actually interested in
the distribution of this quantity.

The distribution of a statistic when thinking
about taking many random samples from a
distribution and calculating the value of that
statistic for each sample

The estimated average magnitude of
residuals in regression

An indication of the success of a confidence
interval in terms of capturing the population
parameter of interest in the long-term over
many different samples


The estimated standard deviation of a
statistic
(Example: the estimated standard deviation of
a sample mean)

The estimated number of standard errors that
a sample statistic differs from a hypothesized
value when dealing with tests for means

A regression line is provided as:
yˆ  13.42  5.6 x

The quantity 13.42 in this equation is what
quantity?

The quantity interpreted as the average
change in y (the response) for a one unit
increase in x (the predictor)

The proportion of observed variation in the
response explained by the linear relationship
between the response and the predictor.

The plot used to check the assumption that
there is a constant standard deviation

Most of the regression assumptions are
checked using the residuals which are
estimates of these quantities
Test or
CI?
ANOVA
Design/Sampling
CLT and
Related
Probability
General
200
200
200
200
200
200
400
400
400
400
400
400
600
600
600
600
600
600
800
800
800
800
800
800
1000
1000
1000
1000
1000
1000

Research question: What proportion of adults
in the U.S. are in favor of the death penalty
for persons convicted of murder?

Research question: Is the mean human body
temperature less than 98.6 degrees
Fahrenheit?

Research question: On average, how much
difference is there between the adult heights
of a father and his son?

Research question: Is the mean number of
tapeworms in the stomach of medicated
sheep less than the mean number of
tapeworms in the stomach of unmedicated
sheep?

Research question: Does response to a new
treatment (yes = responds, no = doesn’t
respond) for cancer depend on gender of the
patient receiving treatment?

The test statistic for an ANOVA is of this type

The appropriate alternative hypothesis for
any ANOVA

Each ANOVA test statistic has two of these as
associated quantities.

This is the assumption about populations in
ANOVA that is not related to normality.

This procedure is performed after an ANOVA
null hypothesis has been rejected to
determine where the differences in means are
among the different populations.

A sampling method where every sample of
size n from the population has an equal
chance of being selected

A sampling method in which every k-th
individual in the population is chosen for the
sample

The optional principle of experimental design

The type of bias that could occur when a
question is worded a certain way or in any
case where some part of the survey design
could influence responses

One of the three required principles of
experimental design, besides control and
replication.

A random sample taken from a right-skewed
parent population will likely have this
distribution.

The sample mean of a random sample taken
from a right-skewed parent population will
likely have this distribution.

The standard rule of thumb for when the
Central Limit Theorem applies


If the population distribution is normal, then
this distribution for the sample mean will
already be normal.
(I.E. what distribution do the CLT and related
results talk about?)

The CLT is not relevant for these tests.

(Multiple correct answers)

An example of a discrete probability
distribution

The probability that a randomly selected tree
is older than 50 years given that it is on
Amherst College property is an example of
this type of probability.

The relationship between two events when
knowing that one occurs does not affect the
probability that the other occurs.

The three basic probability rules we covered.

Suppose X is Binomial (n=1000, p=.25), and
you want to know P(X>275). You could
compute that probability using this method.

The error associated with incorrectly rejecting
the null hypothesis when it is in fact true.

The quantity which is 1 minus the probability
of a Type II error


A 95 percent confidence interval will always
be wider than a confidence interval of this
level for a given random sample (i.e. the only
thing you are changing is the level).
(Multiple answers possible. Give one
example).

This quantity is the probability of obtaining
your test statistic or something more extreme
assuming that the null hypothesis is true.

Significance level in a hypothesis test is
equivalent to the probability of this error
type.



Final Exam is Monday, May 9th, 9 am -12 noon in
SM 207
You can bring a two-sided page of notes and
calculator, plus pen/pencils.
Office Hours:
◦ Thursday – 2-4
◦ Friday – 1-4
◦ Sunday – 2-4 pm, SM 206 or 207

Good luck studying!
 Math
dept. end of semester picnic
is tomorrow (Saturday) from 12-2
at the Alumni House
Download