**This guide is provided as a brief overview of the course topics. Please be sure to spend additional time looking over your notes for any units that you need to review further. Unit review packets and tests should provide a nice overview as well. Do not hesitate to ask questions!!!
Parameter vs. Statistic
Stem and Leaf Plot
Box and Whisker Plot
Four things to look for in a scatterplot:
a numerical measurement (between -1 and 1) of the strength of the linear relationship between two quantitative variables.
Conditions to check:
CORRELATION ≠ _________________________________
a hidden variable that stands behind a relationship and determines it by simultaneously affecting the other two variables.
Be able to interpret the slope and the intercept.
Coefficient of Determination: R
values for “new”
Means vary less than individual values.
If the scatterplot is not straight, try _______________________________________.
Steps to running a simulation:
Identify the __________________________ to be repeated.
Explain how you will model the component’s ____________________.
Explain how you will combine the component’s to model a ____________________.
State clearly what the _______________________________________ is.
Run several trials.
Collect and summarize the results of the trials (CUSS).
State your conclusion, _____ ___________________!
Simple Random Sample (SRS):
Every possible sample of the size we plan to draw has an equal chance to be selected. Each individual and each combination of people has an equal chance of being selected.
Slice the population in __________________________________ (strata), then SRS within each stratum.
Split the population into similar parts or clusters. Select a few clusters and perform a
__________________ in each.
Combining several methods to sample.
Select individuals systematically. Start from a randomly selected individual. Use a pattern from there. (If order of the list can be associated with the responses sought, then this is not an appropriate sampling method.)
Members of the population are chosen based on the convenience of including them.
Members of the population are invited to respond and all who do are included in the sample.
Forms of Bias
Voluntary Response bias
Researchers don’t assign treatments, but simply observe what is going on.
Valuable for discovering trends.
possible to demonstrate a __________________________ _________________________.
A study design that allows us to prove a cause-and-effect relationship. An experiment:
Manipulates factor levels to create _______________________________.
________________________ ____________________ subjects to these treatment levels.
Compares the responses of the subject groups across treatment levels.
Four Principles of Experimental Design:
Confounding Variables: when the levels of one factor are associated with the levels of another factor.
The best experiments are:
Law of Large Numbers
says that as the number of independent trials increases, the long-run relative frequency of repeated events gets closer and closer to a single value.
Be able to calculate probabilities from tree diagrams, Venn diagrams, and tables.
A random variable assumes a value based on the outcome of a random event. Discrete random variables can take one of a countable number of distinct outcomes. A continuous random variable can take any numeric value within a range of values.
Expected Value: Standard deviation:
Adding or subtracting a constant from data shifts the mean but doesn’t change the variance or standard deviation:
In general, multiplying each value of a random variable by a constant multiplies the mean by that constant and the variance by the
of the constant:
In general, the mean of the sum of two random variables is the sum of the means, and the mean of the difference of two random variables is the difference of the means.
If the random variables are
, the variance of their sum
of the variances.
3 conditions… o o o
Geometric Probability Model:
This model tells us the probability for a random variable that counts the number of Bernoulli trials until the first success.
The 10% Condition:
Bernoulli trials must be independent. If that assumption is violated, it is still okay to proceed as long as the sample is smaller than 10% of the population.
Binomial Probability Model:
This model tells us the probability for a random variable that counts the number of successes in a fixed number of Bernoulli trials.
Mean: Standard Deviation:
When dealing with a large number of trials in a Binomial situation, we can apply the Normal model as an approximation of the Binomial model if the Success/Failure Condition is satisfied.
The Success/Failure Condition:
A Binomial model is approximately Normal if we expect at least
10 successes and 10 failures.
Sampling Distribution Model for Proportions:
Assumptions and Conditions
1. The Independence Assumption: The sampled values must be independent of each other.
2. The Sample Size Assumption: The sample size much be large enough
Sampling Distribution Model for Means:
Same assumptions and conditions hold (randomization and 10% condition, but no success/failure condition). The sample needs to be large enough—think about the context of the situation and decide whether you believe the condition has been met.
Central Limit Theorem:
The sampling distribution model of the sample mean from a random sample is approximately Normal for large “n”, regardless of the distribution of the population, as long as the observations are independent. The larger the sample, the better the approximation will be.
Use your two summary sheets for all tests and confidence intervals, assumptions and conditions, df, etc.
Name the Test!!
Mechanics (Show test statistic, p-value, and picture of the distribution)
P-value: The probability of getting results at least as unusual as the observed statistic, given that:
Significance level (α): Threshold for
Type 1 Error:
Type 2 Error:
Power of a test: The probability that the test correctly rejects a false null hypothesis.
Factors affecting power:
Know the difference between and when to apply
-models, and χ