SSG5 230

advertisement
Okun
PSY 230
STUDY GUIDE NUMBER 5
Probability, Sampling, and Sampling Distributions
1. What is inferential statistics?
Inferential statistics involves testing hypotheses about a population parameter based
upon a sample statistic.
I.
Probability
2. What role does probability play in inferential
statistics?
Statisticians compute the probability of obtaining whatever difference is observed between the
hypothesized value of the population parameter and the sample statistic. If this probability is
sufficiently low, then they will reject the hypothesis with which they started.
3.
How can probability be defined and calculated?
Probability The probability of a given event, P (event), is
determined by dividing the number of outcomes in which the event
occurs [NEO] by the total number of outcomes [NTO].
P (event) = NEO/NTO. Probability represents a proportion, that is,
the number of event outcomes divided by the total number of
outcomes.
Deck of Playing Cards
Spades
Ace
King
Queen
Jack
10
9
8
7
6
5
4
3
2
Hearts
Ace
King
Queen
Jack
10
9
8
7
6
5
4
3
2
Clubs
Ace
King
Queen
Jack
10
9
8
7
6
5
4
3
2
1
Diamonds
Ace
King
Queen
Jack
10
9
8
7
6
5
4
3
2
4. How can the probability of either of two events be defined and calculated?
The probability of either of two events, P (A or B) = P(A) + P(B) - P (A and B)
If Event A and Event B are mutually exclusive, i.e., they cannot occur simultaneously, then this
formula simplifies to P (A or B) = P(A) + P(B).
5. How is the absolute reduction in risk calculated?
Absolute reduction in risk is the difference in the proportion of cases in two groups.
Typically, the smaller proportion is subtracted from the larger proportion.
6.
How is relative risk calculated?
What does it tell us?
Relative risk is calculated by dividing the proportion of cases
in one group by the proportion of cases in the other group.
Typically the larger proportion is divided by the smaller
proportion.
Relative risk tells us how much more likely a person in the control group is to have a
condition than a person in the intervention group.
7. What does the base rate tell us?
The base rate is the proportion of people in the control group who have the condition.
2
II.
Sampling
8. What role does sampling play in inferential statistics?
Sampling plays a key role in inferential statistics. Whether or not a sample provides a
sound basis for making an inference about a population parameter from a sample statistic
depends on the method by which the sample is selected and recruited.
9.
What is a census? What is a sample? What is sampling?
When a researcher uses a census, every element in the population
is included in the study.
In practice, researchers rarely collect data from every element
in the population. Rather data are collected from a subset or
portion of all elements, called a sample. Sampling is a process
of selecting and recruiting a subset of all of the elements that
constitute a population.
10. What is the definition of a random sample?
A sample is random if each element in the population has an
equal chance of being selected.
11.
How can a random sample be distinguished from a stratified (modified)
random sample? What is the advantage and disadvantage of each type of sample?
Stratified (modified) random sampling involves sorting elements
of a population according to a characteristic and making
separate lists of the elements of the population. Then the same
number of elements is drawn randomly from each list.
3
12. What is systematic sampling error?
Systematic errors can occur in when the sampling is not done on
a completely random basis. Flaws in sampling can be critical
because nonrandom samples can result in statistics that provide
biased estimates of population parameters.
III. Sampling Distributions
13. How is a sampling distribution created?
The sampling distribution of a statistic is created by
drawing all possible samples of a given size from a
population and each time computing the sample statistic.
14.
What does sampling error refer to?
Sampling error is the discrepancy, or amount of error,
between a sample statistic and its corresponding population
parameter. With respect to a proportion, sampling error
refers to the amount on average that we should expect a
sample proportion to deviate by chance from the population
proportion.
15. What role does a sampling distribution play in
inferential statistics?
Sampling distributions enable us to determine the amount of
sampling error that we should expect when we draw a single
random sample of a given size and have specified a value of .
16. What obstacle does a statistician have to overcome in
order to use inferential statistics?
4
Distinguishing among a Population, a Sample, and a Distribution of Sample Proportions
A population consists of all elements that meet the criterion established by the
researcher (say all registered Democrats in the U. S.) There is only 1 value for the
population proportion, . In inferential statistics, we start with a hypothesized
specific value of , typically that  = .50. For example, we might hypothesize that the
proportion of Democrats in the population who intend to vote for Senator Joe Biden
(versus Senator Christopher Dodd) in the Democratic Presidential primary = .50. We
don’t know the actual value of  but we want to determine whether it is plausible that
the population proportion of Democrats who will vote for Biden = .50
A sample consists of a subset of the elements of our population, say 300 registered
Democrat voters in the U. S. There are many possible samples of 300 cases that can
be drawn from the population of registered Democrat voters in the U. S. In inferential
statistics, we have only 1 sample of n cases and want to test whether our hypothesized
value of  is plausible given the value of our single sample proportion, pBIDEN, equals
say .55.
A distribution of sample proportions (also called the sampling distribution of the
proportion) consists of the sample proportions (or ps) computed from all possible
samples of a given size (say 300) of Democrat registered voters. The distribution of
sample proportions is needed for us to estimate, assuming our hypothesized value of 
(say .50) is true, how likely or unlikely it is for us to obtain a sample proportion that
deviates by say 5 percentage points. It is impractical for researchers to be able to
create the distribution of sample proportions for two reasons: (1) the researcher would
need to collect data from the population to compute the sample proportion for all
possible samples of a given size; (2) the computations involved would be very labor
intensive. To engage in inferential statistics, statisticians had to overcome this obstacle
by figuring out how to estimate the variability in the distribution of sample means
(sampling error) when the researcher has data from a single sample.
5
Illustrating the Process of Creating a Sampling Distribution for
a Proportion
Deck of Cards
SAMPLE
# of Red Cards Drawn
#
in a Sample of 5 cards
Proportion of Red Cards
1_______________________________________________________________
2_______________________________________________________________
3_______________________________________________________________
4_______________________________________________________________
5_______________________________________________________________
6_______________________________________________________________
7_______________________________________________________________
8____________________________________________________________________________
9____________________________________________________________________________
10____________________________________________________________________________
6
17. What does the population standard deviation of a proportion for a variable with
two categories equal?
________
x = () (1-)
________
If  is hypothesized to be .50, then x = (.50) (.50) = .50.
________
If  is hypothesized to be .75, then x = (.75) (.25) = .43.
18. How can p, the standard deviation of the distribution of sample proportions, be
computed?
_________ ___ ___________
p = (() (1- ) /  n = () (1- ) / n
If  is hypothesized to be .50, and n = 300, then
____________
p = (.50) (.50) / 300 = .0289
19. How does x differ from p?
For x observations consist of the categories that each individual in a population
is assigned to (e.g. male or female).
For p observations consist of all possible sample proportions for a designated
value of  and a particular sample size (n).
20. How can we interpret the meaning of p? p is also known as the standard error
of the proportion. The standard error of the proportion represents an approximate
measure of how far, on average, sample proportions are located from . p is an index of
the amount of sampling error.
21. What factors affect p? How does the size of p change as (a) the deviation of
 from .50 increases; and (b) sample size (n) increases?
As the deviation of  from .50 increases and as n increases, p decreases
If  is hypothesized to be .75, and n = 300, then
______________
p =  (.75) (.25) / 300 = .025
If  is hypothesized to be .50, and n = 500, then
______________
p =  (.50) (.50) / 500 = .0224
By increasing n, a researcher can decrease the size of p, which is desirable.
7
22. How can the sampling distribution of proportion be created given (a) a value of ; and (b)
a sample size (n)?
Sampling Distribution of p when  = .50 and n = 300
Sampling Distribution of p when  = .50 and n = 500
23.
8
23. What are the 3 rules for constructing the sampling distribution of proportions?
1. The mean of the distribution of sample proportions will equal the population proportion
(∏).
2. The standard deviation of the distribution of sample proportions (p) will equal
___________
() (1- ) / n
3. If n is equal to or greater than 100, the shape of the distribution of sample proportions
will be normal.
9
Download