A Distribution of Sample Means

advertisement
OUTLINE
PROBABILITY AND THE NORMAL CURVE
I.
Probability
A. Probability and inferential statistics
B. What is probability?
II. The Normal Curve
A. Probability and the Normal Curve
B. Properties of the Standard Normal Curve
C. The Unit Normal Table
III. Solving Problems with the Normal Curve
A. Problem Type 1
B. Problem Type 2
C. Cautions
1
I.
PROBABILITY
A. Probability and Inferential Statistics
 Transition to inferential statistics
 Why is probability so important?
 Links samples and populations
Example 1:
The jar is a “population”
One marble is a “sample”
How likely to get BLACK?
But, isn’t the goal of inferential stats the opposite?
Example 2:
Choose 10 marbles, blindfolded
“Sample” has 8 BLACK & 2 WHITE
Which jar did marbles come from?
This is inferential statistics!
“Judgments under uncertainty”
2
B. What is probability?

likelihood of an “event” occurring

Can range from 0 (never) to 1.0 (always)

Defined in terms of a fraction, proportion, or
percentage
p(A) = Number of outcomes classified as A
Total number of possible outcomes
example #1:
Toss a coin, what is probability of heads?
1
p(Heads) = 2
1 = one way to get heads
2 = two possible outcomes (heads or tails)
1
2
=
.50
3
=
50%
example #2:
Select a card from a deck of 52 cards
What is probability of selecting a king?
4
p(King) = 52
4 = four ways to get a king
52 = 52 possible outcomes
4
52
=
.077
4
=
7.7%
 Compute probability from a frequency distribution
f
p= N
X
9
8
7
6
f
1
3
4
2
f = N = 10
What is the probability of selecting a score with x = 8?
f
p(x = 8) = N
3
= 10 = .30 = 30%
What is the probability of selecting a score with x < 8?
f
p(x < 8) = N
6
= 10 = .60 = 60%
5
II. THE NORMAL CURVE
A. Probability and the Normal Curve
 Special statistical tool called the Normal Curve
 Theoretical curve defined by mathematical formula
 Known proportions/areas under the curve
 Used to solve problems when we don’t know the
population
6
B. Properties of the Standard Normal Curve











Theoretical, idealized curve
Based on mathematical formula
Bell-shaped, symmetrical, unimodal
 = Md = Mo
50% of scores above , 50% below
Standardized:  = 0,  = 1
A probability distribution, tails not anchored to axis
Total area under the curve will sum to 1.0
Exact percentiles associated with each z-score
Area under curve provided in Unit Normal Table
Can be applied to any normal distribution once the
distribution is standardized (converted to z-scores)
7
Why is the normal curve so important?
(1) Many variables normally distributed in population
(2) Can use normal curve to solve many problems
Two types of problems:
(1) What proportion of dist’n falls above, below, or
between particular z-scores?
(2) What z-score is associated with particular
proportions/probabilities under the curve?
8
C. The Unit Normal Table (UNT)
Appendix B.1:
A = z-scores
B = Proportion in body (larger portion)
C = Proportion in tail (smaller portion)
D = Proportion between µ and z
 Curve is symmetrical, only + z scores shown
 Columns B & C always sum to 1.0
 Proportions/probabilities are always positive
9
APPENDIX B. 1
UNIT NORMAL TABLE
10
z-score cuts curve into two portions (B & C)
11
Let’s Practice!
Tip: Always sketch a curve first!
 Examples 1:
What proportion of distribution falls above z = 1.5?
p (z > 1.5)
What proportion falls below z = -.5?
p (z < -0.5)
12
Examples 2:
What z-score separates the lower 75% from the
upper 25%? (same as 75th percentile)
What z-score separates the middle 60% of the
distribution from the rest of the distribution?
13
III. SOLVING PROBLEMS WITH THE NORMAL CURVE
HINTS AND TIPS
 Two types of problems:
(1) Finding proportion associated with X or z
(2) Finding X or z associated with proportion
 Problem Type #1 Steps to Follow:
(a)
(b)
(c)
(d)
Sketch curve
Convert raw score to z-score
Look up proportion for this z-score
Sometimes add/subtract proportions
 Problem type #2 Steps to Follow:
(a) Sketch curve
(b) Look up z-score associated with proportion
(c) Convert z-score back to a raw score (X)
**Always sketch a normal curve first!
14
A. PROBLEM TYPE 1: FINDING AREA UNDER THE
CURVE
Problem #1:
 = 60  = 10
Exam
What percentage will score below 70?
(1) Sketch a normal curve
(2) Convert raw score to z-score
z=
(3) Plan your strategy
(4) Refer to Unit Normal Table (Appendix B.1)
15
Problem #2:
 = 60  = 10
Exam
What is percentile rank of student who
scored 55?
(1) Sketch a normal curve
(2) Convert raw score to z-score
z=
(3) Plan your strategy
(4) Refer to UNT (Appendix B.1)
16
Problem #3:
 = 60  = 10
Exam
What proportion of people will score
between 60 and 80?
(1) Sketch a normal curve
(2) Convert raw scores to z-scores
z1 =
z2 =
(3) Plan your strategy
(4) Refer to UNT (Appendix B.1)
17
Problem #4:
 = 60  = 10
Exam
What proportion of people will score
between 50 and 80?
(1) Sketch a normal curve
(2) Convert raw scores to z-scores
z1 =
z2 =
(3) Plan your strategy
(4) Refer to UNT (Appendix B.1)
18
B. PROBLEM TYPE 2: FINDING A SCORE ASSOCIATED
WITH A PROPORTION OR PERCENTILE
Problem #5: Standardized Exam  = 60  = 10
Assign A+ to the 95th percentile
What is cut-off score for earning an A+?
(1) Sketch curve
(2) Plan your solution
(3) Refer to UNT (Appendix B.1)
z=
(4) Convert z-score back to raw score:
x=+z
x=
19
Problem #6:
 = 60  = 10
Exam
Assign F to 15th percentile (and below)
What is cut-off score for earning an F?
(1) Sketch curve
(2) Plan your solution
(3) Refer to UNT (Appendix B.1)
z=
(4) Convert z-score back to raw score:
x=+z
x=
20
C. Cautions
In order to use the UNT to solve problems, you must:
 have known  and 
 assume your variable is normally distributed
Why?
1
If you don’t know  & , can’t compute a z-score
2
If variable is not normally distributed,
percentages given by UNT won’t apply!
 z-scores can be negative but proportions/percentiles
cannot!
 Pay close attention to the words…
Above, Below, Within, Beyond
THE DISTRIBUTION OF SAMPLE MEANS
Inferential statistics:
21
Generalize from a sample to a population
Statistics vs. Parameters
Why?
Population not often possible
Limitation:
Sample won’t precisely reflect population
Samples from same population vary
“sampling variability”
Sampling error = discrepancy between sample
statistic and population parameter
22
 Extend z-scores and normal curve to SAMPLE
MEANS rather than individual scores
 How well will a sample describe a population?
 What is probability of selecting a sample that has a
certain mean?
 Sample size will be critical
 Larger samples are more representative
 Larger samples = smaller error
23
THE DISTRIBUTION OF SAMPLE MEANS
Population of 4 scores:
2
4
6
8

=5
4 random samples (n = 2):
X 1= 4
X3 = 5
X2 = 6
X4 = 3

X is rarely exactly 

Most X a little bigger or smaller than 

Most X will cluster around 

Extreme low or high values of X are relatively rare

With larger n, X s will cluster closer to µ (the DSM
will have smaller error, smaller variance)
24
A Distribution of Sample Means
X= 4
X= 5
X= 6
The distribution of sample means for n = 2. This distribution shows the 16
sample means obtained by taking all possible random samples of size n=2 that
can be drawn from the population of 4 scores. The known population mean
from which these samples were drawn is µ = 5.
25
THE DISTRIBUTION OF SAMPLE MEANS

A distribution of sample means ( X )

All possible random samples of size n

A distribution of a statistic (not raw scores)
“Sampling Distribution” of X

Probability of getting an X , given known  and 

Important properties
(1) Mean
(2) Standard Deviation
(3) Shape
26
PROPERTIES OF THE DSM

Mean?
X = 
Called expected value of X
X is an unbiased estimate of 

Standard Deviation?
Any X can be viewed as a deviation from 
 X = Standard Error of the Mean
X =

n
Variability of X around 
Special type of standard deviation, type of “error”
Average amount by which X deviates from 
27
Less error = better, more reliable, estimate of
population parameter
 X influenced by two things:
(1) Sample size (n)
Larger n = smaller standard errors
Note: when n = 1   X = 
 as “starting point” for  X ,
 X gets smaller as n increases
(2) Variability in population ()
Larger  = larger standard errors
Note:  X = M
28
The distribution of sample means for random samples of size (a) n = 1, (b) n = 4, and (c) n = 100
obtained from a normal population with µ = 80 and σ = 20. Notice that the size of the standard error
decreases as the sample size increases.
29

Shape of the DSM?
Central Limit = DSM will approach a normal dist’n
Theorem
as n approaches infinity
Very important!
True even when raw scores NOT normal!
True regardless of  or 
What about sample size?
(1) If raw scores ARE normal, any n will do
(2) If raw scores NOT normal, n must be
“sufficiently large”
For most distributions  n  30
30
Why are Sampling Distributions important?
 Tells us probability of getting X , given  & 
 Distribution of a STATISTIC rather than raw scores
 Theoretical probability distribution
 Critical for inferential statistics!
 Allows us to estimate likelihood of making an error
when generalizing from sample to popl’n
 Standard error = variability due to chance
 Allows us to estimate population parameters
 Allows us to compare differences between sample
means – due to chance or to experimental treatment?
 Sampling distribution is the most fundamental
concept underlying all statistical tests
31
Working with the
Distribution of Sample Means

If we assume DSM is normal

If we know  & 

We can use Normal Curve & Unit Normal Table!
z = X 
x
Example #1:
 = 80  = 12
What is probability of getting X  86 if n = 9?
32
Example #1b:
 = 80  = 12
What if we change n =36
What is probability of getting X  86
33
Example #2:
 = 80  = 12
What X marks the point beyond which sample means are
likely to occur only 5% of the time? (n = 9)
34
35
Download