AP Class Notes 2-15-13 - Kenston Local Schools

advertisement
Advanced Placement Statistics
Friday February 15, 2013
1
Daily Agenda
1. Welcome to class
2. Please find folder and take
your seat.
3. Test Review for Probability test
4. Chapter 9 Sampling Distributions continued
5. Valentine and paper return
6. Collect Folders
2
3
4
OTL C9#4
page 588: 9.19
milk in cereal bowl
please follow the directions on slide 14
(example vouchers for schooling)
5
9.19 Do you drink the cereal milk?
A USA Today poll asked a random sample of 1012 U.S. adults what they do with the milk in the bowl after they have eaten the cereal. Of the respondents, 67% said that they drink it. Suppose that 70% of the U.S. adults actually drink the cereal milk.
(1) State the given information.
ρ = the true proportion of U.S. adults that
drink the left over milk in the cereal bowl.
∧
n = 1012, ρ = 0.70, p = 0.67
(2) check the assumpions (3 conditions)
• p is from a random sample so, sampling distributions p is an unbiased estimator of ρ.
• 1012(.70) = 708.4 > 10 and 1012(.3) = 303.6, so the sampling distribution is approximately normal
• 1012(10) = 10,120 < total U.S. adults, yes, so the sampling distributions standard deviation is
(3) Calculate the probability that 67% or less of U.S. adults drink the left over milk.
N( 0.70, 0.014)
.7(1-.7)
------- = 0.014
1012
0.67 - 0.70
z = ----------- = -2.14
0.014
0.658 0.672 0.686 0.700 0.714 0.728 0.742
0.67
P(p < 0.67) = P(z < -2.14 ) ≈ 0.0162
(4) State your conclusion.
If I take a random sample of size 1012, the chance that I get a result as small or smaller than 0.67
is only 0.0162 OR approximately 1 out of 100 chance.
MY CONCLUSION - the sample is somehow "biased" the researcher did not "get" a good sample
proportion for some particular reason ??? !!!
Either
OR
http://www.thatvideosite.com/v/1016/unlucky­bird­gets­slapped
6
7
8
OTL Example page 588: 9.22 HARLEY MOTORCYCLES
Harley Davidson motorcycles make up 14% of all motorcycles registered in the United States. You plan to interview an SRS of 500 motorcycle owners.
a) What is the approximate distribution of your sample
who own Harleys? ∧
sample proportion p = .14 ( μp∧ )
sample standard deviation σ∧p = 0.14(1­0.14) = 0.0155
500
√
So ... the sampling distribution of n = 500 is approximately Normal in shape with a sample proportion of 0.14 and a standard deviation of 0.0155 OR N(0.14,0.0155)
Prove that we can "do this" Explain why you can use the formula for the standard deviation
∧
of p in this setting (Rule of thumb 1)
We must check to see if N ≥10*500 ?
10(500) = 5,000 and this is definitely
less than the registered motorcycle population
motorcycle registrations ≥ 5,000
Check that you can use the Normal approximation ∧
for the distribution of p (Rule of Thumb 2)
We must check to see if n*p ≥ 10 and if n(1­p) ≥ 10
500(.14) = 70 70 ≥ 10 yes
500(.86) = 430 430 ≥ 10 yes
We can use the Normal approximation
b) How likely is your sample to contain 20% or more who own
Harleys? Do a normal probability calculation to answer this question.
N(0.14, 0.0155)
0.0935
0.109
0.1245
0.14
0.1555
0.171
0.1865 0.202
0.20
∧
P( p < 0.20) = P( z < 3.87 ) = 1 ­ 1.0 ≈ 0.00
0.20 ­ 0.14
z = ­­­­­­­­­­­­­­­­ = 3.87
0.0155
Conclusion: There is approximately no random chance that a
sample of size 500 would produce a sample proportion of 20%
or higher.
c) How likely is your sample to contain at least 15% who own
Harleys? Do a normal probability calculation to answer this question.
N(0.14, 0.0155)
0.0935
0.109
0.1245
0.14
0.1555
0.171
0.1865
0.15
∧
P( p < 0.15) = P( z < 0.65 ) = 1 ­ 0.7422 ≈ 0.258
0.15 ­ 0.14
z = ­­­­­­­­­­­­­­­­ = 0.645
0.0155
Conclusion: 25.8% random chance
9
OTL C9#5
page 589: 9.23
please follow the directions on
worksheet
Read and Notes: 9.3 591-607
10
9.23 On­Time Shipping pg 589
Your mail order company advertises that it ships 90% of its orders within three working days. You select an SRS of 100 of the 500 orders received in the past week for an audit. The audit reveals that 86 of these orders were shipped on time.
ρ=
a) What is the sample proportion of orders shipped on time?
∧
sample proportion is p = 0.86
•
•
•
population parameter (proportion) ρ = 0.90
population standard deviation = √ (.9)(.1)/100 ≈
b) If the company really ships 90% of its orders on time, what is the
probability that the proportion in the SRS of 100 orders is as small as the proportion in your sample or smaller?
N(.9,.03)
11
c) a critic says, "Aha! You claim 90%, but in your sample the on­time percent is lower than that. So the 90% claim is wrong." Explain in simple language why your probability calculation in b) shows that the result of the sample does not refute the 90% claim.
12
13
14
OTL C9#3
page 578: 9.9, 9.10
page 579: 9.13, 9.17
9.9, 9.13, 9.17 sentences please
(You should have a total of 5 sentences)
9.10 choose (low or high bias) and (low or high
variability)
15
Exercise 9.9 page 578 IRS Audits
The Internal Revenue Service plans to examine an SRS of individual federal income tax returns from each state. One variable of interest is the proportion of returns claming itemized deductions. The total number of tax returns in each state varies from over 15 million in California to about 240,000 in Wyoming.
a) Will the sampling variability of the sample proportion
change from state to state is an SRS of 2000 tax returns is
selected in each state? Explain your answer.
b) Will the sampling variability of the sample proportion
change from state to state is an SRS of 1% of all tax returns
is selected from each state? Explain your answer.
16
Exercise 9.9 page 578 IRS Audits
The Internal Revenue Service plans to examine an SRS of individual federal income tax returns from each state. One variable of interest is the proportion of returns claming itemized deductions. The total number of tax returns in each state varies from over 15 million in California to about 240,000 in Wyoming.
a) Will the sampling variability of the sample proportion
change from state to state is an SRS of 2000 tax returns is
selected in each state? Explain your answer.
The sampling variability of the sample proportion refers to the spread of the sampling distribution.
If the sample size is the same n = 2000 for
each state then the sampling variability will be approximately equal. It will not change from state to state.
b) Will the sampling variability of the sample proportion
change from state to state is an SRS of 1% of all tax returns
is selected from each state? Explain your answer.
If the sample size is 1% of the states population size, the sample size will vary between n = 2400 and n = 150,000.
The sampling variability will change from state to state. It will be less for the bigger sample sizes. It will change from state to state.
17
page 574: 9.10
______ bias
______ variability
______ bias
______ variability
______ bias
______ bias
______ variability
______ variability
18
page 574: 9.10
______ bias
______ variability
______ bias
______ variability
______ bias
______ bias
______ variability
______ variability
19
9.13 A sample of teens page 579
A study of the health of teenagers plans to measure the blood cholesterol level of an SRS of youths aged 13 to 16. The researchers will report the mean x from their sample as an estimate of the mean cholesterol level μ in this population.
a) explain to someone who knows no statistics what it
means to say that x is an unbiased estimator of μ.
b) The sample result x is an unbiased estimator of the population mean μ no matter what size SRS the study chooses, Explain to someone who knows no statistics why a large sample gives more trustworthy results than a small sample.
20
9.13 A sample of teens page 579
A study of the health of teenagers plans to measure the blood cholesterol level of an SRS of youths aged 13 to 16. The researchers will report the mean x from their sample as an estimate of the mean cholesterol level μ in this population.
a) explain to someone who knows no statistics what it
means to say that x is an unbiased estimator of μ.
Unbiased estimator can be explained by saying that our sample was selected by a method that will make its result (the average) point to the average of the population.
If many other samples of the same size are selected in the same way then eventually the average of all of the samples will equal the average of the population that we are trying to estimate.
b) The sample result x is an unbiased estimator of the population mean μ no matter what size SRS the study chooses, Explain to someone who knows no statistics why a large sample gives more trustworthy results than a small sample.
Sampling distributions contain all samples of a given size n.
So the center of the entire sampling distribution will be exactly
the center of the population. However, smaller n sizes will have
more variability than larger n sizes. The smaller n histograms will
spead out more left to right than the bigger n histograms. So choosing a large n will reduce your chances of missing the population center by a large amount.
Individual samples will most likely miss the true center.
Samples from a large n will miss with less "distance" than samples from a small n.
21
9.17 School Vouchers page 580
∧
a) A national opinion poll recently estimated that 44% (p = 0.44) of all adults agree that parents of school age children should be given vouchers good for education at any public or private school of their choice. The polling organization used a probability sampling method ∧
for which the sample proportion p has a Normal distribution with a standard deviation about 0.015. If a sample were drawn by the same method from the state of New Jersey (population 8.7 million) instead of from the entire United States (population about 300 million), would this standard deviation be larger, about the same, or smaller? Explain your answer.
22
9.17 School Vouchers page 580
∧
A national opinion poll recently estimated that 44% (p = 0.44) of all adults agree that parents of school age children should be given vouchers good for education at any public or private school of their choice. The polling organization used a probability sampling method ∧
for which the sample proportion p has a Normal distribution with a standard deviation about 0.015. If a sample were drawn by the same method from the state of New Jersey (population 8.7 million) instead of from the entire United States (population about 300 million), would this standard deviation be larger, about the same, or smaller? Explain your answer.
TWO ASSUMPTIONS must be made: 1 ­ the sample size is the same for NJ and the USA. 2 ­ The opinions for people in NJ are relatively the same as all people in the USA.
The sampling variability would be the same for both samples because the sample standard deviation is based on the sample size NOT the population size. (also assume n ≤ 870,000 or 10% of NJ)
23
Section 9.3 Sample Means
Notable Changes in the x μ "world"
•
•
Different Notation
Shape guaranteed by CLT
24
NOTATION for Means
For "both" worlds ...
Population Parameters
N = population size
Sample Statistics
η = sample size
For "quantitative" worlds ... (measurement)
Population Parameters
μ = population mean
σ = population standard deviation
Sample Statistics
x = sample mean
S = sample standard deviation
25
There are three "center" means to understand.
sampling distribution
sample mean
population mean
sample mean
μ
x
x our book uses μ∧x
pronounced mu
pronounced x­bar
pronounced master x­bar
It is the mean of the
population measurement
It is the mean of the
sample measurement
It is usually unknown
or I call it the
(God knows) value
It is always calculated from
one sample
This value is
permanent for the
population
This value varies from
one sample to the next
Example:
The average NBA
player is 6'7" or
79 inches tall.
Example:
The average height
of a random sample
of 10 rookie players
in the NBA is 6'9" or
81 inches.
It is the center of the
sampling distribution that has
the given characteristic
It is NEVER calculated from
scratch but always known
by theory
This value is always equal to
the population mean.
It does not vary.
26
Section 9.3 Book Definitions ... CLT is our focus
27
The CENTRAL LIMIT THEOREM guarantees
the approximately Normal shape of the
sampling distribution.
CENTRAL LIMIT THEOREM APPLET DEMONSTATION
http://onlinestatbook.com/stat_sim/sampling_dist/index.html
28
29
_ ∧
A Sampling Distribution is a histogram of sample means (x, or p)
Every population has a population size equal to N.
There are N total possible sampling distributions.
Choose a sample size n. This determines characteristics of
the sampling distribution. How many samples will make up the actual sampling distribution? NCn
30
STLUSER
RESULTS
If n is "large" enough ... (n is approaching N, n is getting bigger)
Then three things occur ...
1. The shape of the distribution becomes approximately normal
(bell shaped)
2. The mean of the sampling distribution equals the mean of the population.
3. The standard deviation of the sampling distribution
will become smaller (determined by your choice of n)
31
Sampling Distributions of MEANS
WHY the RESULTS
If n is "large" enough ... (n is approaching N, n is getting bigger)
Then three things occur ...
1. The mean of the sampling distribution equals the mean of the population. Guaranteed by RANDOM sampling
2. The shape of the distribution becomes approximately normal
(bell shaped) Guaranteed by CENTRAL LIMIT THEOREM
3. The standard deviation of the sampling distribution
will become smaller (determined by your choice of n)
VARIABLITY is reduced by bigger sample size n
Guaranteed by the N ≥ 10*p (It is THUMB RULE 1) We can use the following formula for the standard deviation
of the sampling distribution 32
CONCLUSION for Sampling Distribution Theory
Once n is selected the population distribution is no longer appropriate for using as a comparision ... this is because the sample mean x is from a distribution of sample means not individual data points.
A sampling distribution is a "theory" thing ... we never actually create it from scratch because it would be easier to just ask the entire population. HOWEVER ...
We can create what it would look like when it is a complete
sampling distribution.
1. The distribution is bell shaped (if n is large enough)
2. The center (mean or xmaster) of the sampling distribution equals
the mean of the population.
3. The standard deviaton of the sampling distribution is less than
the standard deviation of the population.
33
34
Example page 609: 9.56 HOW MANY PEOPLE IN A CAR ?
A study of rush­hour traffic in San Francisco counts the number of people in each car entering a freeway at a suburban interchange. Suppose that this count has mean 1.5 and standard deviation 0.75 in the population of all cars that enter at this interchange during rush hours.
Given μ = 1.5 and σ = 0.75
a) Could the exact distribution of the count be Normal? b) Traffic engineers estimate that the capacity of the interchange is 700 cars per hour. According to the central limit theorem, what is the approximate distribution of the mean number of persons x in 700 randomly selected cars at this interchange? Given n = 700
c) What is the probability that 700 cars will carry more than 1075 people? Show all required work and test assumptions. N( , )
35
Example page 609: 9.56 HOW MANY PEOPLE IN A CAR ?
A study of rush­hour traffic in San Francisco counts the number of people in each car entering a freeway at a suburban interchange. Suppose that this count has mean 1.5 and standard deviation 0.75 in the population of all cars that enter at this interchange during rush hours.
Given μ = 1.5 and σ = 0.75
a) Could the exact distribution of the count be Normal? b) Traffic engineers estimate that the capacity of the interchange is 700 cars per hour. According to the central limit theorem, what is the approximate distribution of the mean number of persons x in 700 randomly selected cars at this interchange? Given n = 700
c) What is the probability that 700 cars will carry more than 1075 people? Show all required work and test assumptions. N( , )
36
The mean of all of the sample means, let's call it xmaster is the same value as the population mean called μ.
xmaster = μ
This fact is the result of random sampling.
Some books use the symbol μx
37
How big is "large" enough?
a second thumb rule 10n ≥ N
If this condition holds ... Then the standard deviation of the sampling distribution is less than the standard deviation of the population distribution. ??? how much less ??? (see sections 9.2 and 9.3)
38
CONCLUSION for NOTATION
For "both" worlds ...
Population Parameters
N = population size
Sample Statistics
η = sample size
For "quantitative" worlds ... (measurement)
Population Parameters
μ = population mean
σ = population standard deviation
Sample Statistics
x = sample mean
S = sample standard deviation
For "qualititative" worlds ... (surveys and proportions)
Population Parameters
ρ = population proportion
Sample Statistics
∧
p = sample proportion
39
Conclusion ­
TO REDUCE BIAS ­­­­
TO REDUCE VARIABILITY ­­­­
40
41
OTL C9#6 DUE ON TUESDAY February 18
Pg 602: 9.41 a and b, 9.42 b only
ASKIP 9.41
42
Download