DIRECTIONS: - Mustang Public Schools

advertisement
AP Statistics
Name _______________________
Sampling Distributions, CLT, and Confidence Intervals
Hmw Assignments for Chapter 8:
a. 2, 3, 7, 9
b. 10, 14, 17, 21
c. 23, 25, 29, 31
Hmw Assignments for Chapter 9:
a. 1, 5, 7, 9
b. 11, 14, 16, 18
c. 25, 27, 29
d. 30, 33, 35, 39, 41
e. 47, 49, 50, 53, 55, 73
What is a sampling Distribution?
Why are they used?
What is sampling variability?
What is the difference between a characteristic (parameter) and a statistic?
Do problems 2, 3, 7, and 9 from chapter 8.
NOTES: Statistics v. Characteristics, Sampling Distributions
Sampling Distribution of Sample Means
A statistic can be a random variable such as sample mean ( x ) and sample proportion ( p̂ ).
Suppose MHS has the following population of seven senior football players and their weights in pounds are
listed:
Aaron—220, Brad—200, Chris—170, Doug—180,
Eric—190, Frank—210, George—160.

220  200  170  180  190  210  160
 190 LBS
7
 = 20
To create a sampling distribution of sample means, we will find every simple random sample of a specified
sample size, calculate the sample mean, and plot on the dotplot. Since there are seven players, we will look at
all possible combinations of two players resulting in 21 pairs.
Sample
AB
AC
AD
AE
AF
AG
BC
Sample mean
210
195
200
205
215
190
185
Sample
BD
BE
BF
BG
CD
CE
CF
Sample mean
190
195
205
180
175
180
190
Sample
CG
DE
DF
DG
EF
EG
FG
Sample mean
165
185
195
170
200
175
185
Now what if we looked at every possible sample size of size three?
Sample
ABC
ABD
ABE
ABF
ABG
ACD
ACE
x
196.7
200
203.3
210
193.3
190
193.3
Sample
ACF
ACG
ADE
ADF
ADG
AEF
AEG
x
200
183.3
196.7
203.3
186.7
206.7
190
Sample
AFG
BCD
BCE
BCF
BCG
BDE
BDF
x
196.7
183.3
186.7
193.3
176.7
190
196.7
Sample
BDG
BEF
BEG
BFG
CDE
CDF
CDG
x
180
200
183.3
190
180
186.7
170
Sample
CEF
CEG
CFG
DEF
DEG
DFG
EFG
x
190
173.3
180
193.3
176.7
183.3
186.7
What do you notice about the sampling distributions of sample means as the sample size increases from
the parent population?
1) What type of distribution is the parent population?
2) What is the mean (center) of the parent population?
3) What is the standard deviation of the parent population?
4) What shape (type of distribution) is the sampling distribution of sample means of sample size 2?
5) What shape (type of distribution) is the sampling distribution of sample means of sample size 3?
6) What appears to be the center (mean) value of the sampling distribution of sample means of sample size
2?
7) What appears to be the center (mean) value of the sampling distribution of sample means of sample size
3?
8) The standard deviation of the sampling distribution of sample means of sample size 2 is 13.2288 and for
sample size 3 is 9.5665. How does this compare to the parent population standard deviation of 20?
Four Basic Rules of Sampling Distributions of Sample Means with
Central Limit Theorem
Let x represent the sample mean of a simple random sample of sample size n from a parent population having
population mean  and population standard deviation . Then the following will hold true:
1.
 x ( sampling distributi on )   x ( parentpopulation )
 x ( parent population )
2.
 x (sampling distribution) 
3.
When the parent population is normal, the sampling distribution is normal for any sample size, n.
4.
By the Central Limit Theorem (CLT): When the sample size, n, is sufficiently large, the sampling
distribution of sample means, x , is approximated by a normal (bell-shaped) curve, even if the parent
population distribution is not itself normally distributed.
n
(As long as n ≤ .5N)
Do problems 10, 14, 17, and 21 from chapter 8 in your textbook.
Sampling Distribution of Sample Proportions
Suppose the MHS senior girls were asked if they would ask a high school boy out on a date (including the girl
picks up the tab). Twenty-four senior girls were randomly selected.
Girl
Andrea
Becky
Cathy
Delia
Emily
Frankie
Vote
Yes—1
Yes—1
No—0
Yes—1
No—0
No—0
Girl
Grace
Hillary
Ima
Jenny
Kayla
Laura
Vote
No—0
Yes—1
Yes—1
Yes—1
No—0
Yes—1
Girl
Melissa
Nancy
Opa
Patty
Risa
Stephanie
Vote
No—0
No—0
No—0
Yes—1
No—0
Yes—1
Girl
Tina
Urma
Velma
Wendy
Yami
Zelda
Vote
Yes—1
No—0
No—0
Yes—1
Yes—1
No—0
If we created a sampling distributions of sample proportions, p̂ , of sample size 2, there would be
 24 
24!
24  23  22! 24  23
  


 276 possible combinations. For example: AB = 2, AC = 1, AD = 2,
2 1 22!
2
 2  2! 24  2!
AE = 1… VW = 1, VY = 1, VZ = 0.
We construct the frequency distribution and relative frequency distribution of sampling proportions.
Frequency
Relative Frequency
Sample proportion, p̂
66
66/276 = .2391
p̂ = 0 where (0+0)/2
144
144/276 = .5217
p̂ = .5 where (0+1)/2
66
66/276 = .2391
p̂ = 1 where (1+1)/2
Total = 276
= 1.00
200
150
100
50
0
0
0.5
1
The center (mean) of the sampling distribution of sample proportions of sample size 2 is calculated by
66
144
66
 value of pˆ  probabilit y of pˆ   0  276  .5  276  1 276  .5 .
If we created a sampling distributions of sample proportions, p̂ , of sample size 4, there would be 10626
possible combinations. For example: ABCD = 3, ABCE = 2, ABCF = 2,…,VWYZ = 2.
We construct the frequency distribution and relative frequency distribution of sampling proportions.
Sample proportion, p̂
p̂ = 0 where (0+0+0+0)/4
p̂ = .25 where (0+0+0+1)/4
p̂ = .5 where (0+0+1+1)/4
p̂ = .75 where (0+1+1+1)/4
p̂ = 1 where (1+1+1+1)/4
Frequency
495
2640
4356
2640
495
Total = 10626
Relative Frequency
495/10626 = .0466
2640/10626 = .2484
4356/10626 = .4099
2640/10626 = .2484
495/10626 = .0466
= 1.00
5000
4000
3000
Series1
2000
1000
0
0
0.25
0.5
0.75
1
The center (mean) of the sampling distribution of sample proportions of sample size 4 is calculated by
495
2640
4356
2640
495
 value of pˆ  probabilit y of pˆ   0  10626  .25  10626  .5  10626  .75  10626  1 10626  .5
What do you notice about the sampling distributions of sample proportions as the sample size increases
from the parent population?
1) What shape (type of distribution) is the sampling distribution of sample proportions of sample size 2?
2) What shape (type of distribution) is the sampling distribution of sample proportions of sample size 4
3) What appears to be the center (mean) value of the sampling distribution of sample proportions of sample
size 2?
4) What appears to be the center (mean) value of the sampling distribution of sample proportions of sample
size 4?
Three Basic Rules of Sampling Distributions of Sample Proportions
Let p̂ represent the sample proportion of a simple random sample of sample size n from a parent population
having population mean  and population standard deviation . Then the following will hold true:
1.
 pˆ (sampling distribution )   ( parent population)
2.
 pˆ (sampling distribution ) 
3.
 1   
n
(As long as n ≤ .5N)
When the sample size, n, is sufficiently large AND  is not too close to 0 or 1, the sampling distribution
of sample proportions, p̂ , is approximated by a normal (bell-shaped) curve. A rule of thumb to use if n
is sufficiently large enough, verify both n  10 and n(1-)  10.
Now do problems 23, 25, 29, and 31 from chapter 8 in your textbook.
Practice Problems on Sampling Distributions
DIRECTIONS:. Free Response questions need to have proper statistical notation and explanations containing
complete sentences. Multiple Choice questions have only one correct answer that needs to be circled.
Multiple Choice
For questions 1-2: A phone-in poll conducted by a newspaper reported that 73% of those who called in liked
business tycoon Donald Trump.
1. The number 73% is a
A. statistics
B. sample
C. parameter
D. population
E. size
2. The unknown true percentage of American citizens that like Donald Trump is a
A. statistics
B. sample
C. parameter
D. population
E. size
3. Which of the following are true?
A. The mean of a population depends on the particular sample chosen.
B. The standard deviations of two different samples from the same population will be the same.
C. Statistical inferences can be used to draw conclusions about samples based on population data.
D. Statistical inferences can be used to draw conclusions about populations based on sample data.
E. None of the above statements are true.
4. Which best describes a sampling distribution of a statistic?
A. It is the probability that the sample statistic equals the parameter of interest.
B. It is the probability distribution of all the values that are contained in all possible samples of the same
size.
C. It is the distribution of all of the statistics calculated from all possible samples of the same size from the
same population.
D. It is the histogram of sample statistics from all possible samples of the same size.
E. It is none of these.
5. A random sample of 50 U.S. working adults was asked to reveal their (gross) annual incomes. The variance
of this sample:
A. is always smaller than the variance of the population.
B. cannot be computed since the population size is not given.
C. equals the variance of the population.
D. is an estimate of the variance in the sampling distribution of the sample means of the gross annual
incomes of all possible samples of any sample size.
E. is an estimate of the variance of the population but may differ from the variance of the population.
For questions 6-8: A survey will ask a random sample of 1500 adults in OKC area if they support an increase in
the sales tax from 8.25% to 9% with the additional revenue going to education. Suppose we know that π, the
proportion of all OKC adults that support the increase = .30.
6. The mean,  p̂ , of all possible values of the sample proportion in support of the increase will be
A. 8.25%
B. 30%  8.25%
C. 0.30
D. 1500
E. 450
7. The standard deviation of p̂ (known as  p̂ ) is
A. 0.4582
B. 0.2100
C. 0.0118
D. 0.0141
E. 0.0000
8. The probability that p̂ is more than 0.40 is
A. less than 0.0001
B. about 0.100
C. 0.4549
D. 0.50
E. 0.8918
9. A sample of size 49 is drawn from a normal population with a mean of 63 and a standard deviation of 14.
What are the mean and standard deviation of the sampling distribution of sample means?
A. µ = 9,  = 2
B. µ = 63,  = .286
C. µ = 63,  = 2
D. µ = 1.286,  = 3.5
E. µ = 9,  = 14
10. The distribution of SAT Math scores of students taking Calculus I at a larger university is skewed left with a
mean of 625 and a standard deviation of 44.5. If random samples of 100 students are repeatedly taken, which
statement best describes the sampling distribution of sample means?
A. Normal with a mean of 625 and standard deviation of 44.5.
B. Normal with a mean of 625 and standard deviation of 4.45.
C. Shape unknown with a mean of 625 and standard deviation of 44.5.
D. Shape unknown with a mean of 625 and standard deviation of 4.45.
E. No conclusion can be drawn since the population is not normally distributed.
11. Which of the following statements regarding the sampling distribution of sample means is incorrect?
A. The sampling distribution is approximately normal when the population is normal or the sample size is
sufficiently large.
B. The mean of the sampling distribution is the mean of the population.
C. The standard deviation of the sampling distribution is the standard deviation of the population.
D. The sampling distribution is found by taking repeated samples of the same size from the population of
interest and computing the mean of each sample.
E. All of these are correct.
12. After repeated observations, it has been determined that the waiting time at the drive-through window at a
local bank on Friday afternoons between 12:00 noon and 6:00 pm is skewed left with a mean of 3.5 minutes and
standard deviation of 1.9 minutes. A sample of 100 customers is to be taken next Friday. What is the probability
that the mean of the sample will exceed 4 minutes?
A. .0042
B. .0396
C. .0420
D. .3960
E. The probability cannot be determined using a normal curve approximation.
For questions 13-17: The distribution of actual weights of chocolate bars produced by a certain machine is
normal with mean 8.1 ounces and standard deviation of 0.1 ounces.
13. If a sample of five of these chocolate bars is selected, the probability that their average weight is less than 8
ounces is
A. 0.0125
B. 0.1853
C. 0.4871
D. 0.9873
E. Not enough information provided to answer the question.
14. If a sample of five of these chocolate bars is selected, there is only 5% chance that the average weight of the
sample of five of the chocolate bars will be below
A. 7.94 ounces
B. 8.03 ounces
C. 8.08 ounces
D. 8.20 ounces
E. 8.29 ounces
15. The company sells the individually (& independently selected) wrapped eight chocolate bars as a packaged
special product, what is the expected weight of the special packaged product (chocolate only)?
A. 0.8 ounces
B. 8.1 ounces
C. 40.5 ounces
D. 64.8 ounces
E. cannot be determined.
16. What is the standard deviation of the sampling distribution of the weight of the special packaged product of
chocolate only?
A. 0.035 ounces
B. 0.08 ounces
C. 0.2828 ounces
D. 0.8944 ounces
E. cannot be determined.
17. What is the probability that a special packaged product contains more than 66 ounces?
A. 0.0000
B. 0.0010
C. 0.1100
D. 1.0000
E. cannot be determined.
Free-Response
18. For the following situation label the x-axis of the population distribution and the sampling distribution of
sample means. Show how the CLT supports your work in determining the values of the sampling distribution.
A fisherman takes tourist out fishing and has noticed that over the long run the weights of a certain type of fish
typically caught is approximately normal with a mean of 12 pounds and a standard deviation of 4 pounds.
During a good day, the boatload of 16 people catches the limit of 4 fish each, or a total of 64 fish. The statistic
of interest is the mean weight, x , of the 64 fish caught on a good day.
Next: Chapter 9
Things to know:
Point Estimate
Mean
Median
Midrange
Mode
Trimmed Mean
Standard Deviation
Variance
Proportion
Statistic vs. Parameter (or Characteristic)
What constitutes a good point estimate?
Create and interpret a large sample confidence interval for the population proportion.
Determine the sample size needed to create a confidence interval with a given error and level of confidence.
Create and interpret a confidence interval for the population mean using z or t as appropriate.
NOTES:
Chapter nine begins with a discussion of point estimates. A point estimate is a statistic determined from a
sample that is representative of the population. That statistic is a point estimate of its associated population
parameter. For example, a value of x-bar from a good random sample is a point estimate of the population value
of mu, a value of p from a random sample is a point estimate of the population value of pi. Any statistic taken
from a good random sample is a point estimate of its corresponding population parameter.
What makes a good point estimate?
Good point estimates are a combination of two things: they are unbiased and they have a small standard
deviation. However, sometimes a biased point estimate might be better than an unbiased one, if its standard
deviation is much smaller. Your textbook has some good pictures of the relationship balance between bias and
standard deviation.
So what is bias?
Bias when we talk about point estimates is different than bias when we talk about sampling. Sampling bias has
to do with mistakes made when taking a random sample. There may be selection, measurement, or nonresponse bias. But, bias in the context of point estimates has to do with the ratio of overestimates to
underestimates by the static. In other words, an unbiased static is one that, when it is wrong, overestimates ½
the time and underestimates ½ the time.
Most of the common statistics are unbiased estimators of their associated parameters (x-bar and p are unbiased).
However, s, the sample standard deviation, is a biased estimator of sigma, the population standard deviation.
We still use s because it is the best option we have for estimating the population standard deviation.
Textbook Homework: 1, 5, 7, 9
 Confidence interval
Confidence Interval for the population proportion
  pˆ  E - or - pˆ  E    pˆ  E
p̂ = Sample Proportion
E  zcritical 
pˆ (1  pˆ )
n
1- = Confidence Level ( is the percentage in the tails)
Sample is random
n p̂  10 and n(1- p̂ )  10
n is less than 5% of the population
If all of these are true then the distribution is approximately normal and you can
create a Confidence Interval.
Steps to follow…
1.
2.
3.
4.
5.
Identify the parameter you are estimating.
List the important statistics and their values, including level of confidence.
State and verify the requirements.
State your intentions.
Show the appropriate formula and substitute into it. Find the confidence
interval on the calculator and record the answer.
6. State the confidence interval in a sentence.
Confidence interval
Confidence interval for the population mean
xE
E  zcritical 
- or -

n
xExE
- or -
E  tcritical 
s
n
If you know the standard deviation of the population,  , then create
a z - interval. If you don' t know the standard deviation of the population,
but you do know the standard deviation of your sample, s, then
create a t - interval.
Level of confidence  1 - 
Sample is random
n  30 so the sample size is large enough for x to be approximately normally
distributed
-orThe sample can be demonstrated to be consistent with a sample from a population
that is approximately normal. Thus x is approximately normally distributed.
(Demonstrate this using graphical analysis of the sample data or analysis of the
sample statistics.)
If both of these conditions are satisfied you can create a confidence interval.
The steps are the same as before.
One Proportion Z Confidence Interval
Finding sample size needed:
1. Joey Boatright and Chris Mersinger are running for class president at MHS. You conduct a sample survey of
the senior class on who plans on voting for Joey instead of Chris (note: there are no other candidates or write-in
options). You will tolerate a 4% margin of error with a 95% confidence level. Assume each candidate is
equally likely to be favored in the senior class (hint: use 0.50 for the sample proportion). How large a sample is
needed?
2. An MHS teacher intends to verify reliable information that the illiteracy rate at MHS is about 2%. How many
randomly selected subjects should be tested if we want 96% confidence that the sample is in error by not more
than one percentage point?
Constructing Confidence Interval for a Sample Proportion:
Example: An Associated Press article on potential violent behavior reported the results of a survey of 750
workers who were employed full time (San Luis Obispo Tribune, Sept. 7 1999). Of those surveyed, 125
indicated that they were so angered by a coworker during the past year that he or she felt like hitting the person
(but didn’t). Construct a 95% confidence interval based on the information presented.
3. Mrs. Ima Mean considers a multiple choice test to be easy if at least 85% of the responses are correct. A
sample of 175 student responses to one question indicates that 146 of those student responses were correct.
Construct the 98% confidence interval for the true proportion of correct responses. Is it likely that this particular
test question is really easy?
4. A chemistry teacher at MHS has created a wonder spray that he claims eliminates sophomore bad behavior
when 10th graders come into contact with the potion. If 300 10th graders were sprayed, and 47% of them exhibit
no bad sophomore behavior afterwards, would you conclude that his mixture was satisfactory (assuming
satisfactory is at least 50%)? Construct AND interpret the 99% confidence interval.
Verify assumptions:
5. Jaci makes 996 free throws out of 1000 attempts. Verify if a valid confidence interval could be constructed.
Verify assumptions are met AND construct CI:
6. Kim randomly selected a card from a well shuffled regular 52 card deck. Out of the 50 trials, 17 red cards
were drawn. Construct the 95% confidence intervals and interpret if Kim was playing with a full deck.
Textbook Homework: 11, 14, 16, 18
Around the World in 45 Minutes
Sample Size and Confidence Intervals
We will “randomly” select a location on the earth by tossing and catching the globe. Record whether the mark
on your finger lands on land, L, or water, W.
1. How many times do we need to toss and catch the earth to create a “good” confidence interval?
Well…
2. What level of confidence do we want?
3. How close do we want to be to the true proportion?
4. Do we know what the true proportion is or is supposed to be?
If not then estimate it to be .5. This gives us the largest sample size we could possibly need.
Now use the last part of the C.I. formula.
Bound on error (E) = Critical value*standard error
-orE  1.96
.5 (1  .5 )
when the desired confidence level is 95% and  is unknown.
n
Substitute how close you want to estimate  within for E, and solve for n.
5. Now toss the world and create a 95% confidence interval for the proportion of the earth covered with
land. Interpret the interval in the context of the problem.
6. What three factors involved in the process of creating a confidence interval determine its width?
7. Can any of these factors be controlled?
Textbook Homework: 25, 27, 29
Z and t Confidence Interval for Mu
  x  z* 
 or 
n
  x t* s
n
Normal or t-distribution:
1. Mr. Moore wants to estimate the mean number of Gummi Bears a student can consume in one 55 minute
class period. He randomly selects 30 students using the student rolls and a random number generator,
buys multiple bags of Gummies and starts the timer. He gets an average of 85 GB with a standard
deviation of 9.3 GB.
2. Suppose we know from the M&M/Mars website that the standard deviation of the diameter of an m&m
is .235mm. Estimate the average diameter of an m&m. Your sample of 53 m&ms produces an average
diameter of 9.8mm.
Finding sample size needed:
3. Joey Boatright and Chris Mersinger are running for class president at MHS. You want to know how
smart the average person voting for Joey is. So, you conduct a sample survey of students in the senior
class of those who plan on voting for Joey and measure their IQ. You will tolerate a 4 point margin of
error with a 95% confidence level. Assume that the smartest person planning vote for Joey has an IQ of
122 and the “least smart” person has an IQ of 76. How large a sample is needed?
4. An MHS teacher intends to verify reliable information that the average “big toe” length at MHS is about
2 inches. Information from previous years has consistently shown that “big toe” lengths for young adults
have a standard deviation of .34 in. How many randomly selected subjects should be tested if we want
96% confidence that the sample is in error by not more than .1 inch?
Constructing Confidence Interval For Estimating the Population Mean:
5. Five randomly selected students win a free visit to the dentist during National Dental Health week!
They were asked how many months it had been since their last visit. They responded: 6, 17, 11, 22, and
29. Construct a 95% confidence interval for the mean number of months elapsed since the last visit to a
dentist for the population of students that were eligible for the free visit.
6. A chemistry teacher at MHS has created a wonder spray that he claims eliminates sophomore bad
behavior when 10th graders come into contact with the potion but wants to know how long to expect the
potion to work. 30 randomly selected naughty 10th graders were sprayed, and they exhibited no bad
sophomore behavior afterwards for an average of 4 hours with a standard deviation of .5 h. Construct
AND interpret the 99% confidence interval for the teacher.
Verify assumptions:
The following data are the calories per half-cup serving for 16 popular chocolate ice cream brands: 270, 150,
170, 140, 160, 160, 160, 290, 190, 190, 160, 170, 150, 110, 180, 170. Is it reasonable to use a t-confidence
interval to compute a confidence interval for , the true mean calories per half-cup serving of chocolate ice
cream.
Textbook Homework: 30, 33, 35, 39, 41
Student t Distribution
Small sample inference for population mean: N30
If we know that the original distribution is normal and we know the standard deviation of the population, then
we can construct a confidence interval based on the Standard Normal Distribution.
If we don’t know  we can sometimes use a Student t distribution to determine the critical values.
Degrees of freedom - the number of scores that can vary after certain restrictions have been imposed on all
scores. Example: If 10 scores have a mean of 80 we can freely assign random values of x to the first 9 scores.
But, the tenth will have to be a specific value to result in the mean of 80.
Degrees of freedom = n-1
Student t requirements
Chapter 9: More practice problems
So, why do people really do statistics? Well, people really do statistics to answer
really important questions about trends and patterns within populations. Now,
some of that can be accomplished by looking at graphs and summary statistics of
samples, but not very accurately because of sampling variability (gasp). You
know. What if you get one of those really weird samples that can just happen by
chance and predict that the average height for 18-year-old females is 6ft? Well, if
you’re working for Aeropostale, you may soon be out of a job.
One alternative might be to study the whole population with which we are
concerned. But as a colleague of mine once said, “We really don’t care that
much…plus we’ve got a life that has a time limit.” So, we need to use a sample,
but in some way that allows us to account for some of the dreaded sampling
variability (gasp).
Enter confidence intervals. In class we discussed using a point estimate from a
sample, creating a margin of error using a certain level of confidence, and combining
the two to create a confidence interval. We also discussed the conditions necessary
for creating confidence intervals for estimating a proportion and for estimation a
mean. Below are some sample problems to help you with the process of writing out
the solution to a confidence interval question.
Jeans: The Aeropostale Corporation is determining what styles of jeans will be
popular next school year. Since skinny jeans have surged in popularity, Aeropostale
is considering carrying more “skinny” styles than in previous years. A randomly
selected sample of 2500 women between the ages of 16 and 21 were asked what style
of jeans they planned to buy for back-to-school next year. 52% responded they would buy skinny jeans. Since
Levi’s will only increase production if it is clear that a majority of women want the skinny jean look, should
they increase production? Use a confidence interval to support your decision.
Categorical or numerical data?
Requirements: (why?)
List the statistics you know or will need:
Construct the interval: (write the formula)
Write the answer in context:
We are _____% confident that the true (proportion/mean) of
______________________________________is between _________and _________.
What does the confidence level mean?
If this procedure were repeated many times with the same sample size, we would expect about
________% of the resulting intervals to include the true population (proportion/mean) of
_____________________________.
What will you tell the Aeropostale Corporation?
How would raising the confidence level to 98% change the interval?
Grades: This year’s students seem to be doing better than last year’s students. A random sample of semester
grades from this year’s students is listed below. Construct a 95% confidence interval for the mean semester
grade of current students.
Current scores 82 75 68 87 95 86 95 82 74 88 92 90 89 91
Categorical or numerical data?
Requirements: (why?)
List the statistics you know or will need:
Construct the interval: (write the formula)
What does the interval mean? (Write it in context.)
We are _____% confident that the true (proportion/mean) of
______________________________________is between _________and _________.
What does the confidence level mean?
If this procedure were repeated many times with the same sample size, we would expect about
________% of the resulting intervals to include the true population (proportion/mean) of
_____________________________.
How would raising the confidence level to 98% change the interval?
Textbook Homework: 47, 49, 50, 53, 55, 73
He’s angry, but is he right? The following excerpt is taken from a letter written by a corporation president and
sent to the Associated Press.
When you or anyone else attempts to tell me and my associates that 1223 persons account
for our opinions and tastes here in America, I get mad! How dare you! When you or
anyone else tells me that 1223 people represent America, it is astounding and unfair and
should be outlawed.
The writer then goes on to claim that because the sample size of 1223 people represents 120 million people, his
letter represents 98,000 people (120 million divided by 1223) who share the same views.
a.
Given that the sample size is 1223 and the degree of confidence is 95%, find the margin of error for the
proportion. Assume that there is no prior knowledge about the value of that proportion.
b.
The writer of the letter is taking the position that a sample size of 1223 taken from a population of 120
million people is too small to be meaningful. Do you agree or disagree? Base your answer on your findings
from part a.
It’s Frappy Time!
2002 AP FR Form B Q#4
Each person in a random sample of 1,026 adults in the United States was asked the following question. "Based
on what you know about the Social Security system today, what would you like Congress and the President to
do during this next year?
The response choices and the percentages selecting them are shown below.
Completely overhaul the system
19%
Make some major changes
39%
Make some minor changes
30%
Leave the system the way it is now
11%
No opinion
1%
a. Find a 95% confidence interval for the proportion of all United States adults who would respond "Make
some major changes" to the question. Give an interpretation of the confidence interval and give an
interpretation of the confidence level.
b. An advocate for leaving the system as it is now commented, "Based on this poll, only 39% of adults in the
sample responded that they want some major changes made to the system, while 41% responded that they
want only minor changes or no changes or no changes at all. Therefore, we should not change the system."
Explain why this statement, while technically correct, is misleading.
2003 Form B Q6
Researchers at a large health maintenance organization (HMO) are planning a study of a certain mild illness.
They will select a random sample of patients who are ages 35 to 54 and see if they contract the illness in the
next year. The researchers are interested in estimating the proportions of men and of women who are likely to
develop the illness in each of 4 age-groups: 35-39, 40-44, 45-49, and 50-54.
The researchers plan to include 2,000 patients in the study. Suppose the researchers draw a random sample
from all of the patients at this HMO who are ages 35 to 54 and find the following numbers within each gender
and age group.
----------------------Age-Group-------------------35-39
40-44
45-49
50-54
Male
350
230
150
60
Female
445
370
245
150
a)
Suppose that at the end of the study, 10 percent of the females in the 40-44 age group contracted the
illness. Calculate a 95 percent confidence interval to estimate the population proportion of females in this agegroup that contracted the illness.
Interpret this confidence interval in the context of this situation.
Interpret the confidence level of 95 percent.
b)
Suppose that at the end of the study, 10 percent of the males in the 40-44 age group contracted the
illness. The corresponding 95 percent confidence interval to estimate the population proportion of males in this
age-group that contracted the illness is (0.061, 0.139).
Note that this interval and the interval in part a) are of different lengths even though the two sample
proportions were identical. What would be an alternative way to allocate a sample of 2,000 subjects so that
the 95 percent confidence interval widths for all male age-groups and for all female age-groups (i.e., for all
8 groups) would be the same when the sample proportions are the same? Justify your answer.
c)
Based on previous studies, researchers believe that the percentage of those who contract the illness will
be similar for males and females, and therefore plan to ignore gender when selecting a sample for this study.
Previous studies also indicate that the percentage of adults who will contract this illness in the 35-39, 40-44, 4549, and 50-54 age-groups are anticipated to be 5%, 8%, 20%, and 35%, respectively. How should the sample of
2,000 subjects be allocated with respect to age-groups so that the widths of the 95 percent confidence intervals
for the four groups will be approximately the same? Justify your answer.
Download