Self-Assessment Quiz 8 Answers ) ( ) (

advertisement
STATISTICS
Self-Assessment Quiz 8 Answers
1. The online social networking site Facebook has found that the amount of time spent by American high school
students on their site on weeknights is bimodal with mean 118 minutes per night and standard deviation of 14
minutes. What is the probability of collecting a random sample of fifty high school students and finding that the
mean amount of time these students spend on Facebook per weeknight is at most 110 minutes? Answer this
question with a complete English sentence.
---------------------------------------------------------------------------------------------------------------------------Let x represent the number of minutes per weeknight a high school senior spends on Facebook.
Want: P( x ≤ 110) = P (0 ≤ x ≤ 110) , where x is the mean Facebook time (in minutes) of a sample of n = 50
high school students.
Since a good (unbiased) sampling strategy was employed and assuming that there are more than 500 high
school students, then 10n = 10 ⋅ 50 = 500 ≤ N . In addition, since n = 50 ≥ 40 , then the sampling distribution
of sample means (i.e. the distribution all possible x s ) is approximately normal with mean µ x = µ = 118
14
minutes and standard deviation σ x =
≈ 1.98 minutes.
50

Hence, P( x ≤ 110) = P (0 ≤ x ≤ 110) = normalcdf  0, 110, 118,

Answer: The probability of randomly selecting
50 high school students, from a population of high school students who
spend an average of 115 minutes per
weeknight on Facebook, and finding
that this sample spends an average of
at most 110 minutes is about 0.003%.
110
14 
 ≈ 2.7 E − 5 ≈ 0.000027 ≈ 0.003% .
50 
118
mean Facebook time (in minutes)
of a sample of 50 high school students
-------------------------------------------------------------------------------------------------------------------------------------------
2. Premiers of children’s movies are often accompanied by special promotional offers through one of the leading
fast-food chains. During the many years that McDonald’s has run such promotions, the company has found that
15% of all its customers will buy the specialty item. When Disney re-released its movie “Snow White and the
Seven Dwarfs” on DVD last year, McDonald’s simultaneously initiated a promotional offer enabling its
customers to purchase a specially designed set of drink glasses featuring each of the seven dwarfs. Halfway
through the promotional period, a study was performed at several restaurants throughout the country; 17.6% of
the 500 customers sampled had purchased a set of these glasses. What was the probability of having randomly
sampled 500 customers and finding at least 17.6% of them had purchased a set of the collectible glassware?
WANT P( pˆ ≥ 0.176) = P(0.176 ≤ pˆ ≤ 1) .
Thus, we need information about the distribution of all possible values of p̂ .
GIVEN Population: All McDonald’s customers (whether nationwide or worldwide is not clear here)
Parameter: p = 0.15 = 15%
Statistic: With x representing the number of customers who purchased the collectible glassware
88
88
amongst 500 McDonald’s patrons, then since n = 500, pˆ =
=
= 0.176 = 17.6% .
500 500
FACT: The sampling distribution of the sample proportion --- the distribution of all sample proportions ( p̂ 's)
--- will be approximately normally distributed with mean µ p̂ = p and standard deviation
σ pˆ =
pq
=
n
p ⋅ (1 − p)
IF AND ONLY IF:
n
N ≥ 10 ⋅ n (In English, the population must be at least 10 times larger than the sample to ensure
the trials can be treated as if they’re independent if though sampling is done without replacement)
np ≥ 10
nq = n ⋅ (1 − p ) ≥ 10
Checking all three of these conditions for this specific problem …
Since n = 500, then the condition N ≥ 10 ⋅ n can be restated as N ≥ 10 ⋅ 500 ⇒ N ≥ 5,000 .
Although I have no actual value for N, the total number of all McDonald’s customers (whether nationwide
or worldwide), I am convinced that there are at least 5,000 of them in this country alone (and certainly
worldwide). Therefore, N ≥ 10 ⋅ n . np = (500) ⋅ (0.15) = 75 ≥ 10 nq = n ⋅ (1 − p ) = (500) ⋅ (1 − 0.15) = (500) ⋅ (0.85) = 425 ≥ 10 Therefore, we can deduce (i.e. conclude) that the distribution of sample proportions, p̂ , is approximately
normally distributed with mean µ pˆ = p = 0.15 and standard deviation σ pˆ =
pq
=
n
(0.15)(0.85)
500
≈ 0.016 .
Hence,
(
)
P pˆ ≥ 88 = P( pˆ ≥ .176) = P(0.176 ≤ pˆ ≤ 1)
500

= normalcdf  0.176,1, 0.15,


≈ 0.0517430168
≈ 0.0517
(0.15)(0.85)
500
≈ 0.0517




0.15
0.176
Answer: Assuming that the proportion of special glassware purchasers amongst all McDonald’s patrons is 0.15,
the percentage of all random samples of 500 McDonald’s customers which contain at least 88 special
glassware purchasers is approximately 0.0517. This is equivalent to saying that the probability of
having sampled 500 McDonald’s customers and finding at least 17.6% (i.e. 88 or more) of them who
had purchased a set of this collectible glassware is approximately 0.0517 – only about 5 out of 100 such
groups of 500 McDonald’s customers would contain 17.6% or more people who purchased the specially
designed set of drink glasses featuring each of the seven dwarfs
3. A drug manufacturer has developed a drug that is said to cure postnatal depression. A random sample of 150
women who gave birth, suffered postnatal depression, and who used the drug in a two-year period revealed that
120 of them found it effective. Address parts (a) - (c); you do NOT have to answer with a sentence.
(a) Construct a 99% confidence interval for the percentage of all postnatal depression cases that are cured by
this new drug.
Since a good (unbiased) sampling strategy was employed and since the population of interest here is
comprised of all cases of postnatal depression, the size of this population is at least in the hundreds of
120
= 0.80 .
thousands, and so 10n = 10 ⋅150 = 1500 ≤ N . pˆ =
150
 120 
Also, npˆ = (150)
 = (150)(0.80) = 120 ≥ 10 and nqˆ = (150)(1 − 0.80) = (150)(0.20) = 30 ≥ 10 .
 150 
Consequently, the sampling distribution of proportions is approximately normally distributed, and so we can
use the calculator’s 1-PropZInt function to construct the requested confidence interval. We get the
following rounded values: (0.71587, 0.88413) ≈ (71.6%, 88.4%) .
Answer: Based on this sample, we are 99% confident that the percentage of all postnatal depression cases
that are cured by this new drug is in the (approximate) interval (71.6%, 88.4%) .
(b) Identify the margin of error (to the nearest whole percent) that’s in the parameter’s estimate you gave in
part (a).
Answer: The margin of error is (approximately) 6.4%. Since (71.6%, 88.4%) is equivalent to
120
= 0.80 = 80% and the margin of error is ME pˆ = 8.4% (since
150
80% − 8.4% = 71.6% and 80% + 8.4% = 88.4% ) . Alternatively, the confidence interval is
80% ± 8.4% , where pˆ =
pˆ ± ME pˆ , where pˆ =
120
= 0.80 = 80% & ME pˆ = z *
150
ˆˆ
pq
n
= 2.58
(0.8)(0.2)
150
≈ 0.084 = 8.4% ,
0.01


where z * = InvNorm  0.99 +
, 0, 1 = InvNorm ( 0.995,0,1) ≈ 2.58 .
2


(c) Suppose p represents the percentage of all postnatal depression cases that are cured by this new drug. Is the
probability that p is in the interval you created in part (a) equal to 99%? Justify your answer clearly and
completely.
Answer: No, the parameter p (the percentage of all postnatal depression cases that are cured by this new
drug) is either in the specific 99% confidence interval (71.6%, 88.4%) , or it’s not.
The method used in part (a) to construct a 99% confidence interval here is what had a 99%
chance of succeeding. That is, before any particular sample was drawn (and before any specific
statistics were available), there was a 99% chance of creating a 99% confidence interval that
captures the parameter p. Whether the specific interval (71.6%, 88.4%) is one of the successful
interval, or not, is unknown.
4. The number of hours per week that high school seniors spend on homework is normally distributed
with a mean of 10 hours and a standard deviation of 3 hours. Address parts (a) and (b) below.
(a) What is the probability that one randomly chosen high school senior spends more than
15 hours per week on his or her homework?
Let x represent the number of hours per week that one high school senior spends on homework.
Want: P ( x > 15 ) = P (15 < x ≤ 168 ) since there are at most 168 hours per week ( 7 ⋅ 24 = 168 ) .
Given: The variable I called x is Normally Distributed with mean µ = 10 hours and standard deviation
σ = 3 hours.
Hence, P ( x > 15 ) = P (15 < x ≤ 168 )
= Area under the given Normal
curve between x = 15 & x = 168
= normalcdf (15,168,10, 3)
≈ 0.0477903304 ≈ 0.05
10
15
hours per week a high school senior spends on homework
x
Answer: The probability that one randomly chosen high school senior spends more than 15
hours per week on his or her homework is approximately 5%. That is, about 5% of all
high school seniors spend more than 15 hours per week on their homework
(b) What is the probability that the mean number of hours spent on homework per week of 36 randomly
chosen high school seniors is greater than 15 hours?
Want: P ( x > 15 ) , where x is the mean number of hours spent on homework per week of a sample of
n = 36 high school seniors.
Since there is a maximum of 168 hours per week, then we want P (15 < x ≤ 168 )
Although no actual value for N, the total number of all high school seniors, is given I believe that it is at least
10n = 10 ⋅ 36 = 360 . Therefore, 10n ≤ N . Even though n = 36 ≥ 40 , we are still able to conclude the
sampling distribution of the sample mean (i.e. the distribution of all possible x s) is approximately normally
distributed because of the fact that we were told that x was normally distributed in the population.
Specifically, the sampling distribution of the sample mean is approximately normally distributed with mean
µ x = µ and standard deviation σ x =
σ 


N  µ,
 = N 15,
n


function.
σ
. In other words, the sampling distribution here is
n
3 
3
1


 = N  15, 6  = N  15, 2  , and so, we can use the TI calculator’s normalcdf
36 




So, P ( x > 15 ) = P (15 < x ≤ 168 )
= Area under the deduced curve
between x = 15 and x = 168
= normalcdf (15, 168, 10, 12 )
≈ 7.77 E − 24 ≈ 8 E − 24
≈ 0.000000000000000000000008
x
10
15
mean number of hours per week spent on homework
by a group of 36 random high school seniors
≈ 0.0000000000000000000008%
(a very small number)
Answer: The probability that the mean number of hours spent on homework per week of 36 randomly
chosen high school seniors exceeds 15 hours is approximately 0.0000000000000000000008%
… it’s near 0%, but it is not equal to 0% (for this would imply that it was never possible).
That is, it’s highly unlikely to randomly select 36 high school seniors and find that their
average weekly homework time exceeds 15 hours.
5. Suppose a 95% confidence interval is accurately computed for µ resulting in the interval (112.4, 121.6).
Identify those statements that are definitely true. Write the number of each true statement in your Blue
Book. If none of the statements are true, write NONE .
95% of the time, µ falls within the interval (112.4, 121.6).
One can have 95% confidence that µ is 117.
95% of all possible values for µ will fall within the interval (112.4, 121.6).
95% of the time, p falls within the interval (112.4, 121.6).
Using this method, 95% of all the possible samples will produce the interval (112.4, 121.6) for µ .
The standard error is 4.6.
µ is 117.
There is a 95% chance that µ will fall within the interval (112.4, 121.6).
-----------------------------------------------------------------------------------------------------------------------------------The correct conclusions here are:
•
One can be 95% confident that µ is in the interval (112.4, 121.6).
•
The 95% confidence interval for µ , (112.4, 121.6), can be expressed as 117 ± 4.6 .
•
•
•
•
ME
4.6
=
≈ 2.35 .
*
t
1.960
Before any sample was selected, the method employed here had a 95% chance of creating a 95%
confidence interval that successfully captures µ . Whether the specific interval, (112.4, 121.6), is one
x = 117 and ME = 4.6 , where ME = t *SEx with t * ≐ 1.960 ; thus, SEx =
of these, or not, is unknown.
Any statement involving 95% and the specific interval (112.4, 121.6) that does not contain the phrase
“95% confident” is a false statement.
Any statement involving the phrase “95% confident” that doesn’t involve an interval of infinitely many
numbers is false.
Answer: NONE
6. A newspaper reports that the governor’s approval rating stands at 65%. The article adds that the poll is based
on a random sample of 972 adults and has a margin of error of 2.5%. What level of confidence did the pollsters
use?
Since a good sampling strategy (SRS) was used and 10n = 10 ( 972 ) = 9720 ≤ N , where N , the number of adults
in any one state in America (the specific state was given here), is in the millions, npˆ = ( 972 )( 0.65 ) = 631.8 ≥ 10 ,
and nqˆ = ( 972 )( 0.35 ) = 340.2 ≥ 10 , then the sampling distribution of sample proportions is normal. Therefore,
the margin of error in any confidence interval for the true proportion is ME pˆ = z *
ME pˆ = z *
ˆˆ
pq
⇒ 0.025 = z*
n
(0.65)(0.35)
972
⇒ z* =
0.025
(0.65)(0.35)
ˆˆ
pq
.
n
≈ 1.63 .
972
the area of this region is the level of confidence
p
p̂
value is unknown
−1.63
0
1.63
z
P ( −1.63 ≤ z ≤ 1.63) = normalcdf ( −1.63, 1.63, 0, 1) ≈ 0.89689 ≈ 90%
Answer: Rounded to the nearest whole percent, the pollsters used a 90% confidence interval
Download