Stats Assignment 3

advertisement
ENGT 3010-002
11/08/2012
Assignment #2
Ryan Rapini
ENGT 3010 – 002
November 8, 2012
Ryan Rapini 0
Ryan Rapini 1
Take Home Assignment III
1. Several million lottery tickets are sold, and 60% of the tickets are held by women.
Five winning tickets will be drawn at random.
a. What is the probability that three or fewer of the winners will be women?
We can model the probability using a binomial distribution function.
๐‘›!
๐‘ƒ(๐‘ฅ) = {
๐‘ ๐‘ฅ (1 − ๐‘)๐‘›−๐‘ฅ
๐‘ฅ! (๐‘› − ๐‘ฅ)!
We know that the probability of there being 3 or less women winners plus the probability that
there will be more than three will equal 1, so therefore we can interchange probabilities (to save
time):
๐‘ƒ(๐‘‹ ≤ 3) = 1 − ๐‘ƒ(๐‘ฅ > 3)
1 − ๐‘ƒ(๐‘‹ = 4) − ๐‘ƒ(๐‘‹ = 5)
Now we solve using the equation:
5!
5!
(0.6)4 (0.4)1 −
1−
(0.6)5 (0.4)0
4! 1!
5! 0!
๐‘ƒ (๐‘‹ ≤ 3) = 0.6630
Therefore, there is a 66.3% chance that 3 or fewer of the winners will be women.
b. What is the probability that three of the winners will be of one gender and two of the winners
will be of the other gender?
Using the equation for binomial distribution:
5!
5!
(0.6)3 (0.4)2 +
(0.6)2 (0.4)3 = 0.5760
๐‘ƒ(๐‘‹ = 2) + ๐‘ƒ(๐‘‹ = 3) =
2! 3!
2! 3!
Therefore, there is a 57.6% chance that 3 of the winners will be of one gender and 2 will be of
the other gender.
2. Geologists estimate the time since the most recent cooling of a mineral by counting
the number of uranium fission tracks on the surface of the mineral. A certain mineral
specimen is of such an age that there should be an average of 6 tracks per cm2 of
surface area. Assume the number of tracks in an area follows a Poisson distribution.
Let X represent the number of tracks counted in 1 cm2 of surface area. Find:
1. P(X = 7)
The formula for a Poisson probability mass function is:
๐œ†๐‘ฅ
๐‘ƒ(๐‘‹ = ๐‘ฅ) = ๐‘’ −๐œ†
๐‘ฅ!
7
6
๐‘ƒ(๐‘‹ = 7) = ๐‘’ −6
7!
Therefore, the probability of there being 7 tracks in 1 centimeter of surface area is 0.1377
Ryan Rapini 2
2. P(X ≥ 3)
๐‘ƒ(๐‘‹ ≥ 3) = 1 − ๐‘ƒ(๐‘ฅ = 0) − ๐‘ƒ(๐‘ฅ = 1) − ๐‘ƒ(๐‘‹ = 2)
60
61
62
1 − ๐‘’ −6 − ๐‘’ −6 − ๐‘’ −6
0!
1!
2!
1 − 0.00248 − 0.01487 − 0.04462
Therefore, the probability of there being 3 or less tracks in 1 centimeter of surface area is 0.9380
c. P(2 < X < 7)
๐‘ƒ(๐‘‹ = 3) + ๐‘ƒ(๐‘‹ = 4) + ๐‘ƒ(๐‘‹ = 5) + ๐‘ƒ(๐‘‹ = 6)
63
64
65
6−6
−6
−6
−6
−6
๐‘’
+๐‘’
+๐‘’
+๐‘’
3!
4!
5!
6!
0.0892 + 0.1339 + 0.1606 + 0.1606
Therefore, the probability of there being between 2 and 7 tracks in 1 centimeter of surface area
is 0.5443
d. μX
If X ∼ Poisson (λ), then the mean is given by
μX = ๐œ†
Therefore, the mean number of tracks is 6
e. σX
If X ∼ Poisson (λ), then the variance is given by
σ2X = λ
σX = √๐œ†
Therefore, the variance of the tracks is 2.4495
3. Find the area under the normal curve
a. To the left of z = 0.56.
Using the chart, you find that z=0.56 is 0.7123
b. Between z = −2.93 and z = −2.06.
๐‘ง = −2.93 = 0.0017 ๐‘Ž๐‘›๐‘‘ ๐‘ง = −2.06 = 0.0197
0.0197 − 0.0017 = 0.018
c. Between z = −1.08 and z = 0.70.
๐‘ง = −1.08 = 0.1401 ๐‘Ž๐‘›๐‘‘ ๐‘ง = 0.70 = 0.7580
0.7580 − 0.1401 = 0.6179
d. Outside z = 0.96 to z = 1.62.
๐‘ง = 0.96 = 0.8315 ๐‘Ž๐‘›๐‘‘ ๐‘ง = 1.62 = 0.9474
1 − 0.9474 + 0.8315 = 0.8841
Ryan Rapini 3
4. The lifetime of a lightbulb in a certain application is normally distributed with mean
μ = 1400 hours and standard deviation σ = 200 hours.
a. What is the probability that a lightbulb will last more than 1800 hours?
1800 − 1400
=2
200
๐‘ˆ๐‘ ๐‘–๐‘›๐‘” ๐‘กโ„Ž๐‘’ ๐‘ก๐‘Ž๐‘๐‘™๐‘’, ๐‘Ž ๐‘ง ๐‘ ๐‘๐‘œ๐‘Ÿ๐‘’ ๐‘œ๐‘“ 2 = 0.9772
1 − .9772 = 0.0228
b. Find the 10th percentile of the lifetimes.
The 10th percentile corresponds closest to a z score of -1.28 on the chart of the book.
๐‘ฅ − 1400
−1.28 =
200
x = 1144
The 10th percentile of the lifetimes is 1144 hours
c. A particular light bulb lasts 1645 hours. What percentile is its lifetime on?
1645 − 1400
= 1.23
200
st
The light bulb is in the 1 percentile
d. What is the probability that the lifetime of a lightbulb is between 1350 and 1550 hours?
Using the formula:
๐‘ฅ − 1400
200
1350 − 1400
1550 − 1400
= −0.25
= 0.75
200
200
Z score of 0.75 is 0.7734
Z score of -25 is 0.4013
0.7734 − 0.4013 = 0.3721
5. A film-coating process produces films whose thicknesses are normally distributed
with a mean of 110 microns and a standard deviation of 10 microns. For a certain
application, the minimum acceptable thickness is 90 microns.
a. What proportion of films will be too thin?
๐‘ƒ ( ๐‘‡ < 90 )
( 90 − 110 )
= −2
10
For a normal distribution, a z score of -2 has a probability of 0.0228, or 2.28% of the films will be
too thin. This checks out since 2 standard deviations from the mean should include all but 2.2%
of the values in a normal distribution.
๐‘ง =
b. To what value should the mean be set so that only 1% of the films will be too thin?
A z score of -2.33 gives you a probability of 0.0099, or approximately 1% of the films. Now we
solve backwards to find the appropriate mean.
( 90 − ๐‘ฅ )
๐‘ง = −2.33 =
10
๐‘ฅ = 113.3
Ryan Rapini 4
c. If the mean remains at 110, what must the standard deviation be so that only 1% of the films
will be too thin?
A z score of -2.33 gives you a probability of 0.0099, or approximately 1% of the films. Now we
solve backwards to find the appropriate standard deviation.
( 90 − 110 )
๐‘ง = −2.33 =
๐‘ฅ
๐‘ฅ = 8.58369
6. The molarity of a solute in solution is defined to be the number of moles of solute per
liter of solution (1 mole = 6.02 × 1023 molecules). If X is the molarity of a solution of
sodium chloride (NaCl), and Y is the molarity of a solution of sodium carbonate
(Na2CO3), the molarity of sodium ion (Na+) in a solution made of equal parts NaCl and
Na2CO3 is given by M = 0.5X +Y . Assume X and Y are independent and normally
distributed, and that X has mean 0.450 and standard deviation 0.050, and Y has mean
0.250 and standard deviation 0.025.
d. What is the distribution of M?
One principle of normal distributions is that linear combinations of them are still normal random
distributions. Therefore, M is a normal linear distribution.
We can compute the mean and variance by:
μX = 0.45 and μY = 0.250
μM = 0.5μX + μY = 0.5 ( 0.45 ) + ( 0.250 ) = 0.475
and
σX = 0.05 and σY = 0.025
σ2M = 0.52 * σX2 + σY2 = 0.52 ( 0.45 )2 + ( 0.250 )2 = 0.1131
M ~ N ( 0.475, 0.1131 )
e. Find P(M > 0.5).
Z = ( 0.5 – 0.475 ) / √ 0.1131 = 0.0743
According to the table, the z score of 0.07 has a probability of .4721, and since we’re looking for
the right side of that curve we subtract that from 1
1 - .4721 = 0.5279
P ( M > 0.5 ) = 0.5279
Ryan Rapini 5
Part B
1. An insurance company offers a discount to homeowners who install smoke detectors
in their homes. A company representative claims that 80% or more of policyholders
have smoke detectors. You draw a random sample of eight policyholders. Let X be the
number of policyholders in the sample who have smoke detectors.
a. If exactly 80% of the policyholders have smoke detectors (so the representative’s claim is true,
but just barely), what is P(X ≤ 1)?
The probability that one or less policy holders own smoke detectors would be 0.00008 or
0.008%
b. Based on the answer to part (a), if 80% of the policyholders have smoke detectors, would one
policyholder with a smoke detector in a sample of size 8 be an unusually small number?
Yes. If 80% of policy holders have a smoke detector, the odds of sampling 7 people who don’t
have one is very low.
c. If you found that one of the eight sample policyholders had a smoke detector, would this be
convincing evidence that the claim is false? Explain.
Yes, because the probability of this happening if the insurance company is telling the truth is
very low
d. If exactly 80% of the policyholders have smoke detectors, what is P(X ≤ 6)?
The probability that six or less policy holders own smoke detectors would be 0.49668, or 49.7%
e. Based on the answer to part (d), if 80% of the policyholders have smoke detectors, would six
policyholders with smoke detectors in a sample of size 8 be an unusually small number?
No. Sampling from the data there is almost a 50% chance of having six or fewer smoke
detectors. There is almost a 30% chance of having exactly six smoke detectors, too.
Ryan Rapini 6
2. If X ∼ N(2, 9), compute:
a. P(X ≥ 2)
There is a probability of 0.5 or a 50% chance that x will be less
than or equal to 2.
b. P(1 ≤ X < 7)
P(x < 1) = 0.433816
P(x < 7) = 0.691462
P(x < 7) - P(x < 1) = 0.691462 - 0.433816 = 0.257646
There is a probability of 0.257646 or a 25.8% chance that x will
be greater than or equal to 1 and less than 7.
c. P(−2.5 ≤ X < −1)
P(x < - 2.5) = 0.289257
P(x < -1 ) = 0.348679
P(x < -1 ) - P(x < - 2.5) = 0.348679 - 0.289257 = 0.059422
There is a probability of 0.059422 or a 5.9% chance that x will be
greater than or equal to -2.5 or less than -1.
d. P(−3 ≤ X − 2 < 3)
−5 ≤ X < 1
P(x < - 5) = 0.202328
P(x < 1 ) = 0.433816
P(x < 1 ) - P(x < - 5) = 0.433816 - 0.202328
There is a probability of 0.231488 or a 23.1% chance that x will
be greater than or equal to -5 or less than 1.
3. As part of a quality-control study aimed at improving a
production line, the weights (in ounces) of 50 bars of soap
are measured. The results are as follows, sorted from
smallest to largest.
a. Compute mean and standard deviation of data.
The mean of the data is 16
The standard deviation of the data is 1.949
Ryan Rapini 7
b. Check if these data fit a Normal Distribution Curve. Print the probability plot and answer if the
data came from a normal distribution.
Probability Plot of x
Normal - 95% CI
99
Mean
StDev
N
AD
P-Value
95
90
16.03
1.949
50
0.327
0.512
Percent
80
70
60
50
40
30
20
10
5
1
10
12
14
16
x
18
20
22
This data does fit within a normal distribution curve.
c. Construct a normal distribution curve with the mean and standard deviation you computed.
Distribution Plot for Problem 2, Section 4.10
Normal, Mean=16.034, StDev=1.949
0.20
Density
0.15
0.10
0.05
0.00
10
12
14
16
18
Weights (in ounces)
20
22
Ryan Rapini 8
d. Compute the z-score for each data value using the
mean and standard deviation you computed.
Okay.
e. Compute probability value for z-score column and
store in the next column
Okay.
f.
Print worksheet
It’s right there. -->
g. Plot, using scatter plot, the data column vs the
probability value column. Do the data points lie on a
straight line? Is your answer consistent with b?
Scatterplot of Probability vs Data
0.4
Probability
0.3
0.2
0.1
0.0
12
13
14
Looks good to me.
15
16
Data
17
18
19
20
21
Download