Probability Distributions

advertisement
Chapter 5: Probability Distributions
1.
a.
A random variable is a variable whose values occur at random, following a probability distribution.
b.
An observation is the actual realization of a random variable.
c.
x is the average of sample observations, while μ is the theoretical mean from a probability
distribution.
2.
The 50 woman-owned businesses do not constitute a random sample, since the are not randomly selected
from the population of woman-owned business. They are the top 50 (in terms of annual sales) businesses
owned by women. Thus, we cannot make accurate inferences about woman-owned businesses based on
this sample.
3.
a.
a Poisson distribution
b.
Use the Poisson distribution with y=0 and λ=5. The probability of no births will be:
p( y ) 
y
y!
e 
5 0 5
e
0!
p (0)  0.0067
p ( 0) 
c.
A Normal distribution.
a.
mean = 0.5. standard deviation = 0.5
b.
mean = 1/3. standard deviation =
2 3 = 0.4714
c.
mean = 1/6. standard deviation =
5 6 = 0.3726
a.
mean = 12.5. standard deviation = 2.5
b.
mean = 25. standard deviation = 3.5355
c.
mean = 3.33. standard deviation = 1.49
d.
mean = 1.67. standard deviation = 1.1785.
4.
5.
1
Chapter 5: Probability Distributions
6.
a.
p(n=10) = 0.1762. p(n<=10) = 0.588
b.
p(n=15) = 0.0916. p(n>15) = 0.059
c.
p(n=3) = 0.2601. p(n<=3) = 0.5593
d.
p(n=20) = 0.0679. p(n<=20) = 0.8481
e.
p(n=4,5,6) = 0.6563
a.
λ = 10. standard deviation =
b.
average = 10. standard error = 0.6325
a.
p(x=2) = 0.2707. p(x<=2) = 0.6767
b.
p(x=3) = 0.1804. p(x<=3) = 0.8571
c.
p(x=4) = 0.0902. p(x<=4) = 0.9473
d.
p(x=5) = 0.0361. p(x<=5) = 0.9834
a.
0.6915
b.
0.8413
c.
0.9505
d.
0.9750
a.
-1.6449
b.
-1.2816
c.
0
d.
1.2816
e.
1.6449
f.
1.96
g.
2.3263
a.
0.975
b.
0.99996
c.
1.0000
d.
0.025
e.
0.5
7.
10
8.
9.
10.
11.
2
Chapter 5: Probability Distributions
12.
a.
2.4369
b.
3.3168
c.
5.0000
d.
7.5631
e.
8.2897
f.
9.6527
a.
p = 0.0568
b.
14 players batted at 0.300 or better. This is 5.32% of the sample, which is close the prediction
based on the Normal distribution.
c.
The histogram of the salary data is:
13.
Baseball Salaries
90.0
80.0
70.0
50.0
40.0
30.0
20.0
10.0
Salaries
3
2,380
2,221
2,061
1,902
1,742
1,583
1,423
1,264
1,105
945
786
626
467
307
0.0
148
Counts
60.0
Chapter 5: Probability Distributions
The Normal P-plot is:
d.
Salary P-Plot
2.370
1.370
0.370
-0.630
-1.630
-2.630
68
568
1068
1568
2068
Based on the histogram and the P-plot, there is reason to doubt that salary data follows a Normal
distribution.
e.
The average salary is 541.48. The standard deviation of the salary data is 450.16. The largest
Normal score = 2.824. Therefore, the predicted maximum salary = (2.824)(450.16) + 541.48 or
1812.73. The observed largest salary is 2460, so the Normal scores underestimate this value.
f.
The salary data is positively skewed.
4
Chapter 5: Probability Distributions
14.
The histogram of the price data appears as follows:
a.
Histogram of Prices
25.0
20.0
Counts
15.0
10.0
5.0
209,633
198,900
188,167
177,433
166,700
155,967
145,233
134,500
123,767
113,033
102,300
91,567
80,833
70,100
59,367
0.0
Prices
The Normal P-plot is:
Price P-Plot
1.946
1.446
0.946
0.446
-0.054
-0.554
-1.054
-1.554
-2.054
-2.554
54,000
74,000
94,000
114,000
134,000
154,000
174,000
194,000
214,000
The data does not appear to follow a Normal distribution.
b.
The average house price = $106,273.50. The standard deviation of the price data = $38,043.70 and
the standard error = $3,517.14. Based on the Central Limit theorem, we should be 95% confident
that the true mean value falls in the range ($99,239.22 , $113,307.79).
5
Chapter 5: Probability Distributions
The first few values of the log(Price) variable are:
c.
Price
LogPrice
87,400
4.942
110,900
5.045
95,000
4.978
87,000
4.940
73,900
4.869
77,000
4.886
133,000
5.124
116,000
5.064
102,000
5.009
94,000
4.973
The histogram of the log(Price) variable is:
Log(Price) Histogram
18.0
16.0
14.0
10.0
8.0
6.0
4.0
2.0
Log(Prices)
6
5.312
5.272
5.232
5.192
5.152
5.112
5.072
5.032
4.992
4.952
4.912
4.872
4.832
4.792
0.0
4.752
Counts
12.0
Chapter 5: Probability Distributions
The Normal P-plot is:
Log(Price) P-Plot
1.946
1.446
0.946
0.446
-0.054
-0.554
-1.054
-1.554
-2.054
-2.554
4.732
4.832
4.932
5.032
5.132
5.232
5.332
The transformed data follows the Normal distribution more closely than the raw price values.
15.
a.
Because the shots are selected from a population with a standard deviation of 0.2 (a good
marksman), there is a 68% probability that a shot will fall within one standard deviationof the
mean, i.e. within 0.2 units of 0. Similarly, 95% of the values should be within two standard
deviations of the mean, or within 0.4 units of 0.
b.
To be 95% sure that her sample mean is ± 0.2, she would need a sample size of 100. The standard
deviation of the sample mean would be 0.1, so that is 95% likely that the sample mean is within 2
standard deviations (0.2) of the population mean, 0.
c.
A marksman with highest accurarcy could take 1 shot and have 95% confidence that the sample
mean, the vertical displacement of the shot, was within 0.2 of the bull's eye, since 0.1 is that
standard deviation of the sample mean.
a.
Enter the formula, =AVERAGE(A2:I2), into the first cell and then fill the rest of the values in the
column down.
16.
7
Chapter 5: Probability Distributions
b.
The two histograms will appear as follows (answers will vary):
Column 1 Histogram
100.0
80.0
60.0
40.0
20.0
1
2
2
2
2
1
1
1
1
1
0
0
0
0
1
0.0
Sample Average Histogram
30.0
25.0
20.0
15.0
10.0
5.0
0.600
0.644
0.556
0.467
0.511
0.422
0.333
0.378
0.289
0.200
0.244
0.156
0.067
0.111
0.022
0.0
The sample average histogram looks more like a Normal distribution (aside from being discrete) than the
values from the first column.
c.
The descriptive statistics are:
Count
Average
Column1
100
0.1600
Standard Deviation
0.4197
Sample Average
100
0.2344
0.1464
The sampling distribution of the Poisson, where λ=0.25 and n=0 should follow a Normal distribution with
mean = 0.25 and standard deviation = 0.167. This is reasonable close to the observed values from the
sample average.
17.
a.
Enter the formula, =AVERAGE(A2:I2), into the first cell and then fill the rest of the values in the
column down.
8
Chapter 5: Probability Distributions
b.
The two histograms will appear as follows (answers will vary):
Column 1 Histogram
25.0
20.0
15.0
10.0
5.0
0
1
2
2
3
4
4
5
6
6
7
8
8
9
10
0.0
Sample Average Histogram
20.0
15.0
10.0
5.0
2.556
2.778
3.000
3.222
3.444
3.667
3.889
4.111
4.333
4.556
4.778
5.000
5.222
5.444
5.667
0.0
Both distributions follow a bell-shaped curve, though this is more pronounced with the sample average.
c.
The descriptive statistics are:
Count
Average
Column 1
100
4.320
Standard Deviation
1.858
Sample Average
100
4.020
0.599
The distribution of the Binomial for n=16 and p=0.25 has a mean value of 4 and a standard deviation of
1.732. The mean of the sampling distribution is 4 and the standard error is 0.577. These values are closely
matched in the random data.
18.
a.
Enter the formula, =AVERAGE(A2:I2), into the first cell and then fill the rest of the values in the
column down.
9
Chapter 5: Probability Distributions
b.
The two histograms will appear as follows (answers will vary):
Column 1 Histogram
100.0
80.0
60.0
40.0
20.0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0.0
Sample Average Histogram
30.0
25.0
20.0
0.022
0.067
0.111
0.156
0.200
0.244
0.289
0.333
0.378
0.422
0.467
0.511
0.556
0.600
0.644
15.0
10.0
5.0
0.0
Column 1 contains dichotomous output consisting of zeroes and ones. The sample average, while still
discrete, is starting to show some of the features of the bell-shaped curve.
c.
The descriptive statistics for this set of random data are:
Count
Average
Column 1
100
0.170
Standard Deviation
0.378
Sample Average
100
0.238
0.152
For the Bernoulli distribution, the mean is 0.25 and the standard deviation is 0.5. For a sample with 9
observations, the sample average is 0.25 and the standard error is 0.167.
19.
a.
Enter the formula, =AVERAGE(A2:I2), into the first cell and then fill the rest of the values in the
column down.
10
Chapter 5: Probability Distributions
b.
The two histograms will appear as follows (answers will vary):
Column 1 Histogram
12.0
10.0
8.0
6.0
4.0
2.0
4.06
10.65
17.25
23.84
30.44
37.03
43.63
50.22
56.82
63.41
70.01
76.60
83.20
89.79
96.39
0.0
Sample Average Histogram
15.0
10.0
5.0
24.81
27.67
30.54
33.40
36.27
39.13
41.99
44.86
47.72
50.59
53.45
56.31
59.18
62.04
64.91
0.0
Column 1 does not follow the Normal curve well, particularly in the tails. The sample average follows the
bell-curve reasonably well.
c.
The descriptive statistics for this set of random values is:
Count
Average
Standard Deviation
Column 1
100
49.163
29.056
Sample Average
100
47.263
9.401
For the Uniform distribution over the range [0, 100] the mean is 50 and the standard deviation is 28.868.
For a sample size of 9, the sample average is 50, and the standard error is 9.623.
20.
False. The Central Limit Theorem states that the distribution of the sample means–not the sample values,
will approach the Normal distribution as the sample size increases. The observations will be distributed
following the underlying probability distribution from which they were selected.
21.
a.
To have a 95% confidence that the mean of a sample is within 2 units of the population mean, the
standard error must be one. Thus 100 observations are required.
b.
The standard deviation of the sample mean should be 1, so 100 observations are needed.
c.
0.159.
11
Chapter 5: Probability Distributions
22.
The histogram of the reaction times is:
a.
Reaction Times Histogram
18.0
16.0
14.0
Counts
12.0
10.0
8.0
6.0
4.0
2.0
0.221
0.214
0.207
0.201
0.194
0.187
0.181
0.174
0.167
0.161
0.154
0.147
0.141
0.134
0.127
0.0
Reaction Times
The value 0.1 does not appear on the histogram. The histogram would have to extend to the left in order to
include to 0.1 value.
b.
Mean reaction time = 0.1723. Standard deviation = 0.0206. The probability that a reaction time of
0.1 or less could be observed is 0.022%
c.
The Normal P-plot shown below does not give a compelling reason to discount an assumption of
Normality in the data.
Reaction P-Plot
1.981
1.481
0.981
0.481
-0.019
-0.519
-1.019
-1.519
-2.019
-2.519
0.124
0.134
0.144
0.154
0.164
0.174
12
0.184
0.194
0.204
0.214
0.224
Chapter 5: Probability Distributions
d.
Based on these findings, it would appear unlikely for a sprinter to achive a reaction time of 0.1 or
less. The extremely low probability (0.022%) would place the odds of such an occurrence at about
5000 to 1.
This does not conclusively prove that Christie must have anticipated the starter's gun. Christie,
being the reigning world champion, may have an extremely quick reaction time and it may be
wrong to classify him along with other, lesser, sprinters. Moreover, since these reaction times are
taken from the first heat in which the competition is less keen, it could be that reaction times are not
as fast. The greater sprinters might conserve their energy for the finals, where their reaction times
might be quicker.
To further investigate this issue, it would be better to analyze the reaction times for the 100 meter
champions in controlled conditions to see whether their reaction times are quicker and thus a 0.1
reaction time would not be as unlikely.
13
15
Download