Uploaded by saqibkhattak550

Ex03DescriptMeasures

advertisement
MTCS3063/MT1613 - Probability and Statistics
Exercise 3
Q 1. The following are figures on a well’s daily production of oil in barrels:
214, 203, 226, 198, 243, 225, 207, 203, 208, 200, 217, 202, 208, 212, 205 and 220
Calculate the variance s2 and the standard deviation of the data. Display the
well’s daily production on a boxplot.
Q 2. Consider the following data on type of relative frequencies for the various cathealth complaint (J = joint swelling, F egories, and draw a bar graph.
= fatigue, B = back pain, M = muscle weakness, C = coughing, N = nose
running/irritation, O = other) made by
tree planters. Obtain frequencies and
Q 3. In five tests, one student averaged 63.2 with a standard deviation of 3.3, whereas
another student averaged 78.8 with a standard deviation of 5.3. Which student
is relatively more consistent?
Q 4. Every score in the following batch of exam scores is in the 60s, 70s, 80s, or 90s
as given in the table below:
74
89
81
85
89
81
95
98
80
81
84
84
93
71
81
68
64
74
80
90
67
82
70
82
72
85
69
69
70
63
66
72
66
72
60
87
85
81
83
88
Compute the mean, median and mode of the exam scores.
Q 5. If k sets of data consist, respectively, of n1 , n2 , . . . nk observations and have the
means x1 , x2 , . . . , xk then the overall mean of all the data is given by the formula
Pk
x = Pi=1
k
ni xi
i=1 ni
i) The average annual salaries paid to top-level management in three companies are $94,000, $102,000, and $99,000. If the respective numbers of
top-level executives in these companies are 4, 15, and 11, find the average
salary paid to these 30 executives.
ii) In a nuclear engineering class there are 22 juniors, 18 seniors, and 10 graduate students. If the juniors averaged 71 in the midterm examination, the
seniors averaged 78, and the graduate students averaged 89, what is the
mean for the entire class?
1
Q 6. The formula for preceding exercise is a special case of the following formula for
the weighted mean
Pk
wi xi
xw = Pi=1
k
i=1 wi
where wi is a weight indicating the relative importance of the i-th observation.
i) If an instructor counts the final examination in a course four times as much
as each 1-hour examination, what is the weighted average grade of a student
who received grades of 69, 75, 56, and 72 in four 1-hour examinations and
a final examination grade of 78?
ii) From 1999 to 2004 the cost of food increased by 53% in a certain city, the
cost of housing increased by 40% and the cost of transportation increased
by 34%. If the average salaried worker spent 28% of his or her income on
food, 35% on housing, and 14% on transportation, what is the combined
percentage increase in the cost of these items?
Q 7. The national Highway Traffic Safety
Administration reported the relative
speed (rounded to the nearest 5 mph)
of automobiles involved in accidents
one year. The percentages at different
speeds were as recorded in the given table.
20 mph or less
25 or 30 mph
35 or 40 mph
45 or 50 mph
55 mph
60 or 65 mph
2.0%
29.7%
30.4%
16.5%
19.2%
2.2%
i) From these data can we conclude that it is quite safe to drive at high speeds?
Why or why not?
ii) Why do most accidents occur in the 35 or 40 mph and in the 25 or 30 mph
ranges?
iii) Construct a density histogram using the endpoints 0, 22.5, 32.5, 42.5, 52.5,
57.5, 67.5 for the intervals.
Q 8. According to Chebyshev’s theorem, what can we assert about the percentages of
any set of data that must lie within k standard deviations of the mean when
a) k = 2; b) k = 2.5; c) k = 3.1; d) k = 9; e) k = 12?
Q 9. Twenty power failures last
18 125
45 33
44 96
89 12
31
103
26 80 49
75 40 80
125 63
61 28
minutes. Find the mean, median, and standard deviation. What proportion of
the data lies in the intervals (x − s, x + s) and (x − 2s, x + 2s).
Q 10. Measurements made with one micrometer of the diameter of a ball bearing have
a mean of 3.92 mm and a standard deviation of 0.0152 mm, whereas measurements
made with another micrometer of the unstretched length of a spring have a mean
of 1.54 inches and a standard deviation of 0.0086 inch. Which of these two
measuring instruments is relatively more precise?
2
Downtime (minutes) Frequency
0−9
2
10 − 19
15
20 − 29
17
30 − 39
13
40 − 49
3
Total = 50
Q 11. In a factory or office, the time during
working hours in which a machine is
not operating as a result of breakage or
failure is called a downtime. The table
displays the distribution of a sample of
the length of the downtimes of a certain machine.
Find
i) the mean and the median,
iii) the standard deviation,
ii) lower and upper quartiles,
iv) π10 , π30 , π65 , π95 .
v) Calculate the Pearson’s coefficient of skewness for the distribution and discuss the symmetry or skewness of the data.
Q 12. The following are measurements of the breaking strength (in ounces) of a sample
of 60 linen threads:
i) Find the mean breaking strength of this sample data. Calculate the standard
deviation and describe the population using Chebyshev’s Rule.
ii) Compute again the mean and s from the grouped data and then compare
with the results of Part i) above.
iii) Find the interquartile range.
iv) What are the 20th and 97th percentile of the give data.
v) Draw the boxplot of the given data set.
Q 13. The compressive strength of high-performance concrete had previously been investigated, but not much was known about flexural strength (a measure of ability to resist failure in bending). The sample data on flexural strength (in Mega
Pascal (MPa) where 1 Pa (Pascal) = 1.45 × 10–4 psi) are given below:
5.9
7.9
9.7
7.2 7.3
9.0 7.0
7.8 7.7
6.3
8.1
6.8
6.5
8.2
8.7
7.8
9.7
11.6 11.3 11.8 10.7
i) Compute the mean flexural strength.
ii) Analyse the data by construction its boxplot.
3
7.0 6.3
7.4 7.7
7.6 6.8
Q 14. In a 2-week study of the productivity of workers, the following data were obtained on the total number of acceptable pieces which 100 workers produced:
Use the frequency distribution and cumulative frequency distribution of this
data set from Exercise 1 and estimate the mean, median and mode of this grouped
data.
Q 15. Derive the following equivalent computing formula for the variance:
 n 
n
X
X 
n·
xi2 − 
xi 
i=1
i=1
s2 =
n(n − 1)
Q 16. Find and interpret the z-scores of the
following exam scores:
i) 67, ii) 95, iii) 45, iv) 100.
Q 17. Answer the following questions briefly:
i) Identify the two most commonly used measures of center for quantitative
data. Explain the relative advantages and disadvantages of each.
ii) Among the measures of central tendency discussed, which is the only one
appropriate for qualitative data?
iii) Data Set A has more variation than Data Set B. Decide which of the following statements are necessarily true.
a) Data Set A has a larger mean than Data Set B.
b) Data Set A has a larger standard deviation than Data Set B.
iv) What do you mean by the z-score of an observed value of a variable?
v) Identify the statistic that is used to estimate a) a population mean, and b) a
population standard deviation.
vi) What are the a) Empirical Rule and b) Chebyshev’s Rule for the distribution
of statistical data.
4
Download