251y0012 10/11/00

advertisement
251y0012 10/11/00
Part I.
ECO251 QBA1
FIRST HOUR EXAM
OCTOBER 7, 2000
Name ______KEY__________
SECTION MWF 10 11 TR 11 12:30
Multiple Choice (10 points)
1.(S-2) Inferential statistics is
a. The display of characteristics of a sample in a graph with summary measures.
b. The display of characteristics of a population in a graph with summary measures.
*c. The process of estimating facts (parameters) about a population from a sample taken from the
population.
d. A branch of mathematics devoted to the collection, display and analysis of data.
e. None of the above.
2.(S-3) The Fortune 500 listing of the 500 largest companies in the US in order of their annual sales is an
example of
*a. Ordinal data.
b. Nominal data.
c. Interval data.
d. Ratio data.
e. None of the above.
3. A used automobile dealer lists cars in the following classes. A - 100,000 miles or more on the odometer,
B - less than 100,000 miles on the odometer, C - Diesel. Are these three categories
*a. Collectively exhaustive?
b. Mutually exclusive?
c. Both mutually exclusive and collectively exhaustive?
d. Neither mutually exclusive or collectively exhaustive?
e. Can't tell with the information given.
4. (D7-9) If a distribution is skewed to the left, we can say that it is likely that
a. Mean > median > mode
b. Median > mean > mode
*c. Mode > median > mean
d. Mode > mean > median
e. Mode = mean = median
(Most people got this backwards - make a diagram!)
5. A graph that connects points, each of which represents the cumulative frequency F  is called a
a. Histogram
*b. Ogive
c. Frequency Polygon
d. Pie chart
e. None of the above
1
251y0012 10/11/00
Part II. Compute an appropriate answer, showing your work (except in a)) (15 Points maximum - if you do
more than 15 points, only your right answers will be counted.):
a) Fill in the following table (3)
Class
F
f rel
f
50-59.99
60-69.99
70-79.99
80-89.99
90-99.99
Total
Solution:
Class
50-59.99
60-69.99
70-79.99
80-89.99
90-99.99
Total
Note that n  25
_
3
_
7
6
25
.12
__
__
__
_
__
f
3
3
6
7
6
25
f rel
.12
.12
.24
.28
.24
1.00
__
__
12
__
__
F
3
6
12
19
25
b) Assume that we have sold 1000 life insurance policies in amounts between $5300 and $9800. If
this data is to be presented in seven classes, what intervals would you use? Explain your reasoning
using the appropriate formula and make a table showing the class intervals you would actually use.
(3)
9800  5300
 642 .86 so use 650 or 700. This is only a suggestion. Any number
Solution:
7
somewhat above or equal to 643 will work.
Class
A
B
C
D
E
F
G
from
5000
5700
6400
7100
7800
8500
9200
to
5699.99
6399.99
7099.99
7799.99
8499.99
8199.99
9899.99
c) (S-30)If a population of 1000 items with an unknown distribution has a mean of 12 and a
standard deviation of 1.5, what is the approximate minimum number of items that must be (i)
between 6 and 18? (ii) between 12 and 18? Note: there was an error here - (ii) was a harder
question than I intended to give - I will thus give 3 points for a correct answer for (i). (ii) should
have read (iic) What is the maximum that could be over 18? (3)
x
6  12
18  12
 4 and
 4.
Solution: (i) If we use the formula k  z 
, we find that

1.5
1 .5
According to the Chebyshef inequality, The minimum fraction of the data that must be between
1
1
  4 is 1  2  1   15 16 . Fifteen sixteenths of 1000 is about 938. (ii) since we can't pick
16
k
sides here, the answer can't really be found. (iic) The answer is the opposite to the answer to (i).
There are about 1000 - 938 = 62 items left over. All of these could be above 18.
2
251y0012 10/11/00
d) Do c) again assuming that the distribution is unimodal and symmetric.(2)
Solution: (i and iic) Since the Empirical Rule says that almost all points must be between   3 ,
we would expect almost all of the 1000 points to be between 6 and 18, since these points are
  4 , and we would be quite surprised if even one point is above 18. (ii) If the distribution is
symmetric, we would expect half of the 1000 points or 500 on one side. Again there will be 2
points for a correct answer to (i).
e) For the numbers 11.1, 13.2, 15.1 and 11.5, compute the i) Root-mean-square ii) Harmonic
mean, iii) geometric mean (2 each)
x  50 .9 . This is not used in any of the following calculations and there is
Solution: Note that

no reason why you should have computed it!
(i) The Root-Mean-Square.
1
1
1
1
2
x rms

x 2  11 .12  13 .2 2  15 .12  11 .5 2  123 .21  174 .24  228 .01  132 .25   657 .71
n
4
4
4



1
n
 164 .4275 . So x rms 
x
2
 164 .4275  12 .823 .
(ii) The Harmonic Mean.
1
1
1 1 1
1
1
1  1

 



  0.090090  0.075758  0.066225  0.086957 
xh n
x 4  11 .1 13 .2 15 .1 11 .5  4

1
0.319029   0.079757 . So xh  1  1  12 .5380 .
1
1 0.079757
4
n
x
(iii) The Geometric Mean.


1
x g  x1  x 2  x3  x n  n  n
x 
4
11.113.21`5.111.5  4 25443 .198  25443 .198 
1
4
25443 .198 0.25  12.6297 .
Or
 
ln x g 
1
n
 ln( x)  4 ln 11.1  ln 13.2  ln 15.1  ln 11.5  4 2.40695  2.58022  2.71469  2.44234 
1
1
1
10.14420   2.53605 . So x g  e 2.53605  12 .6297 . I got the last result by putting 2.53605 into
4
the calculator and pressing 'inverse' and then 'ln x.'
Or
1
log( x)  1 log11.1  log13.2  log15.1  log11.5 
log x g 
n
4
1
1
 1.04532  1.12057  1.17898  1.10607   4.40557   1.10139 . So
4
4
x g  10 1.11217  12 .6297 . I got the last result by putting 1.10139 into the calculator and pressing

 

'inverse' and then 'log x.'
Notice that the original numbers and all the means are between 11.1 and 15.1. In spite of
everything that I said, there are many of you who think that: (i) You can find a sum of squares by
summing numbers and squaring the sum; (ii) You can find the sum of 1x by adding up the numbers
and taking the reciprocal; (iii) You can find an nth by dividing by n. I can only recommend a
remedial math class (unless, of course, you want to try listening in class and checking out the
homework very carefully.)
3
251y0012 10/11/00
Part III. Do the following problems (25 Points)
1. In a period of 7 days you make the following numbers of sales(in millions):
Day :
1
2
3
4
5
6
Sales: 9.2
10.2
9.2
11.2
19.2
12.2
Compute the following (assuming that the numbers are a sample):
a) Mean Sales (1)
b) The Median (1)
c) The Standard Deviation (3)
d) The 2nd Quintile (2)
Solution: Compute the Following:
Note that x is in order
n  6 ,  x  85 .4 ,
x
2
 1117 .88 ,
 x  x 2
xx
-3.0
–3.0
–2.0
-1.0
0.0
2.0
7.0
0.0
Index x
x2
1
9.2
84.64
2
9.2
84.64
3 10.2 104.04
4 11.2 125.44 or
5 12.2 148.84
6 14.2 201.64
7 19.2 368.64
85.4 1117.88
 x  x   0.00,  x  x 
2
7
14.2
9.00
9.00
4.00
1.00
0.00
4.00
49.00
76.00
 76.00 .
Isn't it wonderful how predictable so many of you are! I strongly recommended that you compute
the variance by the computational formula in both this and the next problem. Many of you ignored me. Two
thirds of those who used the definitional formula got the problem wrong because they had not checked out
the method enough so that they knew what the formula meant.
you seem to have fooled yourselves into believing. Nor is
x
2
 x  x 
2
 x   85.4 as some of
equal to  x  x   85 .4  12 .2 .
2
is not
2
2
2
If you had tried these in any of the homework problems, you would have found that these tricks didn’t work.
Note that, to be reasonable, the mean, median and 2nd quintile must fall between 9.2 and 19.5.
a) x 
 x  85.4  12.2
n
7
b) Just put the numbers in order and pick the middle number, 11.2.
Or formally: position  pn  1  a.b  .58  4.0
x1 p  xa  .b( xa1  xa ) so x1.5  x.5  x 4  .0( x5  x 4 )  11.2
c) s 2 
x
2
 nx 2
n 1

1117 .88  712 .22
 12 .6667 or s 2 
6
 x  x 
n 1
2

76 .00
 12 .6667
6
s  12.6667  3.55903
d) The 2tnd quintile has 40% below it. position  pn  1  a.b  .48  3.2
x1 p  xa  .b( xa1  xa ) so x1.4  x.6  x3  .2( x 4  x3 )  10.2  .2(11.2  10.2)  10.4
I warned you about quintiles - they are fifths, not fourths. This is an excellent warning! You can't answer a
question that you haven't read carefully!
4
251y0012 10/11/00
2. A bank finds that the amounts overdue on its credit cards are the following. . (Assume that the numbers
are a sample.)
Are there reasons why so many of you (i) totally ignored the classes, (ii) decided that the frequency column
fx 2 column by taking each value of fx and squaring it after I
was both f and x , (iii) computed the
had specifically warned you not to?
amount (thousands)
a. Calculate the Cumulative Frequency (1)
b. Calculate The Mean (1)
c. Calculate the Median (2)
d. Calculate the Mode (1)
e. Calculate the Variance (3)
f. Calculate the Standard Deviation (2)
g. Calculate the Interquartile Range (3)
h. Calculate a Statistic showing Skewness and
Interpret it (3)
i. Make an histogram of the Data (Neatness
Counts!)(2)
frequency
0-$1.99999
$2.000-3.99999
$4.000-5.99999
$6.000-7.99999
$8.000-9.99999
$10.000 and up
70
40
40
30
20
0
Solution: x is the midpoint of the class. Our convention is to use the midpoint of 0 to 2, not 1.99999.
class
$0-$1.99999
$2.000-3.99999
$4.000-5.99999
$6.000-7.99999
$8.000-9.99999
n
 f  200 , 
 f x  x 
2
F
x
70 70
40 110
40 150
30 180
20 200
200
fx  780 ,
1.0
3.0
5.0
7.0
9.0
f
 1478.0, and

fx
fx3
fx 2
70
70
70
120 360 1080
200 1000 5000
210 1470 10290
180 1620 14580
780 4520 31020
fx 2  4520 , and
fx 3

 f x  x 
3
x  x f x  x  f x  x 2 f x  x 3
-2.9 -203 588.7 -1707.23
-0.9
-36
32.4
-29.16
1.1
44
48.4
53.24
3.1
93 288.3
893.77
5.1
102 520.2 2653.02
0 1478.0 1863.64
 31020 ,
f x  x   0,

 1863.64. Note that, to be reasonable, the mean, median and
quartiles must fall between 0 and 10. And no, I did not get the 1.0 in the x column by rounding 0.999995,
or, for that matter, by rounding anything else - Think!
a. Calculate the Cumulative Frequency (1): (See above) The cumulative frequency is the whole F column.
b. Calculate the Mean (1): x 
 fx  780  3.9
n
200
c. Calculate the Median (2): position  pn  1  .5201   100 .5 . This is above 70 and below 110, so the
 pN  F 
 .5200   70 
interval is 2-3.99999. x1 p  L p  
 w so x1.5  x.5  2  
 2  3.500
40
f


p


d. Calculate the Mode (1): The mode is the midpoint of the largest group. Since 70 is the largest frequency,
the modal group is 0 to 1.99999 and the mode is 1.000.
e. Calculate the Variance (3): s 2 
s2 
 f x  x 
n 1
2

 fx
2
 nx 2
n 1

4520  200 3.92
 7.42714 or
199
1478 .0
 7.42714
199
f. Calculate the Standard Deviation (2): s  7.42714  2.72528
5
251x0011 10/11/00
g. Calculate the Interquartile Range (3): First Quartile: position  pn  1  .25201  50.25 . This is above
 pN  F 
F  0 and below F  70 , so the group is 0 to 1.99999. x1 p  L p  
 w gives us
 f p 
 .25200   0 
Q1  x1.25  x.75  0  
 2  1.4286 .
70


Third Quartile: position  pn  1  .75201   150 .75 . This is above 150 and below 180, so the group is
 .75 200   150 
6.000 to 7.99999. x1.75  x.25  6  
 2  6.000 .
30


.
IQR  Q3  Q1  6.000 1.4286  4.5714
h. Calculate a Statistic showing Skewness and interpret it (3):
n
k 3
fx 3  3x
fx 2  2nx 3  200 31020  33.94520  2200 3.93
(n  1)( n  2)
199 198 





 0.00507588 1863 .6  9.45940 .
or k 3 
or g 1 
n
(n  1)( n  2)
k3
s
3

 f x  x 
9.45942
2.72528 3
3

200
1863 .64   9.4647
199 198 
 0.467339
3mean  mode 33.9  1.0

 3.1923
std .deviation
2.72528
Because of the positive sign, the measures imply skewness to the right.
i. Make an histogram of the Data (Neatness Counts!)(2): A histogram is a bar graph of the frequency.
The first bar is between 0 and 2 on the x axis (or has a midpoint at 1) and has a height of 70.
or Pearson's Measure of Skewness SK 
6
Download