251y0311 - West Chester University`s

advertisement
251y0311 2/19/03
ECO251 QBA1
FIRST HOUR EXAM
February 21, 2003
Name: ______Key____________
Social Security Number: _____________________
Part I. (34 points)
1.
Which of the following is NOT a reason for the need for sampling?
a) It is usually too costly to study the whole population.
b) It is usually too time consuming to look at the whole population.
c) It is sometimes destructive to observe the entire population.
d) *It is always more informative to investigate a sample than the entire population.
2.
Most analysts focus on the cost of tuition as the way to measure the cost of a college education.
But incidentals, such as textbook costs, are rarely considered. A researcher at Drummand
University wishes to estimate the textbook costs of first-year students at Drummand. To do so, she
monitored the textbook cost of 250 first-year students and found that their average textbook cost
was $300 per semester. Identify the population of interest (target population) to the researcher.
a) All Drummand University students.
b) All college students.
c) *All first-year Drummand University students.
d) The 250 students that were monitored.
3.
Which of the following is a continuous quantitative variable?
a) The color of a student’s eyes
b) The number of employees of an insurance company
c) *The amount of milk produced by a cow in one 24-hour period
d) The number of gallons of milk sold at the local grocery store yesterday
4.
If I describe the place where a number is in a table as column 3, row 5, the location of that number
is a:
a) Field
b) *Cell
c) Stub
d) Label
1
251y0311 2/19/03
TABLE 2-1
An insurance company evaluates many numerical variables about a person
before deciding on an appropriate rate for automobile insurance. A
representative from a local insurance agency selected a random sample of
insured drivers and recorded, X, the number of claims each made in the last
3 years, with the following results.
X
f
fx
1
14
14
2
18
36
3
12
36
4
5
20
5
1
5
50 111
5. Referring to Table 2-1, how many drivers are represented in the sample?
a) 5
b) 15
c) 18
d) *50
6. Referring to Table 2-1, how many total claims are represented in the sample?
a) 15
b) 50
c) *111
d) 250
7. When constructing charts, the following is plotted at the class midpoints:
a) *frequency histograms.
b) percentage polygons.
c) cumulative relative frequency ogives.
d) All of the above.
8. Which of the following is NOT a reason for drawing a sample?
a) A sample is less time consuming than a census.
b) A sample is less costly to administer than a census.
c) *A sample is usually a good representation of the target population.
d) A sample is less cumbersome and more practical to administer.
2
251y0311 2/19/03
TABLE 2-4
A survey was conducted to determine how people rated the quality of
programming available on television. Respondents were asked to rate the
overall quality from 0 (no quality at all) to 100 (extremely good quality).
The stem-and-leaf display of the data is shown below.
Stem
Leaves
3
24
4
03478999
5
0112345
6
12566
7
01
8
9
2
9. Referring to Table 2-4, what percentage of the respondents rated overall television quality with a
rating of 50 or below?
a) 0.11
b) 0.40
c) *0.44 (11 out of 25)
d) 0.56
TABLE 2-11
The ordered array below resulted from taking a sample of 25 batches of 500
computer chips and determining how many in each batch were defective.
Defects
1, 2, 4, 4, 5, 5, 6, 7, 9, 9, 12, 12, 15, 17, 20, 21, 23, 23, 25, 26, 27, 27, 28, 29, 29
Class
0 – 5.99
5 – 9.99
10 – 24.99
15 - 20.99
20 – 24.99
25 – 29.99
Total
Frequency
4
6
2
2
4
7
25
f 
Rel Frequency
.16
.24
.08
.08
.16
.28
1.00
 f rel 
10. Referring to Table 2-11, if a frequency distribution for the defects data is constructed, using "0 but
less than 5" as the first class, the frequency of the “20 but less than 25” class would be __4______.
11. Referring to Table 2-11, if a frequency distribution for the defects data is constructed, using "0 but
less than 5" as the first class, the relative frequency of the “15 but less than 20” class would be
___.08_____.
3
251y0311 2/19/03
TABLE 2-13
A research analyst was directed to arrange raw data collected on the yield of
wheat, ranging from 40 to 93 bushels per acre, in a frequency distribution.
12. If the researcher was directed to present the data in 5 classes, what should the class interval be?
Show your calculations. Solution:
93  40
 10.6 Use 11 or more.
5
13. Show the actual intervals you might use. (I used 12 as my width – some used 15.)
Class
A
B
C
D
E
From
40
52
64
76
88
to
51.99
63.99
75.99
87.99
99.99
14. Which of the following is NOT sensitive to extreme values?
a) The range.
b) The standard deviation.
c) *The interquartile range.
d) The coefficient of variation.
15. In right-skewed distributions, which of the following is the correct statement? (Q2 and the
median are the same.)
a) The distance from Q1 to Q2 is larger than the distance from Q2 to Q3.
b) *The distance from Q1 to Q2 is smaller than the distance from Q2 to Q3.
c) The mean is smaller than the median.
d) The mode is larger than the mean.
16. In a perfectly symmetrical distribution
a) the range equals the interquartile range.
b) the interquartile range equals the mean.
c) *the median equals the mean.
d) the variance equals the standard deviation
17. Evaluate the following statements:
(i) If every individual in the population is equally likely to be chosen to be in the sample, we must be
taking a simple random sample.
(ii) The sum of cumulative frequencies in a distribution always equals 1.
(iii) The Chebyschev inequality says that least 1/9 of the data must be 3 standard deviations or more from
the mean.
a) Only the first is true.
b) Only the second is true.
c) Only the third is true.
d) *None are true.
e) All are true.
A simple random sample of n must also have all possible samples of n equally likely. The
relative frequencies add to 1. At most 1/9 of the data is 3 or more standard deviations from the
mean.
4
251y0311 2/19/03
Part II.
My Social Security Number is 265398248. If I write it in clumps of 2 numbers I get:
26, 53, 98, 24, 8.
Write your social security number the same way.
Compute the following:
a) The Median (1)
b) The Standard Deviation (3)
c) The 2nd Quintile (2)
Solution: The numbers in order are 8, 24, 26, 53, 98.
x
x2
8
64
x1
24
576
x2
x3
x4
x5
Total
26
676
53
2809
98
9604
209
13729
a) The middle number is 26.
b) n  5, x 
s2 
x
2
 x  209  41.80 ,
n
 nx
n 1
5
2

13729  541 .80 2
4
4992 .8
 1248 .2 . So s  1248.2  35.3299
4
c) pn  1  .46  2.4 . So a  2 and .b  .4

x1 p  xa  .b( xa1  xa ) so
x1.4  x.6  x 2  0.4( x3  x 2 )
 24  0.6(26  24 )  25 .2
5
251y0311 2/19/03
ECO251 QBA1
FIRST EXAM
February 21, 2003
TAKE HOME SECTION
Name: _________________________
Social Security Number: _________________________
Throughout this exam show your work! Please indicate clearly what sections of the problem you are
answering and what formulas you are using.
Part III. Do all the Following (11 Points) Show your work!
1. My Social Security Number is 265398248. If I use each digit as a frequency in and the intervals below, I
get:
Class
frequency
$300- 399.99
$400- 499.99
$500- 599.99
$600- 699.99
$700- 799.99
$800- 899.99
$900- 999.99
$1000-1099.99
$1100-1199.99
Replace my social security number with your own in the
frequency. To make the problem easier, you may replace all
zeros in your new frequency column with 10s.
Assume that this data represents a sample of rents paid in
Chester County.
a. Calculate the Cumulative Frequency (0.5)
b. Calculate The Mean (0.5)
c. Calculate the Median (1)
d. Calculate the Mode (0.5)
e. Calculate the Variance (1.5)
f. Calculate the Standard Deviation (1)
g. Calculate the Interquartile Range (1.5)
h. Calculate a Statistic showing Skewness and Interpret it
(1.5)
i. Make a histogram of the Data showing relative or
percentage frequency (Neatness Counts!)(1)
j. Extra credit: Put a (horizontal) box plot below the
histogram using the same scale. (1)
2
6
5
3
9
8
2
4
8
Solution: x is the midpoint of the class. Our convention is to use the midpoint of 0 to 2, not 1.99999. Note
also, that the midpoints have been divided by 10. Most numbers should be multiplied by 10, the variance
should be multiplied by 100 and k 3 by 1000.
f F x
class
A 300- 399.99
B 400- 499.99
C 500- 599.99
D 600- 699.99
E 700- 799.99
F 800- 899.99
G 900- 999.99
H 1000-1099.99
I 1100-1199.99
n
2
6
5
3
9
8
2
4
8
47
2 35
8 45
13 55
16 65
25 75
33 85
35 95
39 105
47 115
 f  47,  fx
 f x  x 
2
70
270
275
195
675
680
190
420
120
3695
 3695 ,
 28285, and
fx 2
fx
fx3 x  x
2450
85750
12150
546750
15125
831875
12675
823875
50625 3796875
57800 4913000
18050 1714750
44100 4630500
105800 12167000
318775 29510375
 fx
2
 f x  x 
3
 318775 ,
-43.6170
-33.6170
-23.6170
-13.6170
-3.6170
6.3830
16.3830
26.3830
36.3830
 fx
3
f x  x  f x  x 2 f x  x 3
-87.234
-201.702
-118.085
-40.851
-32.553
51.064
32.766
105.532
291.064
0.001
 29510375 ,
3804.9 -165958
6780.6 -227944
2788.8 -65864
556.3
-7575
117.7
-426
325.9
2080
536.8
8794
2784.2
73457
10589.8 385287
28285.0
1851
 f x  x   0,
 1851. Note that, to be reasonable, the mean, median and
quartiles must fall between 0 and 180.
a. Calculate the Cumulative Frequency (1): (See above) The cumulative frequency is the whole F column.
6
251y0311 2/19/03
b. Calculate the Mean (1): x 
 fx  3895  78.6170
n
47
c. Calculate the Median (2): position  pn  1  .548   24 . This is above F  16 and below F  25, so
 pN  F 
the interval is E, 70-79.999 in hundreds. x1 p  L p  
 w so
 f p 
 .547   16 
x1.5  x.5  70  
 10   70  0.83333 10  78 .3333
9


d. Calculate the Mode (1) The mode is the midpoint of the largest group. Since 9 is the largest frequency,
the modal group is E, 700 to 799.99 and the mode is 75 (in hundreds).
e. Calculate the Variance (3): s 2 
s2 
 f x  x 
n 1
2

 fx
2
 nx 2
n 1

318775  47 78 .6170 2 28285 .264

 614 .897 or
46
46
28285 .0
 614 .891 . The computer got 614.894.
46
f. Calculate the Standard Deviation (2): s  614.897  24.7971 or s  614.891  24.7970
g. Calculate the Interquartile Range (3): First Quartile: position  pn  1  .2548   12 . This is above
 pN  F 
F  8 and below F  13, so the interval is C, 500-599.99. x1 p  L p  
 w gives us, in hundreds,
 f p 
 .2547   8 
Q1  x1.25  x.75  50  
 10   50 .75 .
5


Third Quartile: position  pn  1  .7548   36 . This is above F  35 and below F  39, so the interval
 .7547   35 
is H, 1000-1199.99. x1.75  x.25  100  
 10   100 .625 .
4


IQR  Q3  Q1  100.625  50.75  49.875 .
h. Calculate a Statistic showing Skewness and interpret it (3):
n
k 3
fx 3  3x
fx 2  2nx 3  47 29510376  378.6170 318775  247 78.6170 3
(n  1)( n  2)
4645 





 0.0227053 1836   41.687 .
or k 3 
or g 1 
n
(n  1)( n  2)
k3
s
3

 f x  x 
42
24 .797 3
3

47
1851   42.028 The computer gets 42.062 and 41.959
46 45 
 .00275
3mean  mode  378 .6170  75 

 0.4376
std .deviation
24 .797
Because of the positive sign, the measures imply skewness to the right.
i. A histogram is a simple bar graph with frequency on the y-axis and the numbers 300-1200 on the x-axis.
j. The box plot should show the median and the quartiles.
or
Pearson's Measure of Skewness SK 
7
251x0311 2/13/03
2. My Social Security Number is 265398248. If I write it in clumps of 2 numbers I get:
26, 53, 98, 24, 8.
Write your social security number the same way.
For these five numbers , compute the a) Geometric Mean b) Harmonic mean, c) Root-mean-square (1point
each). Label each clearly. If you wish, d) Compute the geometric mean using natural or base 10 logarithms.
(1 points extra credit each ). While your at it, compute the sample mean and bring it to the exam (no credit –
but it won’t hurt).
Solution: Note that
 x  209 . This is not used in any of the following calculations and there is no reason
why you should have computed it!
a) The Geometric Mean.
1
x g  x1  x 2  x3  x n  n  n
 25928448
x 
5
26 539824 8  5 25928448
 25928448

1
5
0.2  30.3917 .
b) The Harmonic Mean.
1 1

xh n
1 1
1
 x  5  26  53  98  24  8   5 0.0384615  0.0188679  0.010204  0.00036099
1
1
1
1
1
 0.125  
1
0.19289454
5

1
1

 25 .9208
1
1 0.03857891
n
x
c) The Root-Mean-Square.
1
1
1
2
x rms

x 2  26 2  53 2  98 2  24 2  8 2  676  2809  9604  576  64 
n
5
5
 0.03857891 .


So xh 



1
13729   2745 .8 . So x rms 
5
1
n
x
2
 2745 .8  52 .4004 .
d) (i)
 
ln x g 
1
n
 ln( x)  5 ln 26   ln 53  ln 98   ln 24   ln 8  5 3.25809  3.97029  4.58497  3.17805  2.07944 
1
1
1
17 .0709   3.4142 . So x g  e 3.4142  30 .3917 .
5
(ii)
1
log( x)  1 log26   log 53   log 98   log24   log8 
log x g 
n
5
1
1
 1.41497  1.72428  1.99123  1.38021  0.90309   7.41378   1.48276 . So
5
5

 

x g  10 1.48276  30 .3917 .
Notice that the original numbers and all the means are between 8 and 98.
8
Download