Uploaded by abumuhammad.sani

SUMMARY ON STATISTIC C.B.T

advertisement
Practice Test 1
Directions: For each question find the answer that is the best solution provided. There is only one correct
answer.
1.
Facts and figures that are collected, analyzed and summarized for presentation and interpretation
are
a.
data
b.
variables
c.
elements
d.
Both variables and elements are correct.
ANSWER:
a
2.
All the data collected in a particular study are referred to as the
a.
census
b.
inference
c.
variable
d.
data set
ANSWER:
d
3.
Quantitative data
a.
are always nonnumeric
b.
may be either numeric or nonnumeric
c.
are always numeric
d.
are always labels
ANSWER:
c
4.
Qualitative data
a.
are always nonnumeric
b.
may be either numeric or nonnumeric
c.
are always numeric
d.
indicate either how much or how many
ANSWER:
b
5.
Arithmetic operations are inappropriate for
a.
qualitative data
b.
quantitative data
c.
both qualitative and quantitative data
d.
large data sets
ANSWER:
a
6.
In a questionnaire, respondents are asked to record their age in years. Age is an example of a
a.
qualitative variable
b.
quantitative variable
c.
qualitative or quantitative variable, depending on how the respondents answered the
question
d.
ratio variable
ANSWER:
b
7.
In an application for a credit card, potential customers are asked for their social security numbers. A
social security number is an example of a
a.
qualitative variable
b.
quantitative variable
c.
qualitative or quantitative variable, depending on how the respondents answered the question
d.
ratio variable
ANSWER:
a
8.
Data collected at the same, or approximately the same, point in time are
a.
time series data
b.
approximate time series data
c.
cross-sectional data
d.
approximate data
ANSWER:
c
1
9.
Data collected over several time periods are
a.
time series data
b.
time controlled data
c.
cross-sectional data
d.
time cross-sectional data
ANSWER:
a
10.
Statistical studies in which researchers do not control variables of interest are
a.
experimental studies
b.
uncontrolled experimental studies
c.
not of any value
d.
observational studies
ANSWER:
d
11.
Statistical studies in which researchers control variables of interest are
a.
experimental studies
b.
control observational studies
c.
non experimental studies
d.
observational studies
ANSWER:
a
12.
A frequency distribution is
a.
a tabular summary of a set of data showing the fraction of items in each of several
nonoverlapping classes
b.
a graphical form of representing data
c.
a tabular summary of a set of data showing the number of items in each of several
nonoverlapping classes
d.
a graphical device for presenting qualitative data
ANSWER:
c
13.
The sum of frequencies for all classes will always equal
a.
1
b.
the number of elements in a data set
c.
the number of classes
d.
a value between 0 and 1
ANSWER:
b
Exhibit 2-1
The numbers of hours worked (per week) by 400 statistics students are shown below.
Number of hours
Frequency
0- 9
20
10 - 19
80
20 - 29
200
30 - 39
100
14.
Refer to Exhibit 2-1. The class width for this distribution
a.
is 9
b.
is 10
c.
is 39, which is: the largest value minus the smallest value or 39 - 0 = 39
d.
varies from class to class
ANSWER:
b
15.
Refer to Exhibit 2-1. The midpoint of the last class is
a.
50
b.
34
c.
35
d.
34.5
ANSWER:
d
16.
Refer to Exhibit 2-1. The number of students working 19 hours or less
a.
is 80
b.
is 100
2
c.
d.
ANSWER:
is 180
is 300
b
17.
a.
b.
c.
d.
ANSWER:
Refer to Exhibit 2-1. The relative frequency of students working 9 hours or less
is 20
is 100
is 0.95
0.05
d
18.
Refer to Exhibit 2-1. The cumulative relative frequency for the class of 20 - 29
a.
is 300
b.
is 0.25
c.
is 0.75
d.
is 0.5
ANSWER:
c
19.
A graphical device for presenting qualitative data summaries based on subdivision of a circle into
sectors that correspond to the relative frequency for each class is a
a.
histogram
b.
stem-and-leaf display
c.
pie chart
d.
bar graph
ANSWER:
c
20.
Qualitative data can be graphically represented by using a(n)
a.
histogram
b.
frequency polygon
c.
ogive
d.
bar graph
ANSWER:
d
21.
The mean of a sample is
a.
always equal to the mean of the population
b.
always smaller than the mean of the population
c.
computed by summing the data values and dividing the sum by (n - 1)
d.
computed by summing all the data values and dividing the sum by the number of items
ANSWER:
d
22.
After the data has been arranged from smallest value to largest value, the value in the middle is called
the
a.
range
b.
median
c.
mean
d.
None of the other answers are correct.
ANSWER:
b
23.
The 75th percentile is also the
a.
first quartile
b.
second quartile
c.
third quartile
d.
fourth quartile
ANSWER:
c
24.
Which of the following is NOT a measure of location?
a.
mean
b.
median
c.
variance
d.
mode
ANSWER:
c
25.
The measure of location that is the most likely to be influenced by extreme values in the data set is the
3
a.
range
b.
median
c.
mode
d.
mean
ANSWER:
d
26.
The difference between the largest and the smallest data values is the
a.
variance
b.
interquartile range
c.
range
d.
coefficient of variation
ANSWER:
c
27.
The variance of the sample
a.
can never be negative
b.
can be negative
c.
cannot be zero
d.
cannot be less than one
ANSWER:
a
28.
The variance of a sample of 81 observations equals 64. The standard deviation of the sample equals
a.
0
b.
4096
c.
8
d.
6,561
ANSWER:
c
29.
Which of the following symbols represents the size of the sample
a.
s2
b.
s
c.
N
d.
n
ANSWER:
d
30.
Which of the following symbols represents the mean of the sample?
a.
s2
b.
s
c.
m
x
d.
ANSWER:
d
Exhibit 3-2
A researcher has collected the following sample data. The mean of the sample is 5.
3
5
12
3
2
31.
Refer to Exhibit 3-2. The variance is
a.
80
b.
4.062
c.
13.2
d.
16.5
ANSWER:
d
32.
Refer to Exhibit 3-2. The standard deviation is
a.
8.944
b.
4.062
c.
13.2
d.
16.5
ANSWER:
b
33.
For any continuous random variable, the probability that the random variable takes on exactly a specific
value is
4
a.
1.00
b.
0.50
c.
any value between 0 to 1
d.
zero
ANSWER:
d
34.
The highest point of a normal curve occurs at
a.
one standard deviation to the right of the mean
b.
two standard deviations to the right of the mean
c.
approximately three standard deviations to the right of the mean
d.
the mean
ANSWER:
d
35.
A standard normal distribution is a normal distribution with
a.
a mean of 1 and a standard deviation of 0
b.
a mean of 0 and a standard deviation of 1
c.
any mean and a standard deviation of 1
d.
any mean and any standard deviation
ANSWER:
b
36.
Z is a standard normal random variable. The P(1.20 £ z £ 1.85) equals
a.
0.4678
b.
0.3849
c.
0.8527
d.
0.0829
ANSWER:
d
37.
Z is a standard normal random variable. The P(1.05 < z < 2.13) equals
a.
0.8365
b.
0.1303
c.
0.4834
d.
None of the alternative answers is correct.
ANSWER:
b
38.
A numerical measure of linear association between two variables is the
a.
variance
b.
covariance
c.
standard deviation
d.
coefficient of variation
ANSWER:
b
39.
Positive values of covariance indicate
a.
a positive variance of the x values
b.
a positive variance of the y values
c.
the standard deviation is positive
d.
a positive relation between the x and the y variables
ANSWER:
40.
d
Positive values of covariance indicate
a.
a positive variance of the x values
b.
a positive variance of the y values
c.
the standard deviation is positive
d.
a positive relation between the x and the y variables
ANSWER:
d
41. In regression analysis, the variable that is being predicted is the
a.
dependent variable
b.
independent variable
5
c.
intervening variable
d.
None of these answers is correct.
ANSWER:
a
42. Regression analysis was applied between sales (in $1,000) and advertising (in $100), and the following
regression function was obtained.
$ = 80 + 6.2 x
Y
Based on the above estimated regression line, if advertising is $10,000, then the point estimate for sales
(in dollars) is
a.
$62,080
b.
$142,000
c.
$700
d.
$700,000
ANSWER:
d
43.If the coefficient of correlation is 0.8, the percentage of variation in the dependent variable explained by the
estimated regression equation is
a.
0.80%
b.
80%
c.
0.64%
d.
64%
ANSWER:
d
44. If the coefficient of determination is equal to 1, then the coefficient of correlation
a.
must also be equal to 1
b.
can be either -1 or +1
c.
can be any value between -1 to +1
d.
must be -1
ANSWER:
b
Short Answer: Answer all of the following questions. Make sure to show all work. Solutions with no work
will receive no credit.
1.You are given the following data on the price/earnings (P/E) ratios for twelve companies.
Construct a stem-and-leaf display. Specify the leaf unit for the display.
23
8
25
36
39
48
47
28
22
37
37
26
ANSWER: Leaf Unit = 10’s
So place the values in ascending order and then put them under the appropriate stem.
Ordered: 8, 22, 23, 25, 26, 28, 36, 37, 37, 39, 47, 48
0| 8
1|
2| 2
3
5
6
8
3| 6
7
7
9
4| 7
8
2. The grades of 10 students on their first management test are shown below.
94
61
96
66
92
68
75
85
84
78
a.
Construct a frequency distribution. Let the first class be 60 - 69.
b.
Construct a cumulative frequency distribution.
c.
Construct a relative frequency distribution.
ANSWER: Recall that frequency is simply the count. Cumulative frequency is the total count.
Relative frequency shows the values in percent form.
a.
b.
c.
6
Class
Frequency
Cumulative
Relative
Frequency
Frequency
60 - 69
3
3
0.3
70 - 79
2
5
0.2
80 - 89
2
7
0.2
90 - 99
3
10
0.3
Total
10
1.0
Additional Question to Consider: Could you construct a cumulative frequency distribution? Show it above.
3. The hourly wages of a sample of eight individuals is given below.
Individual
A
B
C
D
E
F
G
H
Hourly Wage (dollars)
27
25
20
10
12
14
17
19
For the above sample, determine the following measures:
a.
The mean.
b.
The standard deviation.
c.
The 25th percentile.
ANSWERS:
_
To get the mean we use the formula: x = ∑xi /n= (27+25+….+17+ 19)/8 =
To get standard deviation we use the following formula:
a.
b.
= 18
_
s = s 2 where s2 = ∑ ( xi – x )2 / (n-1)
So we subtract each value from the mean, square it, divide the entire sum by (n-1) and then square root it as
follows:
s2=[(27-18)2 + (25-18)2 + …..+ (17-18)2 + (19-18)2] / (8-1) =
= 36
Since s2 = 36, then s = 36 =6
c. To find Q1 we first order the data and then find L1 = 0.25 * N = 0.25*8=2.
-Ordered Data: 10, 12, 14, 17, 19, 20, 25, 27
-Since it is a whole number is is halfway between the 2nd and 3rd variable. In this case that is 12 and 14. So
Q1=13.
4. Exhibit 6-4: The starting salaries of individuals with an MBA degree are normally distributed
with a mean of $40,000 and a standard deviation of $5,000.
a.Refer to Exhibit 6-4. What is the random variable in this experiment?
b.Refer to Exhibit 6-4. What is the probability that a randomly selected individual with an MBA degree will get
a starting salary of at least $30,000?
c.Refer to Exhibit 6-4. What is the probability that a randomly selected individual with an MBA degree will get
a starting salary of at least $47,500?
ANSWER:
a. the starting salaries of the MBA degree are the variables in the study. We know that MBA salary is the
variable since it takes on a range of values and we are collecting data on it.
b. We want to know what is the P (X > 30,000). Recall to find out this probability we must first convert to zscores and then use the standard normal probability table to find the value.
Z=
(
)
=
(
,
,
,
)
= -2
7
So we want P (z > -2) = 1 – P (z < -2) = 1 – 0.228 = 0.9772
So we want to know
the probability of
being greater than -2.
This is all the area to
the right of -2.
Graphically:
Z
0
-2
c. We use the same idea, but now we want P (X > 47,500).
(
So once again we covert our x to a z-score to find the probability. In this case Z=
)
=
(
,
,
,
)
= 1.5
So we want P (z > 1.5) = P (z >1.5) = 0.0668.
So we want to know
the probability of
being greater than 1.5.
This is all the area to
the right of 1.5, which
Z Is just the tail region
Graphically:
0
1.5
5. A sample of 9 mothers was taken. The mothers were asked the age of their oldest child. You
are given their responses below.
3
12 4
7 14 6 2
9 11
Compute the mean.
Compute the variance.
Compute the standard deviation.
Compute the coefficient of variation.
Determine the 25th percentile.
Determine the median
Determine the 75th percentile.
Determine the range.
a.
b.
c.
d.
e.
f.
g.
h.
ANSWERS:
The first thing I am going to do since we are asked to get items that require the values to be in order…I will
order the date from lowest to highest.
So we have: 2, 3, 4, 6, 7, 9, 11, 12,14
_
a. To get the mean we use the formula: x = ∑xi /n= (2+3+….+12+ 14)/9 =
b. To get variance we use the following formula:
= 7.56
_
s2 = ∑ ( xi – x )2 / (n-1)
So we subtract each value from the mean, square it, divide the entire sum by (n-1)
s2=[(2-7.56)2 + (3-7.56)2 + …..+ (12-7.56)2 + (14-7.56)2] / (9-1) =
c. standard deviation = s =
.
s 2 = 17.78 =4.22
d. Coefficient of variation = CV = *100 =
̅
.
.
*100= 0.558*100 = 55.8
8
= 17.78
e. So Q1 = 0.25*n = 2.25. Recall that we round up, so the value is the 3rd term or 4
f. The median or Q2 is exactly the 5th term or 7
g. For Q3 = 0.75*n = 6.75. We round up to 7 and note the 7th term is 11
h. The range = (largest value – smallest value)= (14 - 2) = 12
6. The following observations are given for two variables.
y
x
5
2
8
12
18
3
20
6
22
11
30
19
10
18
7
9
a.
Compute and interpret the sample covariance for the above data.
b.
Compute and interpret the sample correlation coefficient.
ANSWERS:
Recall that to get Covariance and Correlation coefficient we need to have the mean of both X and Y as well as
the standard deviation of X and Y. I am not going to run through the calculations (as they are the same method
as above), but make sure you can do them. The respective means of X and Y are:
=10
= 15
The respective standard deviations of X and Y are
sx =6.32
sY =8.83
a. So now we just apply the covariance formula which is:
_
_
Covariance = sxy = (1/ n-1) ∑ ( xi – x ) ( yi – y )= (1/ n-1) ∑ (= (1/7) * [(2 – 10) (15 – 5) + (2 – 12) (15 – 8) +
….+ (18-10) (15 – 10) + (10 – 9) (15 – 7)] = 19.286 (rounded). Since the covariance is positive, it indicates a
positive relationship between x and y.
b. Now if we want to get correlation coefficient we recall the formula is:
Now plugging all numbers into the correlation coefficient formula we get:
rxy = sx,y / sx*sY = 19.286 /(6.32)*(8.83) = 0.345. There is a positive relationship between x and y. The
relationship is not very strong.
Descriptive Statistics Practice Exercises
Work these exercises without using a computer. Do use your calculator. At the end of the
document you fill find the answers. If you need more practice, please work the exercises at the end
of the chapters in Howell.
Exercise 1
Students in my undergraduate statistics class, Summer, 2010, were asked to rate how fearful
they were of the course (statophobia), using a scale from 0 (absolutely no fear) to 10 (extreme
sympathetic arousal and crippling emotions). Here are the data for the male students:
9
Statoph a
Frequenc
y
5
1
7
2
10
1
Total
4
a. Gender = Male
For these 4 scores, compute the mean, median, mode, range, sample variance, and sample
standard deviation. Compare the mean to the median and then comment on the shape of the
distribution.
Y
(Y-M)
(Y-M)2
5
-2.25
5.0625
7
-0.25
.0625
7
-0.25
.0625
10
2.75
7.5625
Answer
Exercise 1
Sum = 5+7+7+10=29
Mean = 29/4 = 7.25
Sum of squared deviations from the mean = 5.0625+.0625+.0625+7.5625 = 12.75
Sample variance = 12.75/3 = 4.25
Sample standard deviation = 4.25 = 2.062
The median location is (N+1)/2 = 5/2 = 2.5. The 2.5th score from either tail falls between one 7
and the other 7. The mean of 7 and 7 is 7. The median is 7.
The mean is a bit higher than the median, indicating a bit of positive skewness. If you used
SAS or SPSS to compute the g1 estimate of skewness, it would be +0.713.
Exercise 2
Here are the data for the female students in that same class:
Statoph a
Frequenc
y
5
3
6
4
7
2
8
3
9
2
Total
14
a. Gender = Female
For these 14 scores, compute the mean, median, mode, range, sample variance, and sample
standard deviation.
Y
(Y-M)
(Y-M)2
5
-1.786
3.190
5
-1.786
3.190
5
-1.786
3.190
6
-0.786
.618
6
-0.786
.618
6
-0.786
.618
6
-0.786
.618
10
7
7
8
8
8
9
9
0.21
0.21
1.214
1.214
1.214
2.214
2.214
.044
.044
1.474
1.474
1.474
4.902
4.902
Answer
Exercise 2
Sum = 3(5)+4(6)+2(7)+3(8)+2(9) = 95
Mean = 95/14 = 6.786
Sum of squared deviations of scores from their mean = 3(3.19) + 4(.618) + 2(.044) + 3(1.474)
+ 2(4.902) = 26.356.
Sample variance = 26.356/13 = 2.027
Sample standard deviation = 2.027 = 1.424
The median location is 15/2 = 7.5. The 7.5th score from either tail falls between a 6 and a 7.
The mean of 6 and 7 is 6.5. The median is 6.5.
The mean is slightly greater than the median, indicating a little bit of positive skewness. If
you used SAS or SPSS to compute the g1 estimate of skewness, it would be +0.25.
Exercise 3
Imagine that Sue Cash is a female student in your statistics class and she gets a score of 9 on
the measure of statophobia. Using the sample mean and standard deviation for the female students
in that summer class, convert her score of 9 to a z score.
Suppose that we wanted to convert the statophobia scores to a standard score system with a
mean of 100 and a standard deviation of 15 (like IQ scores). What would Suzie’s score be?
Suppose that Sue is not a woman but rather a man – A Boy Named Sue. Recalculate his z
score and IQ-like score using the sample mean and standard deviation for the male students in that
summer class.
Exercise 3
z=
9 - 6.786
= +1.555
1.424
IQ-like Standard Score = 100 + (1.555)(15) = 123.325
The boy named Sue:
z=
9 - 7 .25
= +0.849
2.062
IQ-like Standard Score = 100 + (.849)(15) = 112.735
The means and standard deviations we have used here are what psychologists call
“normative statistics.” That is, they estimate the characteristics of a particular population of scores.
When computing a standard score, it is important that one use the appropriate norms. Notice that
Sue’s standard score is greatly affected by whether we use the norms for female students or the
norms for male students.
And Now For A Little Fun
To find the answer to each of the below, do the indicated calculation on an eight-digit floating
decimal point calculator and then invert the calculator to read the answer from the upside-down
display.
1. An evil German, Nazi minister of propaganda, 1933-1945.
2
11(2284 + 463)
2. German phrase often said by person in item # 1:
15[(168 3 + 1153(13 )]
10,000
3. Magazines printed on glossy paper (British)
11
53.000001(1,002,926)
4. What roosters always are, hens never are:
2353 2 + 1,915,456
5. What does the Eskimo's fiancée do after accepting his proposal of marriage
First two words:
7334 2 + 2789
1.25(800 )
Second two words: .0912(.867) + .000081
6. Mr. Potatohead's hometown:
2
4(95 - 248)
Answers Calculator Fun
1.
2.
3.
4.
5.
6.
57388309 ® GOEBBELS
71349315 ® SIEG hEIL
53155079 ® GLOSSIES
5537993 ® EGGLESS
53790.345 ® ShE OGLES
35108 ® BOISE
0.0791514 ® hIS IGLOO
CHAPTER THREE
DESCRIPTIVE STATISTICS: NUMERICAL METHODS
In the following multiple choice questions, circle the correct answer.
1.
A numerical value used as a summary measure for a sample, such as sample mean, is known as a
a. population parameter
b. sample parameter
c. sample statistic
d. population mean
2.
m is an example of a
a. population parameter
b. sample statistic
c. population variance
d. mode
3.
The hourly wages of a sample of 130 system analysts are given below.
mean = 60
range = 20
mode = 73
variance = 324
median = 74
The coefficient of variation equals
a. 0.30%
b. 30%
c. 5.4%
d. 54%
4.
The median of a sample will always equal the
a. mode
b. mean
c. 50th percentile
d. all of the above answers are correct
Exhibit 3-1
The following data show the number of hours worked by 200 statistics students.
Number of Hours
Frequency
0- 9
40
10 - 19
50
20 - 29
70
30 - 39
40
5.
Refer to Exhibit 3-1. The number of students working 19 hours or less
12
a. is 40
b. is 50
c. is 90
d. cannot be determined without the original data
6.
Refer to Exhibit 3-1. The cumulative relative frequency for the class of 10 - 19
a. is 90
b. is .25
c. is .45
d. cannot be determined from the information given
7.
The 75th percentile is referred to as the
a. first quartile
b. second quartile
c. third quartile
d. fourth quartile
8.
The difference between the largest and the smallest data values is the
a. variance
b. interquartile range
c. range
d. coefficient of variation
9.
In computing the hinges for data with an odd number of items, the median position is included
a. only in the computation of the lower hinge
b. only in the computation of the upper hinge
c. both in the computation of the lower and upper hinges
d. None of these alternatives is correct.
10.
If a data set has an even number of observations, the median
a. cannot be determined
b. is the average value of the two middle items
c. must be equal to the mean
d. is the average value of the two middle items when all items are arranged in ascending order
11.
The value which has half of the observations above it and half the observations below it is called the
a. range
b. median
c. mean
d. mode
12.
The interquartile range is
a. the 50th percentile
b. another name for the variance
c. the difference between the largest and smallest values
d. the difference between the third quartile and the first quartile
13.
The heights (in inches) of 25 individuals were recorded and the following statistics were calculated
mean = 70
range = 20
mode = 73
variance = 784
median = 74
The coefficient of variation equals
a. 11.2%
b. 1120%
c. 0.4%
d. 40%
14
The variance of a sample of 81 observations equals 64. The standard deviation of the sample equals
a. 9
b. 4096
c. 8
d. 6561
Exhibit 3-2
A researcher has collected the following sample data
13
5
12
6
8
5
6
7
5
12
4
15.
Refer to Exhibit 3-2. The mode is
a. 5
b. 6
c. 7
d. 8
16.
Refer to Exhibit 3-2. The 75th percentile is
a. 5
b. 6
c. 7
d. 8
Exhibit 3-3
A researcher has collected the following sample data. The mean of the sample is 5.
3
5
12
3
2
17.
Refer to Exhibit 3-3. The standard deviation is
a. 8.944
b. 4.062
c. 13.2
d. 16.5
18.
Refer to Exhibit 3-3. The range is
a. 1
b. 2
c. 10
d. 12
Exhibit 3-4
The following is the frequency distribution for the speeds of a sample of automobiles traveling on an interstate
highway.
Speed
Miles per Hour
Frequency
50 - 54
2
55 - 59
4
60 - 64
5
65 - 69
10
70 - 74
9
75 - 79
5
35
9.
Refer to Exhibit 3-4. The mean is
a. 35
b. 670
c. 10
d. 67
20.
Refer to Exhibit 3-4. The standard deviation is
a. 6.969
b. 7.071
c. 48.570
d. 50.000
21.
The interquartile range is used as a measure of variability to overcome what difficulty of the range?
a. the sum of the range variances is zero
b. the range is difficult to compute
c. the range is influenced too much by extreme values
d. the range is negative
22.
In computing descriptive statistics from grouped data,
a. data values are treated as if they occur at the midpoint of a class
b. the grouped data result is more accurate than the ungrouped result
14
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
c. the grouped data computations are used only when a population is being analyzed
d. None of these alternatives is correct.
When should measures of location and dispersion be computed from grouped data rather than from
individual data values?
a. as much as possible since computations are easier
b. only when individual data values are unavailable
c. whenever computer packages for descriptive statistics are unavailable
d. only when the data are from a population
The measure of location which is the most likely to be influenced by extreme values in the data set is
the
a. range
b. median
c. mode
d. mean
The numerical value of the standard deviation can never be
a. larger than the variance
b. zero
c. negative
d. smaller than the variance
The coefficient of variation is
a. the same as the variance
b. the standard deviation divided by the mean times 100
c. the square of the standard deviation
d. the mean divided by the standard deviation
If two groups of numbers have the same mean, then
a. their standard deviations must also be equal
b. their medians must also be equal
c. their modes must also be equal
d. None of these alternatives is correct
Which of the following symbols represents the standard deviation of the population?
a. s2
b. s
c. m
d. x
Which of the following symbols represents the variance of the population?
a. s2
b. s
c. m
d. x
Which of the following symbols represents the mean of the sample?
a. s2
b. s
c. m
d. x
The symbol s is used to represent
a. the variance of the population
b. the standard deviation of the sample
c. the standard deviation of the population
d. the variance of the sample
The mean of the sample
a. is always smaller than the mean of the population from which the sample was taken
b. can never be zero
15
33.
34.
35.
c. can never be negative
d. None of these alternatives is correct.
The measure of dispersion which is not measured in the same units as the original data is the
a. median
b. standard deviation
c. coefficient of determination
d. variance
Positive values of covariance indicate
a. a positive variance of the x values
b. a positive variance of the y values
c. the standard deviation is positive
d. positive relation between the independent and the dependent variables
The coefficient of correlation ranges between
a. 0 and 1
b. -1 and +1
c. minus infinity and plus infinity
d. 1 and 100
The value of the sum of the deviations from the mean, i.e., å( x - x) must always be
a. less than the zero
b. negative
c. either positive or negative depending on whether the mean is negative or positive
d. zero
37.
Since the median is the middle value of a data set it
a. must always be smaller than the mode
b. must always be larger than the mode
c. must always be smaller than the mean
d. None of these alternatives is correct.
38.
In a five number summary, which of the following is not used for data summarization?
a. the smallest value
b. the largest value
c. the mean
d. the 25th percentile
e. the mean
39.
The relative frequency of a class is computed by
a. dividing the midpoint of the class by the sample size
b. dividing the frequency of the class by the midpoint
c. dividing the sample size by the frequency of the class
d. dividing the frequency of the class by the sample size
40.
Which of the following is not a measure of dispersion?
a. mode
b. standard deviation
c. range
d. interqurtile range
41.
Given the following information:
Standard deviation = 8
Coefficient of variation = 64%
The mean would then be
a. 12.5
b. 8
c. 0.64
d. 1.25
PROBLEMS
1.
The hourly wages of a sample of eight individuals is given below.
36.
Hourly Wage
16
2.
3.
4.
5.
6.
Individual
(dollars)
A
27
B
25
C
20
D
10
E
12
F
14
G
17
H
19
For the above sample, determine the following measures:
a. The mean.
b. The standard deviation.
c. The 25th percentile.
Consider the data in the following frequency distribution. Assume the data represent a population.
Class
Frequency
2- 6
2
7 - 11
3
12 – 16
4
17 - 21
1
For the above data, compute the following.
a. The mean
b. The variance
c. The standard deviation
In 1998, the average donation to the Help Way was $225 with a standard deviation of $45. In 1999, the
average donation was $400 with a standard deviation of $60. The donations in which year show a more
dispersed distribution?
The following data show the yearly salaries of football coaches at some state supported universities.
Salary
University
(in $1,000)
A
53
B
44
C
68
D
47
E
62
F
59
G
53
H
94
For the above sample, determine the following measures.
a. The mean yearly salary
b. The standard deviation
c. The mode
d. The median
e. The 70th percentile
The grade point average of the students at UTC is 2.80 with a standard deviation of 0.84. The grade
point average of students at UTK is 2.4 with a standard deviation of 0.84. Which university shows a
more dispersed grade distribution?
A local university administers a comprehensive examination to the recipients of a B.S. degree in
Business Administration. A sample of examinations are selected at random and scored. The results are
17
shown below.
7.
8.
9.
10.
Grade
93
65
80
97
85
87
97
60
For the above data, determine
a. The mean
b. The median
c. The mode
d. The standard deviation
e. The coefficient of variation
The frequency distribution below shows the monthly expenditure on gasoline for a sample of 14
individuals.
Expenditure
Frequency
55 - 59
2
60 - 64
3
65 - 69
4
70 - 74
3
75 - 79
2
a. Compute the mean.
b. Compute the standard deviation.
A researcher has obtained the number of hours worked per week during the summer for a sample of
fifteen students.
40 25 35 30 20 40 30 20 40 10 30 20 10 5 20
Using this data set, compute the
a. median
b. mean
c. mode
d. 40th percentile
e. range
f. sample variance
g. standard deviation
The following is a frequency distribution of grades for a statistics examination.
Examination Grade
Frequency
40 - 49
3
50 - 59
5
60 - 69
11
70 - 79
22
80 - 89
15
90 - 99
6
Treating these data as a sample, compute the following:
a. The mean
b. The standard deviation
c. The variance
d. The coefficient of variation
For the following frequency distribution,
Class
Frequency
45 - 47
3
48 - 50
6
18
11.
12.
13.
14.
15.
51 - 53
8
54 - 56
2
57 - 59
1
a. Compute the mean.
b. Compute the standard deviation. (Assume the data represent a population.)
A sample of 9 mothers was taken. The mothers were asked the age of their oldest child. You are given
their responses below.
3
12 4
7 14 6 2
9 11
a. Compute the mean.
b. Compute the variance.
c. Compute the standard deviation.
d. Compute the coefficient of variation.
e. Determine the 25th percentile.
f. Determine the median
g. Determine the 75th percentile.
h. Determine the range.
The starting salaries of a sample of college students are given below.
Starting Salary
(In Thousands)
Frequency
10 - 14
2
15 - 19
3
20 - 24
5
25 - 29
7
30 - 34
2
35 - 39
1
a. Compute the mean.
b. Compute the variance.
c. Compute the standard deviation.
d. Compute the coefficient of variation.
A sample of charge accounts at a local drug store revealed the following frequency distribution of
unpaid balances.
Unpaid Balance
Frequency
10 - 29
5
30 - 49
10
50 - 69
6
70 - 89
9
90 - 109
20
a. Determine the mean unpaid balance.
b. Determine the standard deviation.
c. Compute the coefficient of variation.
In 2000, the average donation to the Community Kitchen was $900 with a standard deviation of $180.
In 2001, the average donation was $1,600 with a standard deviation of $240. In which year do the
donations show a more dispersed distribution?
The following observations are given for two variables.
y
x
5
2
8
12
18
3
20
6
22
11
30
19
10
18
7
9
a. Compute and interpret the sample covariance for the above data.
b. Compute and interpret the sample correlation coefficient.
19
16.
17.
Compute the weighted mean for the following data.
xi
Weight (wi)
19
12
17
30
14
28
13
10
18
10
The following data show the yearly salaries of a random sample of Chattanooga residents.
Resident
Salary
(In $1,000)
A
97
B
48
C
69
D
85
E
92
F
48
G
79
H
74
For the above sample, determine the following measures (Give your answer in dollars):
a. The mean yearly salary.
b. The standard deviation.
c. The mode.
d. The median.
e. The 70th percentile
CHAPTER THREE
DESCRIPTIVE STATISTICS II: NUMERICAL METHODS
MULTIPLE CHOICE QUESTIONS
In the following multiple choice questions, circle the correct answer.
1.
A numerical value used as a summary measure for a sample, such as sample mean, is known as a
a. population parameter
b. sample parameter
c. sample statistic
d. population mean
e. None of the above answers is correct.
2.
Since the population size is always larger than the sample size, then the sample statistic
a. can never be larger than the population parameter
b. can never be equal to the population parameter
c. can never be zero
d. can never be smaller than the population parameter
e. None of the above answers is correct.
3.
m is an example of a
a. population parameter
b. sample statistic
c. population variance
d. mode
e. None of the above answers is correct.
4.
The mean of a sample
a. is always equal to the mean of the population
b. is always smaller than the mean of the population
c. is computed by summing the data values and dividing the sum by (n - 1)
d. is computed by summing all the data values and dividing the sum by the number of items
e. None of the above answers is correct.
5.
When the smallest and largest percentage of items are removed from a data set and the mean is
computed, the mean of the remaining data is
20
6.
a. the median
b. the mode
c. the trimmed mean
d. any of the above
e. None of the above answers is correct.
In a five number summary, which of the following is not used for data summarization?
a. the smallest value
b. the largest value
c. the median
d. the 25th percentile
e. the mean
7.
Since the mode is the most frequently occurring data value, it
a. can never be larger than the mean
b. is always larger than the median
c. is always larger than the mean
d. must have a value of at least two
e. None of the above answers is correct.
Exhibit 3-1
The following data show the number of hours worked by 200 statistics students.
8.
9.
10.
11.
12.
Number of Hours
Frequency
0- 9
40
10 - 19
50
20 - 29
70
30 - 39
40
Refer to Exhibit 3-1. The class width for this distribution
a. is 9
b. is 10
c. is 11
d. varies from class to class
e. None of the above answers is correct.
Refer to Exhibit 3-1. The number of students working 19 hours or less
a. is 40
b. is 50
c. is 90
d. cannot be determined without the original data
e. None of the above answers is correct.
Refer to Exhibit 3-1. The relative frequency of students working 9 hours or less
a. is .2
b. is .45
c. is 40
d. cannot be determined from the information given
e. None of the above answers is correct.
Refer to Exhibit 3-1. The cumulative relative frequency for the class of 10 - 19
a. is 90
b. is .25
c. is .45
d. cannot be determined from the information given
e. None of the above answers is correct.
The 50th percentile is the
a. mode
b. median
c. mean
d. third quartile
21
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
e. None of the above answers is correct.
The 75th percentile is referred to as the
a. first quartile
b. second quartile
c. third quartile
d. fourth quartile
e. None of the above answers is correct.
The lower hinge is essentially the same as the
a. 10th percentile
b. third quartile
c. second quartile
d. 25th percentile
e. None of the above answers is correct.
The difference between the largest and the smallest data values is the
a. variance
b. interquartile range
c. range
d. coefficient of variation
e. None of the above answers is correct.
The first quartile
a. contains at least one third of the data elements
b. is the same as the 25th percentile
c. is the same as the 50th percentile
d. is the same as the 75th percentile
e. None of the above answers is correct.
In computing the hinges for data with an odd number of items, the median position is included
a. only in the computation of the lower hinge
b. only in the computation of the upper hinge
c. both in the computation of the lower and upper hinges
d. None of the above answers is correct.
Which of the following is not a measure of central location?
a. mean
b. median
c. variance
d. mode
e. None of the above answers is correct.
If a data set has an even number of observations, the median
a. cannot be determined
b. is the average value of the two middle items
c. must be equal to the mean
d. is the average value of the two middle items when all items are arranged in ascending order
e. None of the above answers is correct.
Which of the following is a measure of dispersion?
a. percentiles
b. quartiles
c. interquartile range
d. all of the above are measures of dispersion
e. None of the above answers is correct.
The value which has half of the observations above it and half the observations below it is called the
a. range
b. median
c. mean
d. mode
e. None of the above answers is correct.
The most frequently occurring value of a data set is called the
22
a. range
b. mode
c. mean
d. median
e. None of the above answers is correct.
23.
The interquartile range is
a. the 50th percentile
b. another name for the variance
c. the difference between the largest and smallest values
d. the difference between the third quartile and the first quartile
e. None of the above answers is correct.
24.
The weights (in pounds) of a sample of 36 individuals were recorded and the following statistics were
calculated.
mean = 160
range = 60
mode = 165
variance = 324
median = 170
The coefficient of variation equals
a. 0.1125%
b. 11.25%
c. 203.12%
d. 0.20312%
e. None of the above answers is correct.
25.
The median of a sample will always equal the
a. mode
b. mean
c. 50th percentile
d. all of the above answers are correct
e. None of the above answers is correct.
26.
The standard deviation of a sample of 100 observations equals 64. The variance of the sample equals
a. 8
b. 10
c. 6400
d. 4,096
e. None of the above answers is correct.
27.
The variance of a sample of 81 observations equals 64. The standard deviation of the sample equals
a. 9
b. 4096
c. 8
d. 6561
e. None of the above answers is correct.
Exhibit 3-2
A researcher has collected the following sample data
28.
29.
5
12
6
8
5
6
7
5
12
4
Refer to Exhibit 3-2. The median is
a. 5
b. 6
c. 7
d. 8
e. None of the above answers is correct.
Refer to Exhibit 3-2. The mode is
a. 5
b. 6
c. 7
23
d. 8
e. None of the above answers is correct.
30.
Refer to Exhibit 3-2. The mean is
a. 5
b. 6
c. 7
d. 8
e. None of the above answers is correct.
31.
Refer to Exhibit 3-2. The 75th percentile is
a. 5
b. 6
c. 7
d. 8
e. None of the above answers is correct.
Exhibit 3-3
A researcher has collected the following sample data. The mean of the sample is 5.
3
5
12
3
2
Refer to Exhibit 3-3. The variance is
a. 80
b. 4.062
c. 13.2
d. 16.5
e. None of the above answers is correct.
33.
Refer to Exhibit 3-3. The standard deviation is
a. 8.944
b. 4.062
c. 13.2
d. 16.5
e. None of the above answers is correct.
34.
Refer to Exhibit 3-3. The coefficient of variation is
a. 72.66%
b. 81.24%
c. 264%
d. 330%
e. None of the above answers is correct.
35.
Refer to Exhibit 3-3. The range is
a. 1
b. 2
c. 10
d. 12
e. None of the above answers is correct.
36.
Refer to Exhibit 3-3. The interquartile range is
a. 1
b. 2
c. 10
d. 12
e. None of the above answers is correct.
Exhibit 3-4
The following is the frequency distribution for the speeds of a sample of automobiles traveling on an interstate
highway.
Speed
Miles per Hour
Frequency
50 - 54
2
55 - 59
4
32.
24
60 - 64
65 - 69
70 - 74
75 - 79
37.
38.
39.
40.
41.
42.
43.
44.
45.
5
10
9
5
35
Refer to Exhibit 3-4. The mean is
a. 35
b. 670
c. 10
d. 67
e. None of the above answers is correct.
Refer to Exhibit 3-4. The variance is
a. 6.969
b. 7.071
c. 48.570
d. 50.000
e. None of the above answers is correct.
Refer to Exhibit 3-4. The standard deviation is
a. 6.969
b. 7.071
c. 48.570
d. 50.000
e. None of the above answers is correct.
Which of the following is not a measure of dispersion?
a. the range
b. the 50th percentile
c. the standard deviation
d. the interquartile range
e. the variance
The interquartile range is used as a measure of variability to overcome what difficulty of the range?
a. the sum of the range variances is zero
b. the range is difficult to compute
c. the range is influenced too much by extreme values
d. the range is negative
e. None of the above answers is correct.
If the variance of a data set is correctly computed with the formula using n - 1 in the denominator,
which of the following is true?
a. the data set is a sample
b. the data set is a population
c. the data set could be either a sample or a population
d. the data set is from a census
e. None of the above answers is correct.
In computing descriptive statistics from grouped data,
a. data values are treated as if they occur at the midpoint of a class
b. the grouped data result is more accurate than the ungrouped result
c. the grouped data computations are used only when a population is being analyzed
d. all of the above answers are correct
e. None of the above answers is correct.
The measure of dispersion that is influenced most by extreme values is
a. the variance
b. the standard deviation
c. the range
d. the interquartile range
e. None of the above answers is correct.
When should measures of location and dispersion be computed from grouped data rather than from
25
46.
47.
48.
49.
50.
51.
52.
53.
individual data values?
a. as much as possible since computations are easier
b. only when individual data values are unavailable
c. whenever computer packages for descriptive statistics are unavailable
d. only when the data are from a population
e. None of the above answers is correct.
The descriptive measure of dispersion that is based on the concept of a deviation about the mean is
a. the range
b. the interquartile range
c. both a and b
d. the standard deviation
e. None of the above answers is correct.
The measure of location which is the most likely to be influenced by extreme values in the data set is
the
a. range
b. median
c. mode
d. mean
e. None of the above answers is correct.
The most important statistical descriptive measure of the location of a data set is the
a. mean
b. median
c. mode
d. variance
e. None of the above answers is correct.
The numerical value of the standard deviation can never be
a. larger than the variance
b. zero
c. negative
d. all of the above statements are correct
e. None of the above answers is correct.
The sample variance
a. is always smaller than the true value of the population variance
b. is always larger than the true value of the population variance
c. could be smaller, equal to, or larger than the true value of the population variance
d. can never be zero
e. both c and d are correct answers
The coefficient of variation is
a. the same as the variance
b. the square root of the variance
c. the square of the standard deviation
d. the mean divided by the standard deviation
e. None of the above answers is correct.
The variance can never be
a. zero
b. larger than the standard deviation
c. negative
d. all of the above are correct
e. None of the above answers is correct.
If two groups of numbers have the same mean, then
a. their standard deviations must also be equal
b. their medians must also be equal
c. their modes must also be equal
d. their variances must also be equal
e. None of the above answers is correct.
26
54.
55.
56.
57.
58.
59.
60.
61.
62.
The sum of deviations of the individual data elements from their mean is
a. always greater than zero
b. always less than zero
c. sometimes greater than and sometimes less than zero, depending on the data elements
d. always equal to zero
e. None of the above answers is correct.
Which of the following symbols represents the standard deviation of the population?
a. s2
b. s
c. m
d. x
e. N
Which of the following symbols represents the mean of the population?
a. s2
b. s
c. m
d. x
e. N
Which of the following symbols represents the variance of the population?
a. s2
b. s
c. m
d. x
e. N
Which of the following symbols represents the size of the population?
a. s2
b. s
c. m
d. x
e. N
Which of the following symbols represents the mean of the sample?
a. s2
b. s
c. m
d. x
e. N
Which of the following symbols represents the size of the sample
a. s2
b. s
c. N
d. x
e. n
The symbol s is used to represent
a. the variance of the population
b. the standard deviation of the sample
c. the standard deviation of the population
d. the variance of the sample
e. None of the above answers is correct.
The symbol s2 is used to represent
a. the variance of the population
b. the standard deviation of the sample
27
63.
64.
65.
66.
67.
68.
69.
70.
71.
c. the standard deviation of the population
d. the variance of the sample
e. None of the above answers is correct.
The mean of the sample
a. is always larger than the mean of the population from which the sample was taken
b. is always smaller than the mean of the population from which the sample was taken
c. can never be zero
d. can never be negative
e. None of the above answers is correct.
The variance of the sample
a. can never be negative
b. can be negative
c. cannot be zero
d. cannot be less than one
e. None of the above answers is correct.
The measure of dispersion which is not measured in the same units as the original data is the
a. median
b. standard deviation
c. coefficient of determination
d. variance
e. None of the above answers is correct.
A numerical measure of linear association between two variables is the
a. variance
b. covariance
c. standard deviation
d. coefficient of variation
e. None of the above answers is correct.
Positive values of covariance indicate
a. a positive variance of the x values
b. a positive variance of the y values
c. the standard deviation is positive
d. positive relation between the independent and the dependent variables
e. None of the above answers is correct.
A numerical measure of linear association between two variables is the
a. variance
b. coefficient of variation
c. correlation coefficient
d. standard deviation
e. None of the above answers is correct.
The coefficient of correlation ranges between
a. 0 and 1
b. -1 and +1
c. minus infinity and plus infinity
d. 1 and 100
e. None of the above answers is correct.
The coefficient of correlation
a. is the same as the coefficient of determination
b. can be larger than 1
c. can not be larger than 1
d. can not be negative
e. None of the above answers is correct.
The value of the sum of the deviations from the mean, i.e., å(x - x ) must always be
a. less than the mean
28
72.
73.
74.
75.
76.
77.
b. negative
c. either positive or negative depending on whether the mean is negative or positive
d. zero
e. None of the above answers is correct.
The numerical value of the variance
a. is always larger than the numerical value of the standard deviation
b. is always smaller than the numerical value of the standard deviation
c. is negative if the mean is negative
d. can be larger or smaller than the numerical value of the standard deviation
e. None of the above answers is correct.
Since the median is the middle value of a data set it
a. must always be smaller than the mode
b. must always be larger than the mode
c. must always be smaller than the mean
d. must always be larger than the mean
e. None of the above answers is correct.
During a cold winter, the temperature stayed below zero for ten days
(ranging from -20 to -5). The variance of the temperatures of the ten day period
a. is negative since all the numbers are negative
b. must be at least zero
c. cannot be computed since all the numbers are negative
d. can be either negative or positive
e. None of the above answers is correct.
Since the population is always larger than the sample, the value of the sample mean
a. is always smaller than the true value of the population mean
b. is always larger than the true value of the population mean
c. is always equal to the true value of the population mean
d. could be larger, equal to, or smaller than the true value of the population mean
e. None of the above answers is correct.
The relative frequency of a class is computed by
a. dividing the midpoint of the class by the sample size
b. dividing the frequency of the class by the midpoint
c. dividing the sample size by the frequency of the class
d. dividing the frequency of the class by the sample size
e. None of the above answers is correct.
Which of the following is not a measure of dispersion?
a. mode
b. standard deviation
c. range
d. interqurtile range
e. None of the above answers is correct.
29
Download