Uploaded by Gurnoor Singh Malhotra

623da69c4218737c138b4586

advertisement
Business Statistics, 9e (Groebner/Shannon/Fry)
Chapter 13
Goodness‐of‐Fit Tests and Contingency Analysis
1) A goodness‐of‐fit test can be used to determine whether a set of sample data comes from a specific hypothesized
population distribution.
Answer: TRUE
2) If the test statistic for a chi‐square goodness‐of‐fit test is larger than the critical value, the null hypothesis should be rejected.
Answer: TRUE
3) The logic behind the chi‐square goodness‐of‐fit test is based on determining how far the actual observed frequencies are
from the expected frequencies.
Answer: TRUE
4) The goodness‐of‐fit test is always a one‐tail test with the rejection region in the upper tail.
Answer:
TRUE
5) When the expected cell frequencies are smaller than 5, the cells should be combined in a meaningful way such that the
expected cell frequencies do exceed 5.
Answer: TRUE
6) The reason that a decision maker might want to combine groups before performing a goodness‐of‐fit test is to avoid
accepting the null hypothesis due to an inflated value of the test statistic.
Answer: FALSE
7) In a goodness‐of‐fit test, when the null hypothesis is true, the expected value for the chi‐square test statistic is zero.
Answer: TRUE
8) The Conrad Real Estate Company recently conducted a statistical test to determine whether the number of days that
homes are on the market prior to selling is normally distributed with a mean equal to 50 days and a standard deviation equal
to 10 days. The sample of 200 homes was divided into 8 groups to form a grouped data frequency distribution. The degrees
of freedom for the test will be 7.
Answer: TRUE
9) The Conrad Real Estate Company recently conducted a statistical test to determine whether the number of days that
homes are on the market prior to selling is normally distributed with a mean equal to 50 days and a standard deviation equal
to 10 days. The sample of 200 homes was divided into 8 groups to form a grouped data frequency distribution. If a chi‐square
goodness‐of‐fit test is to be conducted using an alpha = .05, the critical value is 14.0671.
Answer: TRUE
10) A business with 5 copy machines keeps track of how many copy machines need service on a given day. It believes this is
binomially distributed with a probability of p = 0.2 of each machine needing service on any given day. It has collected the
following based on a random sample of 100 days.
X
0
1
2
3
4 or 5
Frequency
28
38
22
7
5
Given this information, assuming that all expected values are sufficiently large to use the classes as shown above, the critical
value for testing the hypothesis will be based on 5 degrees of freedom.
Answer: FALSE
11) Given this information the expected number of days on which exactly 1 machine breaks down is 40.96. Answer: TRUE
12) Given this information, assuming that all expected values are sufficiently large to use the classes as shown above, the
critical value based on a 0.05 level of significance is 9.4877.
Answer: TRUE
13) It is believed that the number of drivers who are ticketed for speeding on a particular stretch of highway is a Poisson
distribution with a mean of 3.5 per hour. A random sample of 100 hours is selected with the following results:
X
0
1
2
3
4
5
6
7
8
9
Frequency
5
10
20
18
20
15
4
6
1
2
Given this information, and without regard to whether there is a need to combine cells due to expected cell frequencies, the
critical value for testing whether the distribution is Poisson with a mean of 3.5 per hour at an alpha level of .05 is x2 = 15.5073.
Answer: FALSE
13‐1
Copyright © 2014 Pearson Education, Inc.
14) Given this information, it can be seen that the cells will need to be combined since the actual number of occurrences at
some levels of x is less than 5.
Answer: FALSE
15) If the sample size is large, the standard normal distribution can be used in place of the chi‐square in a goodness‐of‐fit test
for testing whether the population is normally distributed.
Answer: FALSE
16) By combining cells we guard against having an inflated test statistic that could have led us to incorrectly accept the null
hypothesis.
Answer: FALSE
17) If any of the observed frequencies are smaller than 5, then categories should be combined until all observed frequencies
are at least 5.
Answer: FALSE
18) A lube and oil change business believes that the number of cars that arrive for service is the same each day of the week. If
the business is open six days a week (Monday ‐ Saturday) and a random sample of n = 200 customers is selected, the expected
number that will arrive on Monday is about 33.33.
Answer: TRUE
19) The sum of the expected frequencies over the six days cannot be determined without seeing the actual sample data.
Answer: FALSE
20) The critical value for testing the hypothesis using a goodness‐of‐fit test is x2 = 9.2363 if the alpha level for the test is set
at .10.
Answer: TRUE
21) A goodness‐of‐fit test can decide whether a set of data comes from a specific hypothesized distribution.
Answer: TRUE
22) If the calculated chi‐square statistic is large, this is evidence to suggest the fit of the actual data to the hypothesized
distribution is not good, and H0 should be rejected.
Answer: TRUE
23) The goodness‐of‐fit test is essentially determining if the test statistic is significantly larger than zero. Answer:
FALSE
24) By combining cells we guard against having an inflated test statistic that could have led us to incorrectly reject the H0.
Answer:
TRUE
25) If one or more parameters are left unspecified in a goodness‐of‐fit test, they must be estimated from the sample data and
one degree of freedom is lost for each parameter that must be estimated.
Answer: TRUE
26) The sampling distribution for a goodness‐of‐fit test is the Poisson distribution.
Answer:
27) Contingency analysis helps to make decisions when multiple proportions are involved.
28) Contingency analysis is used only for numerical data.
Answer:
FALSE
Answer:
TRUE
FALSE
29) Managers use contingency analysis to determine whether two categorical variables are independent of each other.
Answer: TRUE
30) A survey was recently conducted in which males and females were asked whether they owned a laptop personal
computer. The following data were observed:
Males
Females
Have Laptop
120
70
No Laptop
50
60
Given this information, the sample size in the survey was 300 people.
Answer: TRUE
31) Given this information, if having a laptop is independent of gender, the expected number of males with laptops in this
survey is 150.
Answer: FALSE
13‐2
Copyright © 2014 Pearson Education, Inc.
32) Given this information, if an alpha level of .05 is used, the critical value for testing whether the two variables are
independent is x2 = 3.8415.
Answer: TRUE
33)Given this information, if an alpha level of .05 is used, the sum of the expected cell frequencies will be equal to the sum of
the observed cell frequencies.
Answer: TRUE
34) Given this information, if an alpha level of .05 is used, the test statistic for determining whether having a laptop is
independent of gender is approximately 14.23.
Answer: FALSE
35) When the variables of interest are both categorical and the decision maker is interested in determining whether a
relationship exists between the two, a statistical technique known as contingency analysis is useful.
Answer: TRUE
36) In conducting a test of independence for a contingency table that has 4 rows and 3 columns, the number of degrees of
freedom is 11.
Answer: FALSE
37) A study was recently conducted in which people were asked to indicate which new medium was their preferred choice
for national news. The following data were observed:
radio
television newspaper
under 21
30
50
5
21‐40
20
25
30
41 and over
30
30
50
Given this data, if we wish to test whether the preferred news source is independent of age, the expected frequency in the cell,
radio—under 21 cell is 30. Answer: FALSE
38) A cell phone company wants to determine if the use of text messaging is independent of age. The follow data has been
collected from a random sample of their customers.
Regularly use text Do not regularly
messaging
use text messaging
Under 21
82
38
21‐39
57
34
40 and over
6
83
Using the data above, in order to test for the independence of age and the use of text messaging, the expected value for the
ʺunder 21 and regularly use text messagingʺ cell is 82.
Answer: FALSE
39) A study was recently conducted in which people were asked to indicate which news medium was their preferred choice
for national news. The following data were observed:
radio
television newspaper
under 21
30
50
5
21‐40
20
25
30
41 and over
30
30
50
Given this data, if we wish to test whether the preferred news source is independent of age with an alpha equal to .05, the
critical value will be a chi‐square value with 9 degrees of freedom.
Answer: FALSE
40) Given this data, if we wish to test whether the preferred news source is independent of age, the cell with the largest
expected cell frequency is also the cell with the largest observed frequency.
Answer: FALSE
41) Given this data, if we wish to test whether the preferred news source is independent of age, for an alpha = .05 level, the
critical value from the chi‐square table is based on 8 degrees of freedom.
Answer: FALSE
42) Given this data, if we wish to test whether the preferred news source is independent of age, for an alpha = .05 level, the
critical value from the chi‐square table is 9.4877.
Answer: TRUE
43) Given this data, if we wish to test whether the preferred news source is independent of age, for an alpha = .05 level, the
test statistic is computed to be approximately 40.70.
Answer: TRUE
44) A cell phone company wants to determine if the use of text messaging is independent of age. The following data has been
collected from a random sample of customers.
13‐3
Copyright © 2014 Pearson Education, Inc.
Under 21
21‐39
40 and over
Regularly use text messaging
82
57
6
Do not regularly use text messaging
38
34
8
Using this data, if we wish to test whether the preferred news source is independent of age using a 0.05 level of significance,
the critical value is 5.9915.
Answer: TRUE
45) In order to apply the chi‐square contingency methodology for quantitative variables, we must first break the quantitative
variable down into discrete categories.
Answer: TRUE
46) A study was recently done in the United States in which car owners were asked to indicate whether their most recent car
purchase was a U.S. car, a German car, or a Japanese car. The people in the survey were divided by geographic region in the
United States. The following data were recorded.
US
Japanese
German
East Coast
200
200
50
Central
250
100
20
West Coast
80
300
40
Given this situation, the sample size used in this study was nine.
Answer: FALSE
47) Given this situation, the null hypothesis to be tested is that the car origin is dependent on the geographical location of the
buyer. Answer: FALSE
48) Given this situation, to test whether the car origin is independent of the geographical location of the buyer, the sum of the
expected cell frequencies will equal 1,240.
Answer: TRUE
49) Given this situation, to test whether the car origin is independent of the geographical location of the buyer, the critical
value for alpha = .10 is 14.6837.
Answer: FALSE
50) Given this situation, to test whether the car origin is independent of the geographical location of the buyer, the expected
number of people in the sample who bought a German made car and who lived on the East Coast is just under 40 people.
Answer: TRUE
51) A cell phone company wants to determine if the use of text messaging is independent of age. The following data has been
collected from a random sample of customers.
Regularly use text messaging
Do not regularly use text messaging
Under 21
82
38
21‐39
57
34
40 and over
6
83
To conduct a test of independence, the difference expected value for the ʺ40 and over and regularly use text messagingʺ cell is
just over 43 people.
Answer: TRUE
52) Contingency analysis can be used when the level of data measurement is nominal or ordinal.
Answer:
TRUE
53) To employ contingency analysis, we set up a 2‐dimensional table called a contingency table.
Answer:
TRUE
54) A contingency table and a cross‐tabulation table are two separate things and should not be used for the same purpose.
Answer: FALSE
55) In a contingency analysis, we expect the actual frequencies in each cell to approximately match the corresponding
expected cell frequencies when H0 is true.
Answer: TRUE
56) In a chi‐square contingency test, the number of degrees of freedom is equal to the number of cells minus 1.
Answer: FALSE
57) In a chi‐square contingency analysis application, the expected cell frequencies will be equal in all cells if the null
hypothesis is true.
Answer: FALSE
13‐4
Copyright © 2014 Pearson Education, Inc.
58) Unlike the case of goodness‐of‐fit testing, with contingency analysis there is no restriction on the minimum size for an
expected cell frequency.
Answer: FALSE
59) In a contingency analysis the expected values are based on the assumption that the two variables are independent of each
other.
Answer: TRUE
60) If a contingency analysis test is performed with a 4 × 6 design, and if alpha = .05, the critical value from the chi‐square
distribution is 24.9958
Answer: TRUE
61) If a contingency analysis test performed with a 4 × 6 design results in a test statistic value of 18.72, and if alpha = .05, the
null hypothesis that the row and column variable are independent should be rejected.
Answer: FALSE
62) If the null hypothesis is not rejected, you do not need to worry when the expected cell frequencies drop below 5.0
Answer: TRUE
63) The degrees of freedom for the chi‐square goodness‐of‐fit test are equal to
A) k + 1
B) k – 1
C) k + 2
D) k ‐ 2
, where k is the number of categories.
64) Which of the following statements is true in the context of a chi‐square goodness‐of‐fit test?
A) The degrees of freedom for determining the critical value will be the number of categories minus 1.
B) The critical value will come from the standard normal table if the sample size exceeds 30.
C) The null hypothesis will be rejected for a small value of the test statistic.
D) A very large test statistic will result in the null not being rejected.
65) A walk‐in medical clinic believes that arrivals are uniformly distributed over weekdays (Monday through Friday). It has
collected the following data based on a random sample of 100 days.
Frequency
Mon
25
Tue
22
Wed
19
Thu
18
Fri
16
Total
100
Based on this information how many degrees for freedom are involved in this goodness of fit test?
A) 99
B) 100
C) 4
D) 5
66) Assuming that a goodness‐of‐fit test is to be conducted using a 0.10 level of significance, the critical value is:
A) 9.4877
B) 11.0705
C) 7.7794
D) 9.2363
67) To conduct a goodness‐of‐fit test, what is the expected value for Friday?
A) 20
B) 25
C) 16
D) 100
68) What is the value of the test statistic needed to conduct a goodness‐of‐fit test?
A) 8.75
B) 7.7794
C) 2.46
D) 2.50
69) Based on these data, conduct a goodness‐of‐fit test using a 0.10 level of significance. Which conclusion is correct?
A) Arrivals are not uniformly distributed over the weekday because (test statistic) > (critical value).
B) Arrivals are uniformly distributed over the weekday because (test statistic) > (critical value).
C) Arrivals are not uniformly distributed over the weekday because (test statistic) < (critical value).
D) Arrives are uniformly distributed over the weekday because (test statistic) < (critical value).
70) In a chi‐square goodness‐of‐fit test, by combining cells we guard against having an inflated test statistic that could have
caused us to:
13‐5
Copyright © 2014 Pearson Education, Inc.
A) incorrectly reject the H0.
B) incorrectly accept the H0.
C) incorrectly reject the H1.
D) incorrectly accept the H1.
71) In a goodness‐of‐fit test about a population distribution, if one or more parameters are left unspecified in H0, they must
be estimated from the sample data. This will reduce the degrees of freedom by
for each estimated parameter.
A) 1
B) 2
C) 3
D) None of the above
72) If a sample with n = 60 subjects distributed over 3 categories was selected, a chi‐square test for goodness‐of‐fit will be
used. How many degrees of freedom will be used in determining the chi‐square test statistic?
A) 1
B) 2
C) 16
D) 64
73) Consider a goodness‐of‐fit test with a computed value of chi‐square = 1.273 and a critical value = 13.388, the appropriate
conclusion would be to:
A) reject H0.
B) fail to reject H0.
C) take a larger sample.
D) take a smaller sample.
74) A researcher is using a chi‐square test to determine whether there are any preferences among 4 brands of orange juice.
With alpha = 0.05 and n = 30, the critical region for the hypothesis test would have a boundary of:
A) 7.81
B) 8.71
C) 8.17
D) 42.25
75) A chi‐square test for goodness‐of‐fit is used to test whether or not there are any preferences among 3 brands of peas. If the
study uses a sample of n = 60 subjects, then the expected frequency for each category would be:
A) 20
B) 30
C) 60
D) 33
76) We are interested in determining whether the opinions of the individuals on gun control (as to Yes, No, and No Opinion)
are uniformly distributed.
A sample of 150 was taken and the following data were obtained.
Do you support gun control
Number of Responses
Yes
40
No
60
No Opinion
50
The conclusion of the test with alpha = 0.05 is that the views of people on gun control are:
A) uniformly distributed.
B) not uniformly distributed.
C) inconclusive.
D) None of the above
77) To use contingency analysis for numerical data, which of the following is true?
A) Contingency analysis cannot be used for numerical data.
B) Numerical data must be broken up into specific categories.
C) Contingency analysis can be used for numerical data only if both variables are numerical.
D) Contingency analysis can be used for numerical data only if it is interval data.
78) What does the term observed cell frequencies refer to?
A) The frequencies found in the population being examined
C) The frequencies computed from H0
B) The frequencies found in the sample being examined
D) The frequencies computed from H1
79) What does the term expected cell frequencies refer to?
A) The frequencies found in the population being examined
C) The frequencies computed from H0
B) The frequencies found in the sample being examined
D) the frequencies computed from H1
80) We expect the actual frequencies in each cell to approximately match the corresponding expected cell frequencies when:
A) H0 is false.
B) H0 is true.
C) H0 is falsely accepted.
D) the variables are related to each other.
81) In a contingency analysis, the greater the difference between the actual and the expected frequencies, the more likely:
13‐6
Copyright © 2014 Pearson Education, Inc.
A) H0 should be rejected.
B) H0 should be accepted.
C) we cannot determine H0.
D) the smaller the test statistic will be.
82) In a chi‐square contingency analysis, when expected cell frequencies drop below 5, the calculated chi‐square value tends
to be inflated and may inflate the true probability of
beyond the stated significance level.
A) committing a Type I error
B) committing a Type II error
C) Both A and B
D) All of the above
83) In performing chi‐square contingency analysis, to overcome a small expected cell frequency problem, we:
A) combine the categories of the row and/or column variables. B) increase the sample size.
C) Both A and B
D) None of the above
84) How can the degrees of freedom be found in a contingency table with cross‐classified data?
A) When df are equal to rows minus columns
B) When df are equal to rows multiplied by columns
C) When df are equal to rows minus 1 multiplied by columns minus 1 D) Total number of cell minus 1
85) A cell phone company wants to determine if the use of text messaging is independent of age. The following data has been
collected from a random sample of customers.
Regularly use text messaging
Do not regularly use text messaging
Under 21
82
38
21‐39
57
34
40 and over
6
83
Based on the data above what is the expected value for the ʺunder 21 and regularly use text messagingʺ cell?
A) 82
B) 50
C) 120
D) 58
86) To conduct a contingency analysis, the number of degrees of freedom is:
A) 6
B) 5
C) 3
D) 2
87) To conduct a contingency analysis using a 0.01 level of significance, the value of the critical value is:
A) 15.0863
B) 5.9915
C) 9.2104
D) 11.0705
88) To conduct a contingency analysis, the value of the test statistic is:
A) 9.2104
B) 88.3
C) 275.02
D) 14.6
89) For a chi‐square test involving a contingency table, suppose H0 is rejected. We conclude that the two variables are:
A) curvilinear.
B) linear.
C) related.
D) not related.
90) When testing for independence in a contingency table with 3 rows and 4 columns, there are
A) 5
B) 6
C) 7
D) 12
degrees of freedom.
91) In testing a hypothesis that two categorical variables are independent using the x2 test, the expected cell frequencies are
based on assuming:
A) the null hypothesis.
B) the alternative hypothesis.
C) the normal distribution.
D) the variable are related.
92) A study published in the American Journal of Public Health was conducted to determine whether the use of seat belts in
motor vehicles depends on ethnic status in San Diego County. A sample of 792 children treated for injuries sustained from
motor vehicle accidents was obtained, and each child was classified according to (1) ethnic status (Hispanic or non‐Hispanic)
and (2) seat belt usage (worn or not worn) during the accident. The number of children in each category is given in the table
below.
Hispanic
Non‐Hispanic
Seat belts worn
31
148
Seat belts not worn
283
330
Referring to these data, which test would be used to properly analyze the data in this experiment?
A) x2 test for independence in a two‐way contingency table
B) x2 test for equal proportions in a one‐way table
13‐7
Copyright © 2014 Pearson Education, Inc.
D) x2 goodness‐of‐fit test
C) ANOVA F‐test for interaction in a 2 × 2 factorial design
93) Referring to these data, the calculated test statistic is:
A) approximately ‐0.9991
B) nearly ‐0.1368
C) about 48.1849
D) approximately 72.8063
94) Referring to these data, which of the following conclusions should be reached if the appropriate hypothesis is conducted
using an alpha = .05 level?
A) The mean value for Hispanics is the same as for Non‐Hispanics.
B) There is no relationship between whether someone is Hispanic and whether they wear a seat belt.
C) The use of seat belts and whether a person is Hispanic or not is statistically related.
D) None of the above
95) Many companies use well‐known celebrities as spokespeople in their TV advertisements. A study was conducted to
determine whether brand awareness of female TV viewers and the gender of the spokesperson are independent. Each in a
sample of 300 female TV viewers was asked to identify a product advertised by a celebrity spokesperson. The gender of the
spokesperson and whether or not the viewer could identify the product was recorded. The numbers in each category are
given below.
Male Celebrity
Female Celebrity
Identified product
41
61
Could not identify
109
89
Referring to these sample data, which test would be used to properly analyze the data in this experiment?
A) x2 test for independence in a two‐way contingency table
B) x2 test for equal proportions in a one‐way table
D) x2 goodness‐of‐fit test
C) ANOVA F‐test for main treatment effect
96) Referring to these sample data, if the appropriate hypothesis test is to be conducted using a .05 level of significance,
which of the following is correct critical value?
A) 9.4877
B) 3.8415
C) 1.96
D) 7.8147
97) Referring to these sample data, which of the following values is the correct value of the test statistic?
A) Approximately 9.48
B) Nearly 23.0
C) About 3.84
D) Approximately 5.94
98) Referring to these sample data, if the appropriate null hypothesis is tested using a significance level equal to .05, which of
the following conclusions should be reached?
A) There is a relationship between gender of the celebrity and product identification.
B) There is no relationship between gender of the celebrity and product identification.
C) The mean number of products identified for males is different than the mean number for females.
D) Females have higher brand awareness than males.
99) The degrees of freedom for a contingency table with 11 rows and 10 columns is:
A) 11
B) 10
C) 110
D) 90
100) We want to test whether type of car owned (domestic or foreign) is independent of gender. A contingence table is
obtained from a sample of 990 people as
At alpha = 0.05 level, we conclude that:
A) x2= 3.34 and type of car owned is independent of gender.
C) x2 = 3.84 and type of car owned is independent of gender.
B) x2 = 3.34 and type of car owned is dependent of gender.
D) x2 = 3.84 and type of car owned is dependent of gender.
13‐8
Copyright © 2014 Pearson Education, Inc.
101) The billing department of a national cable service company is conducting a study of how customers pay their monthly
cable bills. The cable company accepts payment in one of four ways: in person at a local office, by mail, by credit card, or by
electronic funds transfer from a bank account. The cable company randomly sampled 400 customers to determine if there is a
relationship between the customerʹs age and the payment method used. The following sample results were obtained:
Based on the sample data, can the cable company conclude that there is a relationship between the age of the customer and
the payment method used? Conduct the appropriate test at the alpha= 0.01 level of significance.
A) Because x2 = 42.2412 > 21.666, do not reject the null hypothesis. Based on the sample data conclude that age and type of
payment are independent.
B) Because x2 = 42.2412 > 21.666, reject the null hypothesis. Based on the sample data conclude that age and type of payment
are not independent
C) Because x2 = 50.3115 > 21.666, do not reject the null hypothesis. Based on the sample data conclude that age and type
of payment are independent.
D) Because x2 = 50.3115 > 21.666, reject the null hypothesis. Based on the sample data conclude that age and type of
payment are not independent.
104) Explain why, in performing a goodness‐of‐fit test, it is sometimes necessary to combine categories.
Answer: Because of the way in which the chi‐square test statistic is computed by squaring the difference between the
observed and expected frequencies, when the expected frequencies are small (less than 5), the calculated test statistic can
become artificially large and therefore may lead to an increased chance of committing a Type I statistical error. That is, a true
null hypothesis may be rejected at a higher rate than indicated by the selected significance level. By combining categories, the
small expected frequencies are grouped to become larger than five and thus the issue of inflated Type I error probability
dissolves.
Note: An alternative to combining categories is to increase the sample size. Large sample sizes result in greater expected cell
frequencies in all categories.
13‐9
Copyright © 2014 Pearson Education, Inc.
Download