answer - Hatem Masri

advertisement
Problem (1)
(1) Recently a survey was conducted involving customers of a fitness center in Dallas, Texas.
Participants were asked to indicate how often they use the club by checking one of the following
categories: 0 – 1 time per week; 2-3 times per week; 4-5 times per week; more than 5 times. The
following data show how males and females responded to this question.
One of the purposes of the survey was to determine whether there is a relationship between
the gender of the customer and the number of visits made each week.
a. State the appropriate null and alternative hypothesis.
ANSWER:
H o : number of visits is independent of gender
H A : number of visits is related to gender
b. What test procedure is appropriate to use to conduct this test?
ANSWER:
Because the data are observed frequencies in various discrete categories, the appropriate test
to use is the chi-square contingency analysis. This involves determining the expected frequencies
assuming the null hypothesis is true and then comparing these expected frequencies, cell by cell, to the
observed frequencies. If these closely match, then the null hypothesis should not be rejected.
However, if there is a big difference between the expected and observed cell frequencies, we should
reject the null hypothesis.
c. Conduct the hypothesis test using an alpha = .05 level.
ANSWER:
The test statistic for performing a chi-square contingency analysis is computed as
follows:
r
c
 2  
i 1 j 1
(oij  eij ) 2
eij
with d.f. = (r – 1)(c – 1).
The first step needed is to compute the expected cell frequencies. This is done under the
assumption that the null hypothesis is true and that the proportion of customers in each use level is the
same regardless of gender. The expected frequencies can be computed using:
Expected Frequency =
row total x column total
.
grand total
For example for the cell corresponding to males who use the center 0-1 times per week, we get:
Expected Frequency =
172 x 150
= 54.3158.
475
The following shows the expected cell frequencies for each cell:
( o  e) 2
. For example in the cell for males and use between 0 and 1,
e
Next for each cell we compute:
we get:
(41  54.3158) 2
 3.26 .
54.3158
Below we show the computation for each cell:
0-1
2-3
4-5
over 5
Males 3.264433 0.822572 2.59585 0.531475
Females 1.853077 0.466939 1.473552 0.301696
The chi-square test statistic is computed by summing these values giving
r
c
 2  
i 1 j 1
(oij  eij ) 2
eij
=11.309.
The critical value for the contingency analysis test with (2-1) x (4-1) = 3 degrees of freedom and alpha
equal .05 is found in the chi-square table to be 7.8147.
The decision rule is:
If  2 > 7.8147, reject the null hypothesis
Otherwise, do not reject.
Since  2 =11.309 > 7.8147, reject the null hypothesis and conclude that use rate is
related to gender of the customer.
Problem (2)
The following regression output is the result of a multiple regression application in which we are
interested in explaining the variation in retail price of personal computers based on three independent
variables, CPU speed, RAM, and hard drive capacity. However, some of the regression output has been
omitted.
Given this information and your knowledge of multiple regression, what is the value for the standard
error of the estimate?
ANSWER:
The standard error of the estimate is a measure of the deviation of the fitted y values around
the actual y values. The formula for the standard error is:
SEE  MSE
However, the MSE is not given directly in the above output. We can compute it using the following
formula:
MSE 
SSE
n  k 1
where SSE = Sum of Squares Error or Sum of Squares Residual and n = 36 and k = 3. The Sum of Squares
Residual is not given either, but can be computed as:
SSE  TSS  SSR
and TSS and SSR are given in the ANOVA section of the regression output. Thus, we get:
SSE = 49,327,250 - 34,335,283 = 14,991,967
Therefore, we get:
MSE 
14,991,967
36  3  1 = 468,499
Then, the standard error of the estimate is:
SEE  MSE  468,499  684.47
Download