MATH 2560 C F03 Elementary Statistics I Solutions to Assignment N3

advertisement

MATH 2560 C F03

Elementary Statistics I

Solutions to Assignment N3

Total points: 50 (2.5 percents).

Question 1: 12 points (1 point is equal to 0 .

05 percents);

Question 2: 20 points;

Question 3: 4 points;

Question 4: 12 points.

Question 5: 2 points.

1 Solution to Question 1.

a). (5 points) Using Excel with Statistical package and CORREL function we obtain the following expression for correlation r : r = 0 .

50430229 .

Squared correlation is equal to: r

2

= 0 .

254321 .

We can see that there is a positive association between US return and overseas return. This is not strong association, rather weak one.

More precisely, r 2 gives us more information about the strength of the association.

The number r 2 = 0 .

25 tells us that only about 1 / 4 the variation is accounted for by the linear relationship.

b). (5 points) Let x be US return and y be overseas retur. Using Excel we find the following data (mean values and standard deviations for x and y , slope and intercept): s y

¯ = 14 .

4 , ¯ = 14 .

33333 , s x s y

= 21 .

64216 ,

= 16 .

4455 ,

= 1 .

315993 , b = 0 .

663658 , a = 4 .

776657 .

s x

Therefore, the equation for the least-squares regression line is:

ˆ = 4 .

77665 + 0 .

663658 × x.

From here, the predicted value of the overseas return ˆ for 1993 is:

ˆ = 11 .

47665 + 0 .

663658 × (10 .

1) = 11 .

479603 .

The observed value for the overseas return for 1993 is 32 .

9 .

c). (2 points) From b) we can see that we are not confident that predictions using the regression line will be quite accurate. Because the difference between predicted value 11 .

479603 and observed value 32 .

9 is large, approximately 32 .

9 − 11 .

5 = 21 .

5 .

It is greater than even mean value for overseas return.

Below you can see the scatterplot and regression line for our data.

2 Solution to Question 2.

a). (5 points) Using Excel with Statistical package we receive the following calculations:

Therefore, the equation of the least-squares regression line takes the following form:

ˆ = 11 .

058 − 0 .

01466 × x.

b). (5 points) Using the data from the section a), and using the Chart

Wizard (Excel) with ”XY(Scatter)” for scatterplot of the raw data and with

”line” functions for regression line we obtain the following plot above.

c). (5 points) The values for the residuals and sum you can find in section a) (left bottom corner). As you can see, the sum of the residuals is equal to

0 .

002 , namely, almost equal to zero.

d). (5 points) Using the data from section a) and using Excel Chart

Wizard with ”XY(Scatter)” we receive the following plot of the residuals against the values of x.

The pattern that we can see show us the same pattern about the zero line as the data points show about the regression line in the scatterplot in b).

3 Solution to Question 3

a). (2 points) The data was selected after the anesthesia was administered.

We did not have any control over whether or not using anesthetics in surgery and neither randomization for assigning ”use” or ”not use” anesthetics for each operation. It means that the National Halothane Study is just an observational study rather than an experiment.

b). (2 points) Some posssible variables that are mixed up with what anesthetic a patient receives are: patient allergy to certain anesthetics, type of surgery, patient’s condition like how healthy the patient was before the surgery, etc.

4 Solution to Question 4.

a). (1 point) The explanatory variable is diet group and the response variable is whether or not subjects get colon cancer.

b). (4 points) The outline of the experiment looks as follows:

c). (4 points) We assume that groups have equal size, namely, 216 in each of the 4 groups. Firstly, we can sort the subjects (in alphabetical order, for example) , then label them 001 , 002 , 003 , 004 , ..., 863 , 864 .

Let us use

Table B starting at line 118 .

The first 216 groups that are less than 864 will denote subjects for the beta carotene group. The first tree-digits numbers from the Table B, line 118, are: 731 , 903 , 253 , 304 , 470 , 269 .

We have to omit 903 because it is greater than 864 .

In this way, subjects labeled

731 , 253 , 304 , 470 , 269 are chosen for the first group. We use treatment groups equal in size.

d). (1 point) The meanning of double-blind is: neither the subject nor the researcher knew which supplement (or placebo) the subject was receiving.

e). (1 point) The meanning of ”no significant difference” is: although the cancer rates may not be exactly the same among the treatment groups, the difference is not larger than might have happened by chance.

f). (1 point) Possible lurking variables are: people who eat more fruits and vegetables may be more health conscious. They may exercise more and see the doctor regularly. They may also eat less ”harmful” foods etc.

5 Solution to Question 5.

(2 points) The chance of being interviewed is 6/60 for people over age 42 and

4/40 for people under age 42. That is 1/10 in both cases. That means every person has the same chance to be interviewed. It is not an SRS because not all combinations of people have an equal chance of being interviwed (an important characteristic of an SRS). For example, groups of ten people all over age 42 have no chance of being interviewed.

Download