Solutions to Assignment Due November 5, 2012

advertisement
EDF 6472 Introduction to Data Analysis in Educational Research
Solutions to Assignment Due November 5, 2012
Hinkle, et al.
Chapter 8
4. The scores on a physical-performance test for boys of junior high school age have
been standardized with a mean of 175 and a standard deviation of 12 for the general
population. In a large city school system, a random sample of 225 boys is tested.
The sample mean is 173.6.
a. Find the standard error of the mean. A standard error is a standard deviation.
Describe the distribution of which this standard error is the standard deviation.

where  X is
N
standard error of the mean, σ is the standard deviation of the population, and N is
the sample size. Note that this is one of those unusual cases where we do know the
standard deviation of the population and do not have to estimate it using the
standard deviation of the sample (s) and calculating s X . So,
The standard error of the mean is found using the formula  X 

12
 .80 . This standard error is the standard deviation of a
N 225
distribution of all possible samples of size N = 225 that could be taken from the
population of boys who have taken this test.
X 

b. Test the hypothesis that the mean performance test scores for the population of
junior high school boys of this school system is 175 against the alternative
hypothesis that it is not 175. Use α = .05. Identify the four steps in testing the
hypothesis, H0: = 175.
Step 1: State the Hypothesis
H0: μ = 175
Ha: μ ≠ 175
Step 2: Set the Criterion for Rejecting H0
This statistical test will be a two tailed test since we will conclude that the null
hypothesis is false if the population mean is greater than 175 or less than 175. In
both these cases the mean will not equal 175. Hence, the area of rejection (5% of
the area under the curve) must be distributed on both sides of the distribution as
shown below in Figure 8.4 in the Hinkle, et al. textbook.
Since we know the population standard error, the Central Limits Theorem tells us
that the sampling distribution of the means is indeed normal, so we may ask
ourselves to determine the value of the z-scores that cut off the top and bottom
2.5% of the distribution. These will define our area of rejection. Going into the
table of Areas under the Standard Normal Curve (Table C.1 in Hinkle, et al.) we
can look at the “Area beyond z” column and find .0250. We see that the z value
given for this area of 2.5% of the total area under the curve is z = 1.96. Since the
curve is symmetrical, the value of z that cuts of 2.5% at lower end of the curve z =
-1.96. So, we will reject the null hypothesis if the value of z obtained from our
sample mean is greater than 1.96 or less than -1.96, since in either of these cases
the chance of getting a value of z that high if the null were true would be less than
2.5% (.025).
Step 3: Compute the Test Statistic
For a sampling distribution of means z 
X 
where X equals the sample mean
X
and μ is the population mean hypothesized under the null hypothesis. So,
X   173.6  175  1.4
z


 1.75 .
X
.8
.8
Step 4: Decide about H0
-1.75 is not greater than 1.96 or less than -1.96. We can conclude that The
probability of obtaining a sample mean of 173.6 if the null were true (that is, if μ =
175) is greater than .05. Therefore, we will not reject the null hypothesis.
c. After completing the test in part b, give the conclusion and the probability
statement.
We cannot reject the null hypothesis because, given the information in the sample,
the chance that the population mean equals 175 is greater than 5%.
d. Suppose school officials are interested in testing the hypothesis that the population
of junior high school boys in this school system performs lower on the test than the
general population does. Test this hypothesis, again us α = .05.
In this case we are defining a one-tailed hypothesis since we will only be
concerned with the lower half of the distribution where the sample means are less
than the population mean. Therefore, H0: μ = 175 and Ha: μ < 175. The area of
rejection is now totally on the lower tail of the sampling distribution (see Figure
8.8 in Hinkle, et al.). Using the table of areas under the normal distribution we
find that the value of z that cuts of 5% (.0500) of the area under the curve is
between 1.64 and 1.65 (feel free to choose either of them). We’ll use z = 1.65 for
this example. So, since we are only concerned with the lower (the negative) half
of the normal curve, we will reject the null hypothesis if z < -1.65. From Part b,
Step 3, we know that, for this sample z = -1.75. This is less than -1.65, so we
conclude that the chances of the null hypothesis being true is less than 5% and
decide to reject the null hypothesis. We will decide that the population mean is
less than 175.
e. Is there an inconsistency between the results of parts b and d? Explain.
No, there is no inconsistency here. In the two-tailed test we assumed that in stating
the alternative hypothesis that we didn’t know whether actual population mean
were greater than or less than the mean hypothesized in the null hypothesis. In the
one-tailed hypothesis we had more information. That is we knew that the actual
population mean was not greater than the one we hypothesized (that is, 175) or, if
it were, we didn’t really care because, for our purposes it would be the same at if
the mean were equal to 175. Having this extra information allows us to be more
confident that any sample mean less than 175 indicated a population mean of less
than 175. Hence, we needed less of a difference between the hypothesized
population mean and the sample mean to be 95% certain that the different sample
mean indicated a population mean that was different from the null hypothesizes
hypothesized population mean.
f. Suppose that, instead of a sample size of 225, one of size 40 had been selected.
Would a t distribution now be the appropriate sampling distribution for the mean?
Why or why not?
No. It is not the sample size that matters here. As long as we know the population
standard deviation we can calculate the population standard error. There is no
need to estimate it from the sample standard deviation and, therefore, the Central
Limits Theorem holds. It tells us that the sampling distribution of the means is
normally distributed, NOT distributed as a t distribution.
g. If a sample of size 40 had been selected, would attaining statistical significance
require a difference between X and the hypothesized value of μ larger or small
than if a sample size of 225 had been selected. Explain.
It would have required a larger difference. A sample size of 40 would have
resulted in a larger standard error than the sample of 225 since  X 

. A
N
X 
larger standard error would have decreased the size of z since z 
for a
X
given value of X   . So you would need a larger difference between the sample
man and the hypothesized population mean to obtain the same value z when the
sample size was 40 than when it was 225.
14. A study on the reaction time of children with cerebral palsy reports a mean of 1.6
seconds on a particular task. A researcher believes that the reaction time can be
reduced by using a motivating set of directions. An equivalent set of children is
located, and they complete same task with the motivating set of directions. The
reaction times for the 12 children follow. Test the null hypothesis (H0: μ = 1.6)
against the directional alternative (Ha: μ < 1.6). Use α = .01.
Child
A
B
C
D
E
F
Reaction
Time
1.4
1.8
1.1
1.3
1.6
0.8
Child
G
H
I
J
K
L
Reaction
time
1.5
2.0
1.4
1.9
1.8
1.3
We do not know the standard deviation of the scores of the population of children
with cerebral palsy on this task, so we must estimate it from the sample standard
deviation. Since this is true, the sampling distribution of the means will be
distributed as a t distribution with N – 1 = 12 – 1 = 11 degrees of freedom. Table
C.3, Critical Values of the t Distribution, in Hinkle, et al. will give us the critical
value of t for a directional (one-tailed) test at the .01 level of significance. We find
that tcv = -2.718 (negative since we are only concerned with values less than the
hypothesized mean). We will reject the null hypothesis if our calculate value of t is
less than -2.718.
Now, t 
X 
s
, but we don’t know the value of s X . We do know that s X 
.
sX
n
 X  X 
2
We can find the standard deviation using the formula s 
N 1
.
Child
X
A
1.4
B
1.8
C
1.1
D
1.3
E
1.6
F
0.8
G
1.5
H
2.0
I
1.4
J
1.9
K
1.8
L
1.3
ΣX = 17.9
XX
X  X 
-.09
.31
-.39
-.19
.11
-.69
.01
.51
-.09
.41
.31
-.19
.01
.10
.15
.04
.01
.48
.00
.26
.01
.17
.10
.04
2
 X  X 
2
s
N 1

1.37
 .35
11
Now we can find the standard error of the
s
.35
.35


 .10 .
means using s X 
n
12 3.46
Knowing this, we can calculate the appropriate
value of t in this manner.
X   1.49  1.6  .11
t


 1.1 . This
sX
.10
.10
2
value in not less than -2.718 so we will fail to
 X  X  1.37
reject the null hypothesis an conclude that we
17.9
X 
 1.49
cannot say that the mean time it takes the
12
children to accomplish the task with the
motivational instructions is less than the children who worked without the
motivational directions required.


Chapter 9
5. The 95-percent confidence interval for a population mean is (4.12, 10.88) Without
any further information, find the 99-percent and 90-percent confidence intervals. The
sample size is greater than 120. (Hint: Find X , the critical value of t or z, and the
standard deviation of the sampling distribution of the mean – in other words, the
standard error of the mean.).
The 95% confidence interval is X  1.96( S X ) since the value of z that cuts off .025 of
the scores under the normal distribution is ±1.96 according to Table C.1 in the
textbook. We can use the table of the normal distribution since we are told that N >
120, where the t distribution would be just about normal. Therefore, the mean of the
sample should be in the middle of the 95% confidence interval. The center of the
interval (4.12, 10.88) can be found by taking the distance between the two limits,
dividing the distance by two, and adding this result to the lower limit of the interval.
So, 10.88 – 4.12 = 6.76. Half of 6.76 or 6.76/2 = 3.38. 4.12 + 3.38 = 7.50. An
alternative method of finding the sample mean, X , is to find the mean of the upper
10.88  4.12 15

 7.50 . It works either way.
and lower limits of the interval. So,
2
2
Since we now know the mean of the sample can find the standard error of the means
S X  by solving the equation 7.50  1.96S X   10.88 for S X . Since the upper limit of
the 95% confidence interval is equal to the mean plus 1.96 standard errors. This gives
3.38
 1.72 .
us 1.96S X   3.38 , so S X 
1.96
The 99% confidence interval is found using the same formula except that we now need
to use the value of z that cuts off .005 (that is, .01/2) of the area under the curve. Table
C.1 shows us that this value is z = 2.57 (note that 2.58 would be just as appropriate
since each is within .0001 of .005. So, the formula for the 99% confidence interval is
X  2.58( S X )  7.50  2.581.72  7.50  4.44 . Therefore, the upper limit of the 99%
confidence interval is 7.50 + 4.44 = 11.94 and the lower limit is 7.50 – 4.44 = 3.06.
In a similar way, the 99% confidence interval is found using the same formula except
that we now need to use the value of z that cuts off .05 (that is, .10/2) of the area under
the curve. Table C.1 shows us that this value is z = 1.64 (note that 1.65 would be just
as appropriate since each is within .0005 of .10. So, the formula for the 90%
confidence interval is X  1.64( S X )  7.50  1.641.72  7.50  2.82 . Therefore, the
upper limit of the 90% confidence interval is 7.50 + 2.82 = 10.32 and the lower limit is
7.50 – 2.82 = 4.68.
14. The researcher tests H0: μ = 95 against the alternative hypothesis, Ha: μ ≠ 95. He
selects a sample size of 200 and computes the statistical test with the data. The
sample mean is 96.5 and he rejects H0 with α = .05. Then a 95-percent confidence
interval is constructed going from 93.5 to 97.5. Identify an inconsistency in the
results and an error in computation.
An inconsistency in the results: The author rejected the null hypothesis that μ = 95
at the α = .05 level of significance, yet the value 95 is within the 95-percent
confidence interval (93.5, 97.5) that he/she constructed. If the probability that μ =
95 was less than .05, 95 should not be within the 95% confidence interval.
An error in calculation: The 95-percent confidence interval was miscalculated. We
can see this immediately since X (96.5) should be in the center of the confidence
interval and it isn’t in this case. If the interval is 93.5 to 97.5, the center of the
interval is 95.5.
Green, et al.
Lesson 22
1. Compute total scores for the algebra test from the item scores. A one-sample t test
will be computed on the total scores.
Well bring the file Lesson 22 Exercise File 1 into the computer’s memory from the
web site. When this is accomplished the Data View screen should look like the one
below.
We can compute the total scores using the COMPUTE function in SPSS. To do this,
go to the Transform menu at the top of the Data View screen and click on it. Click on
the first choice in the menu which is Compute. You will see a dialog screen similar to
the one shown on the next page.
We will store the
sum of the scores
on each of the test
items in a variable
called TOTAL. To
do this, type
“total” (leave out
the quotation
marks) in the
window labeled
Target Variable:
Now since this
variable will be the sum of the variable item1 to item8, type “sum(item1 to item8)”
(again leaving out the quotation marks) in the window labeled Numeric Expression:.
The Compute Variable dialog box should look like the one shown below.
Now, click on the OK button and the new variable will appear in the data view screen
as noted on the next page.
2. What is the test value for this problem?
Okay, I’ll admit it. This one is tricky! However, if we think about it a little we can
probably figure it out. What the researcher wants to find out is if, “the Involvement
Technique is effective in teaching algebra to first graders.” The question never defines
“effective,” but let’s assume the strategy is effective if students learn anything at all.
Since there are eight questions with four choices for each item, a group of students
who learned no algebra and were guessing randomly should have scored two items
correctly, just by luck. So, let us assume that the new technique has been effective for
any student who scores higher than two items correct. This makes the null hypothesis
μ = 2 with the alternative hypothesis being μ > 2. Therefore, the test value is 2.
3. Conduct a one-sample t test on the total scores. On the output, identify the following:
a. Mean algebra score
b. t-test value
c. p value
To conduct the t-test we click on the Analyze menu at the top of the Data View
window. Next we click on the Compare Means submenu and receive the choices
shown on the next page.
Click on the choice One-Sample T Test and look at the dialog box that should be
similar to the one shown below.
Highlight the
variable total in the
left hand box and
move it to the Test
Variable(s): box by
clicking on the right
arrow. Now type “2”
(without the
quotation marks) into
the Test Value: box.
The One-Sample T
Test dialog box should look like the one shown below.
Now, click on the OK button to receive the output shown below.
T-Test
One-Sample Statistics
N
total
6
Mean
6.00
Std.
Deviation
1.414
Std. Error
Mean
.577
One-Sample Test
Test Value = 2
total
t
6.928
df
5
Sig. (2-tailed)
.001
Mean
Difference
4.000
95% Confidence
Interval of the
Difference
Lower
Upper
2.52
5.48
The upper box shows us that the mean of the algebra scores is 6.00.
The obtained t-value is 6.928 as shown in the One-Sample Test box.
The One-Sample Test box tells us that the p value is .001. However, as noted in the
box, this is the value for a 2-tailed (non-directional) test. Our hypothesis is
directional and requires a one-tailed test. So, what is the one-tailed probability that
we could get a sample mean if 6 if the population mean were equal to 2 (the null
hypothesis). This is simple enough. The one tailed probability is simply one-half of
the two-tailed probability. So, in this case, the one-tailed probability is one-half of
.001 or .0005. This is considerably less than .05, so we will reject the null hypothesis
because the chance of the null being true in this case so low.
4. Given the results of the children’s performance on the test, what would should John
conclude? Write a Results section based on your analysis.
John would conclude that his method is effective in teaching algebra to first graders.
Download