EXPERIMENTAL DESIGN

advertisement
EXPERIMENTAL DESIGN
•
•
•
•
Random assignment
Who gets assigned to what?
How does it work
What are limits to its efficacy?
RANDOM ASSIGNMENT
• Equal probability of assignment to each
condition (treatment, control, etc.)
– or fixed, known probability if other design
conditions are included
• Use of random number table, computergenerated random number to make
assignments
WHO GETS ASSIGNED
• Primary units (such as students, patients,
or clients) assigned individually without
additional personal information used
• Assignment within personal or
demographic categories- gender,
psychological diagnosis, etc.
• Multiple levels of assignment- pools used
for selection
How Randomization Works
• Distributes various causal conditions,
variables equally across assignment
conditions
• Generates random differences in initial
conditions, pretest scores whose variance
can be estimated in probability
• Creates individual variation (“error”) that is
independent of treatment
Limits of Efficacy
• Randomization does not last forevergroups begin to change over time in
unknown ways
• History is uncontrolled
• Maturation is uncontrolled over long
periods of time
• Testing effects are not controlled
• Mortality effects are not controlled
TWO GROUP MEANS
TESTS
Two independent groups
experiments
• Randomization distributions.
• 6 scores (persons, things) can be randomly
split into 2 groups 20 ways:
•
•
•
•
123456
135246
234156
256134
124356
136245
235146
345126
125346
145236
236145
346126
126345
146235
245136
356124
134256
156234
2 4 6  13 5
456123
Two independent groups
experiments
• Differences between groups
can be arranged as follows:
-3 -1 1 3
-5 -3 -1 1 3 5
-9 -7 -5 -3 -1 1 3 5 7 9
3
• look familiar?
Count
2
1
0
-8
-4
0
VAR00001
4
8
t-distribution
• Gossett discovered it
• similar to normal, flatter tails
• different for each sample size, based on N-2
for two groups (degrees of freedom)
• randomization distribution of differences is
approximated by t-distribution
t-distribution assumptions
• NORMALITY
– (W test in SPSS)
• HOMOGENEITY OF VARIANCES IN BOTH
GROUPS’ POPULATIONS
– Levene’s test in SPSS
• INDEPENDENCE OF ERRORS
– logical evaluation
– Durbin-Watson test in serial data
Null hypothesis for test of means for two
independent groups
•
H0: 0 - 1 =0
•
H 1:  0 -  1  0 .
• fix a significance level,  .
• Then we select a sample statistic. In this
case we choose the sample mean for each
group, and the test statistic is the sample
difference
d = y0 – y1 .
Null hypothesis for test of means for two
independent groups
•
•
t = d / sd
__________________________________________
= (y0 – y1 )/  {{ [(n0 –1)s20 + (n1 – 1)s21 ] / (n0 + n1 –2)} { 1/n0 + 1/n1}
• The variance of a difference of two scores
is:
•
s2(y1-y2) = s21 + s22 -2r12s1s2
Standard deviation of differences
•
s2(y1-y2) = s21 + s22 -2r12s1s2
• Example, s21 = 100, s22 = 144, r12=.7
• s2(y1-y2) = 100 + 144 -2(.7)(10)(12)
•
= 244 - 168
•
= 76
• s(y1-y2) = 8.72
Standard deviation of
differences
•
•
•
•
•
•
s2(y1-y2) = s21 + s22 -2r12s1s2
Example, s21 = 100, s22 = 144, r12=0
n1 = 24, n2=16
s2(y1-y2) = 100 + 144
= 244
s(y1-y2) = 15.62
t-distribution,
df=24+16-2 = 38
0 SD=15.62
Standard error of mean
difference score
• standard error of the sample difference.
It consists of the square root of the
average variance of the two samples,
• d2=[(n0 –1)s20 + (n1 – 1)s21 ] / (n0 + n1 –2)
• divided by the sample sizes ( 1/n0 + 1/n1 )
d2 = d2/ ( 1/n0 + 1/n1 )
• Same concept as seen in sampling
distribution of single mean
Example
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Willson (1997) studied two groups of college freshman engineering students, one group
having participated in an experimental curriculum while the other was a random sample
of the standard curriculum. One outcome of interest was performance on the Mechanics
Baseline Test, a physics measure (Hestenes & Swackhammer, 1992). The data for the
two groups is shown below. A significance level of .01 was selected for the hypothesis
that the experimental group performed better than the standard curriculum group
(a directional test):
Group
Exper
Std Cur
Mean
47
37
SD
15
16
Sample size
75
50
__________________________________________
t = (47 – 37) / [(74 y 152) + (49 y 162) / (75 + 50 – 2)][1/75 + 1/50]
_______________________________
= (10) / [(16650 + 12554) / (123)][1/75 + 1/50]
= 1.947
The t-statistic is compared with the tabled value for a t-statistic with 123 degrees
of freedom at the .01 significance level, 2.358. The observed probability of occurrence
is 1 - 0.97309 = .02691, greater than the intended level of significance. The conclusion
was that the experimental curriculum group, while performing better than the standard, did
not significantly outperform them.
S
s
u
f
a
V
o
n
a
l
e
r
S
.
e
i
E
g
a
S
e
e
o
p
F
d
a
i
t
r
r
w
g
p
f
i
e
e
T
E
2
0
2
2
0
5
9
9
2
a
1416
E
0
5
0
5
8
0
1
a
not
experimentwise error
• probability of a Type I error in any of the
tests, called the experimentwise error
• Rough approximation: < k
• Example, if we run 3 t-tests at p=.05,
experimentwise error rate < .15
• limit by setting experimentwise error to
some value, like .05, then =.05/k
• Called Bonferroni correction (when
calculated exactly) = 1 - (1- )k
Confidence interval around d
•
d  t {{[(n0 –1)s20 + (n1 – 1)s21 ] / (n0 + n1 –2)} { 1/n0 + 1/n1}
• Thus, for the example, using the .01
significance level the confidence interval is
•  2.358 (5.136) = (-2.11 , 22.11) .
• Thus, the population mean difference is
somewhere between about -2 and 22
• This includes 0 (zero) so we do not reject the null
hypothesis.
Wilcoxon rank sum test for
two independent groups.
• While the t-distribution is the
randomization distribution of standardized
differences of sample means for large
sample sizes, for small samples it is not
the best procedure for all unknown
distributions. If we do not know that the
population is normally distributed, a better
alternative is the Wilcoxon rank sum
test.
Wilcoxon rank sum test for two
independent groups.
• Heart Rejected?
•
•
•
•
•
•
•
•
•
•
•
Yes
No
Survival days
624, 46, 64, 1350, 280, 10, 1024, 39, 730, 136
15, 3, 127, 23, 1, 44, 551, 12, 48, 26
Ranks for data above
Yes
17, 10, 12, 20, 15, 3, 19, 8, 18, 14
No
5, 2, 13, 6, 1, 9, 16, 4, 11, 7
Test Statistics
RANKDAY
Mann-Whitney U
19.000
Wilcoxon W
74.000
Z
-2.343
Asymp. Sig. (2-tailed)
.019
Exact Sig. [2*(1-tailed Sig.)] .019
Sum
136
74
Confidence interval for S
• Confidence interval for S.
•
While S (or U) may not be an obvious statistic to think
about, both have the same standard deviation
•
____________
•
sS =  n1n2(n + 1)/12
• so that for the asymptotic normality condition (with n1 and
n2 at least 8 each), for alpha = .05.
•
S  1.96 sS
• gives a 95% confidence interval. For the data above sS =
13.23, and the 95% confidence interval is
• 74  25.93 = (48.07, 99.93).
Correlation representation of the two
independent groups experiment
r2pb
t2
=

(1 – r2pb )/ (N-2)
t2
= 
t2 + N - 2
• N=n1 + n2
r2pb
Correlation representation of the two
independent groups experiment
t
rpb =

1/2
t2 + N - 2
rpb
x
y
Path model representation
of two group experiment
e
Test of point biserial=0
• H0: pb = 0
• H1: pb  0
• is equivalent to t-test for difference for two
means.
Fig. 6.4: Scatterplot of ranks of days of survival
for persons who experienced tissue
rejection (1) or not (0)
30
20
RANKDAY
10
0
-.2
0.0
REJECT
0
NO
REJECTION
.2
.4
.6
.8
1.0
1
YES
1.2
Dependent groups experiments
• d = y1 – y0
• for each pair. Now the hypotheses about the new scores
becomes
•
H0:  = 0
•
H1:   0
• The sample statistic is simply the sample difference. The
standard error of the difference can be computed from
the standard deviation of the difference scores divided
by n, the number of pairs
Dependent groups experiments
•
_________________
sd =  [s20 + s21 –2r12s0s1 ]/n
• Then the t-statistic is
_
•
t = d / sd
Dependent groups experiments
•
In a study of the change in grade point average for a group of college
engineering freshmen, Willson (1997) recorded the following data over two
semesters for a physics course:
• Variable
N
Mean
Std Dev
•
•
PHYS1
PHYS2
•
•
•
•
•
•
•
128
128
2.233333
2.648438
1.191684
1.200983
Correlation Analysis: r12 = .5517
To test the hypothesis that the grade average changed after the second
semester from the first, for a significance level of .01, the dependent samples tstatistic is
________________________________________
t = [2.648 – 2.233]/ [ 1.1922 + 1.2012 – 2 (.5517) x 1.192 x 1.201]/128
= .415 / .1001
= 4.145
This is greater than the tabled t-value t(128) = 2.616. Therefore, it was
concluded the students averaged higher the second semester than the first.
Nonparametric test of difference in
dependent samples.
• sign test. A count of the positive (or negative) difference
scores is compared with a binomial sign table. This sign
test is identical to deciding if a coin is fair by flipping it n
times and counting the number of heads. Within a
standard error of .5n1/2 the number should be equal to
n/2 .As n becomes large, the distribution of the number
of positive difference scores divided by the standard
error is normal.
•
An alternative to the sign test is the
Wilcoxon signed rank test or symmetry
test
Summary of two group
experimental tests of hypothesis
• Table is a compilation of last two chapters:
– sample size
– one or two groups
– normal distribution or not
– known or unknown population variance(s)
One or
or Two
Groups
One
One
One
Independent Normal Hypotheses
Distribution
or
Assumed?
Dependent
_
H0:  = a
not applicable Yes
H1:   a
not applicable
not applicable
Yes
No
Population Test Statistic Distribution
known?
2 Known
normal
y. - a
z=

[ 2 /n ]1/2
H0:  = a
H1:   a
2 unknown
H0:  = a
2 unknown
y. - a
t=

t with n-1 df
[ s2 /n ]1/2
S = R+i , yi > a Wilcoxon
rank sum
H1:   a
or
n+ = i+ , i+ =1 if yi > a, 0 else
binomial (sign test)
_
_
Two
Independent Yes
H0:  0 - 1 = 0
H1:  0 - 1  0
20 =21 = 2 ,
known
z=
y0. – y1.

Two
Independent Yes
H0:  0 - 1 = 0
H1:  0 - 1  0
20 =21 ,
unknown
normal
[  (1/n0 + 1/n1) ]
_
_
y0. – y1.

t=
t with n0 + n1 –2 df
2
1/2
[ s (1/n0 + 1/n1) ]
2
1/2
Two
Independent No
H0:  0 - 1 = 0
H1:  0 - 1  0
s2 = (n0 –1)s20 + (n1 –1)s21
n0 + n1 –2
2
2
 0 = 1 ,
S = R+i
Wilcoxon rank sum
unknown
for one of the groups
Two
Dependent Yes
H0:  0 - 1 = 0
20 =21= 2,
H1:  0 - 1  0
Known
y0. – y1.
z=

normal
[ 2 2 ( 1 - ) /n ]1/2
 = population correlation
between y0 and y1
Two
Dependent No
H0:  0 - 1 = 0
H1:  0 - 1  0
20 =21= 
unknown
S =R+i
Table 6.2: Summary of one and two group experimental or
observational studies
Wilcoxon Ranks sum
for positive differences
Download