SP801

advertisement

726821357

Page 1 of 17

SP801 Session 6

1.

Review of MRA Topics: Mediation /

Suppression / Nominal Predictors

2.

Experimental Designs

2.1 Prologue

2.2 Purpose and Definition of the

Experiment

2.3 Limitations and Criticism

2.4 Some Terminology

3.

Comparing Means: t-Tests and the

Analysis of Variance

3.1 t-Test for Independent Samples

3.2 t-Test for Matched Pairs of

Observations

3.3 One Factor ANOVA

1. Review of MRA Topics:

Testing mediation

Definition of mediator

How is mediation tested in MRA?

Necessary conditions for inferring mediation

Suppression

What indicates suppression in an MRA result?

How can suppressor variables be identified?

Coding nominal variables

Why create new variables at all?

Alternative coding strategies

Assessment:

You should be able to

726821357

Page 2 of 17

(a) differentiate mediation from moderation,

(b) give examples for mediator and moderator variables,

(c) describe and perform the procedures involved in the analysis of mediation (including hand computation of the z -test),

(d) explain the concept of suppression and give examples,

(e) explain the rationale behind dummy, effects, contrast and nonsense coding; apply these codings to a nominal IV in MRA.

726821357

Page 3 of 17

2.

Experimental and Quasi-Experimental Designs

2.1 Prologue

Traditional domains of statistical models:

MRA: Correlational research . Relationships between variables are examined. Researcher does not intervene or create conditions.

ANOVA: Experimental research . Means of a DV are compared between groups that are defined by the levels of actively created and manipulated IVs.

MRA and ANOVA are mathematically identical and can be seen as special cases of a General Linear Model (GLM). However, each procedure highlights different aspects of analysis.

Before we look at t -tests and ANOVA models, we will briefly discuss the experimental method.

2.2 Purpose and Definition of the Experiment

The experiment:

 major aim is causal analysis .

 promoted as the method of choice in psychology (e.g.,

Aronson et al., 1990; Cook & Campbell, 1979).

 less ambiguous than other methods in terms of alternative causal pathways underlying a relationship between variables.

When to use nonexperimental methods:

 when causal explanation is unimportant (is it ever?);

 when experimentation is ethically or practically impossible;

 to complement and cross-validate experimental analyses.

(For reviews of quasi-experimental designs, see Aronson et al.,

1990, ch. 5; Cook & Campbell, 1979)

726821357

Page 4 of 17

How is an experiment defined?

Common features of many experiments:

 random assignment

 manipulation of the IV

 control of extraneous variables

 laboratory setting

The main (and sufficient) defining feature is random assignment of units of observation to conditions .

Why is random assignment the key to causal inference?

Causation requires that

 the IV covaries with the DV,

 the IV is temporally prior to the DV,

 that alternative causation of the DV (by something other than the IV) can be ruled out.

In nonexperimental studies, we can establish covariation and temporal sequence, and also control for potential rival causes: variables can be eliminated, held constant, or measured and statistically partialled out (as in MRA).

However, to control or statistically partial out the effects of extraneous variables, these need to be known and assessed.

Random assignment is the only way to exert control over extraneous variables whose potential influence is unknown !

Example: Schachter’s affiliation experiments (“Dr. Zilstein”)

Naturalistic observation as an alternative?

Problems of control conditions, confounding of IV and other variables, self-selection of participants.

726821357

Page 5 of 17

NB: Random assignment is not the same as drawing a random sample from a population!

Indeed, experimenters often prefer homogeneous samples (e.g., male undergraduate students) to random samples from a larger population (of people). Why?

Between-subjects and within-subjects designs:

In a between-subjects design, participants are randomly assigned to groups representing various levels of the IV (or combinations of IV levels). The DV is assessed after the treatment has been applied in each group; DV means are compared between groups.

In a within-subjects design, the DV is measured at two or more points in time (e.g. before and after treatment), and the differences in scores within each participant are analysed.

What are the advantages and disadvantages of each approach?

Both techniques can be combined, yielding mixed designs.

(For discussion of design issues, see Aronson et al., 1990, ch. 4)

2.3 Limitations and Criticism

Ethical and practical limitations to experimentation.

Examples of causal research questions that have typically been addressed non-experimentally:

Does TV violence cause an increase in violent behaviour of viewers?

Does clinical depression cause deficits in social behaviour?

Criticism of experimental research:

 artificial

 can’t be generalised to real life

726821357

Page 6 of 17

How valid are these criticisms?

 experimental versus mundane realism (

Milgram)

 aim: testing hypotheses by operationalising constructs of a theory as variables ( not : simulating reality)

 conceptual replications

Ethical considerations

 informed consent

 confidentiality

 deception: avoid / minimise! If unavoidable: debriefing!

2.4 Some Terminology

(see textbooks by Aronson et al. and Cook & Campbell for revision)

 randomisation = random assignment

 field experiment : an experiment conducted in a natural setting

 quasi-experiment : a study comparing nonequivalent groups where random assignment is impossible

 factor = IV

 conditions, treatments = levels of IV or combinations of levels of IVs

 operationalisation = empirical realisation of theoretical constructs, including measurement in the case of DVs

 manipulation check = a DV designed to check if a factor has the intended effect; often used as a mediator in analysis

 factorial design / crossing / nesting

 control group (control condition) = a baseline condition with which a treatment or combination of treatments is compared

726821357

Page 7 of 17

 between-subjects design / within-subjects design / mixed design

 elimination / systematic variation / matching = different ways of controlling for known or suspected confounds

 interaction effect = present when the effect of an IV on a DV changes across levels or combinations of levels of other IVs

 conceptual replication = repeating a study with different empirical realisations of the same conceptual variables

 the concept of error :

 random error

 systematic error

 internal and external validity

 artefact versus alternative explanation

3. Comparing Means: T-Tests and the Analysis of

Variance

3.1 t-Test for Independent Samples

We have already used the t -statistics in testing the significance of correlation and regression coefficients. It is also used for comparing means from independent samples (which, as we have seen, can be reconstructed as a problem of correlation). t can be defined as the difference between the two means divided by an estimate of the standard error of this difference (see

Howell, 1997, Chapter7, for equations and examples). mean between-groups difference t = ————————————————————— standard error of mean between-groups difference

Null hypothesis: The two population means are equal (i.e. their difference is equal to zero).

726821357

Page 8 of 17

Under H

0

, the t statistic is distributed in a “Student’s tdistribution” -- actually a family of distributions defined by their df parameter. Symmetrical and bell-shaped, but with heavier tails than a normal distribution. df = N1 + N2 – 2

The larger the df parameter (i.e. the larger the sample sizes), the more similar the t -distribution is to a standard normal distribution. Formally, t

 z for df

 

(see Table of t , e.g.

Howell, 1997, p. 683).

Larger absolute values of t indicate a larger effect.

The sign of t indicates the direction of the mean difference (as the standard error in the denominator of the t equation is positive by definition, the sign of t depends on which mean is subtracted from the other).

Assumptions underlying the t-test for independent samples:

1. Independence of observations within and between groups.

2. Normal distribution of population raw scores.

3. Equal (homogeneous) variances of the two samples.

(A computational alternative for unequal variances exists.)

Simple computational example:

Group = 1

3

4

Group = 2

5

5

5

6

7

7

__________________________

Mean 4.5 6.0

Mean Standard error

difference of difference

-1.5 0.866

t(6) = -1.732

G r o u

S t p d d e

726821357

Page 9 of 17

Computation via SPSS using the “Compare Means” –

“Independent Samples T-Test” option with default settings (the grouping variable or IV is called GROUP, the DV is called DV:

Syntax:

T-TEST

GROUPS=group(1 2)

/MISSING=ANALYSIS

/VARIABLES=dv

/CRITERIA=CIN(.95) .

Output:

T-Test

S

S t t d

1

2 a

.

.

0

0 t

0

0 i s t i c p t e

- t n e s d t f

9 e

L n

5

I

% n t t

E

S q

C a m p l e s e 's u a l i t c o n y o f t o f e n t h

M c e e a n e

T s e s e s e s a s n s o t t u

C o

1 r

G

.

1

1

.

1 r e

3

3

D

0 u a l

E q a s u s a l u m

Note that a test of equality of variances is provided, as well as two t -tests (one assuming homogeneity, one not assuming it).

For comparison: A point-biserial correlation between the variables GROUP and DV yields the following result: l a

V

S

N e d t i i

U o g

P a

.

n s c c r s

( o

2 i g .

r

( 2 -

N n t a n t a m e d

C i l o e r d

C i l o e r d r

) r

) e e l l s a a t t i i o o n n

726821357

Page 10 of 17

The identical p-values indicate that ttests for difference between means and for correlation between the grouping variable and the DV are indeed equivalent. t

 r

1

 r

2

 df

Just as t is a significance test for r , r (defined as above) is a convenient effect-size measure for the difference between independent means.

Another common effect size measure is Cohen’s d , which is defined as the difference between means divided by an estimate of the population standard deviation of scores: d

M

1

M

2

For groups of equal size (N

1

= N

2

):

2 t d

 df

This can be used for comparing effect sizes of published studies, even if just the df are reported but the n

’s of the groups being compared are unknown. If n

’s are equal, d is estimated accurately. If sample sizes are unequal, d is a conservative estimate, as it underestimates the true effect size.

726821357

Page 11 of 17

3.2 t-Test for Matched Pairs of Observations

Examples:

 before-after measurements of one variable

 comparison of two variables assessed from the same participants (e.g. attitudes toward different objects)

 correlated observations (e.g. relationship satisfaction of husbands and wives with couples as the unit of observation)

 yoked experimental designs t is now defined as the mean of the differences between the two measurements in each pair divided by the standard error of that mean (see Howell, 1997, Chapter 7, for equation).

The standard error of the mean is defined here as usual: standard deviation of differences divided by the square root of N (N being the number of pairs of scores, not the number of scores).

For this application of the t -test, df = N – 1 .

Assume that the data in the above example were measurements of 4 persons’ attitudes before and after an influence attempt: before after Difference

P1

P2

P3

3

4

5

5

5

7

-2

-1

-2

P4 6 7 -1

____________________________________

Mean 4.5 6.0 -1.5

The mean of differences is of course identical to the difference of means, so the numerator of the t ratio remains the same as in the case of independent samples, but both the denominator and the df parameter change .

P a i r e

P a d i r e

S d a

S a m

S

D t p

726821357

Page 12 of 17

Cohen’s d can again be used as an effect size measure, but for matched pairs it is defined as: d

 t df

Computation via SPSS, using the “Compare Means” – “Paired-

Samples T-Test” option with default settings (variables named

BEFORE and AFTER) yields the following:

Syntax:

T-TEST

PAIRS= before WITH after (PAIRED)

/CRITERIA=CIN(.95)

/MISSING=ANALYSIS.

Output:

T-Test l e d s S r

F o n

T r

O

E

R

R t a

E m p

C o

P a i r l r e r s t i

C o n o s

R t e d S a m p l e s T

P air

9 5 ed

%

In t

C er

D v al f t f on er f of en t h

)

F 1 c en es c e e

R E e

A F T E R t i s r r

E t i e

& c s l a

A F t i

T o

E n

R s

726821357

Page 13 of 17

Note that the mean difference is significantly different from zero in this analysis, while the difference between the means was not in the analysis that treated the observations as independent.

This is because the before-after observations are highly correlated across persons, and this correlation is ignored by the independent-samples analysis.

Generally, a matched-pairs test has greater power than an independent samples test to the extent that the paired observations are actually correlated . If this correlation is low, however, matched-pairs tests can be weaker especially with small samples because their df parameter is half that of a comparable independent-samples test.

The reported correlation coefficient of .894 is not identical to the effect size measure r in the case of independent samples

(which was .577 for the same data treated as independent). The former is the correlation between the before and after scores across four paired observations , the latter is the correlation between the DV scores and the dichotomous grouping variable across eight independent observations .

3.3 One-Factor ANOVA

To compare more than two groups or the differences among more than two related measurements, we use analysis of variance (ANOVA). This also exists in two versions, one for k independent groups and one for k repeated measurements.

(For k = 2, these ANOVAs are equivalent to t -tests.)

We will first discuss the case of k independent groups: Onefactor between-subjects ANOVA (or oneway analysis of variance; see Howell, 1997, Chapter 11):

The null hypothesis tested here is that all k group means come from a population with the same mean

:

1

=

2

=

3

........ =

 k

=

726821357

Page 14 of 17

Briefly, this is tested by computing a ratio of “variance between the groups” divided by “variance within the groups” (hence

“analysis of variance

”, although mean differences are tested).

The former variance represents variation due to both chance and systematic effects, the latter represents variation due to chance alone. Thus the larger this ratio, called F , the more likely there is in fact some systematic difference among the group means.

How is this done?

Sums of squares:

This concept should be familiar from MRA.

In one-factor ANOVA, we compute three sums of squares (SS):

SS total

, based on the deviation of observed scores from the

Grand Mean;

SS treatment

, based on the deviation of “predicted” scores from the grand mean (“predicted” here simply means that each score is replaced by its group mean); and

SS error

, based on the deviation of each score from its group mean.

These SS are additive :

SS total

= SS treatment

+ SS error

Each score’s deviation from the grand mean (M) value can be written as a sum of two deviations: y ij

– M = (y ij

– M j

) + (M j

– M)

Squaring both sides of the equation and summing across all scores in each condition yields: i

 j

( y ij

M )

2  i

 j

[( y ij

M j

)

( M j

M )]

2

726821357

Page 15 of 17

The right-hand side of this equation can be rewritten as: i

  j

( y ij

M j

)

2   i j

( M j

M )

2 

2 i

 j

( y ij

M j

)

( M j

M )

Fortunately, the rightmost element in the sum equals zero

(because simple deviations from a mean, summed over all scores, equal zero by definition).

So, indeed, we get two SS terms that are additive in that they sum up to the total SS:

 i j

( y ij

M )

2   i

 j

( y ij

M j

)

2   i j

( M j

M )

2

or

 i j

( y ij

M )

2   i

 j

( y ij

M j

)

2   j n j

( M j

M )

2

Note the similarities to MRA:

 The error SS is defined exactly as in MRA, only the “best guess” defining a predicted score is not a point on the regression line, but the mean of the group the score is in.

The treatment SS corresponds to the increase in prediction we gain from knowing a score’s group membership (if we didn’t know, our best guess would be the grand mean).

To derive a significance test, we first compute variances or mean squares (MS) , dividing each SS by its associated df: df treatment

= k – 1 df error

= N – k

(k being the number of independent groups)

MS treatment

= SS treatment

/ (k – 1)

MS error

= SS error

/ (N – k)

726821357

Page 16 of 17

Finally, the ratio of the two mean squares is computed, yielding

F , the test statistic in the analysis of variance.

F

MS treatment

MS error

The distribution of the F ratio is defined by the two df-parameters. The df treatment

are often called numerator df and the df error are often called denominator df (e.g. in SPSS output).

When reporting an F value, be sure to report both associated df parameters. APA style also requires reporting of MS error

(abbreviated MSE ) along with the ANOVA results. Example:

“An analysis of variance on the agreement index (MSE = 1.22) revealed that smokers agreed less with the proposal to cut down on smoking (M = 4.53, SD = 1.56) than did nonsmokers

(M = 5.94, SD = 1.61), F (1, 212) = 46.24, p < .001.”

The F distribution is defined only in the range of zero to positive infinity. With numerator df = 1, it is identical to a squared t distribution:

F (1, x) = t (x) 2

F distributions are positively skewed. With increasing numerator df, they become more symmetrical and bell-shaped.

Assumptions of ANOVA:

Similar to assumptions of t -test; normal distribution and equality of variances now simply applies to all groups to be compared.

Computational example for a three-group experiment

Using the scores of the two groups from our independent t-test example and adding a third group with the four scores 2, 3, 5, 6,

I ran SPSS “Compare Means” – “Oneway ANOVA”, with default options plus descriptive statistics and a test of homogeneity of variances:

T e s t o f H o m o

9

D

5 %

726821357

Page 17 of 17

Syntax:

ONEWAY

dv BY group

/STATISTICS DESCRIPTIVES HOMOGENEITY

/MISSING ANALYSIS .

Output:

Oneway e s

C c r i p t

D o n f f o i r d e

M

V n e c a i v e n

.

.

.

0

0

0 t

0

0

0 a l e t s e r v l g e

L

S n e t v e

D e i t

V n e i c y

1

1

2

A

7 .

N

6 u

O V A

D

T

V e i o f s t w t t h a i l e n o e n

G f

G r o r u

V o p u s p a s r i a n c e s

Download