The Analysis of Population

advertisement

Diana C. Mutz

University of Pennsylvania

Simple, straightforward

No fancy statistical techniques required

Very few questions required

Comparison of means (analysis of variance)

Many problems result from using observational analysis techniques on experimental data

People make it more complicated than it needs to be!

1.

2.

Well measured Dependent Variable(s)

Manipulation check (to ensure that the

Independent Variable was successfully manipulated by the experimental treatment)

Causality requires meeting only 3 conditions:

1. Association (The easy part!)

2. Precedence in Time of Independent Variable

(We manipulate the Independent Variable)

3. Non-spuriousness of relationship

(Random assignment eliminates this problem)

1.

2.

Well measured Dependent Variable

Manipulation checks (to ensure that the

Independent Variable was successfully manipulated by the experimental treatment)

OPTIONAL:

1.

Potential Moderators/Contingent conditions

2.

Covariates

Does Social Trust Influence Willingness to Engage in Online Economic Transactions?

CONTROL

CONDITION

POSITIVE

SOCIAL

TRUST

NEGATIVE

SOCIAL

TRUST

3.

4.

1.

2.

Randomization checks/Balance tests

Statistical models for analysis

Weighting data to population parameters

Use and misuse of covariates

Randomization checks/balance tests: They can’t tell us what we want to know, and they can lead to inferior model choices

Statistical models for analyzing populationbased survey experiments often altogether ignore the fact that they are, indeed, experiments.

We assume….

Researcher has control over assignment to conditions

Respondents do not undergo attrition differentially as a result of assignment to a specific experimental condition

Researcher can ensure that those assigned to a given treatment are, in fact, exposed to treatment.

If any one of those 3 requirements is not met, then balance tests can make sense

If the randomization mechanism requires pretesting, then balance tests make sense

Otherwise, not.

Rationales for balance tests

Credibility of findings

Efficiency of analyses

Lack of faith in or thorough understanding of probability theory

Confusion between frequentist and Bayesian paradigms

Mistakenly applying methods for observational analyses to experimental results

Field experimental literature in which exposure to treatment cannot always be controlled

What does it mean for a randomization to

“succeed”?

A well-executed random assignment to experimental conditions does not promise to make experimental groups equal on all possible characteristics, or even a specified subset of them.

“Because the null hypothesis here is that the samples were randomly drawn from the same population, it is true by definition, and needs no data.” (Abelson)

Randomization checks are “philosophically unsound, of no practical value, and potentially misleading.” (Senn)

“Any other purpose [than to test the randomization mechanism] for conducting such a test is fallacious.” (Imai et al.)

“p<.05” already includes the probability that randomization might have produced an unlikely result

Thus experimental findings are credible without any balance tests at all.

Can balance tests profitably inform the analyses of results?

What should one do if a balance test fails?

Inclusion of covariates

Post-stratification

Re-randomization

Is a failed balance test useful for purposes of choosing covariates?

Covariates should be chosen in advance, not based on the data.

Covariates are chosen for anticipated relationship with the DV; balance tests evaluate the relationship with the IV.

So is a balance test informative for model selection?

NO!

If inclusion of a variable as a covariate in the model will increase the efficiency of an analysis, then it would have done so, and to a slightly greater extent, had it not failed the balance test.

Thus balance tests are uninformative when it comes to the selection of covariates.

“Failed” randomization with respect to a covariate should not lead a researcher to include that covariate in the model. If the researcher plans to include a covariate for the sake of efficiency, it should be included in the model regardless of the outcome of a balance test.

Changes the appropriate p-value

Always excludes X: p

1

Always includes X: p

2

Not the same p-value that should result after the 2-stage process

But most researchers simply report p

1 p

2 or

If they have no implications for the credibility of our findings…

If they cannot improve the efficiency of our analyses…

They can’t tell us what we want to know

They can lead to inferior model choices

They can lead to unjustified changes in the interpretation of findings

Balance tests do not provide rationales for including additional variables

Three examples of model and analysis choices made for the wrong reasons

EXAMPLE 1: “In order to ensure that the experimental conditions were randomly distributed—thus establishing the internal validity of our experiment—we performed difference of means tests on the demographic composition of the subjects assigned to each of the three experimental conditions.”

“ Having established the random assignment of experimental conditions, regression analysis of our data is not required ; we need only perform an analysis of variance (ANOVA) to test our hypotheses as the control variables that would be employed in a regression were randomly distributed between the three experimental conditions.”

EXAMPLE 2:

Five dummies for 6 conditions

Regressions run amok with surveyexperimental findings!

Regression versus analysis of variance is a red herring. So are balance tests.

Especially in an experimental analysis, everything needs a reason for being there.

True experiments should not have “control” variables! (A few covariates are OK.)

The presence of unnecessary variables in a statistical model should be viewed with suspicion; they can hurt and bias results.

3.

4.

1.

2.

Randomization checks/Balance tests

Statistical models for analysis

Weighting data to population parameters

Use and misuse of covariates

Should population-based experiments use population weights supplied by survey houses?

Some studies do, some don’t; no particular rationale typically given

No one correct answer but need to consider:

Possibility of heterogeneous effects

Power needs

Emphasis on generalizability

1.

2.

3.

No use of weights

Weighting sample as a whole to underlying population parameters

Weighting formulated so that individual experimental conditions reflect population parameters

Either (1) or (2) benefits through increasing generalizability to full population; (2) is better at reducing noise due to uneven randomization

But all weighting sacrifices power .

If all the full sample weights are squared for a sample of size n, and then summed across all subjects, this sum (call it M weighting:

1

) provides a sense of just how much power is lost through 𝑛

= 1 −

𝑀

1

If M

1

=3000 and n=2000, then the equation will come out to .33.

Weighting in this example lowers power as if we had reduced the sample size by one-third.

Instead of a sample of 2000, we effectively have the power of a sample size of 1340.

= 1 − 𝑛

𝑀

1

Calculate via same formula for within-subject

Compare loss of power in within versus whole sample weighting

= 1 − 𝑛

𝑀

1

Request both whole sample and withincondition weights

Decision can be made on basis of importance of power relative to generalizability

Ultimately depends on expectations about heterogeneity of effects.

3.

4.

1.

2.

Randomization checks/Balance tests

Statistical models for analysis

Weighting data to population parameters

Use and misuse of covariates

Because population-based survey experiments involve survey data, often analyzed as if they were observational studies

Mistaken use of unnecessary “control” variables

Because population-based survey experiments involve survey data, often analyzed as if they were observational studies

Mistaken use of unnecessary “control” variables

Not a cure for an unlucky randomization

(which isn’t necessary in any case)

But what’s the harm? Biased results

EXAMPLE 3:

Treatment effects and their interactions with other variables

Treatment effects and their interactions with other variables

But then what are these?

“Control variables”

To improve efficiency when selected in advance from pretest measures based on advance knowledge of predictors of dependent variable

Better yet, use blocking if equality across conditions on that particular variable is THAT important.

Too many available variables leads to suboptimal data analysis practices.

Researchers need to rely more on the elegance and simplicity of their experimental designs.

Equations chock full of “control” variables demonstrate a fundamental misunderstanding of how experiments work.

Failed randomization checks should never be used as a rationale for inclusion of a particular covariate.

Download