SS10.4

advertisement
Resolving the Goldilocks
problem: Presenting results
Jane E. Miller, PhD
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Overview
•
•
•
•
Labeling of variables
Identifying model specifications
Contents of descriptive statistics tables
Prose interpretation of multivariate
coefficients
– In the results section
– In the discussion section
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Guidelines for effective labeling:
Conveying levels of measurement,
units and categories
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Basic attributes of variables
• For every variable in your analysis, convey
– Units of measurement
• System of measurement (e.g., British or metric?)
• Scale (e.g., millimeters or meters?)
• Level of aggregation (e.g., weekly or annual income?)
– Names of categories, if nominal or ordinal
• Label them accordingly in
– Text
– Tables of univariate, bivariate, and multivariate
statistics
– Charts
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Common errors in labeling units
• Calling a proportion a percentage (or vice
versa)
– Check axis scales that are automatically generated
by Excel or other graphing programs.
• If your data are stored as proportions, that is how they
will be graphed.
• When you add labels, make them match the actual
units.
• Forgetting to specify the level of aggregation
– E.g., deaths per 1,000 persons is not the same as
deaths per 100,000.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Tables of multivariate statistics
• To provide the information needed to
correctly interpret βs for each variable in the
multivariate model, carefully label
– level of aggregation
– units
– categories (ranges)
• Labels and titles should reflect whatever
version of the variable is included in the
specification.
– Transformed or original
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Units for transformed variables
• Label the units in multivariate tables, charts,
and prose as they appear in the model. E.g.,
– If you have transformed 1+ continuous
independent or dependent variables before
specifying the model. E.g.,
•
•
•
•
Changed the scale
Taken logs
Changed the level of aggregation
Created a categorical version of a continuous variable
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Conveying model specifications in
multivariate tables and charts
• Clearly name types of model specification in
tables and charts.
– Mention type of model in the title.
• E.g., “Standardized coefficients from a model of . . .”
– Or label columns or axes to identify:
• Standardized coefficients
• Logged dependent variables
– Identify logged independent variables in their respective row
labels.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Contents of tables and charts
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Background information for contrasts
• Anticipate the metrics and contrasts you will
use as you write about your βs. E.g.,
– Interquartile range
– Multiples of standard deviations
– Percentiles in a reference distribution
• Report the corresponding statistics in tables
about your own data.
• Cite a source for complex reference data.
• E.g., Federal Poverty Levels for different sizes and age
compositions of household.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Tables of descriptive statistics
• To provide readers with the basis of the numeric
contrasts in the associated prose:
• If your models include logged versions of variable(s),
report descriptive statistics in both
– The original, untransformed units
– The logged units
• If you use empirically based contrasts to interpret βs,
include those values in the table of descriptive
statistics. E.g.,
– Interquartile range
– Standard deviation and mean
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Interpreting multivariate
coefficients
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Presenting results to minimize
Goldilocks problems
• Explicitly identify the nature and size of the
contrast for each independent variable as you
interpret its estimated coefficient.
– Continuous or categorical?
– Units or categories being compared?
– Size of numeric contrast applied to each β?
– Units of the effect measure?
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Common pitfalls in reporting of
multivariate coefficients
• Simply reporting βs increases the chances of
Goldilocks mistakes by failing to remind readers to
consider
–
–
–
–
variable types
units
range and scale
categories
• The βs should be reported in your multivariate table.
• Don’t simply repeat (report) those values without
substantive interpretation.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Prose to avert Goldilocks errors in
interpretation of coefficients
• Apply carefully selected “right-sized” contrasts to
each β.
– Where needed, explain the criteria used to identify fitting
numeric contrasts for each of your key variables. E.g.,
• Cite sources of substantively relevant contrasts/
• Identify empirical contrasts by name. E.g.,
– Interquartile range
– Standard deviation difference
• Convey the units or categories of both your IV and
DV as you interpret the direction and magnitude of
the βs.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Reporting coefficients
on nominal variables
• Name the categories being compared.
– Such wording helps avoid implying that
• More than a 1-unit change in a dummy variable is
possible;
• Directionality of movement across categories pertains
to nominal variables.
– This distinction is especially important when
interpreting βs for both continuous and
categorical independent variables.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Interpreting β for a nominal variable
• Poor: “The β for boy was 116.0.”
– If reported in a series of coefficients, this version invites
readers to compare them directly, without factoring in that
only a “1-unit” contrast is possible for gender.
• Poor, version number2: “Gender is positively
associated with birth weight.”
– Cannot specify direction of association for a nominal
variable.
• Best: “At birth, boys weigh on average 116 grams
more than girls (p < 0.01).”
– Concepts, categories, units, direction, size, and statistical
significance.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Reporting coefficients
on ordinal variables
• For ordinal variables, including composite
scales or indexes that are composed from
ordinal variables such as Likert scales.
– Name the categories and specify the reference
category.
– Can write about directionality of the association
because the categories are ordered.
• However, distance between ordinal categories cannot
be treated as equal.
• Numeric codes don’t have mathematical meaning.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Interpreting β for an ordinal variable
• Poor: “Self-rated health and mortality are
correlated.”
– Fails to convey direction or magnitude.
• Better: “Among middle-aged men, self-rated
health was inversely related to mortality.”
– Conveys directionality.
– Doesn’t name categories being compared or size
of mortality difference between them.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Interpreting β for an ordinal variable
• Best: “Among middle-aged men, self-rated health
was inversely related to mortality. Relative risks of
mortality were 2.8, 2.2, and 1.9 for those who rated
their health ‘poor/fair’, ‘good’, and ‘very good’, when
each was compared with ‘excellent’ health (all p <
0.05).”
– Concepts, direction, magnitude, and statistical significance.
– By naming the categories, conveys what a move up the
self-rated scale from one category to the next represents
conceptually.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Interpreting coefficients on
continuous variables: Units
• Mention the units of your independent and
dependent variables as used in the regression.
– Units
• Original units or logged?
• Unstandardized βs or standardized βs?
– Level of aggregation, e.g.,
• Weekly or monthly income?
• Income in $1s or $1,000s?
• Individual or family or household income?
• Might differ from their original form in the data.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Interpreting β for a continuous variable
• Poor: “The β of income on mortality was
0.02.”
– By not mentioning the units of either income or
mortality:
• Leaves the size of the β open to misinterpretation.
• Makes it difficult to compare against βs on the same
topic by other authors.
• Better: “Each additional $10,000 in annual
family income was associated with a 2%
decrease in the age-standardized mortality
rate.”
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Interpreting coefficients using
contrasts other than a 1-unit increase
• If you apply a contrast other than a 1-unit increase,
report the size of that contrast as you interpret the
pertinent β. E.g.,
– “Each five-year increase in mother’s age is associated with
a 53 gram increase in birth weight.”
– “Students whose SAT scores were one standard deviation
above the mean had 22% higher chances of graduating
from college within six year as those with SAT scores at the
population mean.”
– “Adults at the 25th percentile of BMI were only one-third
as likely to die as those at the 75th percentile.”
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Goldilocks guidelines for
the discussion section
• Put your results in context in terms of both
– Topic
– Data
• Context
• Specific measures of the concepts under study
• Reiterate theoretical criteria that identify
meaningful contrasts for your topic. E.g.,
– Clinical cutoffs
– Program eligibility thresholds
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Statistical significance versus
substantive importance
• Summarize what those contrasts show about your
key findings, differentiating between
– statistical significance
– substantive importance
• Could mean that once the metrics of the variables
were considered, one or more independent variables
did not have a substantively meaningful association
with the dependent variable,
– even if that association was
• statistically significant
• in the expected direction.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Differentiating between statistical
and substantive significance
• Use a modifier such as “statistically” or
“substantively” before the term “significant” so
readers know which type of “significant” you mean.
– Doing so might also remind you to discuss BOTH of those
aspects of your findings.
• Rather than vague reference to “substantive
significance,” explain that aspect with reference to
topic-specific criteria.
– See chapter and podcast on substantive significance.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Summary
• To present βs from multivariate models
effectively, need to convey
– Units, categories, and descriptive statistics on all
variables as they are specified in the model
– Type of model specification
– The size of the contrast applied to each β as it is
• Interpreted
• Compared with other coefficients in the model or in
other papers.
• Close the narrative by putting the size of
major findings back in substantive context.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Suggested resources
• Miller, J. E., 2013. The Chicago Guide to Writing about
Multivariate Analysis, 2nd Edition. (“WAMA II”)
– Chapter 3, on statistical significance, substantive
significance, and causality
– Chapter 9, on quantitative comparisons for
multivariate models
– Chapter 10, on the Goldilocks problem
• Miller, J. E. and Y. V. Rodgers, 2008. “Economic
Importance and Statistical Significance: Guidelines for
Communicating Empirical Research.” Feminist
Economics 14 (2): 117–49.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Suggested online resources
• Podcasts on
– Statistical significance, substantive significance,
and causality
– Interpreting coefficients from multivariate models
– Defining the Goldilocks problem
– Resolving the Goldilocks problem
• Measurement and variables
• Model specification
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Suggested practice exercises
• Study guide to The Chicago Guide to Writing
about Multivariate Analysis, 2nd Edition.
– Problem sets for
• chapter 3, question #4.
• chapter 10, questions #5 through 8.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Suggested extensions
• Study guide to The Chicago Guide to Writing
about Multivariate Analysis, 2nd Edition.
– Suggested course extensions for
• chapter 3
– “Reviewing” exercise 3.a.v and 3.b.
– “Writing and revising” exercises #2 and 3.
• chapter 10
– “Reviewing” exercises #1 and 2.
– “Applying statistics and writing” question #2.
– “Revising” questions #1 through 5, and 9.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Contact information
Jane E. Miller, PhD
jmiller@ifh.rutgers.edu
Online materials available at
http://press.uchicago.edu/books/miller/multivariate/index.html
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Download