Resolving the Goldilocks problem: Presenting results Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Overview • • • • Labeling of variables Identifying model specifications Contents of descriptive statistics tables Prose interpretation of multivariate coefficients – In the results section – In the discussion section The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Guidelines for effective labeling: Conveying levels of measurement, units and categories The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Basic attributes of variables • For every variable in your analysis, convey – Units of measurement • System of measurement (e.g., British or metric?) • Scale (e.g., millimeters or meters?) • Level of aggregation (e.g., weekly or annual income?) – Names of categories, if nominal or ordinal • Label them accordingly in – Text – Tables of univariate, bivariate, and multivariate statistics – Charts The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Common errors in labeling units • Calling a proportion a percentage (or vice versa) – Check axis scales that are automatically generated by Excel or other graphing programs. • If your data are stored as proportions, that is how they will be graphed. • When you add labels, make them match the actual units. • Forgetting to specify the level of aggregation – E.g., deaths per 1,000 persons is not the same as deaths per 100,000. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Tables of multivariate statistics • To provide the information needed to correctly interpret βs for each variable in the multivariate model, carefully label – level of aggregation – units – categories (ranges) • Labels and titles should reflect whatever version of the variable is included in the specification. – Transformed or original The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Units for transformed variables • Label the units in multivariate tables, charts, and prose as they appear in the model. E.g., – If you have transformed 1+ continuous independent or dependent variables before specifying the model. E.g., • • • • Changed the scale Taken logs Changed the level of aggregation Created a categorical version of a continuous variable The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Conveying model specifications in multivariate tables and charts • Clearly name types of model specification in tables and charts. – Mention type of model in the title. • E.g., “Standardized coefficients from a model of . . .” – Or label columns or axes to identify: • Standardized coefficients • Logged dependent variables – Identify logged independent variables in their respective row labels. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Contents of tables and charts The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Background information for contrasts • Anticipate the metrics and contrasts you will use as you write about your βs. E.g., – Interquartile range – Multiples of standard deviations – Percentiles in a reference distribution • Report the corresponding statistics in tables about your own data. • Cite a source for complex reference data. • E.g., Federal Poverty Levels for different sizes and age compositions of household. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Tables of descriptive statistics • To provide readers with the basis of the numeric contrasts in the associated prose: • If your models include logged versions of variable(s), report descriptive statistics in both – The original, untransformed units – The logged units • If you use empirically based contrasts to interpret βs, include those values in the table of descriptive statistics. E.g., – Interquartile range – Standard deviation and mean The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate coefficients The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Presenting results to minimize Goldilocks problems • Explicitly identify the nature and size of the contrast for each independent variable as you interpret its estimated coefficient. – Continuous or categorical? – Units or categories being compared? – Size of numeric contrast applied to each β? – Units of the effect measure? The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Common pitfalls in reporting of multivariate coefficients • Simply reporting βs increases the chances of Goldilocks mistakes by failing to remind readers to consider – – – – variable types units range and scale categories • The βs should be reported in your multivariate table. • Don’t simply repeat (report) those values without substantive interpretation. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Prose to avert Goldilocks errors in interpretation of coefficients • Apply carefully selected “right-sized” contrasts to each β. – Where needed, explain the criteria used to identify fitting numeric contrasts for each of your key variables. E.g., • Cite sources of substantively relevant contrasts/ • Identify empirical contrasts by name. E.g., – Interquartile range – Standard deviation difference • Convey the units or categories of both your IV and DV as you interpret the direction and magnitude of the βs. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Reporting coefficients on nominal variables • Name the categories being compared. – Such wording helps avoid implying that • More than a 1-unit change in a dummy variable is possible; • Directionality of movement across categories pertains to nominal variables. – This distinction is especially important when interpreting βs for both continuous and categorical independent variables. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting β for a nominal variable • Poor: “The β for boy was 116.0.” – If reported in a series of coefficients, this version invites readers to compare them directly, without factoring in that only a “1-unit” contrast is possible for gender. • Poor, version number2: “Gender is positively associated with birth weight.” – Cannot specify direction of association for a nominal variable. • Best: “At birth, boys weigh on average 116 grams more than girls (p < 0.01).” – Concepts, categories, units, direction, size, and statistical significance. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Reporting coefficients on ordinal variables • For ordinal variables, including composite scales or indexes that are composed from ordinal variables such as Likert scales. – Name the categories and specify the reference category. – Can write about directionality of the association because the categories are ordered. • However, distance between ordinal categories cannot be treated as equal. • Numeric codes don’t have mathematical meaning. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting β for an ordinal variable • Poor: “Self-rated health and mortality are correlated.” – Fails to convey direction or magnitude. • Better: “Among middle-aged men, self-rated health was inversely related to mortality.” – Conveys directionality. – Doesn’t name categories being compared or size of mortality difference between them. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting β for an ordinal variable • Best: “Among middle-aged men, self-rated health was inversely related to mortality. Relative risks of mortality were 2.8, 2.2, and 1.9 for those who rated their health ‘poor/fair’, ‘good’, and ‘very good’, when each was compared with ‘excellent’ health (all p < 0.05).” – Concepts, direction, magnitude, and statistical significance. – By naming the categories, conveys what a move up the self-rated scale from one category to the next represents conceptually. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting coefficients on continuous variables: Units • Mention the units of your independent and dependent variables as used in the regression. – Units • Original units or logged? • Unstandardized βs or standardized βs? – Level of aggregation, e.g., • Weekly or monthly income? • Income in $1s or $1,000s? • Individual or family or household income? • Might differ from their original form in the data. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting β for a continuous variable • Poor: “The β of income on mortality was 0.02.” – By not mentioning the units of either income or mortality: • Leaves the size of the β open to misinterpretation. • Makes it difficult to compare against βs on the same topic by other authors. • Better: “Each additional $10,000 in annual family income was associated with a 2% decrease in the age-standardized mortality rate.” The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting coefficients using contrasts other than a 1-unit increase • If you apply a contrast other than a 1-unit increase, report the size of that contrast as you interpret the pertinent β. E.g., – “Each five-year increase in mother’s age is associated with a 53 gram increase in birth weight.” – “Students whose SAT scores were one standard deviation above the mean had 22% higher chances of graduating from college within six year as those with SAT scores at the population mean.” – “Adults at the 25th percentile of BMI were only one-third as likely to die as those at the 75th percentile.” The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Goldilocks guidelines for the discussion section • Put your results in context in terms of both – Topic – Data • Context • Specific measures of the concepts under study • Reiterate theoretical criteria that identify meaningful contrasts for your topic. E.g., – Clinical cutoffs – Program eligibility thresholds The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Statistical significance versus substantive importance • Summarize what those contrasts show about your key findings, differentiating between – statistical significance – substantive importance • Could mean that once the metrics of the variables were considered, one or more independent variables did not have a substantively meaningful association with the dependent variable, – even if that association was • statistically significant • in the expected direction. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Differentiating between statistical and substantive significance • Use a modifier such as “statistically” or “substantively” before the term “significant” so readers know which type of “significant” you mean. – Doing so might also remind you to discuss BOTH of those aspects of your findings. • Rather than vague reference to “substantive significance,” explain that aspect with reference to topic-specific criteria. – See chapter and podcast on substantive significance. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Summary • To present βs from multivariate models effectively, need to convey – Units, categories, and descriptive statistics on all variables as they are specified in the model – Type of model specification – The size of the contrast applied to each β as it is • Interpreted • Compared with other coefficients in the model or in other papers. • Close the narrative by putting the size of major findings back in substantive context. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Suggested resources • Miller, J. E., 2013. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. (“WAMA II”) – Chapter 3, on statistical significance, substantive significance, and causality – Chapter 9, on quantitative comparisons for multivariate models – Chapter 10, on the Goldilocks problem • Miller, J. E. and Y. V. Rodgers, 2008. “Economic Importance and Statistical Significance: Guidelines for Communicating Empirical Research.” Feminist Economics 14 (2): 117–49. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Suggested online resources • Podcasts on – Statistical significance, substantive significance, and causality – Interpreting coefficients from multivariate models – Defining the Goldilocks problem – Resolving the Goldilocks problem • Measurement and variables • Model specification The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Suggested practice exercises • Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. – Problem sets for • chapter 3, question #4. • chapter 10, questions #5 through 8. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Suggested extensions • Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. – Suggested course extensions for • chapter 3 – “Reviewing” exercise 3.a.v and 3.b. – “Writing and revising” exercises #2 and 3. • chapter 10 – “Reviewing” exercises #1 and 2. – “Applying statistics and writing” question #2. – “Revising” questions #1 through 5, and 9. The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Contact information Jane E. Miller, PhD jmiller@ifh.rutgers.edu Online materials available at http://press.uchicago.edu/books/miller/multivariate/index.html The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.