SS10.2

advertisement
Resolving the Goldilocks problem:
Variables and measurement
Jane E. Miller, PhD
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Overview
• Identifying criteria for choosing fitting
contrasts for each variable
• Understanding conceptual and contextual
aspects of your variables
• Becoming familiar with the distributions of
your variables
• Transforming variables
• Describing your variables in the methods
section
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Criteria for choosing pertinent-sized
contrasts for each of your variables
• Theoretical criteria
• Empirical criteria
• Measurement issues
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Theoretical criteria
for choosing fitting contrasts
• Theoretical criteria relate to how that concept
is measured and compared in the literature or
real-world context.
• Examples:
– Multiples of the poverty level that correspond
with program eligibility criteria for that place and
time.
– Multiples of standard deviations of weight-forheight , based on international child growth
standards.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Identifying theoretical criteria
for your topic
• Start by reading the literature to identify
which ones pertain to each of your
– Independent variables (IVs)
– Dependent variables (DV)
• Also identify real-world factors pertaining to
your variables. E.g.,
– Physical properties (e.g. freezing point of water)
– Clinically meaningful contrasts
– Socially relevant contrasts
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Empirical criteria for choosing
fitting contrasts
• Based on the observed distribution of values
in your data.
• Examples:
– Multiples of standard deviations
• Comparing values at the mean, and ±1 standard
deviation in the IV
– Interquartile range
• Comparing values at the 25th and 75th percentiles of
the IV.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
When to use empirical criteria
• Best used if theoretical criteria are not
available for your topic.
• Or possibly to compare with other studies that
have used same criteria.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Measurement issues
and choice of contrast size
• For some variables, a one-unit contrast is too
small to be measured accurately.
• Examples:
– Difficult for most individuals to accurately recall
their annual income to the nearest dollar.
– Difficult to measure blood pressure to the nearest
1 mm Hg (millimeter of mercury)
• In such situations, use a larger contrast.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Getting to know your variables
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Understanding the context
• Become familiar with the range of values that
make sense for each of your variables:
– When, where, and to whom the data pertain.
• E.g., pertinent values for family income will be
different:
– Now versus 200 years ago.
– In the US versus in a developing country today.
– For a low-income sample of the US than for the
entire population.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Understanding conceptual
attributes of your measures
• Become familiar with the ranges of values that
make sense for each of your variables
– A birth weight of 9,999 grams is too high
• ~=22 lb., which is the size of an average 12 month old!
– In this case, problems arose due to ignoring
• System of measurement (metric, not British)
• Units
• Real-world meaning of the number.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Identifying the valid theoretical
range of values
• Different types of measures have different
valid ranges:
– Proportions must fall between 0.0 and 1.0.
– Temperature in °Fahrenheit can be either positive
or negative, but in °Kelvin can only be positive.
– Number of children in a family has a narrower
theoretical range than does annual family income.
• Identify the pertinent limits for each of your
variables.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Examining the range
of observed values
• Examine the distributions of the variables in
your data set to become familiar with the
– Units
– Range
– Distribution of values
– Categories
• Of nominal variables
• Ordinal versions of continuous variables
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Identifying variables for which
a 1-unit contrast is not suitable
• Based on your theoretical, contextual, and
empirical investigations of each variable in your
model, identify those for which
– A one-unit contrast is too big
• E.g., those with low values or a very narrow range
– A one-unit contrast is too small
• E.g., those with very high values or a wide range
– A one-unit contrast is just right
• See podcast on defining the Goldilocks problem
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Defining variables to address
the Goldilocks problem
• Many Goldilocks issues can be addressed by
modifying one or more variables before
specifying the multivariate model:
– Rescaling
– Using a different level of aggregation
– Creating a categorical version of a continuous
variable.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Transforming your variables
• These transformations can:
– Make a one-unit increase in Xi align better with
the research question.
– Shift the scale of the βs to be more consistent
across the set of variables in the model.
• For any of these approaches, retain the
original variable and create a new variable
with the transformed version.
– Never overwrite the original data!
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Rescaling your variables
• For some research questions, a simple change
of scale can help make a one-unit contrast in
the independent variable align better with the
research question.
• For example, working with
– annual income in $10,000s instead of $1s.
– ozone concentration in parts per thousand instead
of parts per million.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Rescaling and the decimal system
• Rescaling variables involves dividing or multiplying
the original variable by some value
• Often a multiple of ten, e.g.,
– Multiply by 1,000
– Divide by 100
• Although changing the scale of a variable by an order
of magnitude or two is mathematically convenient, it
is also arbitrary and in many cases unrelated to the
topic or data under study.
– E.g., increments of 10 or 100 days don’t correspond to
common usage as well as increments of 7 or 30 or 365
days.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Changing the level of aggregation
• An alternative way to make the scale of
variables fit better with a one-unit increase is
to change the level of aggregation.
– If a one-unit change in the original variable is too
small, shift to a lower level of aggregation, e.g.,
• weekly income instead of annual income;
• population at the county instead of state level.
– If a one-unit change is too large, shift to a higher
level of aggregation, e.g.,
• cost per dozen instead of per piece.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Creating a categorical version
of a continuous variable
• For topics for which standard ranges or cutoffs
are commonly used, consider creating a
categorical version of a continuous variable.
E.g.,
– Age ranges that relate to developmental,
economic, social, or health phenomena
• 0–17 years (children), 18–64 years, 65+ years
– Clinically meaningful ranges of blood pressure
• <120 mm Hg; 120–139 mm Hg; 140+ mm Hg
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Describing exploratory work
in your methods section
• In the methods section, describe the behindthe-scenes work you did to address Goldilocks
issues.
• Explain the reasons for those transformations
given your research question and data.
– Exploratory analysis of distributions of your
variables in your data set.
– Background reading on commonly used cutoffs or
calculations for the variables you are using.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Defining newly created variables
in your methods section
• If you transformed variables or created
categorical versions of continuous variables,
– Report units and levels of aggregation for all
transformed variables. E.g.,
• Income in $10,000s.
• Logged(income in $1s).
– Specify cutoffs used to define categories. E.g.,
• Ranges of BMI used to define overweight or obesity.
• Poverty thresholds (multiples of the Federal Poverty
Level) for different years or household compositions.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Summary
• Transforming one or more of your variables before
specifying your multivariate model can
– Make a one-unit increase in each independent variable
align better with the research question.
– Shift the scale of the βs to be more consistent across
independent variables in the model.
• In your methods section, describe
– Exploratory data analysis to become familiar with observed
values and distributions of each variable in your model.
– The calculations and criteria used to create new variables.
– Citations for those criteria and calculations.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Suggested resources
• Miller, J. E. 2013. The Chicago Guide to Writing
about Multivariate Analysis, 2nd Edition.
– Chapter 10, on the Goldilocks problem
– Chapter 4, on types of variables, units and
distribution
– Chapter 7, on choosing effective examples
– Chapter 13, on the data and methods section
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Suggested online resources
• Podcasts on
– Defining the Goldilocks problem
– Resolving the Goldilocks problem using
• Model specification
• Effective ways of presenting results
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Suggested practice problems
• Study guide to The Chicago Guide to Writing
about Multivariate Analysis, 2nd Edition.
– Problem sets for
• chapter 7, question #6
• chapter 10, questions #1 through 5.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Suggested extensions
• Study guide to The Chicago Guide to Writing
about Multivariate Analysis, 2nd Edition.
– Suggested course extensions for
• chapter 4
– “Reviewing” questions #1 and 3.
• chapter 10
– “Reviewing” exercises #1 and 2.
– “Applying statistics and writing” question #1, 2, 3, and 5.
– “Revising” questions #1, 2, 3, and 9.
• chapter 13, “writing” exercises #3 and 4.
– “Getting to know your variables” assignment
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Contact information
Jane E. Miller, PhD
jmiller@ifh.rutgers.edu
Online materials available at
http://press.uchicago.edu/books/miller/multivariate/index.html
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Download