Indexes, Scales, and Typologies Edgar Degas: The Absinthe Drinker (detail), 1875-76

advertisement
Indexes, Scales, and Typologies
Edgar Degas: The Absinthe Drinker (detail), 1875-76
Indexes versus Scales
1. Index
• A composite measure based upon the summed
responses to dichotomous indicators.
• This composite might be the sum of
responses to the indicator items or some
other calculation, such as the mean.
• Each indicator is given equal weight in
measuring the construct.
Indexes versus Scales
1. Index (Continued)
• An “exercise” index, for example, might be the
total score (0-5) of whether one participated in
five different physical activities in the past week.
• In this example, we want to know only
whether one participated in the physical
activity at all during the past week.
Indexes versus Scales
2. Scale
• A composite measure based upon multiple
continuous-level indicators.
• This composite might be the sum of
responses to the indicator items or some
other calculation, such as the mean.
• Responses to each indicator vary in their
strength and therefore in their contribution to
the total score for the construct.
Indexes versus Scales
2. Scale (Continued)
• An “exercise” scale, for example, might be the
total times (0-35) one engaged in five different
physical activities last week.
• In this example, we want to know how many
times the person engaged in each of the five
activities on each day of the week.
Notes Regarding the Term “Scale”
1. Confusing Language
• Unfortunately, in social science literature, the
term “scale” is used to refer to the metric by
which responses are recorded (i.e., a Likert
scale, a Guttman scale, etc.) and to a measure
that is constructed from multiple questions.
• These notes might clarify the difference in the
use of the term “scale.”
Notes Regarding the Term “Scale”
2. “Scale” as a Metric
•
Suppose we wanted to measure a concrete object, such as
the length of a table.
•
We might measure its length in inches by using a ruler. In
this sense, the ruler is the measuring instrument and its
length is measured on a “scale” of inches. The use of the
word “scale” in this context refers to the metric in which we
measure the table: inches.
•
We could use some other metric to measure the table,
such as centimeters.
Notes Regarding the Term “Scale”
2. “Scale” as a Metric (Continued)
•
Suppose instead that we wanted to measure an abstract
concept, such as marital satisfaction.
•
We might measure marital satisfaction by simply asking the
married person: “Are you satisfied with your marriage?
And we might record the responses to this question on a
Likert scale, wherein 1 = very unsatisfied, 2 = unsatisfied, 3
= neither satisfied nor unsatisfied, 4 = satisfied, 5 = very
satisfied. In this context, the question, “Are you satisfied
with your marriage?” is the measuring instrument and
satisfaction is measured using the metric: 1-5.
Notes Regarding the Term “Scale”
2. “Scale” as a Metric (continued)
•
We might have used some other approach, such as a
semantic differential, to measure marital satisfaction with
the instrument “Are you satisfied with your marriage?”
•
We have not created a “scale” to measure marital
satisfaction because we are using just one question to
measure it: “Are you satisfied with your marriage?”
•
And we cannot assess reliability because we have just one
measure of marital satisfaction: “Are you satisfied with your
marriage?”
Notes Regarding the Term “Scale”
2. “Scale” as a Metric (continued)
•
The use of the word “scale” to refer to the metric of
measurement (i.e., a Likert scale) is unfortunate because it
creates confusion.
•
The confusion is heightened because of the very common
use of the word “scale” in journal articles to refer to a
measurement metric.
•
But there you have it!
Notes Regarding the Term “Scale”
3. “Scale” as a Constructed Variable
•
Suppose we use a series of three questions to measure
marital satisfaction:
• Overall I am happy with my marriage.
• I am satisfied with my marriage.
• My marriage is a source of happiness to me.
•
Suppose that we record the answers to each of these
questions using a Likert scale, wherein 1 = strongly
disagree, 2 = disagree, 3 = neither agree nor disagree, 4 =
agree, and 5 = strongly agree.
Notes Regarding the Term “Scale”
3. “Scale” as a Constructed Variable (continued)
•
Suppose for each person we interview, we calculate the
mean score on each of the three questions about marital
satisfaction.
•
For example, suppose that Mary answered the three
questions, respectively, with the numbers: 5, 4, and 3.
Mary’s average score for the three questions equals 4.
Thus, her score on the three question marital satisfaction
scale equals 4.
Notes Regarding the Term “Scale”
3. “Scale” as a Constructed Variable (continued)
•
In this sense, we have created a “scale.”
•
And because we have more than one measurement that is
included within the marital satisfaction scale (i.e., three
questions), we can calculate the reliability of this scale with
Cronbach’s alpha.
Indexes versus Scales
3. Weighting
• Both indexes and scales can be unweighted or
weighted.
• In an unweighted index or scale, each item is
treated equally in the calculation of the
measure.
• In a weighted index or scale, some items are
given more weight than are others.
• We might decide, for example, that “running”
should be given more weight than “stretching”
in our measure of “exercise.”
Index Construction
1. Item Selection
• Face Validity: The extent to which items seem
to correspond to the definition of the construct
(see also: content, logical).
• Dimensionality: Items intended to measure a
construct should be exclusive to that construct.
• There are special exceptions to this criterion,
such as in multi-trait, multi-method models.
• Variance: Items should have a strong
correlation with the construct being measured.
Index Construction
2. Examination of Empirical Relationships
• Bivariate Relationships: This direct relationship
between the item and the construct indicates
the extent to which the item measures the
construct.
• Multivariate Relationships: Sometimes, once a
bivariate relationship is calculated for one item
and a construct, the relationship between a
second item and the construct is seen as a
weak one.
Index Construction
4. Handling Missing Data
• Missing data always present empirical,
conceptual, and even ethical problems.
• There are no perfect solutions to the problem.
• One might:
1. Omit cases with missing data on any items.
2. Insert the mean of all cases to items with
missing data.
3. Use specialized statistical procedures to
estimate a value for the missing data.
Index Construction
5. Index Validation
• Item Analysis: Examination of the empirical
contribution of each item to measuring the
construct (see also: internal validity).
• External Validation: Examination of the extent to
which the construct is correlated with related
constructs (see also: empirical validity).
• Bad Index or Bad Validators?: A lack of external
validation might occur because the validating
construct is not measured well.
Scale Construction
1. Similarities to Index Construction
• As with the construction of indexes, scale
construction must address issues of item
selection, examination of the data, scoring,
weighting, missing data, and validation.
• Central to scale construction is the
development of items with a range of
responses that can accurately reflect different
levels of correspondence with the construct
being measured.
Types of Scales
2. Thurstone Scale
• This technique assesses the extent of
agreement among a group of judges about the
content validity of proposed items for a scale.
• For example, one might ask a group of
persons to judge how closely 25 different
items come to measuring self-esteem.
Then, one might select the 10 items that
received the highest average scores for
having content validity with self-esteem.
Types of Scales
2. Thurstone Scale (Continued)
 This technique can help find the best
questions to ask to measure an abstract
concept.
 The technique does not specify how a
question or set of questions should be
formatted on a questionnaire.
Types of Scales
3. Likert Scale
• This technique assesses the extent of the
subject’s agreement with items that have been
judged (by some method) to have content
validity with the construct being measured.
• For example, the researcher might ask the
respondent to 1) strongly disagree; 2)
disagree; 3) neither agree nor disagree; 4)
agree; or 5) strongly agree with the
statement, “I am a person of worth” as a
means of measuring self-esteem.
Types of Scales
2. Likert Scale (Continued)
 This technique can be used to ask many
questions in a short amount of space (mailed
survey) or time (telephone survey).
 The technique is intuitively appealing to most
persons.
 The technique provides continuous-level data.
 The technique can become tiresome if used
too extensively on a questionnaire.
Types of Scales
4. Semantic Differential Scale
• This technique assesses the extent of the
subject’s agreement with items, where the
response for each item is shown on a
continuum.
Example:
Skipping class in Sociology 302 is...
Good for me.
1___2___3___4___5
Bad for me.
Types of Scales
2. Semantic Differential (Continued)
 This technique can be used to ask many
questions in a short amount of space (mailed
survey) or time (telephone survey).
 The technique is intuitively appealing to most
persons.
 The technique provides continuous-level data.
 The technique can become tiresome if used
too extensively on a questionnaire.
 It sometimes can be difficult to label the endpoints of a semantic scale.
Types of Scales
5. Guttman Scale
• This technique assesses the extent of the
subject’s agreement with items, where the
items are meant to represent a continuum.For
example, one might ask these questions:
1. Do you drink alcohol?
2. Do you smoke marijuana?
3. Do you use cocaine?
One might anticipate that all persons who answer
“yes” to #3 would also answer “yes” to #1 and #2,
and so forth.
Types of Scales
2. Guttman Scale (Continued)
 This technique can be used to ask many
questions in a short amount of space (mailed
survey) or time (telephone survey).
 The technique is intuitively appealing to most
persons.
 The technique provides continuous-level and
ranked data.
 The items have to form a continuum that is
accepted by respondents and the community
of scholars.
Typologies
1. Definition
• A nominal-level variable that summarizes two or
more variables.
• For example, one might create a typology of
males:
1. The “man’s man”: outdoor-type, strong, beer
drinker.
2. The “womanizer”: suave, good-looking, wine
drinker.
3. And so forth...
Typologies
2. Critique
• As with stereotypes, typologies do not
accurately reflect anyone and provide
oversimplifications of everyone.
• One should use typologies mainly to organize
one’s thinking as part of exploratory research.
• It is extremely difficult to analyze a typology as
a dependent variable because too much
variation exists within each category.
Validity and Reliability
1. Introduction
• Validity: The extent to which a measuring
device measures what it is intended to
measure.
• Reliability: The extent to which a measuring
device provides consistent values for the same
phenomenon being measured.
• These terms are described in detail at this
accompanying web site: Validity and Reliability.
Download