Measurement and Scaling
• Measurement is the process observing and
recording the observations that are collected
as part of a research effort.
• the assignment of numbers to aspects of
objects or events according to one or another
rule or convention
Conceptual definitions help us to be explicit about what our theories are
talking about. That way, we know we are measuring the right thing.
Conceptual definition: The concept of ……………….is defined as the extent to
which…………………. exhibit the characteristic of ………… .
Example 1: What would be a good definition for the following concepts?
• Being politically informed (unit:……….)
• Being economically developed (unit:………..)
• Example: Ecological Fallacy (Berkeley Gender Bias Case)
Evidence of (the concept of) gender bias?
• Operational definitions describe how we
convert the concept into the variable
numerical codes. What could be a simple
example of a operational definition for a sex
1. Systematic measurement error occurs when
the operational definition fails to match the
conceptual definition in a systematic manner.
Lack of systematic error is called validity.
2. Random measurement error occurs when
temporary or haphazard factors affect the
measurement. Lack of this random error is
called reliability.
• The sine qua non of measurement is that the numbers
assigned to objects reflect the relations among the
objects with respect to the aspect being measured.
This idea-referred to as isomorphism-means a one-toone correspondence between elements of two classes.
• A prime example of isomorphism is that between a
map and the geographic region it depicts. There is a
one-to-one correspondence between, say, towns and
the points used to represent them on the map, such
that relations among the points on the map (e.g.,
distances) reflect relations among the geographic
locations they represent. Hence, the great usefulness
and convenience of maps.
• Measurement consists of mapping a set of
objects onto a set of numbers, such that there is
isomorphism between the objects measured and
the numbers assigned to them.
• An obvious example is measurement of weight,
where each of a set of objects is assigned a
number such that the relations among the
numbers reflect the relations among the objects
with respect to weight (e.g., one object being
twice as heavy as another).
• An appreciation of the benefits of measurement may
be attained when it is contrasted with alternative
approaches to the description of or the differentiation
among a set of objects with respect to a given aspect.
• Contrast the limitations, ambiguities, and potential
inconsistencies when attempting to describe verbally
the weight of a set of objects (e.g., heavy, very heavy,
very very heavy, not so heavy, light, lighter, much
lighter) with statements about the numerical weights
of the same objects.
• A great advantage in using measurement is
that one may apply the powerful tools of
mathematics to the study of phenomena.
Operating on sets of numbers that are
isomorphic with aspects of sets of objects
enables one to arrive at concise and precise
statements of regularities, or laws, regarding
phenomena to a degree unattainable without
the benefits of measurement.
• Suppose, for example, that one wants to study and
describe the relation between mental ability and
achievement. Relying on observations and verbal
descriptions, one is limited to unwieldy and potentially
ambiguous statements (e.g., people high on mental
ability generally manifest greater achievement than
those low on mental ability).
• In contrast, measuring mental ability and achievement
of a sample of people, one can calculate an index of
the relation between the two variables (e.g., the
correlation coefficient), and state the direction and
strength of the relation with clarity and conciseness
unattainable through verbal descriptions.
• Moreover, the index of the relation can in turn
be used for various purposes (e.g., to
determine whether and by how much the
relation between mental ability and
achievement differs across various racial
groups), or, along with other statistics, it may
be used to develop an equation to predict
achievement from mental ability.
Scales of Measurement
• Stevens (1951) proposed the following four
types of measurement (also referred to as
levels of measurement) in ascending order,
from the crudest to the most elaborate:
nominal, ordinal, interval, and ratio.
Nominal Scale (1)
• A nominal scale entails the assignment of
numbers as labels to objects or classes of objects.
• Examples…
• Referring to the nominal level of measurement,
Coombs (1953) stated: "This level of
measurement is so primitive that it is not always
recognized as measurement, but it is a necessary
condition for all higher levels of measurement"
Nominal Scale (2)
• To satisfy the requirements of nominal scaling,
subjects have to be classified into a set of
mutually exclusive and exhaustive categories.
What this means is that each subject is assigned
to one category only and that all subjects are
classifiable into the categories used.
• For example, classifying people according to their
political party affiliation, each person is classified
as a member of one party only, and each person
must fit into one of the categories used.
Ordinal Scale (1)
• An ordinal scale entails the assignment of
numbers to persons or objects so that they
reflect their rank ordering on an attribute in
question. If person A, say, is viewed as kinder (or
smarter, better looking) than person B, then he or
she may be assigned a "2," whereas B may be
assigned a "1."
• The numbers thus assigned do not reflect by how
much A exceeds B on the attribute in question
but rather the relation "greater than," or "more
than," symbolized by >.
Ordinal Scale (2)
• On an ordinal scale, it must be true that for any
pair of objects, A and B, if A is greater than B,
then B is not greater than A. This is referred to as
an asymmetric, or nonsymmetric, relation. It is, of
course, possible for A to be equal to B, reflecting
a symmetric relation.
• Under such circumstances, A and B would be
assigned the same number, referred to as a tied
Ordinal Scale (3)
• For any three objects, A, B, and C on an
ordinal scale, it must be true that if A > B, and
B > C, then A > C. This is referred to as
• An asymmetric relation is not necessarily
transitive. For example, person A may beat
person B in a game of chess, and B may beat
C. From this, it does not follow that A will beat
• Because the numbers assigned to objects on an ordinal
scale reflect only the relation "greater than," invariance
will be maintained under any monotonic
transformation of the scale values.
• A monotonic transformation is one in which the rank
ordering of the numbers does not change. Following
are examples of monotonic transformations:
adding a constant to all the numbers,
raising the numbers to any power,
taking the square root of the numbers, and
multiplying the numbers by a +tive constant.
Ordinal Scale (3)
• Limitations of an ordinal scale as well as the
potential for misinterpretation of the scale
values will be illustrated through two
• Assume that two groups, each consisting of eight people,
were rank ordered with respect to height. The results are
depicted in Figure below, where the letters above the line
refer to the people, and the numbers below the line refer
to their rank ordering on height; (a) and (b) refer to the two
groups. Notice that the people are not evenly distributed
on the height continuum.
What is the problem here?
• It is obvious that no meaningful comparisons can
be made between ranks assigned in separate
• The fact that two people have the same rank in
distinct groups obviously does not mean that
they are of the same height.
• It is possible, for example, for the person ranked
as the shortest in group (a) to be taller than the
person ranked as the tallest in group (b).
Interval Scale (1)
• An interval level of measurement is achieved
when numbers are assigned to objects so that, in
addition to satisfying the requirements of the
ordinal level, differences between the numbers
may be meaningfully interpreted with respect to
the attribute being measured.
• In other words, on an interval scale, constant
units of measurement are used, affording
meaningful expressions of differences 'between
objects, comparisons of such differences, as well
as the conversion of differences into ratios.
Interval Scale (2)
• The example of an interval scale most often given is that of
a measure of temperature. On a Celsius scale, for example,
60° centigrade is not merely more than 50°, but it is 10°
• Because the units on the scale are constant, it is also true
that the difference between 60° and 50° is equal to the
difference between, say, 90° and 80°, or the difference
between 60° and 50° is twice that between 37° and 32°.
• An interval scale is invariant under linear (affine)
𝑋 ′ = 𝑎 + 𝑏𝑋
Consider converting celcius to fahrenheit.
Interval Scale (3)
• Note carefully that although it is meaningful
to express differences in scores on an interval
scale as ratios, it is not meaningful to do so for
the scores themselves.
• The reason is that the zero point on an
interval scale is arbitrary, hence the
admissibility of adding a constant to scores on
such a scale.
Interval Scale (4)
• Turning to examples of sociobehavioral measures,
consider the following:
(a) On an interval scale of intelligence,
individual A has a score of 120 and individual
B has a score of 60. The zero point on the
intelligence scale is necessarily arbitrary (how
would one define zero intelligence, in an
absolute sense? As being dead?); thus, it is
erroneous to conclude that person A is
twice as intelligent as B.
Interval Scale (5)
(b) On an interval scale of achievement in social
studies, person A answered correctly 60
multiple-choice items and person B answered
correctly 15 such items. Although it is true that
person A answered correctly four times as many
items as B, this does not mean that A knows four
times as much in social studies as does B.
Ratio Scale
• A ratio level of measurement is achieved when, in
addition to the requirements of the interval level, a
true, or absolute, zero point can be determined. That
is, zero means no amount of the attribute measured.
• The term ratio refers to the fact that, on such a scale,
the ratio of any two scores is independent of the units
of the scale.
• Ratio scales are not often encountered in
sociobehavioral sciences, although they are not
unheard of. The measurement of reaction time (e.g.,
on perceptual-motor tasks) is an example of a ratio
scale used in psychological research.
Levels of Measurement and Method of
• The literature on the relation between levels
of measurement and statistics is extensive,
with some authors strongly defending and
expounding Stevens's position, and others
rejecting it. Steven argued that means and
standard deviations should not be calculated
for measures that are on an ordinal level.
(What do you think?)
• The major source of the controversy regarding measurement and statistics
in sociobehavioral research is whether most of the measures used are on
an ordinal or an interval level.
• The pragmatists (e.g., Borgatta, 1968; Borgatta & Bohmstedt, 1981;
Gardner, 1975; Labovitz, 1967, 1972; Nunnally, 1978) argued cogently
that, although most measures used in sociobehavioral research are not
clearly on an interval level, they are not strictly on an ordinal level either.
• In other words, most of the measures used are not limited to signifying
"more than," or "less than," as an ordinal scale is, but also signify degrees
of differences, although these may not be expressible in equal interval
• Prime examples are summated measures of achievement, mental ability,
attitudes, and the like. Such measures occupy an intermediate, "grey"
(Gardner, 1975, p. 53) region between an interval and an ordinal level, and
to treat them as if they were on an ordinal level may lead to a serious loss
of information.
From Ordinal to Interval –
unidimensional scaling
• Likert Scale
• Rensis Likert was an American psychologist.
• What became known as the Likert method of attitude
measurement was formulated in his doctoral thesis,
and an abridged version appeared in a 1932 article in
the Archives of Psychology.
• At the time, many psychologists believed that their
work should be confined to the study of observable
behaviour, and rejected the notion that unobservable
(or ‘latent’) phenomena like attitudes could be
measured. Like his contemporary, Louis Thurstone,
Likert disagreed.
• They argued that attitudes vary along a dimension from
negative to positive, just as heights vary along a dimension
from short to tall, or wealth varies from poor to rich.
• For Likert, the key to successful attitude measurement was
to convey this underlying dimension to survey respondents,
so that they could then choose the response option that
best reflects their position on that dimension.
• Research confirms that data from Likert items (and
those with similar rating scales) becomes significantly
less accurate when the number of scale points drops
below five or above seven.
• The standard practice, again following Likert’s original
example, is to include a neutral midpoint. While Likert
labelled this point as ‘Undecided’, the more common
version is now ‘Neither agree nor disagree’. The
purpose of this option is evidently to avoid forcing
respondents into expressing agreement or
disagreement when they may lack such a clear opinion.
Not only might this annoy respondents, but it also risks
data quality.
Generate Likert Items (lots of them)
Have a group of judges to rate the items - Notice that, as in other
scaling methods, the judges are not telling you what they believe;
they are judging how favorable each item is with respect to the
construct of interest.
Ex: If the focus is to measure attitudes that people might have towards
persons with AIDS then you want judges to rate the "favorableness" of
each statement in terms of an attitude towards AIDS, where 1 =
"extremely unfavorable attitude towards people with AIDS" and 5 =
"extremely favorable attitude towards people with AIDS.“
-People with AIDS deserve what they got (1)
-If you have AIDS you can still lead a normal life (4)
-People with AIDS should be treated just like everybody else. (5)
and so on…
3. Select the items for the scale
Throw out any items that have a low correlation with the total
(summed) score across all items
In most statistics packages it is relatively easy to compute this type
of Item-Total correlation. First, you create a new variable which is
the sum of all of the individual items for each respondent. Then,
you include this variable in the correlation matrix computation (if
you include it as the last variable in the list, the resulting Item-Total
correlations will all be the last line of the correlation matrix and will
be easy to spot). How low should the correlation be for you to
throw out the item? There is no fixed rule here -- you might
eliminate all items with a correlation with the total score less that
.6, for example. (This is called internal consistency – Cronbach
3. Select the items for the scale
For each item, get the average rating for the top quarter
of judges and the bottom quarter. Then, do a t-test of
the differences between the mean value for the item for
the top and bottom quarter judges.
Higher t-values mean that there is a greater difference
between the highest and lowest judges. In more practical
terms, items with higher t-values are better discriminators, so
you want to keep these items. In the end, you will have to use
your judgment about which items are most sensibly retained.
You want a relatively small number of items on your final
scale (e.g., 10-15) and you want them to have high Item-Total
correlations and high discrimination (e.g., high t-values).
• DO NOT FORGET! Likert scales are ‘summated’
scales, so called because a respondent’s
answers on each question are summed to give
their overall score on the attitude or value.