Principles of Questionnaire Construction

advertisement
Addendum to Quantitative Measurement
The process of creating measurable concrete variables from the abstract concepts that characterize a research problem is
known as operationalization. Concepts of interest are operationalized into an empirical format that can be used to ask
people questions in order to get data for analysis and interpretation.
Measurement Steps:
identify concept of interest
develop conceptual definition
operationalize to create a variable
Considerations in Quantitative Measurement In Creating a Scale or Index:
1. Types of Variables to be created in Scale Construction:
a) Continuous vs. Discrete categories
b) Levels of measurement:
nominal
ordinal
interval/ratio
2. Scale or Index construction:
Any type of quantitative measurement that you do may include a scale or an index (composite measure of a variable.) To
simply conceptualize a variable and then create empirical indicators, does not necessarily guarantee that the variable of
interest has been successfully operationalized. Often, the creation of a collection of items will give a more precise
measurement of a concept. In this case, several measures or items are combined to arrive at a single scale score. The
properties of validity, reliability, unidimensionality, and reproducibility as they relate to scale construction are the
primary standards for evaluating the operationalization and measurement of empirical variables.
3. Types of Measures:
a) Composite Measures: Indexes and scales are composite measures of variables: Several empirical
indicators of a variable are combined into a single measure.
b) Index vs. Scales
- Indexes and scales are not the same thing, though they share some common characteristics.
-Both scales and indexes are typically ordinal measures of variables.
- Scales and indexes are composite measures of variables, which means that measurement is based on
more than one data item or indicator.
- The major distinction between indexes and scales is the manner in which scores are assigned.
- An index is constructed through simple accumulation of scores assigned to individual responses.
-A scale is constructed through the assignment of scores to patterns of attributes (responses). A scale
measures the intensity structure that may exist among attributes. The most potent measure of a variable
is scored the highest, followed by the rest in descending order. The total score thus reflects a pattern of
answers, not just the sum of individual responses.
4. Reliability and Validity
In creating a scale or an index, the following must be considered:
a. Reliability
Reliability refers to the consistency of a measure, or whether a scale measures the same thing, in the
same way, time after time. The more common social science definition of the "reliability" refers to the consistency of
measurement: i.e. do the various items of the scale, which are thought to measure the same thing, actually do so?
Reliability can be statistically assessed. Coefficients of reproducibility and reliability such as Cronbach's alpha and the
Guttman coefficient of reproducibility can be calculated using a statistical software program such as SPSS.
b. Validity
Validity refers to whether or not the questions or empirical indicators are actually measuring what
they are supposed to be measuring. Validity cannot be determined statistically. For face validity, you can look at the
items and assess whether or not the items seem to be measuring what they intend to measure. For content validity, a
1
judgment is made as to whether the items seem to adequately represent all aspects or dimensions of the concept being
measured. Validity is fundamentally more important than the issue of reliability. A scale can be a reliable measure (i.e.
shoe size as a measure of IQ) but not a valid one. Assess your measures carefully to ensure they are valid.
c. Unidimensionality
This refers to the property that the items making up a scale measure one and only one dimension or concept at a time.
Typically, complex concepts such as political orientation, authoritarianism, feminism, marital satisfaction, and other
concepts are measured with scales and not by single questions or empirical indicators.
Note that single-item scales (i.e. single questions on a survey) must be unidimensional as well.
d. Reproducibility
A final issue in the construction of scales is that of reproducibility. The researcher should be
able to predict, with a knowledge of respondent's scale score, those items with which the respondent
most likely agreed and those with which the respondent was not in agreement. Reproducibility is difficult to achieve
when scales lack reliability, validity, and unidimensionality.
5. General criteria for a good index or scale:
1. clear instructions
2. items are simple, free of jargon
3. scale or index should be neat and easy to read
4. should include response bias questions
5. scales should be unidimensional
6. should have face validity
7. response categories should be balanced
8. answers should fit questions
9. behavioural indicators are preferred over attitudinal
10. must be reliable
6. Commonly Used Indexes:
a) Likert:
- This is a summated scale consisting of a series of items to which the subject responds.
- The respondent indicates agreement or disagreement with each item on an intensity scale.
- The Likert technique produces an ordinal scale.
- The scale is highly reliable when it comes to a rough ordering of people with regard to a particular
attitude or attitude complex.
- The score includes a measure of intensity as expressed on each statement.
- Because identical response categories are used for several items intended to measure a given
variable, each item can be scored in a uniform manner.
- A response category for each item is provided, typically a 5-point response composed of (1) strongly
agree, (2) agree, (3) undecided, (4) disagree, (5) strongly disagree.
- Analysis of the data is accomplished by scoring the various responses and summating them. A
summated score is possible by assigning a numerical value to each response--usually a value of 1 to 5.
Once the scoring procedure has been devised, a respondent's score is determined by adding the
individual numbers for each item.
- Respondents can then be ranked according to the overall score obtained.
b) Semantic differential:
- This is not as well known or as widely used as summated or unidimensional scales.
- The semantic differential format is flexible and can be used to measure a variety of attitudes. A 100item test can be administered in about 10 to 15 minutes; a 400-item test takes about an hour.
- This index attempts to measure attitudes toward some phenomena by having respondents check a
point along a continuum between two opposite positions. It uses a 7-point differential category
between two opposite points.
- The basic rationale of the semantic differential format is to measure respondents' reactions to some
property using opposite adjective ratings.
- The semantic differential seeks to understand behavior by studying language concepts and the
meaning projected on the concepts. Most sociologists agree with the notion that how a person behaves
2
in a situation is dependent on one's perception of the situation, the semantic differential is particularly
useful in measuring this type of meaning.
- Is constructed by preparing a list of concepts appropriate to the theory guiding the variable to be
measured. Pairs of polar adjectives, to which the respondents is asked to respond, are selected
according to the theory.
7. Commonly Used Scales:
a) Bogardus social distance scale:
- Measures the "distance" that respondents perceive between themselves and members of different
social categories (nationalities, racial groups, deviants, etc.).
- The Bogardus social distance scale is weighted according to the type of interaction that the subject is
willing to engage in with members of a group or of different groups.
- Theoretically, an individual who would readily accept a member of another ethnic group as a relative
would have no objection to working alongside that person or to that person's becoming a neighbour or
in-law.
b) Guttman scale:
- In Guttman scaling, both respondents and index items are ranked, according to the actual answers
given.
- Guttman scaling is effective at determining the unidimensionality of a scale.
- If a scale is unidimensional, then a person who has a more favorable attitude than another should
respond to each statement with equal or greater favorableness than another person.
3
Download