Operational Definitions

advertisement
Psy 1191 Research Methods Workshop: Operational Definitions
Introduction: The science of psychology tries to develop explanations of human behavior through objective
observations. The procedures or operations that we use to objectively measure a variable is known as its
operational definition. The operational definition gives the variable meaning within a particular study. Because the
meaning of our study rests on how we objectively observe the construct or behavior of interest, developing a
reliable and valid set of procedures for measuring our variables is crucial for the validity of any research study.
Good operational definitions require that we first specify our constructs; developing reliable and valid operations is
the last step of specifying constructs when we are designing our own studies. It is always easier to use an existing
measure than to develop a new one. Be sure to check the literature for measures that have been successfully used
in similar research. A careful reading of the "Procedures" and "Measures" sections of articles will give us
information that will help us identify and evaluate the operational definitions used in published research studies.
The features of a good operational definition vary depending on the study design. We will examine operational
definitions for variables measured in observational, survey, and experimental studies. The meaning of our study
rests on how we objectively observe the construct or behavior of interest. Thus, developing a reliable and valid set
of procedures for measuring our variables is crucial for the validity of any research study. Good operational
definitions require that we first specify our constructs (see Specifying Constructs workshop); developing reliable
and valid operations is the last step of specifying constructs when we are designing our own studies.
Behavioral Observation: Observational research requires careful attention to specifying where and how
observations are made, what is observed, and how it is recorded. As a result, operational definitions in this type of
research may be quite lengthy with multiple components. Let's say that you recently read Nancy Henley's theory of
status, power, and dominance and want to study whether those in higher status positions are more likely to initiate
informal, friendly touches and those in lower status positions are more likely to initiate more formal touches. Let's
develop each part of our operational definition:
You decide to attend a series of professional meetings sponsored by local businesses and observe members
during the social hour before the meeting is called to order. What are the advantages and disadvantages of making
observations at this setting? Next you need to decide how to do the observations. Will you pick "targets",
unobtrusively observe them for the whole social hour, and count how many and what kind of touches they make
(Strategy #1)? Will you wait until you observe a member touch someone and then record what kind of touch
(Strategy #2)? List an advantage and disadvantage of each strategy. Before deciding on a strategy, you will
definitely want to review the empirical literature on touch. Most of this literature uses touch as the unit of analysis
(Strategy #2) because it yields a wider variety of touches.
Now you need to decide what you will observe. Based on your reading of the literature, you want to identify formal
and informal touches. You need to either find an existing measure or develop your own. In either case, you should
have some idea of the content that needs to be included to adequately measure your variable. What types of
touches should be included in the categories of "informal" and "formal" touches? If you develop your own coding
system for touches, you need to make sure that you do a pilot test to make sure that your raters can reliably identify
the behaviors of interest. The pilot test will tell you if you are missing types of touches, if you need to eliminate
types that are never seen, if you need to combine categories, etc.
Once you have a complete and usable set of codes/behaviors, you are ready to conduct and record your
observations. When you are reading an observational study, look for pilot studies or descriptions of how this coding
system was used in previous studies. The final step of developing your observations is to decide how you will
record them. In this study, you want the observation process to be as natural and unobtrusive as possible. Let's say
you have three choices: Paper and pencil tucked in a program or on a clipboard; Hand held organizer; Small
tape recorder. What are the advantages and disadvantages of each strategy?
Reliability and validity are issues for all operational definitions. We want accurate and reliable observations and
we want our observations to validly reflect the variable of interest. If choosing an existing coding scheme, look for
good inter-rater reliability. Remember that the more complex the behavior that is recorded, the more difficult it is to
achieve good inter-rater reliability. Coding schemes that have been successfully used in other studies, demonstrate
good construct validity. If you are developing your own measure, be sure to assess inter-rater reliability. At
minimum, you should have good content validity. Let's go on to our next type of study: Surveys.
From Wadsworth Publishing:
http://www.wadsworth.com/psychology_d/templates/student_resources/workshops/res_methd/science/science_07.
html
Psy 1191 Research Methods Workshop: Operational Definitions
Surveys: Operational definitions for survey studies address the survey method, the type of question, and the
question content. We will briefly address each issue. For a comprehensive view of all aspects of surveys see the
Surveys Workshop. In addition, Survey Design Workshop addresses how to put the survey together.
Methods: There are three methods for obtaining survey data that are commonly used in the literature: face-to-face
interviews, telephone interviews, and self-administered questionnaires. Each strategy has its advantages and
disadvantages.
Face-to-face interviews are the best choice when you need to establish rapport with your participant. In a face-toface interview, you can show respect by attending carefully to the participant's responses, offering encouragement,
and answering questions. The disadvantage of the face-to-face interview is that the social situation created might
bias the participants' responses. They might not want to disappoint you or might feel hesitant to answer a question
on a sensitive topic. Face-to-face interviews are also very expensive to administer.
Telephone interviews have the advantage of offering some social distance since the participant cannot see you. It
might be easier to answer a sensitive question when you cannot see the interviewer's reaction to your response.
You can also answer participants' questions easily in a telephone interview and the cost is considerably less than a
face-to-face interview. In addition, random digit dialing also makes it very easy to recruit a random sample from the
general population. A major disadvantage to telephone interviews is that it is much easier to deny a request to a
telephone request than a face-to-face request. Caller ID also makes it easier to refuse by not answering the call.
Self-administered surveys have the great advantages of privacy and low cost. Participants can choose when it is
convenient to sit down and answer the survey which gives them time to give more considered responses. This is a
great advantage when asking sensitive questions. Self-administered surveys also are a very low-cost alternative,
especially if given using the Internet. A major disadvantage of self-administered surveys is that participants cannot
ask questions The response rates for this method are often very low. Let's say your counseling center decided to
do a survey to find out what students know about the services offered and whether students ever used their
services personally or recommended them to a peer. What method would you recommend and why?
Types of questions: There are three types of survey questions: open-ended, closed-ended, and partially closed.
Use open-ended items when it is important to have complete answers in the participants' own words. Open-ended
items are particularly useful when questions are sensitive and you want to let the participant know that you are
interested in the response, no matter what it is. For example, the question "What do you think are the best ways to
discipline young children?" permits participants to have a wide range of responses. There is no suggestion in the
question that there is a preferred method of discipline. Open-ended questions are particularly useful when
beginning a new area of research. You need to have a good sense of the entire range of response in order to
create valid, close-ended items. The major disadvantage to open-ended response is that they require more effort
from the participants and take a great deal of time to score. Closed-ended questions limit responses to specific
alternatives written by the researcher. Closed-ended items can use multiple choice, ranks, or likert ratings. Closedended items are easy for participants to answer and require the least effort of researchers to code and analyze.
Closed-ended items are not appropriate when the expected responses are too complex to fit into a few number of
categories. Extensive pilot research may be necessary to develop good items. Also, Multiple Choice, Rankings,
& Likert scale items. Partially closed items are favored when you have a good idea of the range of expected
responses but want to give the participant the opportunity to give an answer that is rare or that you did not consider.
Participants write in their responses.
Which of the following forms of discipline do you think are best for young children? (Check as many as apply)
___ Time out ___ Spanking ___ Redirection ___ No!, plus explanation ___ Other, please specify
Content: The content and wording of items are critical for effective survey research. Review the specification of
your construct and make sure that all of the dimensions are covered by survey questions. For example, if you were
studying post-traumatic stress, you would want to make sure that the instrument that you selected included
questions about symptoms of intrusion like nightmares and flashbacks, symptoms of avoidance like emotional
numbing and going out of your way to avoid settings similar to the one where the trauma occurred, and symptoms
of hyperarousal like vigilance and irritability. Make sure that each item addresses only one issue. Sometimes we
are tempted to include more than one idea in an item in order to "soften" the statement. This can result in problems
when trying to interpret the answers. For example, in a personality test, participants are asked to agree or disagree
From Wadsworth Publishing:
http://www.wadsworth.com/psychology_d/templates/student_resources/workshops/res_methd/science/science_07.
html
Psy 1191 Research Methods Workshop: Operational Definitions
with the following statement, "I am a warm and friendly person". For most people, these characteristics will go
together and a "yes" or "no" response will reflect their true character. It is possible, though, that people see
themselves as warm to others but not particularly friendly. Should they agree because they have the characteristic
of warmth? Or should they disagree because they are not friendly? Two separate items listing warmth and
friendliness would be a better alternative. Avoid bias in the wording of survey questions. Biased items limit the
range of responses you receive. For example the question "Should American citizens have the right to protect their
families from harm?" is a question that often appears on questionnaires by those opposing gun control legislation.
Most adults would answer "yes" to this question. However, if the question was worded "Should American citizens
have unrestricted access to guns in order to protect their families from harm?" the same men and women may not
answer "yes" to this question. Biased items should also be avoided because they may reveal the study hypothesis.
Response alternatives should be clear, mutually exclusive, and exhaustive. Pilot testing can help you determine
whether your questions and response meet these criteria. Participants should fit into one and only one category. If
you were asking participants to rate the number of times they read the newspaper in the past month, the following
categories would not be mutually exclusive [0; 1-5; 5-10; 10-15; 15-20; 20-25; 25-30]. Someone who read the
newspaper 15 times could accurately fit in the 10-15 and the 15-20 categories. For the categories to be exhaustive,
responses must fall into at least one alternative. If you assumed that everyone reads the newspaper, you might
omit the "0" category. This would be a problem for someone who prefers to obtain their news from the television or
Internet rather than the newspaper.
You try it. Your school is considering moving alumni weekend to the same weekend as graduation. Most students
think this will be chaotic but the administration believes that this will be a positive experience for alumni and it will
give graduating seniors a chance to meet those who have successfully launched careers. The administration asks
the psychology department to help design a questionnaire. What type of questions would you recommend for this
project? Try writing one open-ended and one closed-ended question.
Reliability: The reliability of survey instruments is usually assessed over items and occasions. Internal consistency
estimates like Cronbach's alpha tell us how well multiple items assess the same underlying construct or dimension.
A high Cronbach's alpha means that if a person scored high on one item they also tended to score high on the
other items. When you have only a single item that measures your content, reliability is usually tested over time.
The item is given on two different occasions and the responses are correlated. A high correlation means that the
same or similar responses were given both times and the instrument or question is relatively stable. When choosing
existing measures for a survey, look at how reliability was established and the level of reliability.
Validity: It is important to establish the construct validity of survey measures. We often do this by correlating
instruments with similar measures. Construct validity can also be established by predicting a specific behavior or
criterion. Construct validity for existing measures is often established by being successfully used in a wide range of
studies. Examining the results of the studies can tell us the various types of factors or behaviors associated with the
construct assessed by this measure.
Experiments: The operational definition of our independent variable is central in experimental research. How we
set up the laboratory and the experimental manipulation is critical to the success of our experiment. Many
experiments use music to induce a positive or negative mood. Energizing music like waltzes and mazurkas or
calming music like sonatas are often used for the happy or positive condition. Complex music with many changes in
tempo and tone (e.g., Wagner) or atonal music (e.g., Glass) are often used to induce negative moods. Typically,
classical music is used. You want to use this manipulation for a class experiment but are worried that participants
will not like the classical music that has been reported in the literature. If you wanted a genre that is more popular
among college students, what would you use? Write down some of the factors you would need to consider to
develop a manipulation that used more modern music. To go about developing the operational definition of positive
and negative music, you would need to do a number of pilot studies in which students rate the music on a wide
range of characteristics. You would need to have them compare different songs from different genres and use this
information to develop music that will have the desired effect. Reliability and validity are usually established in
experiments through pilot tests and manipulation checks. Test-retest reliability is often calculated. Manipulation
checks reveal whether the experimental manipulation produced the desired group differences. This establishes the
construct validity of the operational definition.
From Wadsworth Publishing:
http://www.wadsworth.com/psychology_d/templates/student_resources/workshops/res_methd/science/science_07.
html
Download