Trochim 2006

advertisement
http://www.socialresearchmethods.net/kb/contents.php
Five Big Words
Research involves an eclectic blending of an enormous range of skills and activities. To be a
good social researcher, you have to be able to work well with a wide variety of people,
understand the specific methods used to conduct research, understand the subject that you
are studying, be able to convince someone to give you the funds to study it, stay on track and
on schedule, speak and write persuasively, and on and on.
Here, I want to introduce you to five terms that I think help to describe some of the key
aspects of contemporary social research. (This list is not exhaustive. It's really just the first
five terms that came into my mind when I was thinking about this and thinking about how I
might be able to impress someone with really big/complex words to describe fairly
straightforward concepts).
I present the first two terms -- theoretical and empirical -- together because they are often
contrasted with each other. Social research is theoretical, meaning that much of it is
concerned with developing, exploring or testing the theories or ideas that social researchers
have about how the world operates. But it is also empirical, meaning that it is based on
observations and measurements of reality -- on what we perceive of the world around us. You
can even think of most research as a blending of these two terms -- a comparison of our
theories about how the world operates with our observations of its operation.
The next term -- nomothetic -- comes (I think) from the writings of the psychologist Gordon
Allport. Nomothetic refers to laws or rules that pertain to the general case (nomos in Greek)
and is contrasted with the term "idiographic" which refers to laws or rules that relate to
individuals (idios means 'self' or 'characteristic of an individual ' in Greek). In any event, the
point here is that most social research is concerned with the nomothetic -- the general case -rather than the individual. We often study individuals, but usually we are interested in
generalizing to more than just the individual.
In our post-positivist view of science, we no longer regard certainty as attainable. Thus, the
fourth big word that describes much contemporary social research is probabilistic, or based on
probabilities. The inferences that we make in social research have probabilities associated with
them -- they are seldom meant to be considered covering laws that pertain to all cases. Part
of the reason we have seen statistics become so dominant in social research is that it allows
us to estimate probabilities for the situations we study.
The last term I want to introduce is causal. You've got to be very careful with this term. Note
that it is spelled causal not casual. You'll really be embarrassed if you write about the "casual
hypothesis" in your study! The term causal means that most social research is interested (at
some point) in looking at cause-effect relationships. This doesn't mean that most studies
actually study cause-effect relationships. There are some studies that simply observe -- for
instance, surveys that seek to describe the percent of people holding a particular opinion. And,
there are many studies that explore relationships -- for example, studies that attempt to see
whether there is a relationship between gender and salary. Probably the vast majority of
applied social research consists of these descriptive and correlational studies. So why am I
talking about causal studies? Because for most social sciences, it is important that we go
beyond just looking at the world or looking at relationships. We would like to be able to
change the world, to improve it and eliminate some of its major problems. If we want to
change the world (especially if we want to do this in an organized, scientific way), we are
automatically interested in causal relationships -- ones that tell us how our causes (e.g.,
programs, treatments) affect the outcomes of interest.
Types of Questions
1
There are three basic types of questions that research projects can address:
1.Descriptive.When a study is designed primarily to describe what is going on or what exists.
Public opinion polls that seek only to describe the proportion of people who hold various
opinions are primarily descriptive in nature. For instance, if we want to know what percent of
the population would vote for a Democratic or a Republican in the next presidential election,
we are simply interested in describing something.
2.Relational.When a study is designed to look at the relationships between two or more
variables. A public opinion poll that compares what proportion of males and females say they
would vote for a Democratic or a Republican candidate in the next presidential election is
essentially studying the relationship between gender and voting preference.
3.Causal.When a study is designed to determine whether one or more variables (e.g., a
program or treatment variable) causes or affects one or more outcome variables. If we did a
public opinion poll to try to determine whether a recent political advertising campaign changed
voter preferences, we would essentially be studying whether the campaign (cause) changed
the proportion of voters who would vote Democratic or Republican (effect).
The three question types can be viewed as cumulative. That is, a relational study assumes
that you can first describe (by measuring or observing) each of the variables you are trying to
relate. And, a causal study assumes that you can describe both the cause and effect variables
and that you can show that they are related to each other. Causal studies are probably the
most demanding of the three.
Time in Research
Time is an important element of any research design, and here I want to introduce one of the
most fundamental distinctions in research design nomenclature: cross-sectional versus
longitudinal studies. A cross-sectional study is one that takes place at a single point in time. In
effect, we are taking a 'slice' or cross-section of whatever it is we're observing or measuring. A
longitudinal study is one that takes place over time -- we have at least two (and often more)
waves of measurement in a longitudinal design.
A further distinction is made between two types of longitudinal designs: repeated measures
and time series. There is no universally agreed upon rule for distinguishing these two terms,
but in general, if you have two or a few waves of measurement, you are using a repeated
measures design. If you have many waves of measurement over time, you have a time series.
How many is 'many'? Usually, we wouldn't use the term time series unless we had at least
twenty waves of measurement, and often far more. Sometimes the way we distinguish these
is with the analysis methods we would use. Time series analysis requires that you have at
least twenty or so observations. Repeated measures analyses (like repeated measures ANOVA)
aren't often used with as many as twenty waves of measurement.
Types of Relationships
A relationship refers to the correspondence between two variables. When we talk about types
of relationships, we can mean that in at least two ways: the nature of the relationship or the
pattern of it.
The Nature of a Relationship
While all relationships tell about the correspondence between two variables, there is a special
type of relationship that holds that the two variables are not only in correspondence, but that
one causes the other. This is the key distinction between a simple correlational relationship
and a causal relationship. A correlational relationship simply says that two things perform in a
synchronized manner. For instance, we often talk of a correlation between inflation and
2
unemployment. When inflation is high, unemployment also tends to be high. When inflation is
low, unemployment also tends to be low. The two variables are correlated. But knowing that
two variables are correlated does not tell us whether one causes the other. We know, for
instance, that there is a correlation between the number of roads built in Europe and the
number of children born in the United States. Does that mean that is we want fewer children
in the U.S., we should stop building so many roads in Europe? Or, does it mean that if we
don't have enough roads in Europe, we should encourage U.S. citizens to have more babies?
Of course not. (At least, I hope not). While there is a relationship between the number of
roads built and the number of babies, we don't believe that the relationship is a causal one.
This leads to consideration of what is often termed the third variable problem. In this example,
it may be that there is a third variable that is causing both the building of roads and the
birthrate, that is causing the correlation we observe. For instance, perhaps the general world
economy is responsible for both. When the economy is good more roads are built in Europe
and more children are born in the U.S. The key lesson here is that you have to be careful
when you interpret correlations. If you observe a correlation between the number of hours
students use the computer to study and their grade point averages (with high computer users
getting higher grades), you cannot assume that the relationship is causal: that computer use
improves grades. In this case, the third variable might be socioeconomic status -- richer
students who have greater resources at their disposal tend to both use computers and do
better in their grades. It's the resources that drives both use and grades, not computer use
that causes the change in the grade point average.
Patterns of Relationships
We have several terms to describe the major different types of patterns one might find in a
relationship. First, there is the case of no relationship at all. If you know the values on one
variable, you don't know anything about the values on the other. For instance, I suspect that
there is no relationship between the length of the lifeline on your hand and your grade point
average. If I know your GPA, I don't have any idea how long your lifeline is.
Then, we have the positive relationship. In a positive relationship, high values on one variable
are associated with high values on the other and low values on one are associated with low
values on the other. In this example, we assume an idealized positive relationship between
years of education and the salary one might expect to be making.
On the other hand a negative relationship implies that high values on one variable are
associated with low values on the other. This is also sometimes termed an inverse relationship.
Here, we show an idealized negative relationship between a measure of self esteem and a
measure of paranoia in psychiatric patients.
These are the simplest types of relationships we might typically estimate in research. But the
pattern of a relationship can be more complex than this. For instance, the figure on the left
shows a relationship that changes over the range of both variables, a curvilinear relationship.
In this example, the horizontal axis represents dosage of a drug for an illness and the vertical
axis represents a severity of illness measure. As dosage rises, severity of illness goes down.
But at some point, the patient begins to experience negative side effects associated with too
high a dosage, and the severity of illness begins to increase again.
Variables
You won't be able to do very much in research unless you know how to talk about variables. A
variable is any entity that can take on different values. OK, so what does that mean? Anything
that can vary can be considered a variable. For instance, age can be considered a variable
because age can take different values for different people or for the same person at different
times. Similarly, country can be considered a variable because a person's country can be
assigned a value.
3
Variables aren't always 'quantitative' or numerical. The variable 'gender' consists of two text
values: 'male' and 'female'. We can, if it is useful, assign quantitative values instead of (or in
place of) the text values, but we don't have to assign numbers in order for something to be a
variable. It's also important to realize that variables aren't only things that we measure in the
traditional sense. For instance, in much social research and in program evaluation, we
consider the treatment or program to be made up of one or more variables (i.e., the 'cause'
can be considered a variable). An educational program can have varying amounts of 'time on
task', 'classroom settings', 'student-teacher ratios', and so on. So even the program can be
considered a variable (which can be made up of a number of sub-variables).
An attribute is a specific value on a variable. For instance, the variable sex or gender has two
attributes: male and female. Or, the variable agreement might be defined as having five
attributes:
•1 = strongly disagree
•2 = disagree
•3 = neutral
•4 = agree
•5 = strongly agree
Another important distinction having to do with the term 'variable' is the distinction between
an independent and dependent variable. This distinction is particularly relevant when you are
investigating cause-effect relationships. It took me the longest time to learn this distinction.
(Of course, I'm someone who gets confused about the signs for 'arrivals' and 'departures' at
airports -- do I go to arrivals because I'm arriving at the airport or does the person I'm picking
up go to arrivals because they're arriving on the plane!). I originally thought that an
independent variable was one that would be free to vary or respond to some program or
treatment, and that a dependent variable must be one that depends on my efforts (that is, it's
the treatment). But this is entirely backwards! In fact the independent variable is what you (or
nature) manipulates -- a treatment or program or cause. The dependent variable is what is
affected by the independent variable -- your effects or outcomes. For example, if you are
studying the effects of a new educational program on student achievement, the program is the
independent variable and your measures of achievement are the dependent ones.
Finally, there are two traits of variables that should always be achieved. Each variable should
be exhaustive, it should include all possible answerable responses. For instance, if the variable
is "religion" and the only options are "Protestant", "Jewish", and "Muslim", there are quite a
few religions I can think of that haven't been included. The list does not exhaust all
possibilities. On the other hand, if you exhaust all the possibilities with some variables -religion being one of them -- you would simply have too many responses. The way to deal
with this is to explicitly list the most common attributes and then use a general category like
"Other" to account for all remaining ones. In addition to being exhaustive, the attributes of a
variable should be mutually exclusive, no respondent should be able to have two attributes
simultaneously. While this might seem obvious, it is often rather tricky in practice. For
instance, you might be tempted to represent the variable "Employment Status" with the two
attributes "employed" and "unemployed." But these attributes are not necessarily mutually
exclusive -- a person who is looking for a second job while employed would be able to check
both attributes! But don't we often use questions on surveys that ask the respondent to
"check all that apply" and then list a series of categories? Yes, we do, but technically speaking,
each of the categories in a question like that is its own variable and is treated dichotomously
as either "checked" or "unchecked", attributes that are mutually exclusive.
Hypotheses
An hypothesis is a specific statement of prediction. It describes in concrete (rather than
theoretical) terms what you expect will happen in your study. Not all studies have hypotheses.
Sometimes a study is designed to be exploratory (see inductive research). There is no formal
hypothesis, and perhaps the purpose of the study is to explore some area more thoroughly in
4
order to develop some specific hypothesis or prediction that can be tested in future research.
A single study may have one or many hypotheses.
Actually, whenever I talk about an hypothesis, I am really thinking simultaneously about two
hypotheses. Let's say that you predict that there will be a relationship between two variables
in your study. The way we would formally set up the hypothesis test is to formulate two
hypothesis statements, one that describes your prediction and one that describes all the other
possible outcomes with respect to the hypothesized relationship. Your prediction is that
variable A and variable B will be related (you don't care whether it's a positive or negative
relationship). Then the only other possible outcome would be that variable A and variable B
are not related. Usually, we call the hypothesis that you support (your prediction) the
alternative hypothesis, and we call the hypothesis that describes the remaining possible
outcomes the null hypothesis. Sometimes we use a notation like HA or H1 to represent the
alternative hypothesis or your prediction, and HO or H0 to represent the null case. You have to
be careful here, though. In some studies, your prediction might very well be that there will be
no difference or change. In this case, you are essentially trying to find support for the null
hypothesis and you are opposed to the alternative.
If your prediction specifies a direction, and the null therefore is the no difference prediction
and the prediction of the opposite direction, we call this a one-tailed hypothesis. For instance,
let's imagine that you are investigating the effects of a new employee training program and
that you believe one of the outcomes will be that there will be less employee absenteeism.
Your two hypotheses might be stated something like this:
The null hypothesis for this study is:
HO: As a result of the XYZ company employee training program, there will either be no
significant difference in employee absenteeism or there will be a significant increase.
which is tested against the alternative hypothesis:
HA: As a result of the XYZ company employee training program, there will be a significant
decrease in employee absenteeism.
In the figure on the left, we see this situation illustrated graphically. The alternative
hypothesis -- your prediction that the program will decrease absenteeism -- is shown there.
The null must account for the other two possible conditions: no difference, or an increase in
absenteeism. The figure shows a hypothetical distribution of absenteeism differences. We can
see that the term "one-tailed" refers to the tail of the distribution on the outcome variable.
When your prediction does not specify a direction, we say you have a two-tailed hypothesis.
For instance, let's assume you are studying a new drug treatment for depression. The drug
has gone through some initial animal trials, but has not yet been tested on humans. You
believe (based on theory and the previous research) that the drug will have an effect, but you
are not confident enough to hypothesize a direction and say the drug will reduce depression
(after all, you've seen more than enough promising drug treatments come along that
eventually were shown to have severe side effects that actually worsened symptoms). In this
case, you might state the two hypotheses like this:
The null hypothesis for this study is:
HO: As a result of 300mg./day of the ABC drug, there will be no significant difference in
depression.
which is tested against the alternative hypothesis:
HA: As a result of 300mg./day of the ABC drug, there will be a significant difference in
5
depression.
The figure on the right illustrates this two-tailed prediction for this case. Again, notice that the
term "two-tailed" refers to the tails of the distribution for your outcome variable.
The important thing to remember about stating hypotheses is that you formulate your
prediction (directional or not), and then you formulate a second hypothesis that is mutually
exclusive of the first and incorporates all possible alternative outcomes for that case. When
your study analysis is completed, the idea is that you will have to choose between the two
hypotheses. If your prediction was correct, then you would (usually) reject the null hypothesis
and accept the alternative. If your original prediction was not supported in the data, then you
will accept the null hypothesis and reject the alternative. The logic of hypothesis testing is
based on these two basic principles:
•the formulation of two mutually exclusive hypothesis statements that, together, exhaust all
possible outcomes
•the testing of these so that one is necessarily accepted and the other rejected
OK, I know it's a convoluted, awkward and formalistic way to ask research questions. But it
encompasses a long tradition in statistics called the hypothetical-deductive model, and
sometimes we just have to do things because they're traditions. And anyway, if all of this
hypothesis testing was easy enough so anybody could understand it, how do you think
statisticians would stay employed?
Unit of Analysis
One of the most important ideas in a research project is the unit of analysis. The unit of
analysis is the major entity that you are analyzing in your study. For instance, any of the
following could be a unit of analysis in a study:
•individuals
•groups
•artifacts (books, photos, newspapers)
•geographical units (town, census tract, state)
•social interactions (dyadic relations, divorces, arrests)
Why is it called the 'unit of analysis' and not something else (like, the unit of sampling)?
Because it is the analysis you do in your study that determines what the unit is. For instance,
if you are comparing the children in two classrooms on achievement test scores, the unit is the
individual child because you have a score for each child. On the other hand, if you are
comparing the two classes on classroom climate, your unit of analysis is the group, in this
case the classroom, because you only have a classroom climate score for the class as a whole
and not for each individual student. For different analyses in the same study you may have
different units of analysis. If you decide to base an analysis on student scores, the individual is
the unit. But you might decide to compare average classroom performance. In this case, since
the data that goes into the analysis is the average itself (and not the individuals' scores) the
unit of analysis is actually the group. Even though you had data at the student level, you use
aggregates in the analysis. In many areas of social research these hierarchies of analysis units
have become particularly important and have spawned a whole area of statistical analysis
sometimes referred to as hierarchical modeling. This is true in education, for instance, where
we often compare classroom performance but collected achievement data at the individual
student level.
Levels of Measurement
The level of measurement refers to the relationship among the values that are assigned to the
attributes for a variable. What does that mean? Begin with the idea of the variable, in this
example "party affiliation." That variable has a number of attributes. Let's assume that in this
particular election context the only relevant attributes are "republican", "democrat", and
6
"independent". For purposes of analyzing the results of this variable, we arbitrarily assign the
values 1, 2 and 3 to the three attributes. The level of measurement describes the relationship
among these three values. In this case, we simply are using the numbers as shorter
placeholders for the lengthier text terms. We don't assume that higher values mean "more" of
something and lower numbers signify "less". We don't assume the the value of 2 means that
democrats are twice something that republicans are. We don't assume that republicans are in
first place or have the highest priority just because they have the value of 1. In this case, we
only use the values as a shorter name for the attribute. Here, we would describe the level of
measurement as "nominal".
Why is Level of Measurement Important?
First, knowing the level of measurement helps you decide how to interpret the data from that
variable. When you know that a measure is nominal (like the one just described), then you
know that the numerical values are just short codes for the longer names. Second, knowing
the level of measurement helps you decide what statistical analysis is appropriate on the
values that were assigned. If a measure is nominal, then you know that you would never
average the data values or do a t-test on the data.
There are typically four levels of measurement that are defined:
•Nominal
•Ordinal
•Interval
•Ratio
In nominal measurement the numerical values just "name" the attribute uniquely. No ordering
of the cases is implied. For example, jersey numbers in basketball are measures at the
nominal level. A player with number 30 is not more of anything than a player with number 15,
and is certainly not twice whatever number 15 is.
In ordinal measurement the attributes can be rank-ordered. Here, distances between
attributes do not have any meaning. For example, on a survey you might code Educational
Attainment as 0=less than H.S.; 1=some H.S.; 2=H.S. degree; 3=some college; 4=college
degree; 5=post college. In this measure, higher numbers mean more education. But is
distance from 0 to 1 same as 3 to 4? Of course not. The interval between values is not
interpretable in an ordinal measure.
In interval measurement the distance between attributes does have meaning. For example,
when we measure temperature (in Fahrenheit), the distance from 30-40 is same as distance
from 70-80. The interval between values is interpretable. Because of this, it makes sense to
compute an average of an interval variable, where it doesn't make sense to do so for ordinal
scales. But note that in interval measurement ratios don't make any sense - 80 degrees is not
twice as hot as 40 degrees (although the attribute value is twice as large).
Finally, in ratio measurement there is always an absolute zero that is meaningful. This means
that you can construct a meaningful fraction (or ratio) with a ratio variable. Weight is a ratio
variable. In applied social research most "count" variables are ratio, for example, the number
of clients in past six months. Why? Because you can have zero clients and because it is
meaningful to say that "...we had twice as many clients in the past six months as we did in the
previous six months."
It's important to recognize that there is a hierarchy implied in the level of measurement idea.
At lower levels of measurement, assumptions tend to be less restrictive and data analyses
tend to be less sensitive. At each level up the hierarchy, the current level includes all of the
qualities of the one below it and adds something new. In general, it is desirable to have a
higher level of measurement (e.g., interval or ratio) rather than a lower one (nominal or
ordinal).
7
Download