13 Validity in research

advertisement
Validity in Reserch

Dr Ayaz Afsar
1
Introduction

There are many different types of validity and reliability. Threats to
validity and reliability can never be erased completely; rather the effects
of these threats can be attenuated by attention to validity and reliability
throughout a piece of research.

I will discuss validity and reliability in quantitative and qualitative,
naturalistic research. It suggests that both of these terms can be applied
to these two types of research, though how validity and reliability are
addressed in these two approaches varies.

Finally validity and reliability are addressed, using different instruments
for data collection. It is suggested that reliability is a necessary but
insufficient condition for validity in research; reliability is a necessary
precondition of validity, and validity may be a sufficient but not necessary
condition for reliability.
2
Defining validity

Validity is an important key to effective research. If a piece of research
is invalid then it is worthless. Validity is thus a requirement for both
quantitative and qualitative/naturalistic research.

While earlier versions of validity were based on the view that it was
essentially a demonstration that a particular instrument in fact
measures what it purports to measure, more recently validity has taken
many forms. For example, in qualitative data validity might be
addressed through the honesty, depth, richness and scope of the data
achieved, the participants approached, the extent of triangulation and
the disinterestedness or objectivity of the researcher (Winter 2000).

In quantitative data, validity might be improved through careful
sampling, appropriate instrumentation and appropriate statistical
treatments of the data.
3
Cont…Defining validity

It is impossible for research to be 100 per cent valid; that is the optimism
of perfection.

Quantitative research possesses a measure of standard error which is
inbuilt and which has to be acknowledged. In qualitative data the
subjectivity of respondents, their opinions, attitudes and perspectives
together contribute to a degree of bias.

Validity, then, should be seen as a matter of degree rather than as an
absolute state (Gronlund 1981). Hence at best we strive to minimize
invalidity and maximize validity.

There are several different kinds of validity
4
Kinds of Validity
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
content validity
criterion-related validity
construct validity
internal validity
external validity
concurrent validity
face validity
jury validity
predictive validity
consequential validity
systemic validity
catalytic validity
ecological validity
cultural validity
descriptive validity
interpretive validity
theoretical validity
evaluative validity.
5
Cont…Kinds of Validity

It is not my intention to discuss all of these terms in depth. Rather the
main types of validity will be addressed. The argument will be made
that, while some of these terms are more comfortably the preserve of
quantitative methodologies, this is not exclusively the case.

Indeed, validity is the touchstone of all types of educational research.
That said, it is important that validity in different research traditions is
faithful to those traditions; it would be absurd to declare a piece of
research invalid if it were not striving to meet certain kinds of validity,
e.g. generalizability, replicability and controllability.

Hence the researcher will need to locate discussions of validity within
the research paradigm that is being used. This is not to suggest,
however, that research should be paradigm-bound, that is a recipe for
stagnation and conservatism.
6
Cont…Kinds of Validity
Nevertheless, validity must be faithful to its premises and positivist
research has to be faithful to positivist principles, for example:

controllability

replicability

predictability

the derivation of laws and universal statements of behaviour

context-freedom

fragmentation and atomization of research

randomization of samples

observability.
7
Naturalistic research
•
By way of contrast, naturalistic research has several principles (Lincoln
and Guba 1985; Bogdan and Biklen, 1992):
•
The natural setting is the principal source of data.
•
Context-boundedness and ‘thick description’ are important.
•
Data are socially situated, and socially and culturally saturated.
•
The researcher is part of the researched world.
•
As we live in an already interpreted world, a doubly hermeneutic
exercise (Giddens 1979) is necessary to understand others’
understandings of the world; the paradox here is that the most
sufficiently complex instrument to understand human life is another
human, but that this risks human error in all its forms.
•
There should be holism in the research.
•
The researcher- rather than a research tool- is the key instrument of
research.
•
The data are descriptive.
•
There is a concern for processes rather than simply with outcomes.
8

Data are analysed inductively rather than using a priori categories.

Data are presented in terms of the respondents rather than researchers.

Seeing and reporting the situation should be through the eyes of
participants – from the native’s point of view (Geertz 1974).

Respondent validation is important.

Catching meaning and intention are essential.

Indeed Maxwell (1992) argues that qualitative researchers need to be
cautious not to be working within the agenda of the positivists in arguing
for the need for research to demonstrate concurrent, predictive,
convergent, criterion related, internal and external validity.

The claim is made (Agar 1993) that, in qualitative data collection, the
intensive personal involvement and in-depth responses of individuals
secure a sufficient level of validity and reliability.
9
•
•
•
•
•
•
•
Maxwell (1992) argues for five kinds of validity in qualitative methods
that explore his notion of ‘understanding’:
Descriptive validity (the factual accuracy of the account, that it is not
made up, selective or distorted): in this respect validity subsumes
reliability.
Interpretive validity (the ability of the research to catch the meaning,
interpretations, terms, intentions that situations and events, i.e. data,
have for the participants/subjects themselves, in their terms).
Theoretical validity (the theoretical constructions that the researcher
brings to the research, including those of the researched).
Generalizability (the view that the theory generated may be useful in
understanding other similar situations).
Evaluative validity (the application of an evaluative, judgemental of
that which is being researched, rather than a descriptive, explanatory or
interpretive framework).
Both qualitative and quantitative methods can address internal and
external validity.
10
Internal validity
Internal validity seeks to demonstrate that the explanation of a particular
event, issue or set of data which a piece of research provides can actually
be sustained by the data. The findings must describe accurately the
phenomena being researched.

In ethnographic research internal validity can be addressed in several
ways:

using low-inference descriptors

using multiple researchers

using participant researchers

using peer examination of data

using mechanical means to record, store and retrieve data.
In ethnographic, qualitative research there are several overriding kinds of
internal validity (LeCompte and Preissle 1993: 323–4):
11
Internal validity

confidence in the data

the authenticity of the data (the ability of the research to report a
situation through the eyes of the participants)

the cogency of the data

the soundness of the research design

the credibility of the data

the auditability of the data

the dependability of the data

the confirmability of the data.
12
External validity

External validity refers to the degree to which the results can be
generalized to the wider population, cases or situations.

The issue of generalization is problematical.

For positivist researchers generalizability is a sine qua non, while this is
attenuated in naturalistic research.

For positivists variables have to be isolated and controlled, and samples
randomized, while for ethnographers human behaviour is infinitely
complex, irreducible, socially situated and unique.

Generalizability in naturalistic research is interpreted as comparability
and transferability.

Schofield (1990: 200) suggests that it is important in qualitative research
to provide a clear, detailed and in-depth description so that others can
decide the extent to which findings from one piece of research are
generalizable to another situation, i.e. to address the twin issues of
comparability and translatability.
13
External validity

Lincoln and Guba (1985: 316) caution the naturalistic researcher against
this; they argue that it is not the researcher’s task to provide an index of
transferability; rather, they suggest, researchers should provide
sufficiently rich data for the readers and users of research to determine
whether transferability is possible.

In this respect transferability requires thick description.

Positivist researchers are more concerned to derive universal
statements of general social processes rather than to provide accounts
of the degree of commonality between various social settings (e.g.
schools and classrooms).
14
In naturalistic research threats to external validity include (Lincoln and
Guba 1985: 189, 300):

selection effects: where constructs selected in fact are only
relevant to a certain group

setting effects: where the results are largely a function of their
context

history effects: where the situations have been arrived at by unique
circumstances and, therefore, are not comparable

construct effects: where the constructs being used are peculiar to
a certain group.
15
Content validity

To demonstrate this form of validity the instrument must show that it
fairly and comprehensively covers the domain or items that it purports to
cover. It is unlikely that each issue will be able to be addressed in its
entirety simply because of the time available or respondents’ motivation
to complete, for example, a long questionnaire.

If this is the case, then the researcher must ensure that the elements of
the main issue to be covered in the research are both a fair
representation of the wider issue under investigation and that the
elements chosen for the research sample are themselves addressed in
depth and breadth.

Careful sampling of items is required to ensure their representativeness.
16

For example, if the researcher wished to see how well a group of
students could spell 1,000 words in French but decided to have a
sample of only 50 words for the spelling test, then that test would have
to ensure that it represented the range of spellings in the 1,000 words –
maybe.by ensuring that the spelling rules had all been included or
possible spelling errors had been covered in the test in the proportions
in which they occurred in the 1,000 words.
17
Construct validity

A construct is an abstract; this separates it from the previous types of
validity which dealt in actualities – defined content.

In this type of validity agreement is sought on the ‘operationalized’
forms of a construct, clarifying what we mean when we use this
construct.

Hence in this form of validity the articulation of the construct is
important; is the researcher’s understanding of this construct similar to
that which is generally accepted to be the construct?
18
Construct validity
For example, let us say that the researcher wished to assess a child’s
intelligence (assuming, for the sake of this example, that it is a unitary
quality). The researcher could say that he or she construed intelligence to
be demonstrated in the ability to sharpen a pencil. How acceptable a
construction of intelligence is this? Is not intelligence something else (e.g.
that which is demonstrated by a high result in an intelligence test)?
To establish construct validity the researcher would need to be assured
that his or her construction of a particular issue agreed with other
constructions of the same underlying issue, e.g. intelligence, creativity,
anxiety, motivation. …
In qualitative/ethnographic research construct validity must demonstrate
that the categories that the researchers are using are meaningful to the
participants themselves, i.e. that they reflect the way in which the
participants actually experience and construe the situations in the
research, that they see the situation through the actors’ eyes.
19
Ecological validity

In quantitative, positivist research variables are frequently isolated,
controlled and manipulated in contrived settings.

For qualitative, naturalistic research a fundamental premise is that the
researcher deliberately does not try to manipulate variables or
conditions, that the situations in the research occur naturally.

The intention here is to give accurate portrayals of the realities of social
situations in their own terms, in their natural or conventional settings. In
education, ecological validity is particularly important and useful in
charting how policies are actually happening ‘at the chalk face’.
20

For ecological validity to be demonstrated it is important to include and
address in the research as many characteristics in, and factors of, a
given situation as possible. The difficulty for this is that the more
characteristics are included and described, the more difficult it is to
abide by central ethical tenets of much research—non-traceablity,
anonymity and non-identifiability.
21
Cultural validity

A type of validity related to ecological validity is cultural validity (Morgan
1999). This is particularly an issue in cross-cultural, intercultural and
comparative kinds of research, where the intention is to shape research
so that it is appropriate to the culture of the researched, and where the
researcher and the researched are members of different cultures.

Cultural validity is defined as ‘the degree to which a study is
appropriate to the cultural setting where research is to be carried out’
(Joy 2003: 1).

Cultural validity applies at all stages of the research, and affects its
planning, implementation and dissemination. It involves a degree of
sensitivity to the participants, cultures and circumstances being studied.
22
Questions the researchers may face









Is the research question understandable and of importance to the
target group?
Is the researcher the appropriate person to conduct the research?
Are the sources of the theories that the research is based on
appropriate for the target culture?
How do researchers in the target culture deal with the issues related to
the research question (including their method and findings)?
Are appropriate gatekeepers and informants chosen?
Are the research design and research instruments ethical and
appropriate according to the standards of the target culture?
How do members of the target culture define the salient terms of the
research?
Are documents and other information translated in a culturally
appropriate way?
Are the possible results of the research of potential value and benefit to
the target culture?
23
Cont.

Does interpretation of the results include the opinions and views of
members of the target culture?

Are the results made available to members of the target culture for
review and comment?

Does the researcher accurately and fairly communicate the results in
their cultural context to people who are not members of the target
culture?
24
Catalytic validity


Catalytic validity embraces the paradigm of critical theory.
Put neutrally, catalytic validity simply strives to ensure that research
leads to action. However, the story does not end there, for discussions
of catalytic validity are substantive; like critical theory, catalytic validity
suggests an agenda. The agenda for catalytic validity is to help
participants to understand their worlds in order to transform them.

The agenda is explicitly political, for catalytic validity suggests the need
to expose whose definitions of the situation are operating in the
situation.
25

Lincoln and Guba (1986) suggest that the criterion of ‘fairness’
should be applied to research, meaning that it should not only
augment and improve the participants’ experience of the world, but
also improve the empowerment of the participants.
26
Cont.

Catalytic validity – a major feature in feminist research which needs to
permeate all research – requires solidarity in the participants, an ability of
the research to promote emancipation, autonomy and freedom within a
just, egalitarian and democratic society to reveal the distortions, ideological
deformations and limitations that reside in research, communication and
social structures (see also LeCompte and Preissle 1993).Validity, it is
argued (Mishler 1990; Scheurich 1996), is no longer an ahistorical given,
but contestable, suggesting that the definitions of valid research reside in
the academic communities of the powerful.

Catalytic validity reasserts the centrality of ethics in the research process,
for it requires researchers to interrogate their allegiances, responsibilities
and self-interestedness (Burgess 1989).
27
Consequential validity
•
Partially related to catalytic validity is consequential validity, which
argues that the ways in which research data are used (the
consequences of the research) are in keeping with the capability or
intentions of the research, i.e. the consequences of the research do not
exceed the capability of the research and the action-related
consequences of the research are both legitimate and fulfilled.
•
Clearly, once the research is in the public domain, the researcher has
little or no control over the way in which it is used.
•
However, and this is often a political matter, research should not be
used in ways in which it was not intended to be used, for example by
exceeding the capability of the research data to make claims, by acting
on the research in ways that the research does not support (e.g. by
using the research for illegitimate epistemic support), by making
illegitimate claims by using the research in unacceptable ways (e.g. by
selection, distortion) and by not acting on the research in ways that were
agreed, i.e. errors of omission and commission.
28

A clear example of consequential validity is formative assessment.
This is concerned with the extent to which students improve as a
result of feedback given, hence if there is insufficient feedback for
students to improve, or if students are unable to improve as a result
of – a consequence of – the feedback, then the formative
assessment has little consequential validity.
29
Criterion-related validity

This form of validity endeavours to relate the results of one particular
instrument to another external criterion. Within this type of validity there
are two principal forms: predictive validity and concurrent validity.

Predictive validity is achieved if the data acquired at the first round of
research correlate highly with data acquired at a future date.

A variation on this theme is encountered in the notion of concurrent
validity. To demonstrate this form of validity the data gathered from
using one instrument must correlate highly with data gathered from
using another instrument. For example, suppose it was decided to
research a student’s problem-solving ability. The researcher might
observe the student working on a problem, or might talk to the student
about how s/he is tackling the problem, or might ask the student to
write down how s/he tackled the problem.
30

Here the researcher has three different data-collecting instruments –
observation, interview and documentation respectively. If the results all
agreed – concurred – that, according to given criteria for problemsolving ability, the student demonstrated a good ability to solve a
problem, then the researcher would be able to say with greater
confidence (validity) that the student was good at problem-solving than
if the researcher had arrived at that judgement simply from using one
instrument.
31
Cont.

Here the researcher has three different data- collecting instruments –
observation, interview and documentation respectively.

If the results all agreed – concurred – that, according to given criteria
for problem-solving ability, the student demonstrated a good ability to
solve a problem, then the researcher would be able to say with greater
confidence (validity) that the student was good at problem-solving than
if the researcher had arrived at that judgement simply from using one
instrument.

An important partner to concurrent validity, which is also a bridge into
later discussions of reliability, is triangulation.
32
Triangulation
•
Triangulation may be defined as the use of two or more methods of
data collection in the study of some aspect of human behaviour.
•
The use of multiple methods, or the multi-method approach as it is
sometimes called, contrasts with the ubiquitous but generally more
vulnerable single method approach that characterizes so much of
research in the social sciences.
•
In its original and literal sense, triangulation is a technique of physical
measurement: maritime navigators, military strategists and surveyors,
for example, use (or used to use) several locational markers in their
endeavours to pinpoint a single spot or objective.
•
By analogy, triangular techniques in the social sciences attempt to map
out, or explain more fully, the richness and complexity of human
behaviour by studying it from more than one standpoint and, in so
doing, by making use of both quantitative and qualitative data.
•
Triangulation is a powerful way of demonstrating concurrent validity,
particularly in qualitative research (Campbell and Fiske 1959).
33
Cont…Triangulation

The advantages of the multi-method approach in social research are
manifold and I will examine two of them.

First, whereas the single observation in fields such as medicine,
chemistry and physics normally yields sufficient and unambiguous
information on selected phenomena, it provides only a limited view of
the complexity of human behaviour and of situations in which human
beings interact.

It has been observed that as research methods act as filters through
which the environment is selectively experienced, they are never a
theoretical or neutral in representing the world of experience (Smith
1975). Exclusive reliance on one method, therefore, may bias or distort
the researcher’s picture of the particular slice of reality being
investigated. The researcher needs to be confident that the data
generated are not simply artefacts of one specific method

of collection (Lin 1976).
34

I come now to a second advantage: some theorists have been
sharply critical of the limited use to which existing methods of inquiry
in the social sciences have been put.

The use of triangular techniques, it is argued, will help to overcome
the problem of ‘method-boundedness’, as it has been termed; indeed Gorard and Taylor (2004) demonstrate the value of combining
qualitative and quantitative methods.
35
Types of triangulation & their characteristics

We have just seen how triangulation is characterized by a multi-method
approach to a problem in contrast to a single-method approach.

Denzin (1970b) has, however, extended this view of triangulation to
take in several other types as well as the multi-method kind which he
terms ‘methodological triangulation’:

Time triangulation: this type attempts to take into consideration the
factors of change and process by utilizing cross-sectional and
longitudinal designs.

Space triangulation: this type attempts to overcome the parochialism
of studies conducted in the same country or within the same subculture
by making use of cross-cultural techniques.

Combined levels of triangulation: this type uses more than one level
of analysis from the three principal levels used in the social sciences,
namely, the individual level, the interactive level (groups), and the level
of collectivities (organizational, cultural or societal).
36
Cont…Types of triangulation and their
characteristics

Theoretical triangulation: this type draws upon alternative or
competing theories in preference to utilizing one viewpoint only.

Investigator triangulation: this type engages more than one
observer, data are discovered independently by more than one
observer (Silverman 1993: 99).

Methodological triangulation: this type uses either the same
method on different occasions, or different methods on the same
object of study.
37

Many studies in the social sciences are conducted at one point only
in time, thereby ignoring the effects of social change and process.
Time triangulation goes some way to rectifying these omissions by
making use of cross-sectional and longitudinal approaches. Crosssectional studies collect data at one point in time; longitudinal
studies collect data from the same group at different points in the
time sequence.
38
Ensuring validity

It is very easy to slip into invalidity; it is both insidious and pernicious as
it can enter at every stage of a piece of research. The attempt to build
out invalidity is essential if the researcher is to be able to have
confidence in the elements of the research plan, data acquisition, data
processing analysis, interpretation and its ensuing judgment.

At the design stage, threats to validity can be minimized by:

choosing an appropriate time scale

ensuring that there are adequate resources for the required research to
be undertaken

selecting an appropriate methodology for answering the research
questions selecting appropriate instrumentation for gathering the type
of data required

using an appropriate sample (e.g. one which is representative, not too
small or too large)
39
Cont…Ensuring validity
•
demonstrating internal, external, content, concurrent and construct
validity and ‘operationalizing’ the constructs fairly
•
ensuring reliability in terms of stability (consistency, equivalence, splithalf analysis of test material)
•
selecting appropriate foci to answer the research questions
•
devising and using appropriate instruments:
•
ensuring that readability levels are appropriate; avoiding any ambiguity
of instructions, terms and questions; using instruments that will catch
the complexity of issues;
•
avoiding leading questions;
•
Ensuring that the level of test is appropriate – e.g. neither too easy nor
too difficult; avoiding test items with little discriminability;
•
Avoiding making the instruments too short or too long;
•
avoiding too many or too few items for each issue
•
avoiding a biased choice of researcher or research team (e.g. insiders
or outsiders as researchers).
40

The End
41
Download