validity reliabil..

advertisement





Characteristics or conditions that change or
have different values for different individuals
Age
Gender
Score
Elapsed Time
Mr. fox compared the effectiveness of alcohol and
opium on the speed of reaction of teenagers and
adults to 3 types of insects.
•
•
•
Variables are well defined, easily observed, and
easily measured.
age, time, gender, score
Constructs are intangible, abstract attributes
such as , intelligence, motivation, or selfesteem, academic achievement.
An operational definition specifies a measurement
procedure ( a set of operations) for measuring a
construct.
Researchers have developed two general criteria
for evaluating the quality of any measurement
procedure: validity and reliability


To establish validity, you must demonstrate
that the measurement procedure is actually
measuring what it claims to be measuring.
How do we know if these tests actually
measure intelligence?


Face validity is the simplest and least scientific
definition of validity.
Face validity concerns the superficial
appearance, or face value, of a measurement
procedure.

The scores obtained from the new
measurement technique are directly related to
the scores obtained from another, betterestablished procedure for measuring the same
variable.
Examples
 Teacher’s agreement
 Standard tests


When the measurements of a construct
accurately predict behavior ( according to the
theory), the measurement procedure is said to
have predictive validity.
A medical science test that predicts passing the
Medical Board Exam
•
If you can demonstrate that your measure
matches with what theories and other studies
say about that variable.
Example,
• You would need to study all the past research
on aggression and show that the measurement
procedure produces scores that behave in
accordance with everything that is known
about the construct “ aggression.”
•
If you can demonstrate that your measure
matches with what theories and other studies
say about that variable.
Example,
• You would need to study all the past research
on aggression and show that the measurement
procedure produces scores that behave in
accordance with everything that is known
about the construct “ aggression.”
You search and find all symptoms or all questions
that has been used in earlier research to measure
aggression and then you do a factor analysis to
see which questions are not related to the
construct
•
Convergent validity involves creating two
different methods to measure the same
construct, then showing a strong relationship
between the measures obtained from the two
methods.

Divergent validity, on the other hand, involves
demonstrating that we are measuring one
specific construct and not combining two
different constructs in the same measurement
process.
Self
Esteem
IQ
Math
Face validity
2. Criterion based validity
Concurrent Validity
Predictive Validity
3. Construct Validity
Convergent
Divergent
1.

A measurement procedure is said to have
reliability if it produces identical ( or nearly
identical) results when it is used repeatedly to
measure the same individual under the same
conditions.
Successive measurements (test-retest, parallelforms reliability.)
• Simultaneous measurements: (Inter-rater
reliability)
• Internal consistency: (split- half reliability,
, Kuder- Richardson, and Cronbach’s Alpha )
•



These two factors are partially related and
partially independent.
Reliability is a prerequisite for validity
The consistency of measurement is no
guarantee of validity.
•
•
•
•
In very general terms, measurement is a procedure
for classifying individuals. The set of categories
used for classification is called the scale of
measurement.
Nominal -simply represent qualitative ( not
quantitative) differences in the variable measured.
Ordinal (series of ranks, verbal labels such as
small, medium, and large)
Interval & Ratio & Scale (The categories on interval
and ratio scales are organized sequentially and all
categories are the same size)




Nominal (gender, ethnicity, major, religion)
Ordinal (age groups, grade level, Likert scale, ranks)
Interval (BC, AD, Celsius)
Ratio (time, number, age, height, weight)
The external expressions of a construct are
traditionally classified into three categories
• Self- report
• Physiological
• Behavioral
Advantage
No one knows more about the individuals than the
individual.
Disadvantage




A participant may deliberately lie to create a better
self- image.
Response may be influenced subtly by the presence of
a researcher.
The wording of the questions.
Other aspects of the research situation.


Fear, for example, reveals itself by increased
heart rate
Brain imaging techniques such as positron
emission tomography ( PET- Positron emission
tomography )
One advantage of physiological measures is that
they are extremely objective.



One disadvantage of such measures is that they
typically require equipment that may be
expensive or unavailable.
In addition, the presence of monitoring devices
creates an unnatural situation that may cause
participants to react differently
Example? Lie detector
The behaviors may be completely natural events
such as laughing, playing, eating, sleeping,
arguing, or speaking.


One method of obtaining a more complete
measure of a construct is to use two ( or more)
different procedures to measure the same
variable.
For example, we could record both heart rate
and behavior as measures of fear.
•
In general, if we expect fairly small, subtle
changes in a variable, then the measurement
procedure must be sensitive enough to detect
the changes.
Which one is more sensitive?
 Pass-Fail
 A-B-C-D
 1-10
 1-100
Typically, a researcher knows the predicted
outcome of a research study and is in a position to
influence the results, either intentionally or
unintentionally.
Even the most trained interviewers





by paralinguistic cues ( variations in tone of
voice) that influence the participants to give the
expected or desired responses
by kinesthetic cues ( body posture or facial
expressions)
by misjudgment of participants’ responses
by not recording participants’ responses
by verbal reinforcement of expected or desired
responses


If we observe or measure an inanimate object
such as a table or a block of wood, we do not
expect the object to have any response such as
“ Whoa! I’m being watched. I had better be on
my best behavior.”
Unfortunately this kind of reactivity can
happen with human participants.
Four different subject roles have been identified
• The good subject role. (know what we want)
• The negativistic subject role. (against us)
• The apprehensive subject role.(desirable)
• The faithful subject role.(pro science)


One researcher is comparing two brands of fertilizer to
determine whether one produces larger pumpkins than
the other. A second researcher is comparing two
different elementary school physical education
programs to determine whether one program produces
higher self-esteem than the other. Explain why the first
researcher is probably not concerned about reliability
and validity of measurement, whereas the second
researcher probably is.
What is meant by the reliability of measurement and
describe three methods for measuring reliability.
Download