Reliability - Mohit Puri / FrontPage

advertisement
Ph.D. COURSE WORK (2010-11)
PAPER-I
ADVANCED RESEARCH METHODOLOGY
ASSIGNMENT
ON
RELIABILITY: MEANING, CHARACTERISTICS, METHODS OF
DETERMINING RELIABILITY AND FACTORS
INFLUENCING RELIABILITY
Submitted to:
Submitted by:
DR. KULWINDER SINGH
HARSUKHJINDER SINGH
ROLL NO.9040
DEPARTMENT OF EDUCATION AND COMMUNITY SERVICE
PUNJABI UNIVERSITY
PATIALA
RELIABILITY
The literal meaning of reliability is consistency, dependence or
trust. Reliability is the degree of consistency that the instrument or
procedure demonstrates. Whatever it is measuring, it does so
consistently. A data collection test is considered to be reliable if it
yields consistent results in its successive administration. So by
reliability of a test me mean how dependable, trustworthy or faithful
the test is.
DEFINITIONS OF RELIABILITY
Gronland
and
Linn:
Reliability
refers
to
the
consistency
of
measurement that is, how consistent test scores or other evaluation
results are from one measurement to other.
Anastasi: Reliability refers to the consistency of scores obtained by
the same individuals when re-examined with the same test on
different occasions or with different sets of equivalent items or under
variable examining conditions.
CHARACTERISTICS OF RELIABILITY
1)
It is consistency of a test score.
2)
It refers to the accuracy or precision of a measuring instrument.
3)
It refers to the test results not the test itself.
4)
It is the coefficient of internal consistency.
5)
It is the measure of variable error.
6)
It does not ensure the validity or truthfulness or purposiveness
of a test.
7)
It is the self correlation.
8)
It is a matter of degree.
1
METHODS OF DETERMINING RELIABILITY
There are four procedures generally used for computing the
reliability coefficient (sometimes called the self-correlation) of a test.
These are:
1)
Test-retest (repetition)
2)
Alternate or parallel forms.
3)
Split-half technique
4)
Rational Equivalence
All these methods furnish estimates of the reproducibility of test
scores; sometimes one method and sometimes another will provide the
better measure.
1) Test-retest Method
This method involves (i) repetition of a test on the same group
immediately or after a lapse of time, and (ii) computation of correlation
between the first and the second set of scores. The correlation coefficient thus obtained indicates the extent or magnitude of the
agreement between the two sets of scores and is often called the
coefficient of stability. The estimate of reliability in this case vary
according to the length of time-interval allowed between the two
administrations. The product moment method of correlation is a
significant method for estimating reliability of two sets of scores. Thus,
a high correlation between two sets of scores indicates that the test is
reliable. In other words, it shows that the scores obtained in first
administration
resemble
with
the
scores
obtained
in
second
administration of the same test.
In this method the time interval plays an important role.
Immediate repetition of a test may involve (i) immediate memory
2
effects (ii) practice effects (iii) confidence effects, induced by familiarity
of contents. Intervals of six months or long may show ‘maturity effect’.
The factors of intervening learning and unlearning may lead to
lowering of self-correlation. Owing to difficulties in controlling
conditions which influence scores on retest, the test-retest method is
generally less useful than are the other methods.
Advantages
1.
It is generally used for estimating reliability coefficient.
2.
It is worthy to use in different situations conveniently.
3.
A test of an adequate length can be used after an interval of
many days between successive testing,
Limitations
1.
If the test is repeated immediately or after a little time gap, there
may be possibility of carry-over effect, transfer effect, memory
effect, practice effect and confidence effect induced by familiarity
with the material will almost certainly effect scores when the
test is administered for a second time.
2.
Index of reliability so obtained is less accurate.
3.
If the interval between tests is rather long (more than six
months) growth factor and maturity affect the scores and
tenders to lower down the reliable index.
4.
On repeating the same test on the same group second time,
makes the students disinterested and thus they do not like to
take part wholeheartedly.
2) Alternate or Parallel Forms Method
This method involves the administration of equivalent or parallel
forms of the test instead of repetition of a single test. The two
3
equivalent forms are so constructed as to make them similar (but not
identical) in context, mental process involved, number of items,
difficulty level and in other aspects. Parallel tests have equal mean
scores, variances and intercorrelations among items. That is, two
parallel forms must be homogeneous or similar in all respects, but not
a duplication of test items. The subjects take one form of the test and
then as soon as possible, the other form. The reliability coefficient may
be looked upon as the coefficient correlation between the scores on
two equivalent forms of test.
Advantages
1.
Memory, practice and carryover effects are minimised and not
affect the scores.
2.
The reliability coefficient obtained by this method is a measure
of both temporal stability and consistency of response to
different item samples or test forms.
3.
It is useful for the reliability of achievement tests.
Limitations
1.
Practice and carry over factors cannot be completely controlled.
2.
When the tests are not exactly equal the comparison between
two sets of scores obtained from these tests may lead to
erroneous decisions.
3.
Administration of two forms simultaneously creates boredom.
4.
The testing conditions while administering the Form B may not
be the same.
5.
Test scores of second form of the test are generally high.
3) Split-half Method
In this method the test is administered once on the sample and
it is the most appropriate method for homogeneous tests. This method
4
provides the internal consistency of a test scores. All the items of the
test are generally arranged in increasing order of difficulty and
administered once on sample. After administering the test it is divided
into two comparable or similar or equal parts or halves. The test is
divided into two halves only for the purpose of scoring and not for
administration. The scores are arranged or are made in two sets
obtained from odd numbers of items and even numbers of items
separately. The odd numbered items 1,3,5,7 etc. and the even
numbered items 2,4,6,8 etc. form two different sets of items for
scoring. After obtaining two scores on odd and even numbers of test
items, co-efficient of correlation is calculated. It is really a correlation
between two equivalent halves of scores obtained in one sitting. To
estimate reliability, Spearman-Brown Prophecy formula is used:-
2r1 1
r1t 
22
1  r1 1
22
Where
r1t = reliability coefficient of the whole test.
r1 1 = reliability coefficient of the half test, found experimentally.
22
Advantages
1.
The carryover effect or practice effect is not there as the testee is
not tested twice.
2.
The fluctuations of individual’s ability because of environmental
or physical conditions is minimised.
3.
Difficulty of constructing parallel forms of test is eliminated.
Limitations
1.
A test can be divided into two equal halves in a number of ways
and the coefficient of correlation in each case may be different.
5
2.
As the test is administered once, the chance errors may affect
the scores on the two halves in the same way and thus tending
to make the reliability coefficient too high.
3.
This method cannot be used in power tests and heterogeneous
tests.
4.
This method can not be used for estimating reliability of speed
tests.
4) Rational Equivalence Method
It is a method based on consistency of responses to all items.
This method enables to compute the inter-correlation of the items of
the test and correlations of each item with all the items of the test. In
this method, it is assumed that all items have same or equal difficulty
value, correlation between the items are equal, all the items measure
essentially the same ability and the test is homogeneous in nature.
Like split-half method this method also provides a measure of interval
consistency. The most popular formula is Kuder-Richardson:
n
σ 2 t - Σpq
r1t 
x
(n - 1)
σ2t
Where
r1t
= reliability coefficient of the whole test.
n
= number of items in the test
σt = the SD of the test scores
p
= the proportion of the group answering a test item correctly
q =(1-p) = the proportion of the group answering a test item incorrectly.
Advantages
1.
This coefficient provides some indicators of how internally
consistent or homogeneous the items of the test are.
2.
Split-half method simply measures the equivalence but rational
equivalence
method
measures
homogeneity.
6
both
equivalence
and
3.
It neither requires administration of two equivalent forms of
tests nor it requires to split the tests into two equal halves.
Limitations
1.
The coefficient obtained by this method is generally some what
lesser than the coefficients obtained by other methods.
2.
It the items of the tests are not highly homogeneous, this
method will yield lower reliability coefficient.
3.
Kuder-Richardson and split-half method are not appropriate for
speed test.
FACTORS INFLUENCING THE RELIABILITY
There are some intrinsic and extrinsic factors which affect the
reliability of test scores:
(A) Intrinsic Factors
The intrinsic factors are those factors which lie within the test
itself. The major intrinsic factors which affect the reliability are:
(i)
Length of the test: Other things being equal, the reliability of a
test is a function of its length. Longer tests tend to be more
reliable than shorter tests. The more the number of items the
test contains, the greater will be its reliability and vice-versa.
Logically, the more sample of items we take of a given area of
knowledge, skill and the like, the more reliable the test will be.
However, it is difficult to ensure the maximum length of the test
to ensure an appropriate value of reliability.
(ii)
Homogeneity of items: Homogeneity of items has two aspects:
item reliability and the homogeneity of traits measured from one
item to another. If the items measure different functions and the
inter-relations of items are ‘zero’ or near to it, then the reliability
is ‘zero’ or very low and vice-versa.
7
(iii)
Difficulty level of items: If the test items are too easy or too
difficult it will tend to produce scores of low reliability. Because
both the tests have a restricted spread of scores.
(iv)
Test Instruction: Clear and concise instructions increase
reliability. Complicated and ambiguous directions give rise to
difficulties in understanding the questions and the nature of the
response expected from the testee ultimately leading to low
reliability.
(v)
Item selection: If there are too many interdependent items in a
test, the reliability is found to be low.
(vi)
Reliability of the Scorer: If the score is moody, fluctuating
type, the scores will vary from one situation to another. Thus
the reliability of the scorer also influences reliability of the test.
B) Extrinsic Factors
Extrinsic factors are those factors which remain outside the test
itself. The important extrinsic factors influencing the reliability are:
(i)
Group Variability: The greater the variability, the higher the
reliability and vice versa.
(ii)
Group and Chance errors: Guessing in test gives rise to
increased error variance and as such reduces reliability.
(iii)
Testing Conditions: The conditions in which the test is
administered and scored may effect reliability on either side. As
far as practicable, testing environment should be uniform.
(iv)
Momentary Fluctuations: These may raise or lower the
reliability of the test scores.
8
REFERENCES
Best, John W., & Kahn, James V. (2006). Research in Education (10th
ed.). New Delhi: PHI Learning Private Limited.
Garrett, H.E. (2005). Statistics in Psychology and Education. New
Delhi: Paragon International Publishers.
Koul, Lokesh (2009). Methodology of Educational Research (4th ed.).
New Delhi: Vikas Publishing House Pvt. Ltd.
Sahu, Binod K. (2004). Statistics in Psychology and Education.
Ludhiana: Kalyani Publishers.
Sharma, R.A. (2000). Advanced Statistics in Education and Psychology.
Meerut: R. Laal Book Depot.
9
Download