Can Subject Matter Experts` Rating of Statement Extremity be Used

advertisement
Can Subject Matter Experts’ Rating of Statement Extremity be Used to Streamline the
Development of Unidimensional Pairwise Preference Scales?
Stark S.
Chernyshenko, O. S.
Guenole, N.
The application of noncognitive CAT in organizations settings is rare because of
two concerns. Firstly, the sample size in this context is relatively small. Secondly, it is
hard to development an item pool for noncognitive constructs containing more than
50 items. Consequently, the estimation with the traditional methods (such as MML) is
not accurate and efficient.
It was argued that the first concern is less critical. For the second concern, the
pairwise preference models are implemented in which a pair of items could be created.
The Zinnes-Grigges (ZG) ideal point model was used because of its psychometric
properties for noncognitive testing.
Subjects Matter Experts (SMEs) were used to develop adaptive UPP scales since
it is believed that SMEs approach is much effective to streamline the CAT. Two
studies were conducted: the first is to examine the consistence between SME and
MML location estimates, and the second to examine the recovery of known trait using
simulation studies.
Results showed that error in SME-based location estimates had little harmful
effect on score accuracy or validity, and hence SME ratings of location can substitute
MML estimates.
Questions & Comments
1. The results that the SME-based location is comparable to the MML estimates are
based on the condition in which the location difference of two paired items are large
enough ( s  t  2 ). Only under this condition can the experts accurately ‘estimate’
the trait score. However, for others items which do not have ‘huge’ difference for two
paired items, it is expected that SME estimated will not be as accurate as shown in
this study.
2. Though the interrater correlation between the location estimates obtained using
SME and MML approach, it was found that 19 out of 49 the parameters estimates
have the difference more than (or about) 1.5 logits (Table 2) between these two
approach.
3. I think it is more appropriate to apply the SME to CTT in which the main purpose
is to classify participants into several groups instead of estimate the trait scores.
4. The experts’ performance in rating is not taken into account in this study. It is of
interest to incorporate the effect into the model in the future study.
Download