Rescuing the 2PL model

Cognitive psychology meets psychometric theory: q≥0 On the relation between process models for decision making and latent variable models for individual differences Han van der Maas Gunter Maris, Denny Borsboom University of Amsterdam, Cito Twente, 2009 1  2PL model in item response theory  j (q k   j ) e Pjk ( | qk )   (q   ) 1 e j k j • Main model for educational and psychological measurement in Psychometrics • Justifications – Logistic equation does the job (Toolbox) – Derivation from ‘desirable’ statistical properties (Fischer) – Derivation from ‘desirable’ measurement properties (Roskam) – Derivation from psychological process model (Tuerlinckx & De Boeck) 2 Diffusion model • Stochastic accumulation of evidence stops when decision threshold is reached • Implements SPRT (optimal accuracy given RT) • Very influential and very well studied model • More complex biological realistic models reduce to this model • Explains the speed accuracy trade-off 3 Relation to IRT • Two main parameters: – Boundary separation (decision criterion): a – Drift rate (rate of evidence accumulation): v eav 1 e av P(X  1)  2av  e 1 1 e av   a 1 eav a (DT)   2v 1 eav 2v RT a z X=1 v t if z=a/2 0 X=0 if P(X=1) is large Tuerlinckx & De Boeck (2005): Drift = ability – difficulty (v = q – ) Discrimination = boundary separation (a = ) e (q   ) P(X  1)  1 e (q   ) (DT)   2(q   ) 4 Three qualitative predictions 1. Subjects are slowest when v=0, that is when q= – When q<<, responding is very fast 2. For negative v, i.e. θ<, allowing more time to think reduces the probability correct. 3. By reducing time to think P(+) -> .5, irrespective of θ Long time limit Short time limit Special psychological meaning of q =  5 Attitudes: “I stick to my decisions” (agree/not agree) 1. RT decreases with |q-| – OK: extreme answers are fast 2. if q< and time limit increases P(agree)0 – OK: the longer I think about: I don’t stick to my decisions 3. If time to think (a) 0, P(agree) 1/2 – OK: I just say something Ability: “24+79” (93/114/103/130 ) 1. RT decreases with |q-| – NO, best subjects are fastest 2. if q<and time limit increases P(agree)0 – NO, they guess because they know that they don’t know 3. If time to think (a) 0, P(+) 1/2 – NO: P(+) 0 for open questions and to 1/M for M multiple choice questions Thus • For two choice attitude/personality tests the diffusion IRT model works well! • But not for multiple choice ability tests: 1. Problem of Multiple choice • Diffusion model is two choice model, IRT (2PL) is dichotomously scored not dichotomous choice! Guessing does normally not give P(+)=1/2 2. Problem of Guessing • In ability testing subjects often guess and don’t score below chance level 8 Solving the first problem: Multiple choice 9 Correction for MC in 2PL • Assuming equal attractiveness of incorrect alternatives in Bock’s nominal response model * Pm (q )  * e bm a mq M e b k* a *kq k • gives:  e (q   )ln(M 1) P( | q )  1 e (q   )ln(M 1) e  ln(M 1) 1 P( | q   )    ln(M 1) 1 e M 10 Solving the second problem: guessing • Subjects don’t score below chance level – Typical solution: 3PL • We propose a restricted 2PL model • Main idea: Ability can be zero but can not be negative, so v≥0 – subjects do not score below chance level – increasing time limit will not decrease P+ 11 Ability ≥ 0 • In IRT bipolar traits such as attitudes and unipolar traits such as abilities are not distinguished • But ability is very special (think of walking) – No negative ability (no negative difficulty) – Real zero point (no ability) – With sufficient time and ability>0 any item can be solved • We need a diffusion IRT model with positive v 12 Idea: a and v both have an item and a person part drift v ability person vp (>0) difficulty item vi (>0) requirement v=f(vp, vi)>0 one sensible solution v=vp/vi boundary separation a response caution ap (>0) time pressure ai (>0) requirement a=g(ap, ai)>0 one sensible solution a=ap/ai 13 Quotient diffusion 2PLM (Q-diffusion model) Response caution Ability a kp v kp Time pressure Pjk ( | q k )  e a ij v ij ln(M j 1) a kp v kp 1 e a ij v ij Difficulty ln(M j 1) All a and v are positive 14 so if qk  akp v kp  j  1/a v i j  jq k   j i j  j  ln( M j 1) e  Pjk ( | qk )   jq k   j 1 e  qk  akp v kp  j  1/a ij v ij  (q   ) e j k j  Pjk ( | qk )   (q   ) 1 e j k j  j  ln( M j 1)a ij v ij 15 Simple case qk  a v ⇒ Ignore a,v distinction (RT’s required)  j  1/a i v ij ⇒ equal time limits for all item in test p p k k Ability > 0 Correction for multiple choice Easiness > 0  jq k ln(M j 1) e Pjk ( | qk )   jq k ln(M j 1) 1 e 16 ICC for dichotomous items 17 ICC for MC (4 choices) 18 ICC for MC (10 choices) 19 ICC ‘open’ questions (many choices) 20 Advantages • Process model for attitude and ability traits • ‘Mechanistic’ interpretations of IRT parameters – q is product of drift rate and boundary separation • Meaningful zero point (ratio properties) • Speed accuracy trade-off incorporated –  is easiness parameter, reciprocal of product of item drift rate and time pressure –  is intercept parameter, guessing probability • Model of accuracy, but also model of RT • Guessing explained by restricting the 2PL • Simple extension to MC 21 Relation to other IRT models • • • • • Ramsay’s Q model Person fit van der Linden’s IRT RT model 2PL and multidimensional IRT Guessing models 22  Ramsay’s quotient model • Ramsay (1989) investigates simple models for Skj – Difference model (q): Rasch model – Quotient model (q/) q / • S e kj Pjk ( | qk )  S 1 e kj ek j Pjk ( | qk )  q k /  j ,q k  0,  j  0 Ke  Rasch model with guessing Fits better than Rasch model and in 4 examples • if q = apvp,  = aivi and K=M-1 the Q-model and the Q-diffusion model are equivalent ce • Note: model in Cressie & Holland (’83) P ( | q )  (1 c)  ce also equivalent: e jk k q *k   *j e q *k   *j 23 Person fit Q-diffusion for ability a kp v kp Pjk ( | q k )  e a ij v ij a kp ln(M j 1) a kp v kp 1 e D-diffusion for attitude a ij v ij ln(M j 1) Pjk ( | q k )  e a ij (v kp v ij )ln(M j 1) a kp 1 e a ij (v kp v ij )ln(M j 1) a ij is item discrimination (time pressure) a kp is person discrimination (response caution)    Person discrimination can be estimated by varying time pressure over items and by using RT data 24 IRT model for RT • General model: van der Linden’s Hierarchical model • Fundamental equation for RT modeling:  *j E(DT)  *  E(ln( DT )   j   k k  25 Translation of the item and person parameters of van der Linden’s model to diffusion model parameters akp v ij i i * v ij a 1 aj 1 aj  j v kp E(DT)     * ;E(ln( DT)  ln( 2)  ln i  ln p   j   k p p 2v 2 v k 2 vk k aj ak i p vj ak v ij  j  ln i aj v kp  k  ln p ak qk  akp v kp  j  1/a ij v ij  j  ln( M j 1) • If speed and ability parameters at the second level in van der Linden’s model are positively correlated, then individual differences are primarily due to differences in drift rate (for an example, see van der Linden, 2007). • If these parameters correlate negatively, the individual differences in drift rate are probably similar across subjects, and differences are mainly due to differences in response caution (see example 2 of Klein Entink, Fox, & van der Linden, 2009). 26 2PL • the Q-model and Q-diffusion model are restricted 2PL models • In the Q-diffusion model: – In the end all items will be passed, for all items, P(+|q>0,ai∞)=1, e.g. one Guttman item • If not (because the item also requires a jump) it measures another additional ability. The test is not unidimensional! – So any ability test with some items that I can solve and others that I will never solve (P+ does not increase with longer time limit), tests for more than one ability – requires Conjunctive Multidimensional Q-diffusion model   + x q 27 Guessing • Two process models – 3PL (p and g process) – 1PL-AG (p and ability dependent g process) • San Martin, del Pino, and De Boeck – DINA • One state models – Difficulty dependent guessing • Hessen 2005, d model – Q-diffusion model: one state ability based guessing 28 Fitting • Ramsay fit the QM, DM and DM-G (Rasch model with guessing) to four datasets and found that QM fitted best in all cases • For the full diffusion model several programs are available – Fast-DM (Voss & Voss, 2007) – DMAT (Vandekerckhove & Tuerlinckx, 2007; Voss & Voss, 2007) • Hierarchical IRT RT model of van der Linden – CIRT: Fox, J. P., Klein Entink, R. H., & van der Linden, W. J. (2007). 29 discussion • Diffusion model is simplistic model of simple perceptual discriminations. It is too simple for complex processes involved in typical ability items in IRT – But 2PL is simplistic too – Better a simple model than no model at all • We don’t need a process model – Then IRT is curve fitting without explanatory value • 3PL will fit better – 3PL is difficult to fit and uses many more parameters then the Q-diffusion IRT model – The point of the Q-diffusion IRT is not goodness of fit but theoretical clarity 30 thanks 31 Note: Capacities versus abilities • For Lumsden’s sticks for measuring length the Guttman model is OK • This also applies to STM – Digit span items 32 QM versus DM-G 33 ICC’s of new guessing model 34

Rescuing the 2PL model

Related documents

Products

Support

Rescuing the 2PL model

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib