in Rasch Model

advertisement

AN OVERVIEW OF

THE FAMILY OF

RASCH MODELS

Elena Kardanova

University of Ostrava

Czech republic

26-31, March, 20122

 The family of Rasch measurement models is a way to make sense of the world.

Benjamin D. Wright

Advantages of Rasch Models

The simplest models that provide parameter invariance

Include minimal number of parameters

Parameters have simple interpretation, can be easily estimated (on the interval scale with estimate of precision)

Can be applied to all item types which use in educational and psychological tests

Theory of item and examinee analysis is well developed

All specific testing problems can be easily solved

Family of Rasch models:

 Dichotomous Rasch Model

 Partial Credit Model

 Rating Scale Model

 Binomial and Poisson Models

 Many-Facet Rasch Model

 Multidimensional Rasch Model

Criteria for model choice:

The number of response categories: two vs. more than two

The structure of response alternatives in polytomous items: common vs. individual

The number of attempts to an item: one attempt vs. more than one

The number of examinee parameters: one ability vs. more than one

The number of factors influencing the examinee performance: only item difficulty vs. plus additional factors

Relationship between basic

Rasch models

Software

 Winsteps (Dichotomous Model, PCM,

RSM, Binomial and Poisson Models)

 ConQuest (all models except Binomial and

Poisson Models)

 Facets (Many-Facet Model)

 Other IRT software (depends on the software)

Dichotomous Rasch Model

ni

1 /

  n

, )

 exp(

 n

 n

 

 i

) i

)

(

P(X ni n

=1,…,

=1/

θ n

,

δ i

) is the probability that an examinee n

N ) with ability θ n answers item i ( i

=1,…,

I ) with difficulty δ i correctly.

 The model is called one-parameter because the probabilty P ni is a function of difference ( θ n

-

δ i

).

 The model is also called logistic because the function is logistic

Item Characteristic Curve in Rasch Dichotomous model

δ i

– the point on the ability scale where the probability of a correct response is

0.5. The greater the value of this parameter, the greater the ability that is required for an examinee to have a 50% chance of getting the item correct; hence, the harder the item.

In theory δ i parameter can vary from ∞ to +∞, but typically values of δ i vary from about -3 to +3.

ICCs of three items in Rasch model with difficulties

δ

1

= -1,

δ

2

= 0 и

δ

3

= +1

Assumptions of Rasch model

 ICCs differ only in their location along the ability scale, they don’t cross (are parallel).

 Item difficulty is the only item characteristic that influences examinee performance.

 All items are equally discriminating.

 The lower asymptote of the ICC is zero: examinees of very low ability have zero probability of correctly answering the item (no guessing).

Parameter Interpretation in

Dichotomous Rasch Model

An ability level of any examinee is defined as logarithm chance for this examinee to answer correctly an item with 0 difficulty:

  n ln

1

P n 1

P n 1

,

A difficulty level of any item is defined as logarithm chance to answer correctly this item by an examinee with 0 ability:

 i

 ln

1

P

1 i

P

1 i

Parameter Separation in Rasch

Model

ln

1

P ni

P ni

  n

  i

Log odds that a person passes an item is just difference between examinee ability level and item difficulty.

Item and examinee parameters are completely separated, making it possible to estimate examinee ability independently of item difficulty, and to estimate item difficulty independently of examinee ability.

Item and examinee parameters lie on the same linear scale.

The unit of measurement on this scale is one logit (shortening of log-odds unit – the unit of logarithm chances).

Concept of “Specific Objectivity” in Rasch Model

 Comparisons between objects must be invariant over the specific conditions under which they were observed :

- comparisons between persons must be invariant over the specific items used to measure them,

- comparisons between items must be invariant over the specific persons used to calibrate them.

 Only Rasch models guarantee this property.

Invariant-Person Comparisons: the same differences are observed regardless of the items

Consider the Rasch model predictions for log odds ratio for two persons with abilities θ

1 and θ

2 for an item with difficulty δ i

: ln

1

P

1 i

P

1 i

 

1 i

, ln

1

P

2 i

P

2 i

 

2 i

Subtracting the differences yields the following: ln

1

P

1 i

P

1 i

 ln

1

P

2 i

P

2 i

(

 

1 i

)

(

2

  i

)

  

1

2

Thus, the difference in log odds for any item is simply the difference between the two abilities: the item difficulty δ i dropped out of the equation.

So, the same difference in performance between the two persons is expected, regardless of item difficulty.

Invariant-Item Comparisons: differences between items don’t depend on the particular persons used to compare them

Consider two items with difficulties δ

1 and the log odds of two items for any person n :

δ

2 and the following two equations for ln

1

P n 1

P n 1

  n

 

1

, ln

1

P n

2

P n 2

  n

 

2

Subtracting the differences yields: ln

1

P n 1

P n 1

 ln

1

P n

2

P n 2

(

 n

 

1

)

(

 n

 

2

)

 

2

 

1

The ability level dropped out of the equation. So, the expected difference in performance for any examinee is the difference between item difficulties.

Other IRT Models (2PL and 3PL) fail to meet

“specific objectivity”:

For example, comparison of two persons in the framework of 2PL model yields the following: ln

1

P

1 i

P

1 i

 a i

(

 

1

 i

), ln

1

P

2 i

P

2 i

 a i

(

2

  i

)

And further ln

1

P

1 i

P

1 i

 ln

1

P

2 i

P

2 i

 a i

(

 

1

 i

)

 a i

(

 

2

 i

)

 a i

(

 

1

2

)

The right part of this equation contains a discrimination parameter a i of the item. So, unlike the Rasch model, the expected difference in performance does not depend only on abilities; it is proportional to their difference with the proportion a i depending on the particular item.

Parameter Estimation in Rasch

Model

Total number of parameters to be estimated in dichotomous Rasch model is N+I, where N is the number of examinees, I is the number of items .

Methods of mathematical statistics are used for parameter point estimation. Most estimation methods employ some form of the method of maximum likelihood (without distributional assumptions or with distributional assumptions regarding the parameters).

Under Rasch model raw scores are sufficient statistics for both items and persons measures. It means that all examinees with the same raw score will get the same ability estimate. Similarly for items. Due to this property, all measures can be estimated simultaneously.

Probability Curves for Rasch

Dichotomous Model

π ni 0 and

π ni 1

– probabilities of getting by an examinee score 0 and 1 for item i.

In dichotomous case

π ni 1

= P ni and

π ni 0

= 1-

π ni 1

= 1P ni

.

Partial Credit Model

A simple extension of dichotomous Rasch model: one or more intermediate levels of performance are allowed.

Different levels of performance are labelled 0 (no steps taken), 1, 2 , …, m (the highest level of performance possible).

In order to reach the highest category m , an examinee must complete m steps consecutively, getting 1 point for each of them. Each step can be taken only if the previous step has been completed.

Difficulty of each step doesn’t depend on difficulties of other steps.

Two-step item (

m=

2)

Performance levels: 0 (absolutely correct, superior quality) ,1 (particular correct, good quality) and 2

(incorrect, poor quality).

An item has an intermediate scoring level which allows to award an additional point for particular completed item.

Such item has three possible categories and two steps.

The probability of completing each step can be described by a Rasch model:

P ni 1

 exp

  n

 

 i 1

 n

 

 i 1

 , P ni 2

 exp(

 n

  n

 i

 

2

) i 2

)

P ni 1

- probability of person n scoring 1 rather 0 on item

P ni 2

θ n

- probability of person n scoring 2 rather 1 on item

- ability level of examinee n i

δ i 1 and δ i 2

– step difficulties in item i . i

Item Operating Curves for two-step item (Step Characteristics Curves)

Partial Credit Model

 nik

 exp (

  n j k 

0

 ij

)

,

  l m i  

0 l exp (

 n

  j

0 ij

)

π

θ n nik is the probability of examinee n with ability to get score k for item i . k is the count of the completed item steps. k =0,1,…, m i steps.

, where m i is the number of item

Category Probability Curves for

Two-Step Item

Category Probability Curves for Two-Step Item for the case δ i 1

>

δ i 2

 When the second step is easier than the first, the probability curve for the middle response category doesn’t dominate on any part of the ability scale.

 Even though the second step is easier than the first, the defined order of the response categories requires that this easier second step be undertaken only after the harder first step has been successfully completed.

PCM can be written as:

ln

 nik

(

1)

   n

 ik

 For any step k log odds for this examinee is only defined by the difficulty of the step

δ

ik

Step Characteristics Curves in PCM

(Operaing Curves)

These operating curves have the same slope (so don’t cross) and differ only in their location on the ability continuum.

Item Characteristic Curve for two-step item

ICC for polytomous item represents an expected score on the item as a function of examinee ability level

ICCs of several two-step items

Unlike ICCs in the dichotomous Rasch model, ICCs of different polytomous items are not parllel, they can cross

Rating Scale Model

Can be considered as a particular case of PCM when all items have the common response format (for example,

Likert scale)

Usually is used to collect attitude data

Each item is provided with a stem (or statement of attitude) and a few response alternatives where a respondent is required to chose one, indicating the extent to which the statement in the stem is endorsed

Thus, all items have m response alternatives and they are the same for all items . Completing of the k -th step can be considered as choosing the k -th alternative over the ( k -1)-th in response to the item.

Example: Likert Scale

Has 4 or 5 categories: Strongly Disagree , Disagree , Undecided (or

Neutral ) - may be omitted, Agree , Strongly Agree:

SD D N A SA

Response alternatives are ordered to represent a respondent’s increasing inclination towards the concept questioned

A person who chooses to Agree with a statement on an attitude questionnaire can be considered to have chosen Disagree over

Strongly Disagree (1-st step taken), and also Neutral over Disagree

(2-nd step taken), and also Agree over Neutral (3-rd step taken), but to have failed to choose Strongly Agree over Agree (4-th step not taken).

All responses are coded as 1,2,3,4,5, where the higher number indicates a higher degree of agreement with the statement.

Concept of item difficulty

 Consider two statements from the test of computer anxiety :

I am so afraid of computers I avoid using them SD D N A SA

I am afraid that I’ll make mistakes when I use my computer SD D N A SA

 It is more than likely that the first stem indicates much higher levels of computer anxiety that does the second stem. Indeed, the children who respond SA on the “mistakes” stem might endorse N on the

“avoid using” stem. And we should use :

I am so afraid of computers I avoid using them SD D N A SA

I am afraid that I’ll make mistakes when I use my computer SD D N A SA

 The first item can be considered as more difficult than the second item. So each item can be accorded a difficulty estimate (location of the item on the variable axis)

Concept of threshold parameter

As the same set of rating points is used with every item, it is usually thought that the relative difficulties of the steps in each item should not vary from item to item.

The pattern of item steps around an item location is supposed to be determined by the fixed set of threshold parameters, that is fixed set of rating points used with all items.

These threshold parameters are estimated once for the entire item set.

Difficulty of any step can be resolved into two components :

 ik

  i k

δ ik

- difficulty of completing the k -th step or choosing the k -th alternative in the response to the item i

δ i

τ k

the location of item i (item difficulty )

– the location of the k

– th step in each item relative to that item’s location (threshold parameter for k th step)

The only difference between items is the difference in their location on the variable (or difference in their difficulty). The pattern of item steps around this location is described by the threshold parameters τ k

, k = 1,…, m , that is fixed set of rating points used with all items.

Probabilities of passing each threshold can be described by a Rasch model (for two twostep items):

P ni 1

 exp(

1 exp( n

 n

(

 

( i

  i

1

))

1

))

,

P nj 1

 exp(

 n

(

 j

 

1

))

  n

(

 j

 

1

))

,

P ni 2

 exp(

1 exp( n

 n

(

 

( i

  i

2

))

2

))

P nj 2

 exp(

 n

(

 j

 

2

  n

(

 j

 

))

2

))

 P nik

- probability of person n scoring k rather k1 (choosing the k th alternative over ( k 1 )-th) in response to the item i; k= 1,2 .

θ n

δ i

τ i 1

- ability level of examinee and

τ i 2 n

– the location of item i on the variable axis (item difficulty)

– threshold parameters in item i .

Item Operating Curves (Step Characteristics

Curves) for two Rating Scale Items with

Three Response Categories

Rating Scale Model

 nik

 exp j k 

0

(

 n

(

  i

 j

))

,

  l m

 

0 exp j l

0

(

 n

(

  i

 j

))

π

θ n nik is the probability of examinee n with ability to get score alternative). k for item i (to chose the k -th k =0,1,…, m , where m is the number of item steps in any item.

RSM can be written as:

ln

 nik

(

1)

  n

(

  i

 k

)

 For any step k (or the k -th response category) log odds of choosing the category over the previous adjacent one for this examinee is only defined by the difficulty of the item δ i difficulty of the k -th step τ k and

Download