Lecture 11 - Rasch Modeling Bond & Fox Ch 6

advertisement

Bond and Fox, Chapter 6 – Likert Scales

The authors spend pages 102-103 spreading what might be termed lies about Likert scaling.

They suggest that the response of 5 on a 5-point scale is taken by researchers to mean 5 times the value of the response of 1 on the same 5-point scale. No one who knows anything about measurement would presume that.

There are some assumptions underlying common usage of Likert scales that can be questioned, however.

For example,

Does a 5 response to Item 1 represent the same amount of Extraversion as a 5 response to Item

10?

Does a 5 response to Item 1 represent more extraversion than a 3 response to Item 3?

These are the kinds of issues that Rasch modeling attempts to address.

The relationship between Likert Scales and Right/Wrong tests

Consider the IPIP Extraversion item, “Am the life of the party.”

A person who is extraverted will agree with this item.

Thus, an agreement response is the “correct” response for Extraverts to this item.

So, if we count the number of agreement responses to the Extraversion items, that’s counting the number of “correct” responses to those items. The person with the higher count has more

Extraversion, i.e., greater “ability” on the Extraversion dimension.

So, if we were to score all Likert items as Disagree=0 and Agree=1 and if we were to count the

1s to get the persons total Extraversion score, then scoring them would be no different from scoring a Right/Wrong test by counting the 1s.

That is, a Likert item can be viewed as simply a Right/Wrong item in which the Agreement response is the correct answer.

PSY 5950 BF6 - 1

Example of scoring a personality test as Right/Wrong, i.e., treating responses as a dichotomy

The data here are taken from a study in which the IPIP and the NEO-FFI Big Five instruments were compared.

There were 189 respondents – UTC UG and G students. Half filled out the IPIP first and the

NEO second. The order was reversed for the other half.

The focus here will be on the IPIP Extraversion scale.

The Extraversion items are

1 I am the life of the party.

2 I don't talk a lot.

3 I feel comfortable around people.

4 I keep in the background.

5 I start conversations.

6 I have little to say.

7 I talk to a lot of different people at parties.

8 I don't like to draw attention to myself.

9 I don't mind being the center of attention.

10 I am quiet around strangers.

The standard SPSS stuff on the data is below. Alpha was .890. Negatively worded items were reverse-scored prior to this and all following analyses.

This is the histogram of Extraversion mean-of-responses scores.

It’s approximately Unimodal and

Symmetric. ie1 ie2 ie3 ie4 ie5 ie6 ie7 ie8 ie9 ie10

Scale Mean if

Item Deleted

30.63492

29.87831

29.73016

30.15344

29.73545

29.71429

30.14286

30.84127

30.26455

30.71429

Item-Total Statistics

Scale Variance if Item Deleted

Corrected Item-

Total

Correlation

45.467

43.076

47.411

44.705

45.196

46.822

42.251

46.358

45.940

43.748

Cronbach's

Alpha if Item

Deleted

.631 .879

.703

.577

.717

.711

.874

.883

.874

.875

.581

.690

.883

PSY 5950 BF6 - 2

.569 .884

.524

.623

.887

.881

Scoring as a Right/Wrong test.

To score a test as a right/wrong test, the agreement responses must be dichotomized into

Disagreement vs Agreement.

The dichotomization should be made near the middle of the response scale.

I’ll actually try all possible here, in order to show you why the middle is better.

First, I dichotomized the items as 1=0 and 2,3,4,5=1.

Then I dichotomized them as 1,2=0 and 3,4,5=1.

Next was 1,2,3=0 and 4,5=1.

Finally, it was 1,2,3,4=0 and 5=1. (Boy, I had so much energy when I was young.)

The correlations of the 4 new right/wrong dichotomies with the original summated mean-ofresponses Extraversion scale scores are

Correlations with Extraversion scale scores biasid1iext

.673 biasid2iext

.900 biasid3iext

.929 biasid4iext

.671 Pearson Correlation

Sig. (2-tailed) biasiext .000 .000 .000 .000

N 189 189 189 189

As you can see, the two “middle” dichotomizations – 1,2=0 vs 3,4,5=1 and 1,2,3=0 vs 4,5=1 – yielded “right/wrong” scale scores that were very highly correlated with the original summated scores.

I’m not recommending that you do this. But in a pinch, you can get a fairly good estimate of a personality summated scale score from dichotomized variables treated as if they were right/wrong items, as long as the dichotomization is in the “middle” of the Likert response format.

Here’s a screen shot of the first few cases of the original responses and the 1,2,3=0 / 4,5=1 dichotomized responses . . .

As you can see, the scale scores based on the dichotomized items, biasd3iext , look different from the scale scores based on actual responses, biasiext. However they’re quite highly correlated.

PSY 5950 BF6 - 3

Rasch analysis of personality items dichotomized so that they’re like right/wrong items.

Here’s the Rasch analysis of the above 1,2,3=0 vs 4,5=1 items as if they were right/wrong items.

Item information

TABLE 14.1 Bias Study Dichotomized (1,2,3vs4,5)Ex ZOU872WS.TXT Apr 1 19:38 2012

INPUT: 189 Persons 10 Items MEASURED: 189 Persons 10 Items 2 CATS 1.0.0

--------------------------------------------------------------------------------

Person: REAL SEP.: 1.49 REL.: .69 ... Item: REAL SEP.: 6.30 REL.: .98

Item STATISTICS: ENTRY ORDER

+-----------------------------------------------------------------------------------------------------------+

|ENTRY TOTAL MODEL| INFIT | OUTFIT |PTMEA|EXACT MATCH| |

|NUMBER SCORE COUNT MEASURE S.E. |MNSQ ZSTD|MNSQ ZSTD|CORR.| OBS% EXP%| Item |

|------------------------------------+----------+----------+-----+-----------+------------------------------|

| 1 58 189 1.74 .21|1.01 .1| .85 -.4| .63| 81.3 81.3| 01 Life of party |

| 2 121 189 -.80 .20| .94 -.6|1.00 .1| .65| 78.3 78.4| 02R Don't talk a lot |

| 3 138 189 -1.54 .22|1.06 .6| .78 -.6| .60| 78.3 81.4| 03 Comfortable around people|

| 4 94 189 .26 .20| .84 -1.7| .74 -1.7| .71| 81.3 77.6| 04R Keep in background |

| 5 137 189 -1.49 .22| .80 -1.9| .66 -1.1| .67| 84.9 81.1| 05 Start conversations |

| 6 140 189 -1.63 .22|1.05 .5| .81 -.5| .59| 81.9 81.9| 06R Have little to say |

| 7 111 189 -.40 .20| .92 -.8|1.16 .9| .66| 79.5 77.6| 07 Talk to diff people |

| 8 42 189 2.55 .24|1.11 .9|1.63 1.4| .54| 84.9 84.8| 08R Don't draw attention |

| 9 97 189 .14 .20|1.36 3.4

|1.28 1.5| .55| 63.9 77.4| 09 Don't mind being center |

| 10 71 189 1.17 .20| .97 -.3| .83 -.8| .66| 79.5 79.3| 10R Quiet around strangers |

|------------------------------------+----------+----------+-----+-----------+------------------------------|

| MEAN 88.9 166.0 .00 .21|1.01 .0| .97 -.1| | 79.4 80.1| |

| S.D. 33.1 .0 1.38 .01| .15 1.4| .29 1.0| | 5.6 2.3| |

+-----------------------------------------------------------------------------------------------------------+

There is one item in this analysis that is very poorly fitting – “Don’t mind being the center of attention”.

Same information ordered by Extraversion

Top items – Respondent must have a LOT of E to endorse – lots of E required to get this “right”

Bottom items – Respondent may endorse with just a little E – little E required to get this “right”

TABLE 13.1 Bias Study Dichotomized (1,2,3vs4,5)Ex ZOU872WS.TXT Apr 1 19:38 2012

INPUT: 189 Persons 10 Items MEASURED: 189 Persons 10 Items 2 CATS 1.0.0

--------------------------------------------------------------------------------

Person: REAL SEP.: 1.49 REL.: .69 ... Item: REAL SEP.: 6.30 REL.: .98

Item STATISTICS: MEASURE ORDER

+-----------------------------------------------------------------------------------------------------------+

|ENTRY TOTAL MODEL| INFIT | OUTFIT |PTMEA|EXACT MATCH| |

|NUMBER SCORE COUNT MEASURE S.E. |MNSQ ZSTD|MNSQ ZSTD|CORR.| OBS% EXP%| Item |

|------------------------------------+----------+----------+-----+-----------+------------------------------|

| 8 42 189 2.55 .24|1.11 .9|1.63 1.4| .54| 84.9 84.8| 08R Don't draw attention |

| 1 58 189 1.74 .21|1.01 .1| .85 -.4| .63| 81.3 81.3| 01 Life of party |

| 10 71 189 1.17 .20| .97 -.3| .83 -.8| .66| 79.5 79.3| 10R Quiet around strangers |

| 4 94 189 .26 .20| .84 -1.7| .74 -1.7| .71| 81.3 77.6| 04R Keep in background |

| 9 97 189 .14 .20|1.36 3.4

|1.28 1.5| .55| 63.9 77.4| 09 Don't mind being center |

| 7 111 189 -.40 .20| .92 -.8|1.16 .9| .66| 79.5 77.6| 07 Talk to diff people |

| 2 121 189 -.80 .20| .94 -.6|1.00 .1| .65| 78.3 78.4| 02R Don't talk a lot |

| 5 137 189 -1.49 .22| .80 -1.9| .66 -1.1| .67| 84.9 81.1| 05 Start conversations |

| 3 138 189 -1.54 .22|1.06 .6| .78 -.6| .60| 78.3 81.4| 03 Comfortable around people|

| 6 140 189 -1.63 .22|1.05 .5| .81 -.5| .59| 81.9 81.9| 06R Have little to say |

|------------------------------------+----------+----------+-----+-----------+------------------------------|

| MEAN 88.9 166.0 .00 .21|1.01 .0| .97 -.1| | 79.4 80.1| |

| S.D. 33.1 .0 1.38 .01| .15 1.4| .29 1.0| | 5.6 2.3| |

+-----------------------------------------------------------------------------------------------------------+

The item “Negation of : Don’t draw attention to myself” is the item only the most extraverted

“got correct”. Those respondents who were the most highly extraverted disagreed with this – they DO draw attention to themselves.

The item, “Negation of : Have little to say” is the item most people from the least extraverted to the most extraverted “got correct”. People even just slightly extraverted disagree with this – they

DO have something to say.

PSY 5950 BF6 - 4

Item Map of dichotomized items

TABLE 12.2 Bias Study Dichotomized (1,2,3vs4,5)Ex ZOU872WS.TXT Apr 1 19:38 2012

INPUT: 189 Persons 10 Items MEASURED: 189 Persons 10 Items 2 CATS 1.0.0

--------------------------------------------------------------------------------

Persons MAP OF Items

<more>|<rare>

3 ###### +

.############ |

|T

|

|

| 08R Don't draw attention

|

|

|

|

2 S+

########## |

|

| 01 Life of party

|

|

|S

|

| 10R Quiet around strangers

######### |

1 +

|

|

|

|

######## |

|

| 04R Keep in background

M|

| 09 Don't mind being center

0 +M

########### |

|

|

| 07 Talk to diff people

|

######## |

|

| 02R Don't talk a lot

|

-1 +

|

.########## |

|

|S

S| 03 Comfortable around people

05 Start conversations

| 06R Have little to say

|

|

.######## |

-2 +

|

|

|

|

|

|

|

.##### |T

|

-3 .##### +

<less>|<frequ>

EACH '#' IS 2.

The persons on the left extend beyond the item s in the high extraversion direction (upwards) so there were people who likely got all items “correct” and they extend beyond the items in the low extraversion direction (downwards), so there were people who didn’t get many items

“correct” in this dichotomization. The fact that the distribution of person “Extraversions” extends beyond the item values is good.

PSY 5950 BF6 - 5

Comparison of person measures.

I pasted the Rasch person measures into SPSS and created a scatterplot of the original Likert summated scores of the original responses vs. the Rasch person measures from the analysis of dichotomized items. Note that the person measures are of the (1,2,3 vs 4,5) dichotomized values treated as right/wrong answers. Here’s the scatterplot . . .

So r=.928

Rasch measures of dichotomized items

1,2,3 vs 4,5

Mean of responses to E items – regular Likert scale score

There are multiple Likert scale scores for each Rasch measure value because a given proportion

“correct” could be obtained by means of different combinations of 10 responses .

The r-squared is .861, which means the r is .928

, essentially the same as the correlation of the sum of “correct” responses found above (.929, p 3 of this lecture).

The correlation of Rasch measures of dichotomized items with the traditional Likert summated scale scores is very large. In spite of this, inspection of the scatterplot yields some reason to not treat them as being completely equivalent. For example, the points circled in the scatterplot all have the same Rasch score of -2.2 or so. Yet their Likert means range from 2 to 3.2, a wide range of Likert scores all for persons with the same Rasch score. The same criticism applied to the highest and lowest Rasch measures. Each single Rasch value represents a range of summated score values. This is due to the dichtomization.

So we won’t continue to dichotomize the Likert responses and analyze them as right/wrong answers. This was what is called an intellectual exercise – an exercise in “what if”.

PSY 5950 BF6 - 6

Using all the response values in analyses of Likert items – Analysis as Dimension Scales

Start here on 3/25/15.

We hardly ever dichotomize Likert items and score them as right/wrong.

Since most Likert items use response formats with multiple values 1 thru 5 or 1 thru 7, appropriate analyses of such items must take all the information in the responses into account.

Dichotomizing as done above ignores differences within the categories .

For example, response of 1 is not the same as a response of 2 or 3, even though they were all categorized as 0 in the above scheme. A response of 7 is more agreement than a response of 5, so treating them both as simply “Agree” jus’ ain’t right.

Rasch analysis can incorporate multiple responses to each item.

The ideas behind these analyses date to the seminal work of L. L. Thurstone in the 1920s.

It uses a model of respondent behavior that assumes the reading of an item results in an internal

“amount” of agreement. “I am the life of the party.”

This internal “amount” is evaluated against the perceived response alternatives. The response alternative to which the internal “amount” is closest is picked as the response.

This means that both the perception of the item and the perceptions of the response alternatives are internalized and placed on an internal continuum.

The assumption is made that there is a dividing line between the internalization of adjacent response alternatives. These dividing lines are called thresholds in much of the literature on scaling of Likert data.

PSY 5950 BF6 - 7

For example, if a participant responds A (for Agree) to an item, under this model this is taken as evidence that the internal perception of the item is between the T3 and the T4 thresholds as shown in the figure below – closer to the internalization of A than to any other response alternatives.

T1 T2 T3 T4

-----------------|-----------------------|---------------------|---------------------|-----------------------

SD D N A SA

So, the program estimates K-1 threshold s for a K-response format. This means that for a 5response format with SD, D, N, A, SA, it would estimate 4 thresholds.

The program reports the threshold values on the same scale as people and items, in case you want to use them.

The program estimates the following for Likert data . . .

1. Person “ability” values. More appropriately, person positions on the dimension.

2. Item “difficulty” values. Position of items on the same personality dimension.

Items with high numeric positions are those only endorsed by persons with the most of whatever it is that the items represent, e.g., Extraversion.

3. Threshold values relative to each item.

We won’t pay much attention to these in the hope of avoiding sensory overload.

Trying to deal with 4 thresholds X 50 items, for example, would mean that we would

Have to try to intrepret 200 different values. It’s probably the case that individual item thresholds are the same for each item, so . . .

4. Average threshold values.

5. Response “marker” values – locations on the internal continuum corresponding to each response category.

PSY 5950 BF6 - 8

Rasch Control file for the analysis – Bond&FoxChapter6B.txt.

The data are Extraversion items from the “Bias” study conducted several years ago. 189 persons responded to both the IPIP-50 and the NEO-FFI Big 5 questionnaires. Responses were on a 5point scale due to the fact that the NEO questionnaire packets were set up for 5-point scales.

&INST ; initial line (can be omitted)

TITLE = "Bias Study Extraversion Items"

PERSON = Person ; persons are ...

ITEM = Item ; items are ...

ITEM1 = 5 ; column of response to first item in data record

NI = 10 ; number of items

NAME1 = 1 ; column of first character of person label

NAMELEN = 4 ; length of person label

XWIDE = 1 ; number of columns per item response

TOTAL = Yes ; show total raw scores

CHART = Yes ; produce across-pathway picture

MNSQ = No ; use Standardized fit statistics

STBIAS = Yes ; Adjust for estimation bias

MAXPAGE = 60 ; Maximum lines per page

IREFER= FFFFFFFFFF ; Forward and Reversed items

CODES = 12345 ; valid codes in data file

IVALUEF = 12345 ; Forward items

IVALUER = 54321 ; Reversed items

CLFILE = * ; category label file for category naming

1+1 "STD Strongly Disagree" ; Item 1 is a forward item

1+2 "D Disagree"

1+3 "N Neither A nor D"

1+4 "A Agree"

1+5 "STA Strongly Agree"

* ; end of CLFILE=* list

&END

01 Life of party

02R Don't talk a lot

03 Comfortable around people

04R Keep in background

05 Start conversations

06R Have little to say

07 Talk to diff people

08R Don't draw attention

09 Don't mind being center

10R Quiet around strangers

END LABELS

3001 3343433333

30034544444244

N = 189 for this dataset.

PSY 5950 BF6 - 9

ie1 ie2 ie3 ie4 ie5 ie6 ie7 ie8 ie9 ie10

Traditional Analysis of these data.

Here are the first 10 cases

(We’ve seen this before, so it’s included here to refresh out memories.) origorder ie1 ie2 ie3 ie4 ie5 ie6 ie7 ie8 ie9 ie10

1 3 3 4 3 4 3 3 3 3 3

2 4 5 4 4 4 4 4 2 4 4

3 4 4 4 4 4 4 4 3 4 3

4 2 4 4 4 4 4 5 1 1 2

5 2 2 4 3 3 3 2 2 3 2

6 4 3 3 4 4 2 4 3 3 4

7 3 5 4 3 4 4 4 3 3 2

8 1 2 2 2 2 2 2 1 1 2

9 4 5 4 4 5 5 5 4 5 4

10 2 4 2 4 4 2 2 2 3 2

Here’s the RELIABILITY Output

Reliability Statistics

Cronbach's Alpha

.890

Cronbach's Alpha

Based on

Standardized Items

.892

N of Items

10

Summary Item Statistics

Item Means

Mean

3.353

Scale Mean if Item

Deleted

30.63

29.88

29.73

30.15

29.74

29.71

30.14

30.84

30.26

30.71

Minimum

2.693

Maximum

3.820

Item-Total Statistics

Range

1.127

Scale Variance if

Item Deleted

45.467

43.076

47.411

44.705

45.196

46.822

42.251

46.358

45.940

43.748

Corrected Item-Total

Correlation

.631

.703

.577

.717

.711

.581

.690

.569

.524

.623

Maximum / Minimum

Squared Multiple

Correlation

.476

.566

.373

.542

.550

.428

.590

.434

.454

.458

1.418

Variance

.183

Cronbach's Alpha if

Item Deleted

.879

.874

.883

.874

.875

.883

.875

.884

.887

.881

Here’s a dot plot of the scores.

N of Items

10

Mean = 3.35

Median = 3.40

SD = 0.74

Skewness = -0.45

So the midpoint is above the middle of the response range.

The distribution is packed on the right and skewed to the left.

PSY 5950 BF6 - 10

Application of Rasch model to a Dimension scale – Bias study Extraversion responses

(File is Bond&FoxChapter6B.txt; Negatively worded items have been reverse-scored.)

Note that the Person reliability estimate of .85

indicates that if the sample of persons were given an equivalent set of items, we’d expect a correlation of .85 person measures for the two sets of items.

Correlation of .85 between scores of same people using equivalent items – so good test.

The item reliability of .98 indicates that if the same items were used on an equivalent sample of persons, the correlations between item measures for the two samples would be expected to be

.98. Correlation of .98 between measures of the same items using equivalent people.

Item Statistics -Ordered by Item number.

TABLE 14.1 Bias Study Extraversion Items ZOU268WS.TXT Apr 2 12:27 2012

INPUT: 189 Persons 10 Items MEASURED: 189 Persons 10 Items 5 CATS 1.0.0

--------------------------------------------------------------------------------

Person: REAL SEP.: 2.42 REL.: .85 ... Item: REAL SEP.: 7.03 REL.: .98

Item STATISTICS: ENTRY ORDER

+-----------------------------------------------------------------------------------------------------------+

|ENTRY TOTAL MODEL| INFIT | OUTFIT |PTMEA|EXACT MATCH| |

|NUMBER SCORE COUNT MEASURE S.E. |MNSQ ZSTD|MNSQ ZSTD|CORR.| OBS% EXP%| Item |

|------------------------------------+----------+----------+-----+-----------+------------------------------|

| 1 548 189 .80 .09| .86 -1.4| .93 -.7| .69| 50.8 48.1| 01 Life of party |

| 2 691 189 -.52 .10|1.13 1.2|1.06 .6| .74| 46.5 52.9| 02R Don't talk a lot |

| 3 719 189 -.82 .10| .96 -.3| .88 -1.1| .65| 59.4 56.0| 03 Comfortable around people|

| 4 639 189 -.01 .10| .68 -3.6

| .75 -2.6| .75| 64.2 49.8| 04R Keep in background |

| 5 718 189 -.81 .10| .75 -2.5

| .74 -2.6| .75| 61.0 55.2| 05 Start conversations |

| 6 722 189 -.86 .10|1.03 .3|1.01 .2| .65| 59.9 56.2| 06R Have little to say |

| 7 641 189 -.03 .10|1.20 1.9|1.17 1.6| .73| 47.6 49.9| 07 Talk to diff people |

| 8 509 189 1.14 .09| .98 -.2|1.03 .3| .65| 52.9 48.3| 08R Don't draw attention |

| 9 618 189 .18 .09|1.32 2.9

|1.28 2.6| .62| 48.1 49.0| 09 Don't mind being center |

| 10 533 189 .93 .09|1.16 1.6|1.16 1.5| .70| 46.5 47.6| 10R Quiet around strangers |

|------------------------------------+----------+----------+-----+-----------+------------------------------|

| MEAN 627.8 187.0 .00 .10|1.01 .0|1.00 .0| | 53.7 51.3| |

| S.D. 76.7 .0 .72 .00| .19 1.9| .17 1.6| | 6.4 3.2| |

+-----------------------------------------------------------------------------------------------------------+

Note that there are 3 items for which there is an indication that agreement to the items was not as consistent as we would like. Two of the Infit values are negative, indicating “very good, perhaps too good consistency”. Only 1 - #9 is large and positive. This is an item for which some high E people might disagree and low E people might agree.

Why? For example, the items measure Extraversion. #9: “Don’t mind being the center of attention” may reflect characteristics, other than extraversion, e.g., narcissism, in some people.

PSY 5950 BF6 - 11

Same Items - Ordered by measure.

TABLE 13.1 Bias Study Extraversion Items ZOU268WS.TXT Apr 2 12:27 2012

INPUT: 189 Persons 10 Items MEASURED: 189 Persons 10 Items 5 CATS 1.0.0

--------------------------------------------------------------------------------

Person: REAL SEP.: 2.42 REL.: .85 ... Item: REAL SEP.: 7.03 REL.: .98

Item STATISTICS: MEASURE ORDER

+-----------------------------------------------------------------------------------------------------------+

|ENTRY TOTAL MODEL| INFIT | OUTFIT |PTMEA|EXACT MATCH| |

|NUMBER SCORE COUNT MEASURE S.E. |MNSQ ZSTD|MNSQ ZSTD|CORR.| OBS% EXP%| Item |

|------------------------------------+----------+----------+-----+-----------+------------------------------|

| 8 509 189 1.14

.09| .98 -.2|1.03 .3| .65| 52.9 48.3| 08R Don't draw attention |

| 10 533 189 .93

.09|1.16 1.6|1.16 1.5| .70| 46.5 47.6| 10R Quiet around strangers |

| 1 548 189 .80

.09| .86 -1.4| .93 -.7| .69| 50.8 48.1| 01 Life of party |

| 9 618 189 .18

.09|1.32 2.9

|1.28 2.6| .62| 48.1 49.0| 09 Don't mind being center |

| 4 639 189 -.01

.10| .68 -3.6

| .75 -2.6| .75| 64.2 49.8| 04R Keep in background |

| 7 641 189 -.03

.10|1.20 1.9|1.17 1.6| .73| 47.6 49.9| 07 Talk to diff people |

| 2 691 189 -.52

.10|1.13 1.2|1.06 .6| .74| 46.5 52.9| 02R Don't talk a lot |

| 5 718 189 -.81

.10| .75 -2.5

| .74 -2.6| .75| 61.0 55.2| 05 Start conversations |

| 3 719 189 -.82

.10| .96 -.3| .88 -1.1| .65| 59.4 56.0| 03 Comfortable around people|

| 6 722 189 -.86

.10|1.03 .3|1.01 .2| .65| 59.9 56.2| 06R Have little to say |

|------------------------------------+----------+----------+-----+-----------+------------------------------|

| MEAN 627.8 187.0 .00 .10|1.01 .0|1.00 .0| | 53.7 51.3| |

| S.D. 76.7 .0 .72 .00| .19 1.9| .17 1.6| | 6.4 3.2| |

+-----------------------------------------------------------------------------------------------------------+

Items with large positive measures are those to which only the most extraverted agreed.

Items with negative measures are those to which the most extraverted most likely agreed and to which moderately or low agreement respondents also agree.

This information is essentially the same as we got from the analysis of the dichotomized items above. In fact, here is the scatterplot of item measures from this analysis vs. the item measures from the analysis of the 1,3,4 vs 4,5 dichotomy above . . .

Note that the item measure values are not the same for the two analyses. These values are unique up to a linear transformation, as are the person measures. But they’re clearly very highly related: r = .978.

PSY 5950 BF6 - 12

Thresholds – Average (Table 3.2 Rating (partial credit) scale)

This is new stuff.

+--------------------------------------------------------------------------

|CATEGORY STRUCTURE | SCORE-TO-MEASURE | 50% CUM.| COHERENCE|ESTIM|

| LABEL MEASURE S.E. | AT CAT. ----ZONE---|PROBABLTY| M->C C->M|DISCR|

|------------------------+ --------------------+---------+----------+-----+

| 1 NONE | ( -3.66) -INF -2.75

| | 64% 16%| | 1 STD Strongly Disagree

| 2 -2.47 .12 | -1.61 -2.75 -.82

| -2.58 | 54% 48%| .87| 2 D Disagree

| 3 -.43 .07 | -.16 -.82 .56

| -.68 | 36% 54%| .97| 3 N Neither A nor D

| 4 .02 .06 | 1.57 .56 3.02

| .34 | 62% 69%| 1.05| 4 A Agree

| 5 2.87 .08 | ( 4.02) 3.02 +INF | 2.93 | 76% 29%| 1.06| 5 STA Strongly Agree

+--------------------------------------------------------------------------

M->C = Does Measure imply Category?

C->M = Does Category imply Measure?

SD D N A SA

-2.75 -0.82 0.56 3.02

| | | |

-.----.----.----.----.----.----.----.----.----.----.----.----.----.----.----.----.-

-4 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0

In English . . .

The leftmost category, SD, label has value -3.66. The interval of “internal perceptions” that would yield a response of SD extends from –infinity to -2.75.

The D label has value -1.62. The interval of “internal perceptions” that would yield a response of D extends from -2.75 to -0.82.

The N label has value of -.16. The interval of “internal perceptions” that would yield a response of N extends from -.82 to +.56.

The A label has value of 1.57. The interval of “internal perceptions” that would yield a response of A extends from +.56 to +3.02.

The SA label has value of 4.02. The interval of “internal perceptions” that would yield a response of SA extends from +3.02 to +infinity.

PSY 5950 BF6 - 13

Items+Thresholds – Probabilities of giving each possible response vs Person measure

(Table 3.2)

This is a figure in which the probability of making a given response, such as SD – the 1s in the figure below - is plotted vs. Person Extraversion values.

CATEGORY PROBABILITIES: MODES - Structure measures at intersections

P ++-----+-----+-----+-----+-----+-----+-----+-----+-----++

R 1.0 + +

O | |

B | 55|

A | 55 |

B .8 +11 55 +

I | 11 5 |

L | 1 55 |

I | 1 444444 5 |

T .6 + 1 44 44 5 +

Y | 1 222222 4 44 5 |

.5 + 1122 2 4 45 +

O | 221 2 4 554 |

F .4 + 2 1 22 4 5 44 +

| 2 1 33*3*33 5 4 |

R | 22 1 33 * 33 5 44 |

E | 22 1 3 4 2 33 55 4 |

S .2 +22 ** 4 2 33 5 44 +

P | 33 1 44 2 5*3 44 |

O | 33 4*1 22555 33 44|

N | 33333 444 1111 55552222 33333 |

S .0 +**************5555555555*11111111**********************+

E ++-----+-----+-----+-----+-----+-----+-----+-----+-----++

-4 -3 -2 -1 0 1 2 3 4 5

Person [MINUS] Item MEASURE

For example, persons low in E should have low probability of choosing response

5. That probability should increase for persons higher in E, as the “5” curve shows.

The x-axis of this graph represents amount of Extraversion, from -4 to +5.

The probability of making each response on the Y-axis is plotted as a function of the amount of person extraversion. The extreme response curves (1 and 5) should be highest at the extreme and go to zero toward the interior of the graph.

The interior response curves (2,3,4) should start small, increase to a max, then fall off again.

The 1s, 2s, 3s, 4s, and 5s should be ordered regularly from left to right.

For example, the 2 plot should be to the left of the 3 plot which should be to the left of the 4 plot.

If they weren’t that would mean that people with high Extraversion were more likely to disagree than to agree with E items, something that wouldn’t make sense.

These plots are nearly exactly what they should be.

People with low E are more likely to respond 1 or 2 than to respond 4 or 5.

People with high E are more likely to respond 4 or 5 than to response 1 or 2.

People with medium amounts of E are most likely to respond 3, just as we would expect.

The program also plots item-specific measures in Table 14.3, with an estimate of “where” each response is for each item. Too much for this presentation.

PSY 5950 BF6 - 14

Item Map

The item map from our study of dichotomized items is shown on the right.

TABLE 12.2

Bias Study Extraversion Items ZOU268WS.TXT Apr 2 12:27 2012

INPUT: 189 Persons 10 Items MEASURED: 189 Persons 10 Items 5 CATS 1.0.0

--------------------------------------------------------------------------------

Dichotomized items

Persons MAP OF Items

<more>|<rare>

5 . +

|

|

|

4 .# +

|

.# |

T|

3 ## +

.# |

# |

### |

2 ####### S+

.### |

###### |T

##### | 08R Don't draw attention

1 .### + 10R Quiet around strangers

.######## |S 01 Life of party

.###### M|

####### | 09 Don't mind being center

0 ##### +M 04R Keep in background

07 Talk to diff people

.######### |

.###### | 02R Don't talk a lot

### S|S 03 Comfortable around people

05 Start conversations

06R Have little to say

-1 ### +

## |

## |T

.# |

-2 # T+

# |

# |

. |

-3 +

|

|

|

-4 . +

<less>|<frequ>

EACH '#' IS 2.

As can be seen from this display, there are people both above and below the items.

This means that there are people so extraverted that they would probably respond 5 to all items and that there are people so unextraverted that they might respond 1 to all items.

As we saw above, the ordering of items is essentially the same for the analysis based on 5 responses per item vs. the dichotomized items.

PSY 5950 BF6 - 15

Person information

Obviously, there are too many respondents to usefully list them all. But it is useful to identify those whose response patterns were not consistent with the assumptions of the Rasch model.

These are persons with high Extraversion on the average who too often disagreed with “Low E” items or persons who had low Extraversion on the average who too often agreed with “High E” items.

Here they are from Table 6.1. They might be persons who could be treated as outliers???!!!

Person STATISTICS: MISFIT ORDER

+------------------------------------------------------------------------------------+

|ENTRY TOTAL MODEL| INFIT | OUTFIT |PTMEA|EXACT MATCH| |

|NUMBER SCORE COUNT MEASURE S.E. |MNSQ ZSTD|MNSQ ZSTD|CORR.| OBS% EXP%| Person|

|------------------------------------+----------+----------+-----+-----------+-------|

| 102 38 10 1.22 .45|7.06 6.0

|7.53 6.1

|A .02| .0 55.0| 3118 |

| 14 31 10 .01 .39|2.90 3.2

|2.92 3.2

|B-.11| 30.0 45.1| 3016 |

| 171 27 10 -.59 .39|2.91 3.4

|2.87 3.2

|C .23| 20.0 40.7| 3193 |

| 176 45 10 3.09 .60|2.73 2.8

|2.37 2.5

|D .32| 60.0 63.4| 3209 |

| 105 30 10 -.14 .39|2.69 3.0

|2.70 3.0

|E .46| 20.0 46.6| 3121 |

| 116 39 10 1.43 .47|2.68 2.6

|2.52 2.4

|F .25| 40.0 56.1| 3132 |

| 123 41 10 1.90 .50|2.63 2.5

|2.36 2.3

|G .75| 30.0 58.6| 3140 |

| 19 38 10 1.22 .45|2.49 2.4

|2.60 2.5

|H .54| 10.0 55.0| 3021 |

| 112 38 10 1.22 .45|2.57 2.5

|2.55 2.4

|I .62| 20.0 55.0| 3128 |

| 57 42 10 2.16 .52|2.54 2.4

|2.06 1.9

|J .77| 50.0 57.5| 3070 |

| 165 21 10 -1.53 .42|2.54 2.7

|2.36 2.4

|K .39| 20.0 45.3| 3187 |

| 61 33 10 .33 .40|2.49 2.6

|2.51 2.6

|L .37| 30.0 47.0| 3074 |

| 185 41 10 1.90 .50|2.45 2.3

|2.10 1.9

|M .61| 50.0 58.6| 3219 |

| 138 23 10 -1.20 .40|2.40 2.6

|2.34 2.5

|N .14| 40.0 45.6| 3157 |

| 15 40 10 1.65 .48|2.35 2.2

|2.19 2.0

|O .43| 40.0 57.4| 3017 |

| 101 27 10 -.59 .39|2.21 2.4

|2.30 2.5

|P-.01| 20.0 40.7| 3117 |

| 170 43 10 2.44 .54|2.18 2.0

|2.26 2.2

|Q .30| 50.0 59.6| 3192 |

| 156 26 10 -.74 .39|1.98 2.1

|2.18 2.3

|R .28| 50.0 42.4| 3178 |

| 43 30 10 -.14 .39|2.06 2.1

|2.15 2.2

|S-.30| 40.0 46.6| 3053 |

| 98 28 10 -.44 .39|2.12 2.3

|2.05 2.1

|T .18| 30.0 40.4| 3114 |

| 25 39 10 1.43 .47|2.00 1.8|1.83 1.5|U .41| 50.0 56.1| 3027 |

| 177 44 10 2.75 .57|1.94 1.8|1.75 1.6|V .67| 70.0 61.2| 3210 |

| 49 46 10 3.46 .64|1.85 1.7|1.46 1.0|W .67| 70.0 67.5| 3061 |

| 111 23 10 -1.20 .40|1.80 1.7|1.72 1.6|X .31| 50.0 45.6| 3127 |

| 129 40 10 1.65 .48|1.77 1.4|1.80 1.5|Y .53| 20.0 57.4| 3146 |

| 84 25 10 -.89 .39|1.74 1.7|1.68 1.5|Z .59| 30.0 45.5| 3100 |

| BETTER FITTING OMITTED +----------+----------+-----| | |

| 5 26 10 -.74 .39| .38 -1.9| .35 -2.0| .66| 60.0 42.4| 3006 |

| 71 27 10 -.59 .39| .36 -2.0| .33 -2.1|z .78| 60.0 40.7| 3085 |

| 153 35 10 .66 .42| .35 -1.8| .29 -2.0|y .85| 70.0 51.3| 3175 |

| 180 30 10 -.14 .39| .35 -2.0| .34 -2.0|x .80| 60.0 46.6| 3213 |

| 64 36 10 .84 .43| .34 -1.9| .29 -2.0|w .60| 70.0 54.6| 3077 |

| 1 32 10 .17 .40| .31 -2.2| .34 -1.9|v .57| 70.0 46.4| 3001 |

| 39 19 10 -1.90 .45| .26 -2.2| .34 -1.9|u .43| 90.0 53.2| 3047 |

| 67 29 10 -.29 .39| .31 -2.3| .32 -2.2|t .64| 60.0 44.1| 3081 |

| 76 41 10 1.90 .50| .28 -1.9| .30 -1.9|s .40| 90.0 58.6| 3090 |

| 122 39 10 1.43 .47| .29 -1.9| .24 -2.2

|r .37| 70.0 56.1| 3139 |

| 150 36 10 .84 .43| .28 -2.2

| .22 -2.3

|q .80| 90.0 54.6| 3172 |

| 113 37 10 1.02 .44| .26 -2.2

| .22 -2.3

|p .64| 80.0 55.2| 3129 |

| 132 37 10 1.02 .44| .26 -2.2

| .22 -2.3

|o .64| 80.0 55.2| 3149 |

| 155 35 10 .66 .42| .25 -2.3

| .21 -2.4

|n .83| 90.0 51.3| 3177 |

| 148 35 10 .66 .42| .24 -2.4

| .21 -2.4

|m .83| 70.0 51.3| 3170 |

| 45 33 10 .33 .40| .24 -2.5

| .22 -2.5

|l .91| 80.0 47.0| 3055 |

| 27 40 10 1.65 .48| .23 -2.2

| .22 -2.3

|k .00| 90.0 57.4| 3029 |

| 48 40 10 1.65 .48| .23 -2.2

| .22 -2.3

|j .00| 90.0 57.4| 3060 |

| 149 40 10 1.65 .48| .23 -2.2

| .22 -2.3

|i .00| 90.0 57.4| 3171 |

| 183 40 10 1.65 .48| .23 -2.2

| .22 -2.3

|h .00| 90.0 57.4| 3217 |

| 135 26 10 -.74 .39| .22 -2.9

| .20 -2.9

|g .81| 80.0 42.4| 3152 |

| 44 39 10 1.43 .47| .19 -2.5

| .17 -2.6

|f .53| 90.0 56.1| 3054 |

| 77 39 10 1.43 .47| .19 -2.5

| .17 -2.6

|e .53| 90.0 56.1| 3092 |

| 35 38 10 1.22 .45| .18 -2.6

| .15 -2.7

|d .67| 90.0 55.0| 3043 |

| 182 23 10 -1.20 .40| .13 -3.4

| .14 -3.3

|c .75| 90.0 45.6| 3215 |

| 3 38 10 1.22 .45| .14 -2.9

| .12 -2.9

|b .72| 90.0 55.0| 3004 |

| 68 36 10 .84 .43| .10 -3.4

| .10 -3.2

|a .87| 90.0 54.6| 3082 |

Those persons with negative Infit values were actually persons for whom the model fit “too well”.

PSY 5950 BF6 - 16

Comparison of Person measures with Likert summated scale scores

R-squared = .950, so r= .975

. As we have come to expect, the Rasch measures expand the tails identifying differences between respondents in the tails whose summated scores are closer together.

Overall, the relationship is very very strong. So I would surmise that summated scales would suffice for this scale except in those rare instances where more precise distinctions among persons in the extremes of the distribution were of interest.

PSY 5950 BF6 - 17

5-category Rasch Measures vs Dichtomized Rash Measures

What about the relationship of the Person measures from the analysis of 5-category responses to

Person measures from the analysis of dichotomized (1,2,3 vs 4,5) data. Here’s that scatterplot.

Remember that the dichtomization process resulted in persons having identical scores on the dichtomized measure even though they had different combinations of 1, 2, 3, 4, and 5 responses.

That’s why there are multiple points arrayed vertically at each dichotomized Rash value.

The r value is “only” .885, and there is noticeable curavture on the left and right. This suggests to me that these two analytic strategies are not equivalent. My money would be on the analysis of the 5-category responses.

PSY 5950 BF6 - 18

A 7-point Likert Extraversion scale.

The study was conducted in 2009 or so. It’s initial purpose was to develop a questionnaire for the Big Five for which equal numbers of positively-worded and negatively-worded pairs of items could be used for each of the Big Five dimensions. Interestingly, we’ve yet to pursue that aspect of the study, although we’ve presented papers on other analyses that were done with the data.

The moral of that story is “Go with the flow.”

The data are in Bond&FoxCh6C.txt

.

&INST ; initial line (can be omitted)

TITLE = "Balanced Scale Extraversion Items"

PERSON = Person ; persons are ...

ITEM = Item ; items are ...

ITEM1 = 5 ; column of response to first item in data record

NI = 10 ; number of items

NAME1 = 1 ; column of first character of person label

NAMELEN = 4 ; length of person label

XWIDE = 1 ; number of columns per item response

TOTAL = Yes ; show total raw scores

CHART = Yes ; produce across-pathway picture

MNSQ = No ; use Standardized fit statistics

STBIAS = Yes ; Adjust for estimation bias

MAXPAGE = 60 ; Maximum lines per page

IREFER= FFFFFFFFFF ; Forward and Reversed items

CODES = 1234567 ; valid codes in data file

IVALUEF = 1234567 ; Forward items

IVALUER = 7654321 ; Reversed items

CLFILE = * ; category label file for category naming

1+1 "Completely Inaccurate" ; Item 1 is a forward item

1+2 "Very Inaccurate"

1+3 "Probably Inaccurate"

1+4 "Sometimes Acc, sometimes Inacc"

1+5 "Probably Accurate"

1+6 "Very Accurate"

1+7 "Completely Accurate"

* ; end of CLFILE=* list

&END

01 Life of party

02R Don't talk a lot

03 Comfortable around people

04R Keep in background

05 Start conversations

06R Have little to say

07 Talk to diff people

08R Don't draw attention

09 Don't mind being center

10R Quiet around strangers

END LABELS

50013336454434

206 cases total.

PSY 5950 BF6 - 19

The SPSS Analysis of the Extraversion scale . . . e1 e14 e3 e4 e5 e6 e7 e8 e9 e10

Reliability Statistics

Cronbach's Alpha N of Items

.885 10

Scale Mean if Item

Deleted

43.15

42.36

42.16

42.76

42.19

42.34

42.54

43.42

42.76

43.46

Item-Total Statistics

Scale Variance if

Item Deleted

Corrected Item-Total

Correlation

92.249

84.470

89.895

86.541

88.985

91.493

87.031

91.752

87.450

83.330

.575

.687

.660

.695

.703

.516

.606

.510

.635

.631

Cronbach's Alpha if

Item Deleted

.877

.869

.872

.868

.869

.881

.875

.881

.873

.874

The distribution of summated scores is quite nicely unimodal and symmetric. None of the items has a markedly deleterious effect on reliability.

PSY 5950 BF6 - 20

The Rasch analysis of the original items.

Both the person reliability value of .86 (expected r between same people with different items) and the item reliability value of .97

(expected r between same items with different people) are acceptable. CAN RELIABILITY BE TOO HIGH?? As in Likert scaling????

Item statistics ordered by number

TABLE 14.1

Balanced Scale Extraversion Items ZOU007WS.TXT Apr 4 9:58 2012

INPUT: 206 Persons 10 Items MEASURED: 206 Persons 10 Items 7 CATS 1.0.0

--------------------------------------------------------------------------------

Person: REAL SEP.: 2.46 REL.: .86 ... Item: REAL SEP.: 5.77 REL.: .97

Item STATISTICS: ENTRY ORDER

+-----------------------------------------------------------------------------------------------------------+

|ENTRY TOTAL MODEL| INFIT | OUTFIT |PTMEA|EXACT MATCH| |

|NUMBER SCORE COUNT MEASURE S.E. |MNSQ ZSTD|MNSQ ZSTD|CORR.| OBS% EXP%| Item |

|------------------------------------+----------+----------+-----+-----------+------------------------------|

| 1 888 206 .38 .06| .82 -1.9| .85 -1.6| .67| 40.7 35.5| 01 Life of party |

| 2 1050 206 -.30 .07|1.10 1.0|1.04 .4| .70| 37.7 37.6| 02R Don't talk a lot |

| 3 1092 206 -.50 .07| .78 -2.4

| .74 -2.7| .68| 45.1 38.0| 03 Comfortable around people|

| 4 968 206 .06 .06| .79 -2.2

| .82 -1.9| .72| 47.5 36.7| 04R Keep in background |

| 5 1085 206 -.47 .07| .69 -3.4

| .66 -3.7| .72| 47.1 37.9| 05 Start conversations |

| 6 1054 206 -.32 .07|1.23 2.2

|1.27 2.4| .59| 40.2 37.7| 06R Have little to say |

| 7 1013 206 -.14 .07|1.20 2.0

|1.24 2.2| .65| 34.3 37.8| 07 Talk to diff people |

| 8 833 206 .60 .06|1.11 1.2|1.19 1.9| .60| 30.4 34.1| 08R Don't draw attention |

| 9 969 206 .05 .06| .97 -.3| .95 -.4| .69| 37.3 36.7| 09 Don't mind being center |

| 10 825 206 .63 .06|1.28 2.7

|1.26 2.5| .70| 33.3 33.5| 10R Quiet around strangers |

|------------------------------------+----------+----------+-----+-----------+------------------------------|

| MEAN 963.7 204.0 .00 .07|1.00 -.1|1.00 -.1| | 39.4 36.5| |

| S.D. 94.5 .0 .40 .00| .20 2.1| .22 2.2| | 5.6 1.6| |

+-----------------------------------------------------------------------------------------------------------+

Wow – many of the items – most of the items – exhibit unusual fit statistics.

The negative Infit values indicate “very good, perhaps too good” fit by the model.

The positive Infit values indicate poor fit by the model.

PSY 5950 BF6 - 21

Item Statistics ordered by Measure

TABLE 13.1 Balanced Scale Extraversion Items ZOU007WS.TXT Apr 4 9:58 2012

INPUT: 206 Persons 10 Items MEASURED: 206 Persons 10 Items 7 CATS 1.0.0

--------------------------------------------------------------------------------

Person: REAL SEP.: 2.46 REL.: .86 ... Item: REAL SEP.: 5.77 REL.: .97

Item STATISTICS: MEASURE ORDER

+-----------------------------------------------------------------------------------------------------------+

|ENTRY TOTAL MODEL| INFIT | OUTFIT |PTMEA|EXACT MATCH| |

|NUMBER SCORE COUNT MEASURE S.E. |MNSQ ZSTD|MNSQ ZSTD|CORR.| OBS% EXP%| Item |

|------------------------------------+----------+----------+-----+-----------+------------------------------|

| 10 825 206 .63 .06|1.28 2.7|1.26 2.5| .70| 33.3 33.5| 10R Quiet around strangers |

| 8 833 206 .60 .06|1.11 1.2|1.19 1.9| .60| 30.4 34.1| 08R Don't draw attention |

| 1 888 206 .38 .06| .82 -1.9| .85 -1.6| .67| 40.7 35.5| 01 Life of party |

| 4 968 206 .06 .06| .79 -2.2| .82 -1.9| .72| 47.5 36.7| 04R Keep in background |

| 9 969 206 .05 .06| .97 -.3| .95 -.4| .69| 37.3 36.7| 09 Don't mind being center |

| 7 1013 206 -.14 .07|1.20 2.0|1.24 2.2| .65| 34.3 37.8| 07 Talk to diff people |

| 2 1050 206 -.30 .07|1.10 1.0|1.04 .4| .70| 37.7 37.6| 02R Don't talk a lot |

| 6 1054 206 -.32 .07|1.23 2.2|1.27 2.4| .59| 40.2 37.7| 06R Have little to say |

| 5 1085 206 -.47 .07| .69 -3.4| .66 -3.7| .72| 47.1 37.9| 05 Start conversations |

| 3 1092 206 -.50 .07| .78 -2.4| .74 -2.7| .68| 45.1 38.0| 03 Comfortable around people|

|------------------------------------+----------+----------+-----+-----------+------------------------------|

| MEAN 963.7 204.0 .00 .07|1.00 -.1|1.00 -.1| | 39.4 36.5| |

| S.D. 94.5 .0 .40 .00| .20 2.1| .22 2.2| | 5.6 1.6| |

+-----------------------------------------------------------------------------------------------------------+

We’ll see whether the item difficulty measures from this analysis correlate with those from the previous analysis of the same scale using different people.

PSY 5950 BF6 - 22

Average Thresholds ( Output Tables -> 3.2 Rating (partial credit) scale to get both displays)

+--------------------------------------------------------------------------

|CATEGORY STRUCTURE | SCORE-TO-MEASURE | 50% CUM.| COHERENCE|ESTIM|

| LABEL MEASURE S.E. | AT CAT. ----ZONE----|PROBABLTY| M->C C->M|DISCR|

|------------------------+---------------------+---------+----------+-----+

| 1 NONE | ( -2.79) -INF -2.18| | 100% 1%| | 1 Completely Inaccurate

| 2 -1.29 .14 | -1.58 -2.18 -1.19| -1.80 | 39% 24%| .55| 2 Very Inaccurate

| 3 -.93 .09 | -.85 -1.19 -.51| -1.12 | 23% 29%| .87| 3 Probably Inaccurate

| 4 -1.17 .07 | -.13 -.51 .28| -.66 | 46% 42%| .98| 4 Sometimes Acc sometimes Inacc

| 5 .74 .06 | .70 .28 1.13| .39 | 31% 52%| 1.08| 5 Probably Accurate

| 6 .81 .06 | 1.65 1.13 2.43| 1.04 | 42% 43%| 1.03| 6 Very Accurate

| 7 1.85 .08 | ( 3.17) 2.43 +INF | 2.14 | 82% 29%| 1.13| 7 Completely Accurate

+--------------------------------------------------------------------------

M->C = Does Measure imply Category?

C->M = Does Category imply Measure?

Below is a map of the Rasch Response values, with labels to help show how they correspond to the Likert values.

The Likert response values seen by respondents are successive integers, with equal distances between adjacent values.

On the figure below, I’ve attempted to place the labels “CI”, “VI”, etc at locations corresponding to their estimated values.

1 2 3 4 5 6 7

CI VI PI S PA VA CA

-.----.----.----.----.----.----.----.----.----.----.----.----.----.----.----.----.-

-4 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0

Note that the ordering of the label estimates:

-2.79, -1.58, -0.85, -0.13, +0.70, +1.65, and 3.17 is what we would expect.

The estimated differences between labels are

1.21 0.73 0.72 0.83 0.95 1.52

These differences are not exactly equal as is assumed by the labels seen by respondents.

But the sample size is only 206 and the differences between values of adjacent labels are

“roughly” equal, close enough to being equal to justify the “equal intervals” assumption of Likert scale scoring.

PSY 5950 BF6 - 23

Plot of Rasch response values vs. Likert “presented to respondent” response values

The linearity of the relationship and the fairly uniform locations of the

Rasch response values are certainly support for the common practice of calling the labels “1”, “2”, “3”, etc.

(See 5130 lecture on Scale construction.)

CATEGORY PROBABILITIES: MODES - Structure measures at intersections

P ++-------+-------+-------+-------+-------+-------+-------++

R 1.0 + +

O | 777|

B .8 +1111 77777 +

A | 111 777 |

B .6 + 11 777 +

I .5 + 11 444 777 +

L .4 + 11 4444 4444 66666**666 +

I | 2222222**244 555****5 77 66666 |

T .2 +222222 33*******33**55 666 4***555 66666 +

Y | 3333**44 **********3**777 44444555555 6666|

.0 +*********************************************************+

O ++-------+-------+-------+-------+-------+-------+-------++

F -3 -2 -1 0 1 2 3 4

Person [MINUS] Item MEASURE

Although it’s difficult to follow the curves for each response category on this map, it appears that the probabilities of using the different response categories make sense. The modal response for a person whose E was -3 was 1. The modal response for a person whose E was 1.5 was 6, and so forth.

PSY 5950 BF6 - 24

Item map (Output Tables -> 12. Item map)

TABLE 12.2 Balanced Scale Extraversion Items ZOU007WS.TXT Apr 4 9:58 2012

INPUT: 206 Persons 10 Items MEASURED: 206 Persons 10 Items 7 CATS 1.0.0

--------------------------------------------------------------------------------

Persons MAP OF Items

<more>|<rare>

4 # +

|

|

.# |

|

3 # +

# |

|

# T|

### |

2 # +

## |

### S|

####### |

.##### |

1 .####### +

.######### |T

###### M| 08R Don't draw attention

10R Quiet around strangers

########## |S 01 Life of party

.####### |

0 .######### +M 04R Keep in background

09 Don't mind being center

########## | 07 Talk to diff people

.#### S|S 02R Don't talk a lot

03 Comfortable around people

05 Start conversations

06R Have little to say

#### |

### |T

-1 .# +

# |

.# T|

. |

|

-2 +

<less>|<frequ>

EACH '#' IS 2.

As was the case with the Bias data above, there are persons with less Extraversion than the lowest E item and there are persons with more Extraversion than the item with the largest measure of E.

PSY 5950 BF6 - 25

Person statistics

As above, it’s not possible to display all the person measures.

I won’t display the person information in misfit order to save space.

The relationship of summated scores to the Rasch measures.

The “stretching” of the extremely high scores for the Rasch measures is apparent here. There is no such “stretching” of the low scores, perhaps because there was no one in this sample who scored lower than a mean of 2 on the 1 to 7 scale.

So the Rasch measures are telling us that there are a few extraordinarily extraverted people in the sample, something the “compressed-tail” Likert measures didn’t tell us about.

PSY 5950 BF6 - 26

Comparison of item measures

Since we have item measures for the same items from two studies, with different response formats (1 to 5 vs 1 to 7), it is of interest to compare them.

Here they are from the two studies (taken from the above tables) . . .

Clearly, as we would hope, the items that represent the most extraversion when a 5-point response scale is used are the same items that represent the most extraversion when a 7-point scale is used. The r-squared is .957, which means the r is .978. Recall that the estimated item reliabilities from the two analyses were .98 and .97 for the two studies.

PSY 5950 BF6 - 27

INFIT and inconsistency

Infit

The infit measure is computed from mean “Inlier-sensitive or Information-weighted fit.” It is based on the squared differences between observed responses (0 or 1 in right/wrong tests; response value in Likert scales) and the probabilities of response based on the Rasch model . So infit is based on squared residuals. Through computational magic, the mean of the quantity computed for data for which the Rasch model is an appropriate fit is 1.0

. (This is based on chi-square divided by degrees of freedom computations.)

Values of Raw Infit about equal to 1 represent goodness of fit that would be expected if the model were true with the exception of random variability.

Values larger than 1 represent differences between actual responses and predicted probabilities that are greater than would be expected by chance.

Values less than 1 represent differences that are smaller than would be expected by chance, perhaps representing goodness of fit that is “too good.”

The Z (sometimes it’s called t) statistic that is printed is simply the infit value for a person divided by the estimate of the standard deviation. So

Z ~ ~ 0 -> Goodness of fit for a person that would be expected.

Z > > 0 -> Goodness of fit that is poor, larger than expected.

Z << 0 -> Goodness of fit that might be “too good”.

PSY 5950 BF6 - 28

Inconsistency

In our research program here at UTC, we’ve been investigating a concept we call response inconsistency. This represents differences in response participants make to items from the same personality dimension.

For a single dimension, our measure of inconsistency is computed as the standard deviation of responses of a person to the items from that dimension .

In our research we’ve computed the inconsistency for each of the five dimensions of the Big 5 and then averaged them to get a measure that had acceptable reliability (alpha around .7 or so).

We’ve found that inconsistency measured as we compute it is related to reliability. If we split a sample into inconsistent responders vs. consistent responders, the reliabilities of scales computed from the consistent responders are significantly larger than those of the inconsistent responders.

We’ve also found that inconsistency measured as we do is related to validity. Validity of conscientiousness as a predictor of UGPA is larger for consistent responders than it is for inconsistent responders.

Reddock, C. R., Biderman, M. D., & Nguyen, N. T. (2011). The relationship of reliability and validity of personality tests to frame-of-reference instructions and within-person inconsistency. International Journal of Selection and Assessment, 19 , 119-131.

Biderman, M. D., & Reddock, C. M. (2012). The relationship of scale reliability and validity to respondent inconsistency. Personality and Individual Differences, 52 , 647-651.

Khan, R. (2012). An examination of two methods of measuring inconsistency. MS Thesis.

University of Tennessee at Chattanooga.

Khan, R., & Biderman, M. D. (2013). Behavioral variability measured using Likert and frequency based response formats. Paper accepted for presentation at the 28 th

annual conference of The Society for Industrial and Organizational Psychology, Houston, TX, April.

Look at inconsistency across dimensions – especially Agreeableness . Thanks, Aaron.

PSY 5950 BF6 - 29

Is person infit computed from Rasch analyses related to inconsistency computed from

Likert analyses?

I performed Rasch analysis of the data of each of the Big Five scales from the Balanced Scale study.

For each analysis, I copied the Z infit measures for all 206 respondents and pasted them into

SPSS. I then computed the mean of the five Z infit measures – for E, A, C, S, and O.

Next I computed the SD of raw responses to items within each dimension and averaged those 5 standard deviations – E, A, C, S, and O to get a Likert measure of Inconsistency.

Finally, I correlated the mean of the within-dimension standard deviations (our Likert measure of inconsistency) with the mean of the Z infit measures. Here’s the result . . .

The r-squared is .751, meaning the correlation between the two measures is .867.

N = 206.

So the Rasch analyses and the Likert analyses are connected in this way.

PSY 5950 BF6 - 30

Does Rasch Conscientiousness predict GPA better than Likert Conscientiousness?

A combined sample of N=1143.

We have GPA information on all participants. (No names, just GPAs.)

The correlation of Likert Conscientiousness scales with GPA is .180 (p < .001), about as would be expected.

The question is, if we measured Conscientiousness as a Rasch measure, would it be more (or less or equally) valid than the Likert measure?

PSY 5950 BF6 - 31

Here’s the Bond&Fox Steps data file – the first part of it, at least.

This file is: Bond&FoxChapter6GCombo.txt in Bond&Fox data folde r

&INST ; initial line (can be omitted)

TITLE = "RosOPQFORBalSc Conscientiousness Items"

PERSON = Person ; persons are ...

ITEM = Item ; items are ...

ITEM1 = 5 ; column of response to first item in data record

NI = 10 ; number of items

NAME1 = 1 ; column of first character of person label

NAMELEN = 4 ; length of person label

XWIDE = 1 ; number of columns per item response

TOTAL = Yes ; show total raw scores

CHART = Yes ; produce across-pathway picture

MNSQ = No ; use Standardized fit statistics

STBIAS = Yes ; Adjust for estimation bias

MAXPAGE = 60 ; Maximum lines per page

IREFER= FFFFFFFFFF ; Forward and Reversed items

CODES = 1234567 ; valid codes in data file

IVALUEF = 1234567 ; Forward items

IVALUER = 7654321 ; Reversed items

CLFILE = * ; category label file for category naming

1+1 "Completely Inaccurate" ; Item 1 is a forward item

1+2 "Very Inaccurate"

1+3 "Probably Inaccurate"

1+4 "Sometimes Acc, sometimes Inacc"

1+5 "Probably Accurate"

1+6 "Very Accurate"

1+7 "Completely Accurate"

* ; end of CLFILE=* list

&END

1 Am always prepared

2R Leave my belongings around

3 Pay attention to details

4R Make a mess of things

5 Get chores done right away

6R Often forget to put things back

7 Like order

8R Shirk my duties

9 Follow a schedule

10 Am exacting in my work

END LABELS

60275544334434

60366746575766

PSY 5950 BF6 - 32

The first thing to do is to do the usual checks of item appropriateness.

So Output Tables -> 13. Item: Measure

TABLE 13.1 RosOPQFORBalSc Conscientiousness Items ZOU507WS.TXT Mar 27 19:31 2014

INPUT: 1143 Persons 10 Items MEASURED: 1143 Persons 10 Items 7 CATS 1.0.0

--------------------------------------------------------------------------------

Person: REAL SEP.: 1.97 REL.: .79 ... Item: REAL SEP.: 9.87 REL.: .99

Item STATISTICS: MEASURE ORDER

+----------------------------------------------------------------------------------------------------------------+

|ENTRY TOTAL MODEL| INFIT | OUTFIT |PTMEA|EXACT MATCH| |

|NUMBER SCORE COUNT MEASURE S.E. |MNSQ ZSTD|MNSQ ZSTD|CORR.| OBS% EXP%| Item |

|------------------------------------+----------+----------+-----+-----------+-----------------------------------|

| 5 4623 1143 .61 .03| .93 -1.7| .95 -1.2| .65| 36.3 33.8| 5 Get chores done right away |

| 6 5108 1143 .28 .03|1.17 4.0

|1.17 3.9| .67| 31.7 34.9| 6R Often forget to put things back|

| 2 5373 1143 .10 .03|1.33 7.3

|1.32 6.9| .62| 29.2 36.0| 2R Leave my belongings around |

| 8 5412 1142 .07 .03| .96 -1.0|1.09 2.0| .49| 37.8 35.8| 8R Shirk my duties |

| 10 5539 1143 -.02 .03| .87 -3.3

| .93 -1.6| .51| 42.8 36.0| 10 Am exacting in my work |

| 9 5608 1143 -.08 .03|1.09 2.1|1.08 2.0| .61| 36.7 35.8| 9 Follow a schedule |

| 1 5663 1143 -.12 .03| .66 -9.3

| .68 -8.6| .63| 44.8 36.2| 1 Am always prepared |

| 4 5706 1143 -.15 .03| .96 -.9| .99 -.2| .59| 40.3 35.9| 4R Make a mess of things |

| 7 5879 1143 -.29 .03| .94 -1.4| .93 -1.7| .63| 40.6 36.8| 7 Like order |

| 3 6032 1143 -.41 .03| .97 -.8|1.00 .1| .50| 40.7 36.8| 3 Pay attention to details |

|------------------------------------+----------+----------+-----+-----------+-----------------------------------|

| MEAN 5473.3 1139.9 .00 .03| .99 -.5|1.01 .2| | 38.1 35.8| |

| S.D. 381.5 .3 .28 .00| .17 4.2| .16 3.9| | 4.6 .8| |

+----------------------------------------------------------------------------------------------------------------+

As we discovered previously, some of the items have poor fit measures.

Only the most Conscientious respondents endorsed “Got chores done right away.”

The least Conscientious respondents endorsed “Pay attention to details.”

Output Tables -> 3.2 Rating (partial credit) scale

The estimates of the 7 response “values” were reasonable . . .

+--------------------------------------------------------------------------

|CATEGORY STRUCTURE | SCORE-TO-MEASURE | 50% CUM.| COHERENCE|ESTIM|

| LABEL MEASURE S.E. | AT CAT. ----ZONE----|PROBABLTY| M->C C->M|DISCR|

|------------------------+---------------------+---------+----------+-----+

| 1 NONE |( -2.91) -INF -2.24| | 100% 1%| | 1 CI: Completely Inaccurate

| 2 -1.52 .08 | -1.58 -2.24 -1.16| -1.90 | 41% 8%| .92| 2 VI: Very Inaccurate

| 3 -.76 .04 | -.81 -1.16 -.47| -1.06 | 28% 21%| .91| 3 PI: Probably Inaccurate

| 4 -1.03 .03 | -.12 -.47 .26| -.59 | 43% 45%| 1.02| 4 S: Sometimes Acc sometimes Inacc

| 5 .57 .02 | .67 .26 1.11| .32 | 31% 56%| 1.02| 5 PA: Probably Accurate

| 6 .80 .02 | 1.65 1.11 2.48| 1.02 | 42% 36%| 1.04| 6 VA: Very Accurate

| 7 1.94 .03 |( 3.24) 2.48 +INF | 2.20 | 75% 17%| 1.00| 7 CA: Completely Accurate

“1” “2” “3” “4” “5” “6” “7”

CI VI PI S PA VA CA

-.----.----.----.----.----.----.----.----.----.----.----.----.----.----.----.----.-

-4 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0

Rasch Dimension

Here’s a plot of the Likert response values vs. the Rasch response values.

So, again, there is a little bit of stretching of the response scale at each end.

PSY 5950 BF6 - 33

Comparison of distribution shapes

Clearly the two distribution shapes are different. The Likert scale score distribution is slightly negatively skewed.

The Rasch measure distribution is positively skewed.

.

Which is more appropriate? “truer”? Rasch adherents would say the Rasch distribution is likely closer to the true distribution, though no one has actually seen that distribution.

PSY 5950 BF6 - 34

Correlation of individual Rasch person vs. Likert Person measures.

The “stretching” of the extremes is quite apparent in the graph, in particular the “high conscientiousness” extreme, as would have been expected from the

So, which of the two ways of measuring Conscientiousness is better for prediction of GPA?

Correlations corig crasch

Pearson Correlation

GPA Ros:EOYGPA; OPQ:EOSGPA

Sig. (2-tailed)

FOR:EOSGPA

N

The original measure appears to be the winner here.

.180

.000

1143

.158

.000

1143

PSY 5950 BF6 - 35

Note, however, that the distribution of GPAs is negatively skewed.

So the distribution of the Y variable has the same shape as the distribution of corig, but has a very different shape from the distribution of crasch.

PSY 5950 BF6 - 36

Here are the two scatterplots.

Note that the relationship of gpa with crasch is curvilinear.

This may mean that the criterion measure is not representing the actual “performance” of students appropriate.

What would be needed is a Rasch measure of gpa. Hmm.

PSY 5950 BF6 - 37

Revisiting this issue, including a quadratic component in the prediction of GPA

The SPSS syntax to create quadratic polynomial components. compute crasch2 = crasch**2. compute corig2 = corig**2. regression variables = gpa corig corig2 /dep=gpa /enter /save= pred(rawpred). regression variables = gpa crasch crasch2 /dep=gpa /enter /save= pred(raschpred).

The results . . .

Regression of GPA onto summated C and summated C 2 .

Model Summary b

Adjusted R Square

.034

Std. Error of the

Estimate

.6456

Model

1

R

.190

a

R Square

.036 a. Predictors: (Constant), corig2, corig b. Dependent Variable: GPA

Coefficients a

Model

1 (Constant)

Unstandardized Coefficients

B Std. Error

Standardized

Coefficients

Beta t Sig.

1.431 .438 3.270 .001 corig corig2

.516

-.039

.183

.019 a. Dependent Variable: GPA

Regression of GPA onto Rasch C & Rasch C 2 .

Model Summary b

Adjusted R Square

.034

.658

-.482

Std. Error of the

Estimate

.6458

Model

1

R

.188

a

R Square

.035 a. Predictors: (Constant), crasch2, crasch b. Dependent Variable: GPA

Coefficients a

Model

1 (Constant)

Unstandardized Coefficients

B Std. Error

Standardized

Coefficients

Beta

2.891 .024 t

2.814

-2.059

120.205

Sig.

.005

.040

.000 crasch crasch2

.230

-.045

.038

.013

.287

-.165

6.126

-3.520

.000

.000 a. Dependent Variable: GPA

So, when the quadratic nature of the relationsihp of gpa to C is taken into account, summated C and Rasch C yield virtually identical predictions of gpa.

PSY 5950 BF6 - 38

Download