Comparison of the NEO-FFI, EPP, FPI-R, TIPI and EPQ-R English, German, Polish, Spanish-Version Extraversion Scales: A new approach to proofing content validity by comparing the Rasch-scaled variance of person parameters Dr. J. M. Müller University of Tübingen Germany My thanks to... The testauthors for contributing their DATA: and the Students for their help: Paul Barrett EPQ-R English Tina Fechter Willibald Ruch EPQ-R German Magnus Pagendarm Piotr Brzozowski EPQ-R Polish Mildred Girndt Anton Aluja EPQ-R Spanish Susana Ruiz Peter Borkenau NEO-FFI Jochen Fahrenberg FPI-R Peter Becker TIPI Stephan Bulheller EPP SWETS, SCHUHFRIED, HOGREFE …and many more…. Daniel Kostatinov Foundation ‚Strukturfond der Universität Tübingen‘ Kap. 1415 ISSID-Graz, 2003; Müller University of Tuebingen 2 Three messages… 1. Personality issue: 2. Methodological issue: 3. Rasch Variances have a psychological meaning Proofing content validity by comparing Rasch Variances Practical/technical issues: Factors that influences Rasch Variances ISSID-Graz, 2003; Müller University of Tuebingen 3 Actuall DATABASE of the TEST-META-ANAYSES-PROJECT Number of psychological dimensions >90 (8) Number of IRT-Analyses >559 (140) Number of persons per test 1000 < n < 3500 Item-Response-Software BilogMG, Parscale, Winmira NEO-FFI, EPP, FPI-R, TIPI and EPQ-R English, German, Polish, Spanish-Version Extraversion Scales ISSID-Graz, 2003; Müller University of Tuebingen 4 1. Introduction: A surprising (and long time not testable) assumption of Personality Psychology The psychometric based Personality Psychology assumes that persons vary within several psychological dimensions in a comparable manner. Density Psychologica l dimension I Psychological dimension II ISSID-Graz, 2003; Müller University of Tuebingen 5 1. The raschscaled person parameter variances differ between tests... (Source: Müller, 2002) Different variances from different tests ISSID-Graz, 2003; Müller University of Tuebingen 6 1. The chess-example Chess-players B A 1:2 B 1: 2 A C 1:2 1: 2 1: 2 1:2 C 1: 2 1: 2 1: 2 1: 2 ISSID-Graz, 2003; Müller University of Tuebingen 7 1. Transfer to the Rasch model Player B becomes to task b for player A. The difference between two persons (or players) is now defined by their difference in probabilities of solving (or winning against) a task b. Person B p(B,b)=.50 Aufgabe b p(A,b)=.66 exp x Ab A b p x Ab ISSID-Graz, Müller University 1of2003; Tuebingen exp A b Person A 8 1. Interpreting a Rasch scale unit Probability to solve an item Differences to solve an reference item Personparameters taks b with = 0 B A ISSID-Graz, 2003; Müller University of Tuebingen 9 1. Rasch variances are a standardised measure for variability! Probability to solve an item Constant differences in probabilities taks b with = 0 task a with = 1 Personparameters B A C ISSID-Graz, 2003; Müller University of Tuebingen 10 2. Methology Issue: The usage of Rasch Variances Content validity is a concept of representative item sampling out of the universe of a valid item population (Fitzpatrick, 1983; Klauer, 1984). We expect comparable Rasch Variances in all extraversion scales. ISSID-Graz, 2003; Müller University of Tuebingen 11 2. Common approaches to test content validity: Psychometric approaches Sources of Bias: Construct, method and item (see Vijver & Hambleton, 1996) 1. 2. 3. ... by correlations ... by structural equivalence ... by simple descriptives like • Variances in rasch-units ISSID-Graz, 2003; Müller University of Tuebingen 12 2. Differences between the approaches Testing of relationsships is a strategy that leads to weak statistical testing (H1-hypothesis) Testing of equivalence (deviation ) is a strategy that leads to strong statistical testing (H0-hypothesis) ISSID-Graz, 2003; Müller University of Tuebingen 13 3. Methodological issues: What factors influences the True Rasch Variance? 1. 2. 3. 4. 5. 6. 7. 8. MEASUREMENT ERROR (Lord, 1983); Number of items IRT-MODELL (1PL, 2PL, Partial Credit, Graded, ...) ANSWERING FORMAT (Dichotome, Rating) ESTIMATING ALGORITHM (WLE, MLE, ...) LINK-FUNCTIONS (Logit, Normal-ogive) SOFTWARE (Winmira, Parscale, Bilog) BOTTOM-AND-CEILING-EFFECTS …unknown factors… ISSID-Graz, 2003; Müller University of Tuebingen 14 3. The influence of the Measurement Error increases the observed variances (Lord, 1983) We assume that... (1) Then the empirical variance is... ˆi i i (2) Var (ˆ) Var ( ) Var ( ) We know the proportion.. We correct therefore... Var ( ) ˆ (3) REL ( ) Var (ˆ) (4) Var ( ) Var (ˆ) REL (ˆ) ISSID-Graz, 2003; Müller University of Tuebingen 15 3. The reliability varies… Correlation r = 0.72 .5 1.0 .5 Mean = 0.72 IRT-Reliability 1.0 Mean = 0.75 Cronbach Alpha ISSID-Graz, 2003; Müller University of Tuebingen 16 3. The reliability varies, also in Extraversion… .5 1.0 Mean = 0.83 IRT-Reliability .5 1.0 Mean = 0.83 Cronbach Alpha ISSID-Graz, 2003; Müller University of Tuebingen 17 3. The different IRT-Models does not change the persons position in an extraversion scale EPQ-R EPQ-R EPQ-R EPQ-R German English Spanish Polish TIPI EPP German German FPI-R NEO-FFI German German The overall intercorrelation (Fisher-Z-transformed) of person parameters between the IRT-Analyses referring to the same test is r = 0.996 ISSID-Graz, 2003; Müller University of Tuebingen 18 6.7 BOTTOM-AND-CEILING effects on the STANDARDDEVIATION ISSID-Graz, 2003; Müller University of Tuebingen 19 6.7 BOTTOM-AND-CEILING effects on the SKEWNESS ISSID-Graz, 2003; Müller University of Tuebingen 20 6.7 BOTTOM-AND-CEILING effects on the KURTOSIS ISSID-Graz, 2003; Müller University of Tuebingen 21 Results: Comparing the person parameter distributions from different Extraversion measures… EPQ-R EPQ-R EPQ-R EPQ-R German English Spanish Polish TIPI EPP FPI-R NEO-FFI German German German German ISSID-Graz, 2003; Müller University of Tuebingen 22 Define a standard... ...if you would like to compare variances in rasch-units! 1. 2. 3. 4. 5. 6. The reference error of measurement ist 0 (True Variances). The reference IRT-Model is the Partial-Credit-Model The reference link-function is the Logit! The reference slope-parameter is 1! The reference estimation algorithm is the WarmEstimator! The reference distribution is normal. ISSID-Graz, 2003; Müller University of Tuebingen 23 Open questions... 1. The influence of item location and discrimination, caused by different IRTModels. 2. We plan to compare different scales from the same person sample. ISSID-Graz, 2003; Müller University of Tuebingen 24 Item Tresholds/Locations in NEO-FFI/Extraversion threshold 1 threshold 2 threshold 3 threshold 4 threshold 1 threshold 2 Item Parameters in Class 1 with size 1.00000 threshold 3 threshold 4 Item Parameters in Class 1 with size 1.00000 2 2 1 Threshold Threshold 1 0 0 -1 -1 -2 -2 1 2 3 4 5 6 7 8 9 10 11 1 12 2 3 4 5 6 Item threshold 1 threshold 2 7 8 9 10 11 12 Item threshold 3 threshold 4 threshold 1 Item Parameters in Class 1 with size 1.00000 threshold 2 threshold 3 threshold 4 Item Parameters in Class 1 with size 1.00000 8 6 6 4 4 2 Threshold Threshold 2 0 -2 0 -2 -4 -4 -6 -6 -8 -8 -10 1 2 3 4 5 6 7 Item 8 9 -10 ISSID-Graz, 2003; Müller University 10 11 12 1 2 3 4 5 of Tuebingen 6 7 Item 8 9 10 25 11 12 Item Tresholds/Locations in NEOFFI/Extraversion threshold 1 threshold 2 threshold 3 threshold 4 threshold 1 threshold 2 Item Parameters in Class 1 with size 1.00000 threshold 3 threshold 4 Item Parameters in Class 1 with size 1.00000 2 2 unequal treshold within item equal for all item Rating-Scale-Model 0 -1 unequal treshold within item unequal for each item Partial-Credit-Model, 1 Threshold Threshold 1 0 -1 -2 -2 1 2 3 4 5 6 7 8 9 10 11 1 12 2 3 4 5 6 Item threshold 1 threshold 2 threshold 3 threshold 4 threshold 1 Item Parameters in Class 1 with size 1.00000 8 9 10 11 12 threshold 2 threshold 3 threshold 4 Item Parameters in Class 1 with size 1.00000 8 6 6 4 4 equal treshold within item equal for all item No special IRT-Name 0 -2 -4 -6 Threshold 2 2 Threshold 7 Item 0 -2 -4 -6 unequal treshold within item unequal for each item Polytomous-Model, -8 -8 -10 1 2 3 4 5 6 7 Item 8 9 -10 ISSID-Graz, 2003; Müller University 10 11 12 1 2 3 4 5 of Tuebingen 6 7 Item 8 9 10 26 11 12 SUMMARY about the three messages… 1. Rasch Variances have a psychological meaning! 2. The Extraversion scales seems to differ in content! 3. …but the methodology about comparing Rasch Variances from different tests is still in progress, because many factors influences the Rasch variances of person parameters. ISSID-Graz, 2003; Müller University of Tuebingen 27 Thank you for your attention. More Information you can find under: www.psychological-tests.de Send me your data! ISSID-Graz, 2003; Müller University of Tuebingen 28 Appendix: Additional Informations ISSID-Graz, 2003; Müller University of Tuebingen 29 3. Little influence of changing the answering format from a Rating to a Dichotomous Format for the NEO-FFI (also for TIPI) ------------UTRVU-----------N Mean Std Dev Rating Scale 31 1.85 2.13 Rating/Dichotomi 12 1.67 0.63 The GLM Procedure; Dependent Variable: UTRVU, F(1,41)=0.08 p=0.7765 n.s. ISSID-Graz, 2003; Müller University of Tuebingen 30 TRV should be independent from the length of a scale: We reduced the scales from 17 to 8,5 items per scale (Analysis with Winmira; 47 Scales; 139 Analysis in total) Testname Frequency‚ ItemReduk‚EP ‚FP ‚NE ‚QD ‚TI ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ ALL ‚ 16 ‚ 12 ‚ 5 ‚ 4 ‚ 10 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Q-Sort-R ‚ 16 ‚ 12 ‚ 5 ‚ 4 ‚ 9 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Diff-Red ‚ 16 ‚ 12 ‚ 5 ‚ 4 ‚ 9 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 48 36 15 12 28 ALL Q-Index-Reduction (bad Items out) Difficulty-Reduction ERV Duncan TRV Duncan Cronbach Mean Mean A A 1.45 1.13 0.77 2.07 B 1.52 B 0.73 1.61 A 1.04 A 0.64 Total 47 46 46 139 ITRV IRTMEAN Reli 1.01 1.12 0.76 0.72 0.56 0.51 ...Cronbachs Alpha and the IRT-Reliability 2003; Müller University estimates behaveISSID-Graz, to ofdifferently... Tuebingen 31 Heavy deviations from normal-distribution Kurtosis Analysis Variable : NEO-FFI, Scale: Agreeableness N Mean Std Dev Skewness Kurtosis 2078 0.50 0.66 0.11 3.13 f requency WLE MLE 240 230 220 210 200 190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10 0 4 Parameter 2 0 -2 -4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Frequency Person Parameters in Class 1 with size 1.00000 24 Rawscore ISSID-Graz, 2003; Müller University of Tuebingen 32 The unreliable Kurtosis as an (bad) indicator of non-normality correction 400 1. 300 200 100 Std. Dev = 1,26 Mean = ,2 N = 1006,00 0 -4,0 -2,0 -3,0 0,0 -1,0 2,0 1,0 4,0 3,0 Descriptive Statistics Valid N (lis twis e) N Statis tic 1006 1006 Minimum Statis tic -3,5843 Std. Maximum Mean Skewness Deviation Statis tic Statis tic Statis tic Statis tic Std. Error ISSID-Graz, 2003; Müller University 3,7025 ,223009 1,2570870 ,117 ,077 of Tuebingen Kurtos is Statis tic Std. Error ,374 ,154 33 1.7% of all estimated reliabilities are negative!! Obs 1 2 3 4 5 6 7 8 9 testcode Reliability by IRT-Software PNE2112EX012OrUMa5KDBi1PL##ML####mR -1.1945 0.80155 PNE2112EX012OrUMa5KDBi1PL##ML####oR -1.3924 0.80155 PNE2112EX012OrUMa5KDBi1PN##ML####mR -1.1947 0.80155 PNE2112EX012OrUMa5KDBi1PN##ML####oR -1.3926 0.80155 PNE2112EX012OrUMa5KDBi2PL##ML####mR -1.2574 0.80155 PNE2112EX012OrUMa5KDBi2PL##ML####oR -2.4275 0.80155 PNE2112EX012OrUMa5KDBi2PN##ML####mR -1.2220 0.80155 PNE2112EX012OrUMa5KDBi2PN##ML####oR -2.3639 0.80155 OEP1500SM007SPUMq2KDWI1PL10WM###### -0.1940 0.77434 ISSID-Graz, 2003; Müller University of Tuebingen Cronbach 34 Item number and reliability Simple Statistics IRT-Reli Cronbach ItemAnzahl N 278 277 278 Mean 0.74 0.76 17.3 IRT-Reliability Cronbach Minimum 0.40 0.28 3 Maximum 0.98 0.94 61 Item Anzahl 0.65 0.59 (Spearman-Correlation) ISSID-Graz, 2003; Müller University of Tuebingen 35 Misleading Graphics in Winmira f requency WLE MLE Person Parameters in Class 1 with size 1.00000 130 4 120 3 110 100 2 80 70 0 60 -1 Frequency Parameter 90 1 50 40 -2 30 -3 20 10 -4 0 0 1 2 3 4 5 6 7 8 9 10 11 12 Rawscore ISSID-Graz, 2003; Müller University of Tuebingen 36 Comparing Estimation-Methods: WML vs ML T-Tests Variable Method Variances UERVU Pooled Equal UERVU Satterthwaite DF t Value 231 -0.82 0.4132 Unequal 230 -0.83 Pr > |t| 0.4089 Equality of Variances Variable Method Num DF Den DF UERVU Folded F 126 105 1.23 F Value Pr > F 0.2782 …no systematic influence by the estimation method ISSID-Graz, 2003; Müller University of Tuebingen 37 3. Strategy of analysing data Testidentification Dateiart, Testname, VPAnzahl, DimName, ItemAnzahl, Modifiziert Datstruktur Method-factors ItemReduk, Antwortformat, Dichotomi, Software, IRTModell, Linkfunction Opt_Skal, Estim_Meth, Par_Calib, Par_Free, Par_Slope Data-nonnormalities ERV, TRV, KTRV Skewness Cronbach Kurtosis Winmira-einlesen.sas Parscale-einlesen.sas CRONBACH.sas DATEINAMEN_EINLESEN.sas ISSID-Graz, 2003; Müller University Bilog-einlesen.sas habil.pp-schiefe of Tuebingen Bigsteps-einlesen.sas38 6.7 Influences of BOTTOM-AND-CEILING effects on the TRV Bottom-effect Look for Skewness! Bottom-and-ceiling effect Look for Kurtosis! You find a skewness of magnitude 1 You see that this is caused by an cutting-point at 0.5 A cutting-point at 0.5 leads to an decreased variability of 70% ISSID-Graz, 2003; Müller University of Tuebingen 39 A link to a previews presentation on the European Congress of Personality in Jena, 2002 (a summary of the interpretation of rasch variances) ISSID-Graz, 2003; Müller University of Tuebingen 40 Extraversion/ Surgency Emotional Conscientiousness Stability Eysencks Extraversion Agreeableness Intellect/ Openness Eysenck Extraversion Psychoticism (r) Neuroticism (r) Source: http://www.personalityresearch.org/bigfive/eysenck.html Adler Superiority Striving Bakan Agency Bales Dominant Iniative Social Interest Communion Social-Emotional Orientation Agency Task Orientation Bartholome Model of Other w (Avoidance)(r) Block Superiority Striving Model of Self (Anxiety) (r) Low Ego Control High Ego Control Buss and Plomin Activity Ego Resiliency Impulsivity Emotionality (r) Independence vs. Subduedness Openness Cattell Exvia (vs. Invia) Pathemia (vs. Cortertia) Superego Strength Comrey Extraversion and Activity Femininity Orderliness and Social Conformity Adjustment vs. Anxiety Emotional Stability Agreeableness Conscientiousness Neuroticism (r) Costa and McCra Extraversion e Digman Beta Erikson Fiske Alpha Beta Basic Trust Confident SelfExpression Social Adaptability Conformity Freud Emotional Control Inquiring Intellect Psychosexual Development Emotional Stability Goldberg Surgency Agreeableness Conscientiousness Gough Extraversion Consensuality Control Guilford Social Activity Paranoid Disposition (r) Thinking Introversion Emotional Stability Hogan Ambition and Sociability Likeability Adjustment Horney Rebelliousness Moving Toward Prudence Intellect Flexibility ISSID-Graz, 2003; Müller University of Tuebingen Intellectance 41 Emotional Conscientiousness Stability Eysencks Extraversion Extraversion/ Surgency Agreeableness Jackson Outgoing, Social Leadership Self-Protective Orientation (r) Leary Control / Dominance Affiliation / Love Maslow Self-Actualization McAdams Power Motivation MyersBriggs Extraversion vs. Introversion Feeling vs. Thinking Judging vs. Perception Peabody Power Love Work Rank Individuation Rogers Personal Growth Intellect/ Openness Work Orientation Dependence (r) Aesthetic / Intellectual Source: http://www.personalityresearch.org/bigfive/eysenck.html Self-Actualization Intimacy Motivation Intuition vs. Sensing Affect Union Intellect Individuation Personal Growth Skinner Socialization Tellegen Positive Emotionality Watson Wiggins Power Motivation Constraint Negative Emotionality Absorption Socialization Agency Communion Zuckerman Extraversion Extraversion/ Surgency Agreeableness Agency Psychoticism, Impulsivity, Sensation Seeking (r) Neuroticism (r) Psychoticism, Impulsivity, Sensation Seeking Conscientiousness Emotional Stability Intellect/ Openness ISSID-Graz, 2003; Müller University of Tuebingen 42 Comparing Models ISSID-Graz, 2003; Müller University of Tuebingen 43 Comparing Software Testname TRV_mean QD QE QP QS Bi PS Wi Bi PS Wi Bi PS Wi Bi PS Wi Software TRV_Std Testname TRV_mean 1.53 0.45 EP Bi 0.93 0.01 PS 2.96 0.76 Wi 1.66 0.52 FP Bi 0.90 0.01 PS 2.48 . Wi 1.65 0.74 NE Bi 0.46 0.02 PS 1.16 . Wi 1.75 0.73 TI Bi 0.51 0.04 PS ISSID-Graz, 2003; Müller University of TuebingenWi 2.97 . Software TRV_Std 0.99 1.15 1.30 1.86 0.84 3.20 1.79 1.28 3.51 1.20 1.15 3.21 0.43 0.05 0.44 0.69 0.06 0.60 0.60 0.41 3.83 0.40 0.67 2.7044 TRV from Parscale Testname QD QE QP QS PS PS PS PS TRV_mean 0.93 0.90 0.46 0.51 TRV_Std 0.01 0.01 0.02 0.04 Testname EP FP NE TI TRV_mean PS PS PS PS ISSID-Graz, 2003; Müller University of Tuebingen 1.15 0.84 1.28 1.15 TRV_Std 0.05 0.06 0.41 0.67 45 TRV from Winmira Testname QD QE QP QS Wi Wi Wi Wi TRV_mean TRV_Std 2.96 0.76 2.48 . 1.16 . 2.97 . Testname EP FP NE TI TRV_mean Wi Wi Wi Wi ISSID-Graz, 2003; Müller University of Tuebingen 1.30 3.20 3.51 3.21 TRV_Std 0.44 0.60 3.83 2.70 46 TRV from BilogMG Testname QD QE QP QS Bi Bi Bi Bi TRV_mean 1.53 1.66 1.65 1.75 TRV_Std 0.45 0.52 0.74 0.73 Testname EP FP NE TI TRV_mean Bi Bi Bi Bi ISSID-Graz, 2003; Müller University of Tuebingen 0.99 1.86 1.79 1.20 TRV_Std 0.43 0.69 0.60 0.40 47