M. Roth European et al.: Dimensionality Journalof Psychological and NormsAssessment © of2008 Rosenberg Hogrefe 2008;’sVol. & Self-Esteem Huber 24(3):190–197 Publishers Scale Dimensionality and Norms of the Rosenberg Self-esteem Scale in a German General Population Sample Marcus Roth1, Oliver Decker2, Philipp Yorck Herzberg2, and Elmar Brähler2 1 Department of Psychology II, University of Leipzig, Germany Department of Medical Psychology and Medical Sociology, University of Leipzig, Germany 2 Abstract. This study analyzed the dimensionality of the German version of Rosenberg’s Self-Esteem scale (RSES) in a nationally representative population sample of 4,988 subjects (46.4% males; aged 14–92 years). Using confirmatory factor analysis, one- and two-dimensional models were tested. Results suggest that the RSES is a two-dimensional scale comprising the highly correlated components positive and negative self-evaluation, which constitute a unitary construct of global self-esteem at the second-order level. In order to obtain a more conclusive solution, an item response theory (IRT) analysis (partial credit model) was conducted. Results lend support to a one-dimensional view of the RSES. Furthermore, psychometric properties and norm values based on the representative sample are reported. Analyses revealed extremely high response probabilities for all items, as a consequence of which self-esteem cannot be differentiated at the upper end of the range. Keywords: dimensionality, self-esteem, Rosenberg scale, psychometrics Introduction Despite the wide range of self-esteem measures proposed by researchers over the years, the Rosenberg Self-Esteem scale (RSES; Rosenberg, 1965) remains by far the most widely used (e.g., Blascovich & Tomaka, 1991; Gray-Little, Williams, & Hancock, 1997). In a systematic review, Gray-Little et al. (1997) concluded that the RSES is a reliable and valid measure of global self-worth and “deserves its widespread use and continued popularity” (p. 450). The RSES comprises five positively and five negatively worded items. The scale was conceptualized as a single-factor scale with scores ranging along a continuum from low self-esteem to high self-esteem. According to Rosenberg (1979), the individual with a high level of self-esteem can be characterized as follows: “he has selfrespect, considers himself a person of worth. Appreciating his own merits, he nonetheless recognizes his faults . . .. The term ‘low self-esteem’ . . . means that the individual lacks respect for himself, considers himself unworthy, inadequate, or otherwise seriously deficient as a person” (p. 54). While psychometric studies have generally supported the one-dimensionality of the scale (e.g., Fleming & Courtney, 1984; Mimura & Griffiths, 2007), a considerable number of studies applying exploratory or confirmatory factor analysis have provided evidence to suggest that the scale reflects a two-dimensional construct. As a result, European Journal of Psychological Assessment 2008; Vol. 24(3):190–197 DOI 10.1027/1015-5759.24.3.190 two different two-facet models have been proposed. The first differentiates between a positive and a negative image of the self (e.g., Goldsmith, 1986; Pullmann & Allik, 2000). Studies supporting this model have found that the five positively worded items (e.g., “I feel that I have a number of good qualities”) load onto one factor, referred to variously as positive self-esteem, positive self-worth, and positive self-image, while the five negatively worded items (e.g., “At times I feel that I am no good at all”) load onto a separate factor, referred to as negative self-esteem, self-derogation, self-deprecation, or negative self-image. Some researchers have argued that the two facets reflect different forms of self-image that are influenced by different experiences (e.g., Owens, 1994; Sheasby, Barlow, Cullen, & Wright, 2000). Other researchers have, however, suggested that the two-dimensionality of the RSES should be interpreted as a methodological artefact of item-wording (e.g., Greenberger, Chen, Dmitrieva, & Farraggia, 2003; Tomas & Oliver, 1999). In line with this suggestion, positive and negative self-esteem factors might result from respondents’ tendency to react asymmetrically to positively and negatively keyed items. Marsh (1996) has argued that this artefact might primarily be due to verbal ability, since responding to negatively worded items (in particular negating them) implies a higher level of cognitive complexity than responding to positive items. In support of this view, Marsh (1996) demonstrated that students with lower verbal ability were especially susceptible to providing responses to negative items that were inconsis© 2008 Hogrefe & Huber Publishers M. Roth et al.: Dimensionality and Norms of Rosenberg’s Self-Esteem Scale 191 Table 1. Demographic characteristics of the representative population sample Age Age groups Total (N = 4988) Male (N = 2312) Female (N = 2676) M 48.32 48.03 48.56 SD 17.96 18.02 17.92 14–24 years 11.1% 12.8% 9.6% 25–34 years 13.7% 12.8% 14.4% 35–44 years 19.2% 17.1% 21.0% 45–54 years 17.2% 17.4% 16.9% 55–64 years 16.0% 17.1% 15.1% 65–74 years 15.4% 16.8% 14.2% 7.4% 6.0% 8.7% ≥ 75 years Urbanicity Educationa Net household income Rural area 13.3% 13.2% 13.3% Urban area 86.7% 86.8% 86.7% No qualifications 1.1% 0.8% 1.3% High school 44.2% 45.2% 43.3% O-levels/technical college 39.7% 37.1% 34.0% A-levels 7.7% 7.7% 7.6% University/polytechnic 6.5% 8.0% 5.2% Currently still at school 3.5% 4.2% 2.8% 4.1% 4.3% 3.9% 750 to 1249 e/month < 750 e/month 18.3% 14.8% 21.4% 1250 to 1999 e/month 36.1% 36.8% 35.5% ≥ 2000 e/month 41.5% 44.1% 39.2% a Note. O-level refers to the German “Realschulabschluss” (10 years); A-level refers to the German “Abitur” (12 or 13 years). tent with their responses to positively worded items. Similar results were obtained by Corwyn (2000) and Dunbar, Ford, Hunt, and Der (2000). In their analysis of verbal competence and self-esteem in German students, however, Collani and Herzberg (2003) failed to find support for this theory. In addition to the positive-negative image model, a second two-factor model of self-esteem as measured by the RSES has been proposed. Tafarodi and colleagues (Tafarodi & Swann, 1995, 2001; Tafarodi & Milne, 2002) differentiate between the facets self-competence and self-liking. While self-competence is defined by the authors as “the valuative experience of oneself as a causal agent, an intentional being with efficacy and power” (Tafarodi & Swann, p. 654), self-liking is seen to refer to “the valuative experience of oneself as a social object, a good or bad person according to internalized criteria of worth” (p. 655). According to this definition, self-competence is closely related to Bandura’s (1977) concept of general self-efficacy, whereas self-liking corresponds to the actual definition of global self-esteem. The facet self-competence has been shown to comprise items, 3, 4, 5, 7, and 9, and self-liking items 1, 2, 6, 8, and 10 of the RSES. Despite the controversy surrounding the number of factors that underlie Rosenberg’s scale, previous studies have predominantly attempted to address the issue using nonrepresentative subsamples. Furthermore, with the exception of the study by Collani and Herzberg (2003), the factorial © 2008 Hogrefe & Huber Publishers structure of the German version of the Rosenberg scale has thus far hardly been subject to investigation. The central aim of the present study was therefore to investigate the dimensionality of the German version of the Rosenberg Self-Esteem scale using a representative population sample. In this way, biased results due to selected populations were avoided. A second aim of the study was to provide data on the psychometric properties and norm values on the basis of the representative sample. This would appear to be highly necessary, given the huge gulf between research and practice in applying the Rosenberg scale: Whereas the German version of the RSES is one of the most widely used instruments in several areas of psychological research, it almost completely fails to find application in the practical context of individual diagnostics (see Roth & Herzberg, 2008). It is possible that this lack of application is in part due to missing norms, as a consequence of which individual RSES scores cannot be interpreted. Method Subjects and Procedure This study formed part of a nationally representative faceto-face household survey conducted in Germany. Study participants were surveyed using a structured self-report European Journal of Psychological Assessment 2008; Vol. 24(3):190–197 192 M. Roth et al.: Dimensionality and Norms of Rosenberg’s Self-Esteem Scale Figure 1. Measurement models for global self-esteem. RSES = Rosenberg Self-Esteem Scale; SCO = self-competence; SLI = self-liking; POS = positive self-image; NEG = negative self-image. questionnaire. Participants also completed a self-report questionnaire that included Rosenberg’s Self-Esteem scale (1965). The survey was carried out by professional interviewers from a demographic consultation company (USUMA, Berlin). Within each wave, a representative sample of the German population aged 14 years or older was approached using 258 sample points. Addresses were selected according to the random route procedure. Of the 8,398 addresses selected, 8,106 proved valid. A total of 5,036 persons agreed to participate, provided verbal informed consent, and completed the study questionnaires. The response rate among those individuals who were asked to participate by the interviewers was 72.9%. In the following analysis, subjects with missing data for the variables measuring self-esteem were excluded (n = 48). Thus, the final sample consists of 4,988 subjects aged 14 to 92 years (Table 1). The characteristics of the sample given in Table 1 closely match those from the total German population (Statistisches Bundesamt, 2005). European Journal of Psychological Assessment 2008; Vol. 24(3):190–197 Measures The German Adaptation of Rosenberg’s Self-Esteem scale (RSES) by Ferring and Filipp (1996) was administered. The RSES as described above is composed of five positively and five negatively worded items. Subjects are asked to indicate the extent to which the items describe them, using a 6-point Likert scale from “1” = strongly disagree to “6” = strongly agree. Results Confirmatory Factor Analysis (CFA) The three models of global self-esteem (see Figure 1) described in the introduction section above were tested via confirmatory factor analysis using AMOS 6.0 (Arbuckle, © 2008 Hogrefe & Huber Publishers M. Roth et al.: Dimensionality and Norms of Rosenberg’s Self-Esteem Scale 193 Table 2. Summary of the fit indices for the estimated models of the German Adaptation of Rosenberg’s Self-Esteem Scale (left: random sample 1, n = 2499/right: random sample 2, n = 2489) Modela χ² χ²S–B df GFI NFI TLI CFI RMSEA RMSEA (90% CI) 1 1540.28/1531.70 1007.74/944.97 35 .86/.86 .86/.85 .82/.81 .86/.85 .13/.13 .13–.14/.13–.14 2 1529.08/1527.67 820.59/869.68 34 .86/.86 .86/.85 .82/.80 .86/.85 .13/.14 .13–.14/.13–.15 3 464.06/460.21 296.19/287.60 34 .96/.96 .96/.95 .95/.94 .96/.96 .07/.07 .07–.08/.07–.08 a Note. See Figure 1. GFI = goodness-of-fit index; NFI = normed fit index; TLI = Tucker Lewis index (nonnormed fit index); CFI = comparative fit index; RMSEA = root mean square error of approximation; 90% CI = limits of the 90%-confidence interval for RMSEA. In all cases, p < .001. χ²S–B: Satorra-Bentler chi-square. Table 3. Summary of the fit indices for the estimated models of Rosenberg’s Self-Esteem Scale in 25- to 44-year-old Germans with the lowest educational degree (low; n = 471) and the highest educational degree (high; n = 326) Modela χ² df GFI NFI TLI CFI RMSEA 1 Low High 385.21 203.53 35 .81 .87 .81 .85 .77 .83 .82 .87 .15 .12 .13–.16 .11–.14 2 Low High 383.88 303.21 34 .81 .87 .81 .84 .77 .82 .83 .86 .15 .12 .14–.16 .11–.14 3 Low 101.77 34 .95 .97 .96 .97 .06 .05–.08 High 102.08 .95 .93 .93 .95 .08 .06–.10 Note. aSee Figure 1. GFI = goodness-of-fit index; NFI = normed fit index; TLI = Tucker Lewis index (nonnormed fit index); CFI = comparative fit index; RMSEA = root mean square error of approximation; 90% CI = limits of the 90% confidence interval for RMSEA. In all cases p < .001. 2005). Model 1 represents the one-dimensional conception of global self-esteem (e.g., Rosenberg, 1965, 1979), with all 10 items defined as indicators of a single factor. Model 2 depicts the two-factor model proposed by Tafarodi and Milne (2002), with two latent variables (selfcompetence and self-liking) and five measured variables loading onto each. Model 3 represents the claim that the two poles of self-esteem represent substantively distinct dimensions (e.g., Pullmann & Allik, 2000), with the five positively worded items defined as indicators of a positive self-esteem facet (positive self-image) and the five negatively worded items as indicators of a negative self-esteem facet (negative self-image). Error covariances were constrained to zero in all models, in order to avoid opportunistic fitting. For the purpose of confirming the goodness of fit criteria for the models, the sample were randomly split into two halves, and analysis were conducted separately for each half using the maximum likelihood (ML) method of estimation. As shown in Table 2, Model 1 and Model 2 both demonstrated unsatisfactory fit indices (Tucker-Lewis Index < .90 and root mean square error of approximation > .08). In contrast, the fit indices associated with Model 3 indicated a reasonable model fit (correlation between POS and NEG: r = –.49 res. r = –.50). This was supported by an evaluation of the differences in fit between the two models: A χ² difference test comparing Model 1 and Model 3 revealed that Model 3 was superior (∆χ2 = 1076.22/1071.49, df = 1, p < .001). A χ² difference test comparing Model 2 and Model 3 was not admissible given that these models were not nested. Because the multi© 2008 Hogrefe & Huber Publishers variate normality assumption of the ML estimation method was not met by the data, all analyses were consequently confirmed by applying a Satorra-Bentler χ² correction for nonnormality using EQS 5.7. Results yielded using this method did not produce conclusions which deviated from those obtained using ML estimation (see Table 2). Finally, we explored the contribution of verbal ability method effects as reported by Marsh (1996). Here, education level was used as a proxy for verbal ability of the participants. While an optimal analysis would include all six educational levels presented in Table 1, we omitted individuals with no qualification due to the small sample size. For space-saving reasons and given that results based on an analysis of all five groups did not significantly deviate, we report results for the two extreme education groups only. These two extreme groups were formed as follows: The first comprised individuals with the lowest possible level of school education (8–9 years of completed school education), and the second those with the highest level of school education obtainable in Germany (“A”-level). Since age and education level are confounded (with lower education levels occurring in higher age groups), we only included subjects between the ages of 25 and 44 years in our analyses. As shown in Table 3, the two-factor model distinguishing between a positive and negative facet of self-esteem (Model 3) was once again the only model to provide an adequate model fit in both subsamples. For participants with the lowest level of education, correspondence between the latent variables POS and NEG was r = –.50, and r = –.54 for subjects with the highest level of education. European Journal of Psychological Assessment 2008; Vol. 24(3):190–197 194 M. Roth et al.: Dimensionality and Norms of Rosenberg’s Self-Esteem Scale Item Response Theory Analysis (IRT) While factor analyses, whether exploratory or confirmatory in nature, represent the most common technique in evaluating the dimensionality of questionnaires, these approaches do not constitute an exhaustive analysis at the item-level. An IRT-analysis can provide useful information about a questionnaire, in particular when the results of preceding factor analyses are ambiguous (e.g., Herzberg, 2002). More specifically, the misfit of the one-dimensional model for the RSES may result from the fact that five items are keyed in the positive direction and five in the negative direction. This results in dissimilar item distributions, which in turn can lead to spurious factors based on a similarity of the item distribution rather than the content. Since CFA approaches make normal curve assumptions and the RSES is plagued with a bimodal item distribution, it is therefore prudent to use full information techniques that can handle bimodal item distributions. IRT overcomes this limitation and allows the one-dimensionality of questionnaires to be tested free from item bias (van der Linden & Hambleton, 1997). The 10 items were analyzed using the Partial Credit Model (Masters, 1982) as implemented in the computer program LpcM-WiN 1.0 (Fischer & Ponocny-Seliger, 1998). This model assumes that the items measure a onedimensional latent trait and that the response categories are ordered, while item difficulties are allowed to differ between items. We used sex as an external criterion to split the sample for testing of statistical fit, and, thus, examined whether parameter estimates for RSES items were invariant across gender groups. LpcM-WiN provides several measures of statistical fit. First, an inspection of the graphical goodness of fit test (Figure 2) revealed that none of the items conspicuously deviated from the 45°-line. Second, the global model fit with χ² = 61.09 (df = 49, p > .05) was nonsignificant, indicating homogeneity of the 10 items across males and females. Third, the Pearson correlation between parameters was .97, also indicating homogeneity of the 10 items across gender groups. Finally, beyond the overall model fit, the one-dimensionality of the items is supported by z-statistics, which pertain to the single item. This z value is an approximate asymptotically normal statistic for testing differential item functioning (DIF) of a single item, based on a bipartitioning of the total testee sample into two disjoint samples. With the exception of two significant z statistics (z > 2.0) for Item 9 (categories 3 and 5, respectively), the remaining z statistics ranged between –1.91 and 1.87 and were, therefore, not significant at the 5% level. The Rating-Scale Model did not fit the data, indicating that categories were ordered but not equidistant. Figure 2. Graphical goodness of fit test based on the sample split by sex. Note: Each data point represents an item category, while the category itself is not depicted. Of importance is the deviation of the points from the 45° line. negative self-evaluation) were tested using multivariate analyses of variance with sex and age (graded sevenfold in correspondence with the ranges depicted in Table 1) as independent variables. Due to the large sample size used in the present study, we also calculated the effect size ηsquared for the purpose of determining the magnitude of differences between means. In order to avoid an overestimation of significant results, it would seem reasonable to primarily rely on this value in large samples. Multivariate tests revealed a main effect of age (F = 6.52, p < .001, η² = .01), but no main effect of sex (F = 0.52, p = .597, η² = .00), and no sex by age interaction effect (F = 1.12, p = .111, η² = .00). Univariate tests indicated main effects of age on the total RSES scale (F = 3.38, p = .002, η² = .01) and on the positive SE component subscale (F = 8.172, p < .001, η² = .01), but not on the negative SE-component subscale (F = 1.63, p = .135, η² = .00). Duncan’s multiple post hoc analyses of the means indicated that the lowest RSES scores emerged in the age groups 55–64 years and > 75 years, and the highest scores in the age groups 25–34 years and 35–44 years. With respect to the positive SE-component, the lowest scores were again found in the age groups 55–64 years and 75years and the highest scores in the 25–34 years age group. Age differences were, however, found to be very small (accounting for a maximum of 1% of the variance in self-esteem). Age and Sex Differences Psychometric Properties Age and sex differences in the RSES scale as well as in the subscales (representing the two components positive and Item and scale characteristics were evaluated on the basis of the total sample (N = 4988). Based on the results con- European Journal of Psychological Assessment 2008; Vol. 24(3):190–197 © 2008 Hogrefe & Huber Publishers M. Roth et al.: Dimensionality and Norms of Rosenberg’s Self-Esteem Scale 195 Table 4. Psychometric properties of the German adaptation of Rosenberg’s Self-Esteem Scale (RSES; N = 4,988) Scale/item M SD α ris rit p Frequency of scores (%) 1 2 3 4 5 6 RSES-pos 4.82 0.82 .81 – – – – – – – – – Item01 4.75 1.11 – .56 .60 .79 1.3 3.5 8.0 20.2 40.6 26.5 Item03 4.87 0.95 – .64 .56 .81 0.4 1.7 5.2 22.8 42.6 27.3 Item04 4.86 1.06 – .56 .49 .81 0.6 2.2 7.3 23.0 34.5 32.5 Item07 4.89 1.18 – .61 .57 .82 1.5 2.7 8.4 19.1 29.1 39.2 Item10 4.72 1.15 – .60 .62 .79 1.3 3.2 9.1 23.2 34.2 28.9 RSES-neg 5.02 1.02 .86 – – – – – – – – – Item02* 5.08 1.26 – .70 .67 .85 1.2 4.8 8.0 11.1 20.6 54.2 Item05* 4.85 1.32 – .64 .63 .81 1.7 5.3 11.1 13.0 25.8 43.1 Item06* 4.96 1.34 – .71 .67 .83 2.0 5.2 10.0 11.4 20.9 50.7 Item08* 4.84 1.32 – .64 .61 .81 1.4 5.7 11.4 14.4 23.3 43.8 Item09* 5.35 1.09 – .72 .69 .89 0.6 2.2 6.7 8.3 16.0 66.2 RSES-tot 4.92 0.82 .88 – – – – – – – – – Note. M = mean; SD = standard deviation; RSES-pos = positive subscale of the RSES; RSES-neg = negative subscale of the RSES (reversed pooled); RSES-tot = total score of the RSES; α = Cronbach’s α; ris = part-whole corrected item-total correlation (related to the subscale); rit = part-whole corrected item-total correlation (related to the total score); p = response probability. *Negative items were recoded prior to analyses. Table 5. Normative data from the German general population (N = 4988) for the Rosenberg Self-Esteem Scale (sum score) Sum T Sum T Sum T Sum T <20 <14 30 27 41 40 52 53 20 15 31 28 42 41 53 55 21 16 32 29 43 42 54 56 22 17 33 30 44 44 55 57 23 18 34 32 45 45 56 58 24 19 35 33 46 46 57 60 25 21 36 34 47 47 58 61 26 22 37 35 48 49 59 62 27 23 38 36 49 50 60 63 28 24 39 38 50 51 29 25 40 39 51 52 Note. Sum = Raw score; T = T-score (mean = 50; standard deviation = 10). cerning the internal structure of the scale, we evaluated the total score of the RSES as well as scores for the subscales representing the positive and negative evaluation components of self-esteem. As shown in Table 4, internal consistencies (Cronbach’s α) were satisfactory for the RSES scale as well as for the two subscales (α > .80). Item-total correlations were in the upper range (r > .50). Finally, with the aim of enabling the application of the RSES in the context of individual diagnostics, Table 5 presents normative data for the RSES score. Given negligible sex and age differences, we calculated norms based exclusively on the total sample. Standardized T-scores from this table can be used to compare an individual’s RSES score with scores from the general population reference group. © 2008 Hogrefe & Huber Publishers Discussion In the present study, Ferring and Filipp’s (1996) German version of the Rosenberg Self-Esteem Scale (RSES; Rosenberg, 1965) was administered to a representative population-based sample of 4988 subjects aged between 14 and 92 years with the aim of exploring dimensionality and psychometric properties of the scale. In line with previous studies using the German Version in selected samples (e.g., Collani & Herzberg 2003; Ferring & Filipp, 1996), satisfactory reliabilities were found. Together with the norm values reported in our paper, this instrument can, thus, be employed to measure global self-esteem in the practical context of individual diagnostics. With respect to the dimensionality of the RSES, the CFA results of our study clearly indicate a two-factor structure. As shown by confirmatory factor analyses, only the model in which positive and negative items load onto separate factors, which in turn constitute global self-esteem on a higher order level (referred to as Model 3 in Figure 1), yielded an adequate model fit. Contrary to the findings of Marsh (1996), and Collani and Herzberg (2003), we did not find any evidence to suggest that this dimensionality is an artefact of verbal ability. Indeed, the two-factor structure emerged in participants with a low level of education as well as in those with a high level of education. This is in line with results reported by Greenberger et al. (2003), who also used only a proxy indicator of verbal ability (grade and generational status, i.e., first-generation immigrants vs. higher generation immigrants). Nevertheless, even if proxy indicators of verbal ability do not contribute to explaining the structure of the scale, we still assume that the two-dimensionality comprising a positive and a negative self-image factor represents a methodological artefact of the two European Journal of Psychological Assessment 2008; Vol. 24(3):190–197 196 M. Roth et al.: Dimensionality and Norms of Rosenberg’s Self-Esteem Scale types of item-wording used in the scale. An inspection of the different answer distributions for positively and negatively worded items supports this assumption. As shown in Table 4, the most frequent response to positively worded items was 5, whereas negatively worded items were predominantly responded to with 1. Participants, thus, tended to disagree with or negate negatively worded items more strongly than they tended to agree with positively worded items. In order to rule out that the two factor solution was caused by the bimodal item distribution, we conducted an IRT analysis. IRT analyses are not based on covariance matrices and are, therefore, not biased by item distribution artefacts. Results of the IRT analyses support a one-dimensional structure of the RSES. The assumption of an artefact of item-wording also corresponds with the findings of Greenberger et al. (2003). In their study, a clear one-dimensional structure emerged when the wording of RSES items was altered in such a way that all items were worded in a consistent direction, regardless of whether the wording was in a consistently positive or negative direction. Furthermore, the authors were able to demonstrate that the mean scores for the differently worded versions of the RSES did not significantly differ, and that the wording of the items did not seem to affect the magnitude of correlations between the RSES and validityrelated measures (for two exceptions see Greenberger et al., 2003). In conclusion, it therefore seems reasonable to use the global score in analyzing individuals’ self-esteem rather than measuring the two components separately. Although previous findings found both age and sex differences in self-esteem (e.g., Kling, Hyde, Showers, & Buswell, 1999; Robins, Trzesniewski, Tracy, Gosling, & Potter, 2002), age differences in our study were very small, and gender differences or an age by gender interaction did not occur1. The norms provided are therefore valid for males and females as well as for all age groups. The distribution of item scores leads us to the most serious problem in applying the RSES to measure global selfesteem. As the psychometric properties of the items show, response probabilities were far too high, in particular when considering that such high probabilities (p = .79–.89) were found for all items. A clear ceiling-effect was found with respect to the distribution of self-esteem scores. The RSES, thus, only enables the measurement of self-esteem in the low to medium range, whereas differentiations at the upper end of the range are not possible. This, unfortunately, means that the RSES is not an adequate instrument when it comes to measuring self-esteem at extreme levels associated with, for example, overestimation and narcissistic tendencies (for a summary see Schütz, 2000). The inclusion of items with lower response-probabilities in the scale 1 would therefore seem necessary in order to avoid ceilingeffects and enable the identification of subjects with extremely high levels of self-esteem. Furthermore, future efforts should focus on extending the external validity of the Rosenberg Self-Esteem scale by investigating the usefulness of the norms provided. Norms may, for instance, enable researchers to conduct clinical single case studies or to evaluate the effects of treatment programs. Acknowledgments The study was funded by the Friederich-Ebert-Stiftung, Germany. Reference Arbuckle, J.L. (2005). AMOS 6.0 user’s guide. Chicago: SPSS. Bandura, A. (1977). Self-efficacy toward a unifying theory of behavioral change. Psychology Review, 84, 191–215. Blascovich, J., & Tomaka, J. (1991). Measures of self-esteem. In J.P. Robinson, P.R. Shaver, & L.S. Wrightsman (Eds.), Measures of personality and social psychological attitudes (pp. 115–160). New York: Academic Press. Collani, G.v., & Herzberg, P.Y. (2003). Zur internen Struktur des globalen Selbstwertgefühls nach Rosenberg [A revised version of the German adaptation of Rosenberg’s Self-Esteem scale]. Zeitschrift für Differentielle und Diagnostische Psychologie, 24, 9–22. Corwyn, R.F. (2000). The factor structure of global self-esteem among adolescents and adults. Journal of Research in Personality, 34, 357–379. Dunbar, M., Ford, G., Hunt, K., & Der, G. (2000). Question wording effects in the assessment of global self-esteem. European Journal of Psychological Assessment, 16, 13–19. Ferring, D., & Filipp, S.H. (1996). Messung des Selbstwertgefühls: Befunde zu Reliabilität, Validität und Stabilität der Rosenberg-Skala [Measurement of self-esteem: Findings on reliability, validity, and stability of the Rosenberg Scale]. Diagnostica, 42, 284–292. Fischer, G.H., & Ponocny-Seliger, E. (1998). LPCM-Win 1.0 (Version 1.0). Groningen, The Netherlands: ProGamma. Fleming, J.S., & Courtney, B.E. (1984). The dimensionality of self-esteem. Journal of Personality and Social Psychology, 46, 404–421. Goldsmith, R.E. (1986). Dimensionality of the Rosenberg SelfEsteem scale. Journal of Social Behavior and Personality, 1, 253–264. Gray-Little, B., Williams, V.S.L., & Hancock, T.D. (1997). An item response theory analysis of the Rosenberg Self-Esteem Although Kling et al. (1999) meta-analyses yielded an overall effect size of .21, the authors conclude that “Taken together, the 2 analyses provide evidence that males score higher on standard measures of global self-esteem than females, but the difference is small.” (p. 470). That we could not confirm sex differences may be a peculiarity auf this German sample and not generalize to other countries. Indeed, none of the samples included in Klings et al.’s meta-analyses is a German one. European Journal of Psychological Assessment 2008; Vol. 24(3):190–197 © 2008 Hogrefe & Huber Publishers M. Roth et al.: Dimensionality and Norms of Rosenberg’s Self-Esteem Scale scale. Personality and Social Psychology Bulletin, 23, 443–451. Greenberger, E., Chen, C., Dmitrieva, J., & Farraggia, S.P. (2003). Item-wording and the dimensionality of the Rosenberg SelfEsteem scale: Do they matter? Personality and Individual Differences, 35, 1241–1254. Herzberg, P.Y. (2002). Zur psychometrischen Optimierung einer Reaktanzskala mittels klassischer und IRT-basierter Analysemethoden [Psychometric refinement of a reactance scale by means of classical and IRT-based approaches]. Diagnostica, 48, 163–171. Kling, K.C., Hyde, J.S., Showers, C.J., & Buswell, B.N. (1999). Gender differences in self-esteem: A meta-analysis. Psychological Bulletin, 125, 470–500. Marsh, H. (1996). Positive and negative global self-esteem: Substantively meaningful distinction or artifactors. Journal of Personality and Social Psychology, 70, 810–819. Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174. Mimura, C., & Griffiths, P. (2007). A Japanese version of the Rosenberg Self-Esteem Scale: Translation and equivalence assessment. Journal of Psychosomatic Research, 62, 589–594. Owens, T.J. (1994). Two dimensions of self-esteem: Reciprocal effects of positive self-worth and self-deprecation on adolescent problems. American Sociological Review, 59, 391–407. Pullmann, H., & Allik, J. (2000). The Rosenberg Self-Esteem scale. Its dimensionality, stability and personality correlates in Estonian. Personality and Individual Differences, 28, 701–715. Robins, R.W., Trzesniewski, K.H., Tracy, J.L., Gosling, S.D., & Potter, J. (2002). Global self-esteem across the life-span. Psychology and Aging, 17, 423–434. Rosenberg, M. (1965). Society and the adolescent self-image. Princeton, NJ: Princeton University Press. Rosenberg, M. (1979). Conceiving the self. New York: Basic Books. Roth, M., & Herzberg, P.Y. (2008). Psychodiagnostik in der Praxis: State of the art? [Status of psychological assessment in © 2008 Hogrefe & Huber Publishers 197 professional practice: State of the art?]. Klinische Diagnostik und Evaluation, 1, 5–18. Schütz, A. (2000). Psychologie des Selbstwertgefühls [Psychology of self-esteem]. Stuttgart, Germany: Kohlhammer. Sheasby, J.E., Barlow, J.H., Cullen, L.A., & Wright, C.C. (2000). Psychometric properties of the Rosenberg Self-Esteem scale among people with arthritis. Psychological Reports, 86, 1139–1141. Statistisches Bundesamt. (2005). Mikrozensus 2005. Wiesbaden, Germany: Statistisches Bundesamt. Tafarodi, R.W., & Milne, A.B. (2002). Decomposing global selfesteem. Journal of Personality, 70, 443–484. Tafarodi, R.W., & Swann, W.B. (1995). Self-liking and self-competence as dimensions of global self-esteem. Journal of Personality Assessment, 65, 322–342. Tafarodi, R.W., & Swann, W.B. (2001). Two-dimensional self-esteem: Theory and measurement. Personality and Individual Differences, 31, 653–673. Tomas, J.M., & Oliver, A. (1999). Rosenberg’s Self-Esteem scale: Two factors or method effects. Structural Equation Modeling, 6, 84–89. van der Linden, W.J., & Hambleton, R.K. (1997). Handbook of modern item response theory. New York: Springer. PD Dr. Marcus Roth University of Leipzig Department of Psychology II Seeburgstraße 14–20 D-04103 Leipzig Germany Tel. +49 341 973-5918 Fax +49 341 973-5909 E-mail mroth@uni-leipzig.de European Journal of Psychological Assessment 2008; Vol. 24(3):190–197