ASSESSMENT 10.1177/1073191105280987 Miller et al. / SCORING PERSONALITY DISORDERS WITH FFM A Simplified Technique for Scoring DSM-IV Personality Disorders With the Five-Factor Model Joshua D. Miller University of Pittsburgh Medical Center R. Michael Bagby University of Toronto Paul A. Pilkonis Sarah K. Reynolds University of Pittsburgh Medical Center Donald R. Lynam University of Kentucky The current study compares the use of two alternative methodologies for using the FiveFactor Model (FFM) to assess personality disorders (PDs). Across two clinical samples, a technique using the simple sum of selected FFM facets is compared with a previously used prototype matching technique. The results demonstrate that the more easily calculated counts perform as well as the similarity scores that are generated by the prototype matching technique. Optimal diagnostic thresholds for the FFM PD counts are computed for identifying patients who meet diagnostic criteria for a specific PD. These threshold scores demonstrate good sensitivity in receiver operating characteristics analyses, suggesting their usefulness for screening purposes. Given the ease of this scoring procedure, the FFM count technique has obvious clinical utility. Keywords: Five-Factor Model; personality disorders; prototypes Costa and McCrae’s (1992) Five-Factor Model (FFM) of personality has been a highly generative research tool in the service of exploring the relations between personality disorder (PD) constructs and “normal” or general personality functioning. Much of this research has been driven by a general dissatisfaction with the categorical approach taken by the official classification manual used throughout psychiatry and psychology—Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; American Psychiatric Association, 1994)—and a belief that dimensional models of adaptive or maladaptive personality features provide a better representation of these phenomena (Livesley, 2001; Widiger, 1993). In addition to the FFM, several prominent personality theorists have put forth alternative personality frameworks and assessment tools that can be used to examine pathological variants of This research was supported by National Institute of Mental Health Grant T32 MH18269, Clinical Research Training for Psychologists (principal investigator P. A. Pilkonis), which provided postdoctoral fellowship support to Joshua D. Miller. Please note that Joshua D. Miller, Ph.D., is now in the Department of Psychology at the University of Georgia. Correspondence concerning this article should be addressed to Joshua D. Miller, Ph.D., Department of Psychology, University of Georgia, Athens, GA 30602; e-mail: jdmiller@ uga.edu. Assessment, Volume 12, No. 4, December 2005 404-415 DOI: 10.1177/1073191105280987 © 2005 Sage Publications Miller et al. / SCORING PERSONALITY DISORDERS WITH FFM 405 personality, such as Clark’s (1993) Schedule for NonAdaptive and Adaptive Personality, Livesley’s (1986, 1987; Livesley, Jackson, & Schroeder, 1989) Dimensional Assessment of Personality Pathology, Cloninger’s (Cloninger, Svrakic, Bayon, & Przybeck, 1999; Cloninger, Svrakic, & Przybeck, 1993) seven-factor temperament and character model, and Harkness and McNulty’s (Harkness, McNulty, & Ben-Porath, 1995) Minnesota Multiphasic Personality Inventory-2-based Personality Psychopathology Five Scales. The dissatisfaction and subsequent proposal of alternative models of PD stem from a variety of reasons, including the inability of the DSM-IV PD categories to account for a range of clinically significant, personality-related problems that either (a) do not fit with any of the currently measured constructs or (b) are not severe enough to meet DSM-IV criteria (Westen & Arkowitz-Westen, 1998). Others have commented on the generally limited reliability (Cacciola, Rutherford, Alterman, McKay, & Mulvaney, 1998; Klonsky, Oltmanns, & Turkheimer, 2002) and validity (Clark, Livesley, & Morey, 1997) of DSM-IV PDs. It is the belief of many personality theorists that PDs are best conceptualized as comprising either extreme variants of general personality traits (Costa & Widiger, 1994, 2002) or alternative psychobiological dimensions, such as anxiety/inhibition, impulsivity/ aggression, affective instability, and cognitive/perceptual organization (Siever & Davis, 1991). By deconstructing the PDs into their underlying dimensions, a wider array of maladaptive personality styles can be conceptualized and assessed and issues such as comorbidity become less problematic (Lynam & Widiger, 2001). Although a number of trait models have been successfully used in the service of understanding PDs, the most frequently used has been the FFM. However, the manner in which the FFM has been used to understand PDs has evolved during the past decade. Widiger, Trull, Clarkin, Sanderson, and Costa (1994) laid the groundwork for much of this research by articulating specific hypotheses regarding how each DSM-IV PD would be conceptualized via the 30 specific personality traits (facets) of the FFM. Numerous studies have since tested the success of the FFM in capturing the PDs in general and the Widiger et al. (1994) hypotheses specifically (see Saulsman & Page, 2004, for meta-analysis of FFM domains and PDs; e.g., Axelrod, Widiger, Trull, & Corbitt, 1997; Bagby, Costa, Widiger, Ryder, & Marshall, 2005; Blais, 1997; Dyce & O’Connor, 1998; Huprich, 2003; Reynolds & Clark, 2001; Trull, 1992). The majority of this empirical work has involved an examination of the relations between the FFM domains and facets and PD symptomatology using bivariate correlations and multiple regression. More recently, Lynam et al. developed a prototypematching technique in which FFM PD prototypes are generated through the use of expert ratings for both DSM-IV– recognized PDs (Lynam & Widiger, 2001) and non-DSMIV–recognized forms of personality psychopathology, such as psychopathy (Miller & Lynam, 2003; Miller, Lynam, Widiger, & Leukefeld, 2001). These expertgenerated prototypes, which use all 30 FFM facets, can then be matched to individuals’ FFM profiles (as assessed by the Revised NEO Personality Inventory [NEO PI-R]) through the use of an intraclass correlation. This correlation, which takes into account profile agreement with regard to shape and absolute magnitude, can then be used as an index of similarity to the pertinent PD constructs. This technique was first successfully applied by Miller et al. (2001) and Miller & Lynam (2003) to demonstrate that psychopathy, a particularly virulent form of PD characterized by traits such as callousness, manipulativeness, lack of remorse or empathy, egocentricity, and impulsivity, could be captured by the FFM. Following this, Lynam and Widiger (2001) solicited expert ratings to develop FFM PD prototypes for all 10 DSM-IV PDs. Subsequently, these prototypes have been tested in four studies. Trull, Widiger, Lynam, and Costa (2003) demonstrated that the FFM prototype for borderline PD converged with other well-validated measures of this PD as well as important criterion constructs. Recently, Miller, Pilkonis, & Morse, (2004) and Miller, Reynolds, & Pilkonis (2004) have examined all 10 of the Lynam and Widiger (2001) prototypes across clinical samples and informant methodologies. Miller, Reynolds et al. (2004) found support for the convergent, discriminant, and predictive validity and temporal stability of the FFM PD prototypes. Two studies have also demonstrated the “resilience” of this technique to information source; Miller, Pilkonis et al. (2004) demonstrated that FFM information derived from an informant could be used to score the prototypes with equal validity, whereas Miller, Bagby, and Pilkonis (in press) showed that data from a semistructured interview of the FFM could also be successfully used. Despite the empirical success of the prototype-matching technique across PDs and data source, researchers and clinicians may be reluctant to use this approach. The scoring methodology is complex and requires a statistical program to create the PD similarity scores.1 In addition, the scores are not intuitively meaningful. One possible alternative is to use simple additive counts to score individuals on DSMIV PDs, which would still use information from the Lynam and Widiger (2001) FFM prototypes.2 To do this, one would first have to identify which facets were considered prototypically low or prototypically high for each PD (i.e., a facet with a score ≥ 4 or ≤ 2 on the Lynam & Widiger pro- 406 ASSESSMENT totypes), reverse key the facets with a score of ≤ 2, and sum the scores in the same (high) maladaptive direction (see the appendix for count syntax and coding information). A clinician or researcher would then simply add an individual’s scores across relevant facets. For example, according to this strategy, the FFM PD count for histrionic PD would involve adding together the following facets: selfconsciousness (a facet of neuroticism [N], which would be reverse scored), impulsivity (N), gregariousness (a facet of extraversion [E]), activity (E), excitement seeking (E), positive emotions (E), openness to fantasy (a facet of openness to experience [O]), openness to feelings (O), openness to actions (O), trust (a facet of agreeableness [A]), self-discipline (a facet of conscientiousness [C], which would be reverse scored), and deliberation (C, which would be reverse scored). These counts, which have not been tested, would have greater clinical utility if they work as well as the overall prototype-matching technique. However, because they do not take into account the full FFM profile (the number of facets used in the counts range from 7 to 17), the counts may not perform as well as the similarity scores. In the current study, we examined the success of these counts in comparison to the FFM PD similarity scores in two samples, both of which have been previously used to demonstrate the success of the FFM similarity scores.3 In particular, we provide descriptive statistics for the FFM counts across both samples. Next, we examine the convergent validity of the FFM counts in relation to PD symptom counts generated by well-known PD measures and compare their performance to the FFM similarity scores. Finally, we present data from ROC analyses using FFM counts and similarity scores to identify patients who met criteria for the PD diagnoses. METHOD Sample 1 Participants and Procedures The sample consisted of 115 patients (53 men, 62 women) assessed at the Psychological Assessment Service at a large tertiary care, medical school–affiliated, psychiatric facility located in a large, primarily Englishspeaking, North American metropolis. Ethnic status was reported for 94 patients; 90 were of European descent, 2 were of African descent, 1 was of Asian descent, and 1 was of Hispanic descent. Most of these referrals were outpatients (n = 100). Mood (n = 91, 79%) and anxiety (n = 9, 8%) disorders were the most common diagnoses. The mean age of this sample was 41.4 (SD = 11.26). All patients were assessed with the Structured Clinical Interview for DSM-IV (SCID), Axis I Disorders (Version 2.0/Patient Form; First, Spitzer, Gibbon, & Williams, 1995) and completed the Structured Clinical Interview for DSM-IV Personality Disorders–Personality Questionnaire (SCID-II/PQ; First, Gibbon, Spitzer, Williams, & Benjamin, 1997) and NEO PI-R. Advanced clinical psychology interns (n = 5), two M.A.-level clinical psychologists, and a postdoctoral clinical fellow conducted the interviews. Although interrater agreement was not formally determined, all interviewers were trained extensively in the interview procedures and carefully observed and approved by a Ph.D.-level clinical psychologist prior to conducting any interview. Measures SCID-II Personality Questionnaire (SCID-II/PQ). PD symptomatology was assessed via a two-tiered approach. First, all participants were assessed using the 119-item self-report questionnaire version of the SCID-II (SCID-II/ PQ), on which items are answered using a yes-no response format. Each of the 119 questions corresponds to the diagnostic criteria for the 10 different PDs in the main text of DSM-IV and the two additional PDs listed in Appendix B of DSM-IV. Following this, the SCID-II interview items were asked for those disorders where full DSM-IV criteria were met on the self-report measure. In the current study, we used both dimensionalized sum scores (a sum of each PD’s items) derived from the self-report report ratings for each of the PDs and the actual no-yes diagnoses that use self-report and interview data. Although self-report measures are prone to overestimating PDs, a number of studies have shown that the dimensional self-report scales have reasonable validity (e.g., Carey, 1994; Huprich, 2003). The coefficient alphas for the self-report items ranged from .32 (OCPD) to .84 (borderline PD), with a median alpha of .69. NEO PI-R. The NEO PI-R (Costa & McCrae, 1992) was specifically designed to measure the FFM of personality and provides domain scores corresponding to N, E, O, A, and C. The NEO PI-R consists of 240 self-report items answered on a 5-point scale, with separate scales for each of the five domains. Each scale consists of six correlated facets or subscales with eight items, for a total of 48 items for each scale. Internal consistency reliabilities for the five domains ranged from .89 (A) to .94 (N), whereas the internal consistency reliabilities of the facet scales ranged from .56 to .89 (median coefficient alpha = .79). FFM PD similarity scores. We calculated similarity scores for each of the 10 DSM-IV PDs by using intraclass correlations between participants’ obtained NEO-PI-R Miller et al. / SCORING PERSONALITY DISORDERS WITH FFM 407 facets scale scores and the expert-generated facet profiles of the PD prototypes as described in Lynam and Widiger (2001). An intraclass Q correlation (in which individuals’ FFM profiles and the 10 FFM PD prototypes are entered as columns) was used because it considers both the shape and elevation of individual scores (in comparison to the expert prototypes) rather than the shape alone, as is the case with a Pearson correlation. As such, it is a more stringent measure of agreement. FFM PD counts. The FFM PD counts represent an alternative method for scoring the Lynam and Widiger (2001) prototypes. Rather than using a prototype-matching technique as discussed earlier, a simple count is used in which facets that were rated as being prototypically high (≥ 4) or prototypically low (≤ 2) are summed together (see the appendix for the 10 PD facet counts). However, facets that are considered prototypically low (e.g., straightforwardness in antisocial PD) are reverse scored so that all facets are scored in the direction of maladaptivity for that specific PD. Sample 2 Participants and Procedures Participants were either inpatients or outpatients undergoing assessment or treatment at one of several facilities affiliated with the University of Iowa. Outpatients were recruited from either the university medical center psychiatry clinic or the university-based psychology clinic staffed by graduate students and faculty of the psychology department. Inpatients were recruited from the university medical center psychiatric units, which serve a general psychiatric population, with a small minority of participants (10%) recruited from the eating disorder specialty unit. Individuals with personality pathology were not selectively recruited for participation. Rather, the goal of the sampling strategy was to approximate a general clinical sample that included a variety of clinical problems and a wide range of severity of psychopathology. Patients who met the following inclusion criteria were asked to participate: age of 18 years or older, high school diploma or GED, and absence of active psychosis, organic brain syndrome, or mental retardation (per available chart information). The data presented here are from 94 participants: 58 outpatients (62%) and 36 inpatients. The sample included 69 women (73%) and 25 men. Mean age was 34.6 (range = 18 to 76, SD = 10.5). The modal participant was Caucasian (96%), unmarried (71%), and employed (72%). The mean of self-reported age of first psychiatric contact was 24.4 (range = 5 to 59, SD = 10.5), and 55% of the sample had had at least one prior psychiatric hospitalization (M = 3.0, SD = 4.6). Axis I disorders were not formally assessed; however, available Axis I chart diagnoses made as part of routine clinical care were noted. These diagnoses often had been made years prior to the present study and may have limited validity. Nonetheless, the majority of participants received an Axis I diagnosis (88%), with the most frequent diagnosis a mood disorder (53%). Measures Structured Interview for DSM-IV Personality (SIDPIV). The SIDP-IV (Pfohl, Blum, & Zimmerman, 1997) is a semi-structured interview that contains probe questions developed to assess each of the DSM-IV PD criteria. The questions are grouped into 10 areas of functioning (e.g., close relationships, work style, perception of others) rather than by diagnoses. Following the interview, each criterion is rated on a 4-point scale (0 = not present; 1 = subthreshold features; 2 = clearly present, clinically significant; 3 = prominent symptom). Dimensional scores were calculated for each diagnosis by summing the component criterion scores (0 to 4). Diagnoses were scored in a manner consistent with the SIDP manual and DSM-IV. Interviews were conducted by two clinical psychology graduate students who were trained in the administration and scoring of the SIDP-IV by an author of the instrument. As suggested by the SIDP-IV authors, chart information, when available, was used as additional data in rating each criterion. To examine the interjudge agreement of the PD ratings, a second rater reviewed audiotapes of a subset of interviews (18%) and provided independent ratings. Intraclass correlation coefficients (ICC) were computed for the dimensional scores of each PD scale, and the mean ICC for the 10 PDs was .90. Schizotypal PD was the least reliably rated criteria set (ICC = .77), whereas borderline and avoidant were the most reliably rated (ICCs = .96). In terms of internal consistency, coefficient alphas ranged from .53 (schizoid) to .79 (borderline, avoidant), with a median of .72. FFM measures. All the FFM measures (e.g., NEO PIR; FFM PD similarity scores, FFM PD counts) were the same as Sample 1. RESULTS Descriptive Statistics Table 1 presents descriptive statistics for the FFM PD count scores. The mean FFM counts were quite similar across the samples (e.g., M FFM PD counts for paranoid = 408 ASSESSMENT TABLE 1 Descriptive Characteristics of FFM PD Counts FFM Counts PAR PAR SZD SZD SCT SCT APD APD BPD BPD HST HST NAR NAR AVD AVD DEP DEP OC OC Sample 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Min. Max. M SD M (by Facets) 64 56 56 60 59 53 131 133 57 97 122 125 94 102 98 90 84 66 93 138 219 201 194 191 177 166 342 354 249 225 248 266 251 282 247 252 185 183 287 271 133.81 131.47 131.60 119.46 125.57 120.10 226.71 233.47 159.69 164.55 190.77 200.24 163.63 164.67 185.39 175.88 135.94 132.68 204.03 204.48 30.90 26.57 26.97 28.07 23.44 20.72 33.97 35.02 28.58 26.86 25.95 31.94 29.04 27.58 30.98 31.24 20.05 21.82 29.95 30.82 13.38 13.15 16.45 14.93 17.94 17.16 13.34 13.73 17.74 18.28 15.90 16.69 12.59 12.67 18.54 17.59 19.42 18.95 15.69 15.73 NOTE: FFM = Five-Factor Model; PD = personality disorder; PAR = paranoid; SZD = schizoid; SCT = schizotypal; APD = antisocial; BPD = borderline; HST = histrionic; NAR = narcissistic; AVD = avoidant; DEP = dependent; OC = obsessive-compulsive. Because the counts have a different number of facets (ranging from 7 to 17), we provide the mean score taking into account the number of facets. 133.81 [Sample 1] and 131.47 [Sample 2]), with a mean difference of 5.53. Because the counts use a different number of facets (ranging from 7 to 17), we also provide mean count scores that take this into account (thus making it possible to compare scores across the FFM counts). Correlations Between FFM PD Counts and Similarity Scores We next examined the correlations between the FFM PD similarity scores and the FFM PD counts across the samples. In Sample 1, the correlations ranged from .75 (histrionic) to .97 (avoidant), with a median r of .91. There was one case of a gender difference: The correlation between the similarity score and count was significantly different for dependent PD, with an r of .89 for men and .78 for women. In Sample 2, the correlations between the FFM similarity scores and the FFM counts ranged from .77 (narcissism for women only) to .98 (avoidant), with a median r of .91. In this sample, there were two significant gender differences in the size of the correlations; the correlation for narcissism was .94 for men and .77 for women, whereas the correlation for dependent was .93 for men and .80 for women. Correlations Between FFM PD Counts and PD Symptom Counts Next, we examined the convergent validity of the FFM counts with PD symptom counts from well-known measures of PD symptoms (see Table 2). In Sample 1, the correlations between the FFM PD counts and the PD symptom counts ranged from –.02 (OCPD) to .64 (borderline), with a median r of .40. In Sample 2, the correlations between the FFM PD counts and the PD symptom counts ranged from –.15 (histrionic for men only) to .64 (avoidant), with a median r of .45. As noted, there was one case in which the correlation between the FFM count and the PD symptoms was significantly different across gender; in Sample 2, the correlation between the FFM histrionic count and a histrionic PD diagnosis was significantly larger (and positive) for women. We next tested, in each sample, whether the correlation for each FFM PD count was significantly different than the correlation previously reported (Miller et al., in press; Miller, Reynolds, et al., 2004) for its respective FFM similarity score. Of the 21 comparisons, only 1 was significantly different. The correlation between the FFM dependent count and the dependent symptom count (r = .34) in Sample 1 was significantly larger than the correlation for the FFM dependent similarity score and dependent PD count (r = .24). These findings suggest that the differ- Miller et al. / SCORING PERSONALITY DISORDERS WITH FFM 409 ences with regard to convergent validity are quite minimal between the FFM similarity scores and the counts. As can be seen in Table 2, we calculated weighted effect sizes using meta-analytic techniques (i.e., Fisher ztransformed rs were combined, taking into account sample size, to obtain a mean effect size and then were transformed back to rs) for the FFM counts. Overall, the effect sizes ranged from .02 to .63, with one small effect size (OCPD), six medium effect sizes (paranoid, schizotypal, antisocial, histrionic, narcissistic, and dependent), and three large effect sizes (schizoid, borderline, and avoidant). We also examined the divergent validity of the FFM PD counts with the PD symptom counts. In Sample 1, the discriminant validity correlations ranged from –.34 (FFM histrionic and avoidant PD symptoms; FFM OCPD and borderline PD symptoms) to .64 (FFM schizotypal and avoidant PD symptoms), with an absolute median correlation of .21. In Sample 2, the discriminant correlations ranged from –.53 (FFM histrionic and schizoid PD symptoms) to .63 (FFM schizotypal and avoidant PD symptoms), with an absolute median correlation of .21. TABLE 2 Correlations Between FFM PD Counts and PD Symptom Counts FFM PD PD Sample 1 Sample 2 Paranoid PD count Schizoid PD count Schizotypal PD count Antisocial PD count Borderline PD count Histrionic PD count Narcissistic PD count Avoidant PD count Dependent PD count OCPD PD count .41** .40** .40** .36** .64** .33** .45** .63** a .34 ** –.02 .44** .60** .28** .51** .56** –.15/.41** .45** .64** .46** .08 Weighted Effect Size .42 .50 .35 .43 .61 .31 .45 .63 .40 .02 NOTE: FFM = Five-Factor Model; PD = personality disorder; OCPD = obsessive-compulsive PD. / = a significant gender difference in the size of the correlation. The relation for men is presented before the diagonal, women after it. a. Correlation is significantly different between the FFM count and PD symptoms and the FFM similarity score and PD symptoms from Miller et al. (in press), which was .24. *p ≤ .05. **p ≤ .01. Receiver Operating Characteristics (ROCs) Finally, in the interest of clinical utility and our desire to provide a basis for initial decision making regarding the use of the FFM counts and similarity scores to identify PDs, we conducted a series of ROC analyses. These analyses provide important diagnostic efficiency statistics, such as sensitivity, specificity, and positive and negative predictive power, associated with the raw scores. Because these analyses require that a certain number of individuals receive a PD diagnosis, we limited our analyses in each sample to those PDs that had a sufficient prevalence. This, coupled with the poor performance of the FFM counts and similarity scores to capture OCPD, limited us to testing 8 of the 10 PDs in Sample 1 and 3 of the 10 in Sample 2. Table 3 provides information regarding the PD prevalence in each sample, the area under the curve (AUC) accounted for by the similarity scores and counts, the first raw score that manifested a sensitivity equaling or exceeding .80 for each method, and other diagnostic efficiency statistics.4 The AUC was significant for 10 of 11 similarity scores and for 11 of 11 counts across the two samples. The median AUCs accounted for by the similarity scores and counts, across samples, was .77 and .78, respectively. We also calculated median sensitivities, specificities, positive predictive power (PPP), and negative predictive power for these cut scores. For the similarity scores, the medians for these diagnostic statistics were .82, .61, .31, and .94, respectively. For the counts, the medians for these diagnostic statistics were .82, .63, .31, and .94, respectively. DISCUSSION The use of measures of general personality to understand and assess constructs has been primarily a matter of theoretical interest aimed at demonstrating that PDs are extensions or variants of general personality traits. Recent studies have put forth a new technique by which an individual’s general personality profile, with regard to the FFM, can be matched to the PDs. However, because of the complexity of the scoring methodology, the probability of this technique being used in clinical settings seems low. As we have noted previously (Bagby, Schuller, Marshall, & Ryder, 2004; Miller, Reynolds, et al., 2004), we believe that using the FFM as an assessment tool for both adaptive and maladaptive personality variants has real advantages. So in conjunction with this belief, we sought to develop a manner of scoring PDs with FFM data that also uses the broad expertise collected in the Lynam and Widiger (2001) prototypes. As noted earlier, these expertgenerated prototypes have been quite successful in capturing PD constructs, including those in DSM-IV, such as borderline PD, and those not included, such as psychopathy. Given the general success of these prototypes, it seemed particularly important to develop a scoring methodology that used, in some form, the prototype information but did so in a manner that might have real world applications. 410 29/25% 29/25% 21/18% 21/18% 9/8% 9/8% 19/17% 19/17% 52/45% 52/45% 25/22% 25/22% 58/50% 58/50% 9/8% 9/8% Paranoid similarity Paranoid count Schizoid similarity Schizoid count Schizotypal similarity Schizotypal count Antisocial similarity Antisocial count Borderline similarity Borderline count Narcissistic similarity Narcissistic count Avoidant similarity Avoidant count Dependent similarity Dependent count .69** .69** .77** .79** .80** .78** .67* .69** .78** .80** .83** .80** .75** .75** .59 .72* .87** .85** .73* .78** .78** .75** Sample 1 Sample 2 Area –.32 116.5 –.11 131.0 .28 131.5 –.45 226.0 –.09 156.0 –.41 170.5 .20 179.5 .13 129.5 .34 192.0 .23 143.5 –.03 148.5 Sample 1 Sample 2 Raw Scores .83 .83 .81 .91 .89 .89 .84 .84 .81 .81 .80 .80 .81 .81 .89 .89 .82 .82 .80 .80 .86 .82 Sample 1 Sample 2 Sensitivity NOTE: FFM = Five-Factor Model; PD = personality disorder; PPP = positive predictive power; NPP = negative predictive power. *p ≤ .05. **p ≤ .01. 5/5% 5/5% 0/0% 0/0% 0/0% 0/0% 2/2% 2/2% 22/23% 22/23% 3/3% 3/3% 22/23% 22/23% 10/11% 10/11% Sample 1 Sample 2 PDs PD Base Rate/ Percentage .37 .34 .54 .54 .70 .64 .46 .55 .67 .65 .66 .72 .61 .63 .30 .38 .86 .85 .58 .75 .63 .57 Sample 1 Sample 2 Specificity PPP .31 .30 .28 .31 .20 .17 .24 .27 .67 .66 .39 .44 .68 .69 .10 .11 .64 .62 .19 .28 .41 .37 Sample 1 Sample 2 TABLE 3 Receiver Operating Characteristics of FFM PD Similarity Scores and Counts .86 .85 .93 .96 .99 .99 .94 .95 .81 .80 .92 .93 .76 .77 .97 .98 .94 .94 .96 .97 .94 .91 Sample 1 Sample 2 NPP Miller et al. / SCORING PERSONALITY DISORDERS WITH FFM 411 To test the comparability of the FFM PD counts with the FFM PD similarity scores, we first examined the correlations between these two overlapping scoring methodologies across the two samples. The FFM PD counts were highly correlated with FFM PD similarity scores; the median correlations were .91 and .91 in Samples 1 and 2, respectively. We then compared the size of the correlations between these two FFM measures and PD symptom counts across two similar clinical samples. Finally, using receiver operator characteristics, we identified cut scores for the FFM similarity scores and counts and looked at the generalizability of these cut score across the samples. Across the analyses and samples, the FFM similarity scores and counts performed in a nearly identical fashion. The median correlations for the FFM counts with the PD symptom counts were .40 and .45 in Samples 1 and 2, respectively. The median correlations between the similarity scores and the PD symptom counts, across the two samples, were .39 (Miller et al., in press) and .50 (Miller, Reynolds et al., 2004), respectively. In fact, there was only one case in which a correlation was significantly different between the counts and similarity scores; in Sample 1, the correlation for the dependent count was stronger than the respective correlation using the similarity score. This difference, however, was small (d = .11). As has been the case with the FFM similarity scores, the FFM counts were not significantly related to obsessivecompulsive PD (OCPD). This is not an uncommon finding; numerous studies have found that OCPD is not well captured by the FFM (Ball, Tennen, Poling, Kranzler, & Rounsaville, 1997; Huprich, 2003; Saulsman & Page, 2004; cf. Dyce & O’Connor, 1998). The two other PDs that are typically more weakly represented by the FFM, schizotypal and dependent PD, were significantly related to their respective PD diagnoses, albeit less strongly and consistently across the samples. There are several potential explanations for this failure. One explanation put forth by Haigler and Widiger (2001) is that the NEO PI-R does not include an adequate number of items written to assess maladaptivity at both the high and low poles of the domains. So PDs hypothesized to be based, in part, on high scores on domains such as C (OCPD), A (dependent), or openness (schizotypal) may be more poorly assessed by the FFM. Haigler and Widiger (2001) found that manipulating the NEO PI-R items to include more items representing maladaptively high variants of the FFM domains increased the size of the correlations between OCPD, dependent, and schizotypal PDs with C, A, and openness, respectively. Miller, Reynolds, et al. (2004) also suggested that it is possible that the prototypes may be mistaken in their view of certain disorders, such as the relation between dependent PD and A and C. For example, results from the current samples are consistent with those reported in a meta-analysis by Bornstein and Cecero (2000) that suggest that dependent PD is negatively correlated with trust (a facet of A) and certain C facets (e.g., competence, achievement striving, self-discipline) rather than positively, as postulated by the Lynam and Widiger (2001) prototypes. These findings are further complicated by the idea that there may be different forms of dependency, which have different FFM conceptualizations (Pincus, 2002). Further examination will be necessary to tease apart these weaker relations and determine if they are an artifact of the personality measure or a case of misconceptualization with regard to the expert ratings. The findings were also consistent across samples regarding which PDs were best captured by the FFM count. In particular, schizoid, borderline, and avoidant PDs were well accounted for by the FFM counts across both clinical samples. Weighted effect sizes for the relations between these three counts and PD symptoms across the samples were large. One innovative aspect of this study is that it provides cut scores from one to two clinical samples that can be used for the FFM similarity scores and counts. These analyses are important because they provide information that allows these two scoring techniques scores to be used in clinical settings as a screening measure for several of the PDs. The data gleaned from these analyses, although tentative, represent an important step toward making these approaches clinically useful. However, because of the sample sizes and the use of a self-report PD measure (Sample 1), these scores should be tested further to see whether they replicate in other clinical samples. As with the bivariate relations, borderline and avoidant PDs were well accounted for by the similarity scores and counts, as the diagnostic efficiency statistics worked quite well. Although 21 of 22 cut scores manifested a significant AUC, the similarity scores and counts for paranoid and antisocial PD had scores that would be deemed poor (e.g., .6 to .7). The rest of the PDs tested had, for the most part, cut scores that resulted in either fair (e.g., .7 to .8) or good (e.g., .8 to .9) AUCs. As we have advocated for the use of the FFM as a screening tool and not a stand-alone, comprehensive PD assessment battery, we believe that sensitivity is more important than specificity because the false positives should be ruled out on further assessment. Given that we identified cut scores with sensitivities of .80 or higher, 16 of the 22 cut scores also had specificities higher than .50. In fact, the median specificity score was .61 for the similarity scores and .63 for the counts. The FFM similarity scores and counts also demonstrated good negative predictive power (medians = .94 and .94). However, the same was not true for PPP (medians = .31). Similar to most 412 ASSESSMENT self-report PD questionnaires, the FFM similarity scores and counts demonstrate a clear tendency toward overdiagnosis. The median PPP for the FFM counts is actually better than those reported for the Personality Diagnostic Questionnaire–4+ (PDQ-4+; Hyler, 1994); median reported PPPs for the PDQ-4+ include .16 (Yang et al., 2000), .18 (Wilberg, Dammen, & Fries, 2000), and .19 (Fossati et al., 1998). Similarly, median PPPs from the self-report Millon Clinical Multiaxial Inventory–III, as reported by Hsu (2002), have ranged from .18 (Millon, 1994) to .72 (Millon, Davis, & Millon, 1997; see Hsu, 2002, for possible explanations for elevated scores in this sample). These results, including those reported here, suggest that self-report PD measures, regardless of how they were created, are prone to generating a high number of false positives. Limitations One limitation of this study is that the data used here were primarily limited to self-reports. As such, the size of the relations reported may have been inflated because of common method variance. This concern is mitigated to some degree by previous studies that have found similar patterns of findings using significant other reports and interview-based data (Miller et al., in press; Miller, Pilkonis, et al., 2004). Another limitation is that the cut scores provided are based on one or two moderately sized clinical samples. In addition, PD diagnoses in Sample 1 were, in part, determined by a self-report scale, which affects the reliability of the subsequent diagnoses. As such, these diagnostic scores should be viewed cautiously as it is possible that they will be sample specific and fail to generalize to other samples. The replication reported here for three of the PDs may make this less likely for these PDs, but it is a concern for the remaining ones. Because of the low prevalence rate of certain PDs and the size of our clinical samples, we were unable to provide a comprehensive test of the diagnostic efficiencies of these methods for all 10 PDs. Finally, given the clinical nature of both samples, the cut scores may only be appropriate for use with individuals of a moderate to high severity and may be less appropriate for use in nonclinical samples. CONCLUSION Overall, the current results suggest that both the FFM counts and the full prototype-matching technique (e.g., FFM similarity scores) are equally successful in relating to PD symptoms. With the exception of OCPD, the FFM counts and similarity scores are relatively successful at capturing the various DSM-IV PD constructs. The counts may be easier to use given the simplicity of the scoring methodology; however, it is worth noting that scoring of the FFM PD prototypes is now available using two readily available software programs. The current results should move this line of research forward by allowing clinicians and/or researchers to use the FFM PD prototypes (in either the count or similarity score form) in clinical settings. We believe that it is important to consider that the counts are still a technique that takes into account the dimensional, multitrait model even if it is not as broad or comprehensive as the full prototype-matching technique. In fact, 9 of the 10 FFM PD counts use facets from three or more of the personality domains, thus ensuring relatively broad coverage. Overall, we believe that the development of these simple additive counts will make this approach more applicable in clinical settings. A benefit of this is that it will allow clinicians to gather data on clients’ general personality configurations as well as their more maladaptive personality styles. More broadly, we believe that the use of basic, dimensional models of personality in understanding DSM Axis II diagnoses holds great promise for providing a model for more empirically valid measures of personality-based psychopathology. NOTES 1. FFM PD similarity scoring programs via Microsoft Excel worksheet and/or SPSS syntax are available from the first or last author. 2. We would like to thank an anonymous reviewer from a previous manuscript who first suggested the idea of using additive counts based on the FFM to assess the PDs. 3. See Miller et al. (in press) and Miller, Reynolds, et al. (2004) for specific data on the relations between the FFM PD similarity scores and PD symptoms. 4. A number of the cut scores for the FFM similarity scores have a negative value. Because these scores take into account similarity across 30 traits, many individuals who are considered a good match (e.g., meet or exceed the identified cut score) are still going to be quite dissimilar to the overall prototype, which is reflected in these negative values. Miller et al. / SCORING PERSONALITY DISORDERS WITH FFM 413 APPENDIX FFM PD Counts Paranoid PD Schizoid PD Schizotypal PD Antisocial PD Borderline PD Histrionic PD Narcissistic PD Avoidant PD Dependent PD OCPD = = = = = = = = = = n2 + e1r +e2r + o4r + o6r + a1r + a2r + a3r + a4r + a6r. e1r + e2r + e3r + e4r + e5r + e6r + o3r + o4r. n1 + n4 + e1r + e2r + e6r + o5 + c2r. n1r + n2 + n4r + n5 + e3 + e4 + e5 + o4 + a1r + a2r + a3r + a4r + a5r + a6r + c3r + c5r + c6r. n1 + n2 + n3 + n5 + n6 + o3 + o4 + a4r + c6r. n4r + n5 + e2 + e4 + e5 + e6 + o1 + o3 + o4 + a1 + c5r + c6r. n2+n4r + e1r + e3 + e5 + o3r + o4 + a1r + a2r + a3r + a4r + a5r + a6r. n1 + n4 + n5r + n6 + e2r + e3r + e5r + e6r + o4r + a5. n1 + n4 + n6 + e3r + a1 + a4 + a5. n1 + n5r + e5r + o3r + o4r + o5r + o6r + c1 + c2 + c3 + c4 + c5 + c6. NOTE: r = indicates that this facet should be reversed scored before summing it into the count. For example, a Trust score (a1) of 31 for antisocial APD would be scored a 1 for the count. 0 = 32 11 = 21 22 = 10 1 = 31 12 = 20 23 = 9 2 = 30 13 = 19 24 = 8 3 = 29 14 = 18 25 = 7 4 = 28 15 = 17 26 = 6 5 = 27 16 = 16 27 = 5 6 = 26 17 = 15 28 = 4 7 = 25 18 = 14 29 =3 8 = 24 19 = 13 30 = 2 9 = 23 20 = 12 31 = 1 10 = 22 21 = 11 32 = 0 REFERENCES American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. Axelrod, S. T., Widiger, T., Trull, T., & Corbitt, E. (1997). Relations of Five-Factor Model antagonism facets with personality disorder symptomatology. Journal of Personality Assessment, 69, 297-313. Bagby, R. M., Costa, P. T., Widiger, T. A., Ryder, A. G., & Marshall, M. (2005). DSM-IV personality disorders and the Five-Factor Model of personality: A multi-method examination of domain and facet-level predictions. European Journal of Personality, 19, 1-18. Bagby, R. M., Schuller, D. R., Marshall, M. B., & Ryder, A. G. (2004). Depressive personality disorder: Rates of comorbidity with personality disorders and relations to the Five-Factor Model of personality. Journal of Personality Disorders, 18, 542-554. Ball, S. A., Tennen, H., Poling, J. C., Kranzler, H. R., & Rounsaville, B. J. (1997). Personality, temperament, and character dimensions and the DSM-IV personality disorders in substance abusers. Journal of Abnormal Psychology, 106, 545-553. Blais, M. A. (1997). Clinician ratings of the Five-Factor Model of personality and the DSM-IV personality disorders. The Journal of Nervous and Mental Disease, 185, 388-394. Bornstein, R. F., & Cecero, J. J. (2000). Deconstructing dependency in a five-factor world: A meta-analytic review. Journal of Personality Assessment, 74, 324-343. Cacciola, J. S., Rutherford, M. J., Alterman, A. I., McKay, J. R., & Mulvaney, F. D. (1998). Long-term test-retest reliability of personality disorder diagnoses in opiate dependent patients. Journal of Personality Disorders, 12, 332-337. Carey, K. B. (1994). Use of the Structured Clinical Interview for DSMIII-R Personality Questionnaire in the presence of severe Axis I disorders: A cautionary note. Journal of Nervous and Mental Disease, 182, 669-671. Clark, L. A. (1993). Manual for the Schedule for Nonadaptive and Adaptive Personality (SNAP). Minneapolis: University of Minnesota Press. Clark, L. A., Livesley, W. J., & Morey, L. (1997). The challenge of construct validity. Journal of Personality Disorders, 11, 205-231. Cloninger, C. R., Svrakic, D. M., Bayon, C., & Przybeck, T. R. (1999). Measurement of psychopathology as variants of personality. In C. R. Cloninger (Ed.), Personality and psychopathology (pp. 33-65). Washington, DC: American Psychiatric Association. Cloninger, C. R., Svrakic, D. M., & Przybeck, T. R. (1993). A psychobiological model of temperament and character. Archives of General Psychiatry, 50, 975-990. Costa, P. T., & McCrae, R. R. (1992). Revised NEO Personality Inventory (NEO-PI-R) and NEO Five-Factor Inventory (NEO-FFI) professional manual. Odessa, FL: Psychological Assessment Resources. Costa, P. T., & Widiger, T. A. (1994). Personality disorders and the FiveFactor Model of personality. Washington, DC: American Psychological Association. Costa, P. T., & Widiger, T. A. (2002). Personality disorders and the FiveFactor Model of personality (2nd ed.). Washington, DC: American Psychological Association. Dyce, J. A., & O’Connnor, B. P. (1998). Personality disorders and the Five-Factor Model: A test of facet-level predictions. Journal of Personality Disorders, 12, 31-45. First, M. B., Gibbon, M., Spitzer, R. L., Williams, J. B. W., & Benjamin, L. S. (1997). Structured Clinical Interview for DSM-IV Axis II Personality Disorders Self-Report. Washington, DC: American Psychiatric Press. First, M. B., Spitzer, R. L., Gibbon, M., & Williams, J. B. W. (1995). Structured Clinical Interview for DSM-IV Axis I disorders–Patient edition, Version 2.0. New York: New York Biometrics Research Department. Fossati, A., Maffei, C., Bagnato, M., Donati, D., Donini, M., Fiorilli, M., et al. (1998). Criterion validity of the Personality Diagnostic Questionnaire-4+ (PDQ-4+) in a mixed psychiatric sample. Journal of Personality Disorders, 12, 172-178. 414 ASSESSMENT Haigler, E. D., & Widiger, T. A. (2001). Experimental manipulation of NEO-PI-R items. Journal of Personality Assessment, 77, 339-358. Harkness, A. R., McNulty, J. L., & Ben-Porath, Y. S. (1995). The personality psychopathology five (PSY-5): Constructs and MMPI-2 scales. Psychological Assessment, 7, 104-114. Hsu, L. M. (2002). Diagnostic validity statistics and the MCMI-III. Psychological Assessment, 14, 410-422. Huprich, S. K. (2003). Evaluating NEO Personality Inventory–Revised profiles in veterans with personality disorders. Journal of Personality Disorders, 17, 33-44. Hyler, S. E. (1994). PDQ-4+ Personality Questionnaire. New York: Author. Klonsky, E. D., Oltmanns, T. F., & Turkheimer, E. (2002). Informantreports of personality disorder: Relation to self-reports and future research directions. Clinical Psychology: Science and Practice, 9, 300311. Livesley, W. J. (1986). Traits and behavioral prototypes of personality disorder. American Journal of Psychiatry, 143, 728-732. Livesley, W. J. (1987). A systematic approach to the delineation of personality disorders. American Journal of Psychiatry, 144, 772-777. Livesley, W. J. (2001). Conceptual and taxonomic issues. In W. J. Livesley (Ed.), Handbook of personality disorders (pp. 3-38). New York: Guilford. Livesley, W. J., Jackson, D., & Schroeder, M. L. (1989). A study of the factorial structure of personality pathology. Journal of Personality Disorders, 3, 292-306. Lynam, D. R., & Widiger, T. A. (2001). Using the Five-Factor Model to represent the DSM-IV personality disorders: An expert consensus approach. Journal of Abnormal Psychology, 110, 401-412. Miller, J. D., Bagby, R. M., & Pilkonis, P. A. (in press). A comparison of the validity of the Five-Factor Model (FFM) personality disorder prototypes using FFM self-report and interview measures. Psychological Assessment. Miller, J. D., & Lynam, D. R. (2003). Psychopathy and the Five-Factor Model of personality: A replication and extension. Journal of Personality Assessment, 81, 168-178. Miller, J. D., Lynam, D., Widiger, T., & Leukefeld, C. (2001). Personality disorders as an extreme variant of common personality dimensions: Can the Five Factor Model represent psychopathy. Journal of Personality, 69, 253-276. Miller, J. D., Pilkonis, P. A., & Morse, J. Q. (2004). Five-Factor Model prototypes for personality disorders: The utility of self-reports and observer ratings. Assessment, 11, 127-138. Miller, J. D., Reynolds, S. K., & Pilkonis, P. A. (2004). The validity of the Five-Factor Model prototypes for personality disorders in two clinical samples. Psychological Assessment, 16, 310-322. Millon, T. (1994). MCMI-III manual. Minneapolis, MN: National Computer Systems. Millon, T., Davis, R., & Millon, C. (1997). MCMI-III manual (2nd ed.). Minneapolis, MN: National Computer Systems. Pfohl, B., Blum, N., & Zimmerman, M. (1997). Structured Interview for DSM-IV Personality. Washington, DC: American Psychiatric Press. Pincus, A. L. (2002). Constellations of dependency within the FiveFactor Model of personality. In P. T. Costa, Jr., & T. A. Widiger (Eds.), Personality disorders and the Five-Factor Model of personality (pp. 203-214). Washington, DC: American Psychological Association. Reynolds, S. K., & Clark, L. A. (2001). Predicting dimensions of personality disorder from domains and facets of the Five-Factor Model. Journal of Personality, 69, 199-222. Saulsman, L. M., & Page, A. C. (2004). The Five-Factor Model and personality disorder empirical literature: A meta-analytic review. Clinical Psychology Review, 23, 1055-1085. Siever, L. J., & Davis, K. L. (1991). A psychobiological perspective on the personality disorders. American Journal of Psychiatry, 148, 16471658. Trull, T. J. (1992). DSM-III-R personality disorders and the Five-Factor Model of personality: An empirical comparison. Journal of Abnormal Psychology, 101, 553-560. Trull, T. J., Widiger, T. A., Lynam, D. R., & Costa, P. T. (2003). Borderline personality disorder from the perspective of general personality functioning. Journal of Abnormal Psychology, 112, 193-202. Westen, D., & Arkowitz-Westen, L. (1998). Limitations of Axis II in diagnosing personality pathology in clinical practice. American Journal of Psychiatry, 155, 1767-1771. Widiger, T. A. (1993). The DSM-III-R categorical personality disorder diagnoses: A critique and an alternative. Psychological Inquiry, 4, 75-90. Widiger, T. A., Trull, T. J., Clarkin, J. F., Sanderson, C. J., & Costa, P. T. (1994). A description of the DSM-III-R and DSM-IV personality disorders with the Five-Factor Model of personality. In P. T. Costa, Jr., & T. A. Widiger (Eds.), Personality disorders and the Five-Factor Model of personality (pp. 41-56). Washington, DC: American Psychological Association. Wilberg, T., Dammen, T., & Friis, S. (2000). Comparing Personality Diagnostic Questionnaire-4+ with Longitudinal, Expert, All Data (LEAD) standard diagnoses in a sample with a high prevalence of Axis I and Axis II disorders. Comprehensive Psychiatry, 41, 2953002. Yang, J., McCrae, R. R., Costa, P. T., Yao, S., Dai, X., Cai, T., et al. (2000). The cross-cultural generalizability of Axis II constructs: An evaluation of two personality disorder assessment instruments in the People’s Republic of China. Journal of Personality Disorders, 14, 249263. Joshua D. Miller, Ph.D., received his degree in clinical psychology from the University of Kentucky and is currently an assistant professor of psychology at the University of Georgia. His research focuses on the role of general personality traits in understanding personality psychopathology, such as the DSM personality disorders and psychopathy, as well as problematic, externalizing behaviors, such as antisocial behavior, substance use, risky sex, and aggression. R. Michael Bagby, Ph.D., C. Psych, is a professor in the Department of Psychiatry at the University of Toronto, and is the director of the Clinical Research Department, as well as the codirector of the Psychological Assessment Service at the Centre for Addiction and Mental Health. He has a wide range of clinical and research interests, including an active program of research in the assessment of malingering and socially desirable responding. Other interests include the relation between personality and depression, and the use of the Five Factor Model of personality in the assessment of personality pathology. Paul A. Pilkonis, Ph.D., is a professor of psychiatry and psychology in the Department of Psychiatry, Western Psychiatric Institute and Clinic, University of Pittsburgh School of Medicine. His primary interest is clinical research, both research on psychopathology, with a focus on the assessment and longitudinal course of personality disorders, and research on psychosocial treatments for personality and affective disorders. Sarah K. Reynolds, Ph.D., is an assistant professor in the Department of Psychiatry, Western Psychiatric Institute and Clinic, University of Pittsburgh School of Medicine. Her primary research interest is in the assessment and treatment of personality Miller et al. / SCORING PERSONALITY DISORDERS WITH FFM 415 disorder, with a focus on the development of psychosocial interventions for women with personality disorder and cooccurring medical problems. Donald R. Lynam, Ph.D., received his degree in clinical psychology from the University of Wisconsin–Madison and is cur- rently a professor of psychology at the University of Kentucky. His primary research interests include developmental models of antisocial behavior, the role of individual differences in deviance, the early identification of chronic offenders, and psychopathy at the juvenile and adult levels.