TAPAS: Tailored Adaptive Personality Assessment System TAILORED ADAPTIVE PERSONALITY ASSESSMENT SYSTEM (TAPAS) Fritz Drasgow, Stephen Stark, and Sasha Chernyshenko Drasgow Consulting Group Introduction Over the past 5 years, we have worked on developing a comprehensive assessment system, which we refer to as the Tailored Adaptive Personality Assessment System or TAPAS. This work is supported by an SBIR grant from the Army. Dr. Len White, Contracting Officer’s Representative. In our approach, we have combined modern psychometric methods, computing technology, and research findings from the personnel selection and personality domains to create a system that is innovative not only in terms of how personality constructs are being measured (i.e., the psychometric underpinnings of TAPAS), but also what aspects of personality should be measured, at what level of generality, and for what purposes (i.e., the content of TAPAS). In a nutshell, TAPAS is designed to be easily customizable to meet the assessment needs of virtually any civilian or military organization, both in terms of test content and test administration. Unlike previously available instruments, TAPAS allows users to choose: 1) the response format (items can be presented as single stimulus or two-alternative forced-choice); 2) the scale length (the user decides on the number of items per personality dimension); 3) the constructs/traits to be assessed (the user picks which traits to administer); and 4) the item presentation algorithm (static forms for everyone or adaptive item selection tailored to a specific examinee). Background TAPAS has its roots in the growing interest in temperament/personality as a predictor of job performance and other outcomes over the last fifteen years. This increase has been caused by: legal and societal concerns about adverse impact associated with the use of intelligence test scores for selection and promotion; empirical evidence showing that temperament constructs provide incremental validity beyond general cognitive ability in predicting performance across a diverse array of civilian and military occupations (e.g., Barrick & Mount, 1991; Campbell & Knapp, 2001; Schmidt & Hunter, 1998); and the search for a means of predicting contextual performance, adaptability, and retention of employees because cognitive predictors have little or no ability to predict these outcomes. To illustrate, in 2005 the U.S. Defense Manpower Data Center convened a four-member Review Panel of experts in the areas of personnel selection and psychometrics to consider changes for the Armed Services Vocational Aptitude Battery (ASVAB). Representatives of the Services told 1 TAPAS: Tailored Adaptive Personality Assessment System the Review Panel that their leadership criticized the ASVAB because it provided little help with respect to: predicting adaptability to the Service; creating a culture with a strong commitment to the Service; enhancing disciplined initiative; fostering teamwork; improving problem solving skills; promoting continuous learning. With the exception of problem solving, it is unlikely that any cognitive ability test battery can help. Despite the clear need, noncognitive selection tools have been little used by the military. For example, in the early 1990s, Navy researchers Tom Trent and John Pass sought to implement a single statement personality measure called the ASAP and Army researchers Len White and Mark Young had similar intentions for an instrument called the ABLE. An impressive set of studies showed that these measures predicted important behaviors (e.g., White, Nord, Mael, & Young, 1993; White, Young, & Rumsey, 2001). Nonetheless, the Department of Defense Advisory Committee on Military Personnel Testing recommended against implementation because of concerns that single statement items were easily compromised by faking good (White et al., 1993). In sum, there is a critical need for a well-designed personality assessment system capable of supporting the aforementioned personnel objectives. Developing such system presents a formidable psychometric challenge. Not only must it be valid for the purposes listed above, but it must also resist socially desirable responding and, perhaps, be implemented in a way that minimizes applicants’ motivation to fake good. Existing Personality Assessment Systems Nearly all personality inventories available today have evolved from measures developed for research purposes. As a consequence, the majority of batteries consist of many scales having 10 or so single stimulus items (i.e., the items consist of statements like “I enjoy meeting new people”) developed and scored using classical test theory methods. Such scales, however, are more useful in research and counseling settings than for making important personnel decisions. First, in high stakes testing situations, research shows that single statement temperament items can be easily faked; i.e., test takers can discern the correct or socially desirable answers and, thus, increase or decrease their scores to suit their personal needs (White & Young, 1998). This intentional distortion can severely undermine the utility of temperament measures. Second, currently used scales are not constructed to measure accurately across all levels of the trait continuum. Specifically, because classical test theory methods are used to evaluate and choose items during scale development, only those having moderately positive and moderately negative standing on the underlying trait continuum are retained; extreme and neutral items are discarded (Stark, Chernyshenko, & Drasgow, 2003, 2005). This degrades the validity of the rank-order of high and low scoring individuals who are often of primary interest in selection contexts. Finally, traditional temperament measures are inefficient and cumbersome to administer and maintain. They have rigid administration prescriptions in the sense that all items must be administered to 2 TAPAS: Tailored Adaptive Personality Assessment System every individual in a fixed order. This increases testing time and decreases test security through repeated item exposure. In addition, because organizations are often interested in different subsets of scales for different occupations, it would be better to have a flexible way of choosing the constructs assessed on particular occasions, an option not available in most current inventories. There have been several attempts to develop inventories having items in a forced-choice rather than the single statement format (items in these “forced-choice” measures typically consist of four statements representing different dimensions and respondents are asked to select “most like me” and “least like me” statements). This alternative format appears to be more resistant to response distortion, and thus may provide solutions to the faking problem (Jackson, Wrobleski, & Ashton, 2000). Yet, adopting such inventories brings a new set of psychometric challenges. First, traditional scoring of forced-choice items produces ipsative or partially ipsative scores. This means that when a person’s score on a trait is high, it is high relative to that particular person’s score on other traits. Scores on ipsative measures cannot be interpreted normatively, i.e., when a person’s score is high, we do not know if it is high relative to other people. Thus, ipsative measurement raises concerns of between person score comparability. Second, no formal psychometric model is usually specified, which makes it difficult to evaluate score precision or to anticipate the performance of newly constructed items. Third, all test items must be administered to examinees and even small changes in scale length or item composition compromise the comparability of scores across examinees or test administrations. Finally, to obviate ipsativity problems, it is usually recommended that a large number of scales be administered, which is time consuming and often impractical in applied contexts. Tailored Adaptive Personality Assessment System (TAPAS) How does TAPAS measure? The TAPAS measurement approach is rooted in item response theory (IRT) and thus is similar to such well-known tests as the ASVAB or the Graduate Record Exam (GRE). For each TAPAS dimension, there is a pool of items that have been pre-calibrated using large representative samples of military recruits. In computerized settings, to increase test efficiency, items are selected adaptively and depend on an individual’s previous responses (a.k.a., adaptive item presentation). If computerized testing is unavailable, then items can be pre-assembled into scales and presented to examinees (a.k.a., static item presentation). Unlike ASVAB and many personality measures, TAPAS is designed to be extremely flexible in its assessment approach. Instead of having a single response format for presenting items and a single psychometric model for item selection and scoring, TAPAS personality items can be administered in 4 response formats, each having its own computer adaptive item selection and scoring algorithms: 1) Single statement dichotomous response format (Agree/Disagree) administered and scored using the three-parameter logisitic (3PL) model (Birnbaum, 1968) of item response theory. 3 TAPAS: Tailored Adaptive Personality Assessment System . In this format, statements are presented one at a time and examinees asked to if they agree/disagree with the statement. Examples of existing personality inventories using this format are the California Psychological Inventory and the Hogan Personality Inventory. (Note that the CPI and HPI are currently scored using classical total score methods and, hence, don’t have adaptive item selection.) 2) Single statement polytomous response format (Strongly Disagree, Disagree/Agree/Disagree) administered and scored using the SGR model (Samejima, 1968). In this format, statements are presented one at a time and examinees asked to indicate the degree of agreement with the statement using a 4-point Likert scale. Examples of existing personality inventories using this format are the NEO-PI and Goldberg’s AB5C (Note that the NEO-PI and AB5C are currently scored using classical total score methods and, hence, don’t have adaptive item selection. These inventories are also administered in a 5-point Likert format where the middle option is “Neutral.” TAPAS does not have a neutral option, because research [Hernández, Drasgow, & González-Romá, 2004] shows that middle options can be misinterpreted). 3) Unidimensional pairwise preference response format (“Which of these two statements is more like you?) administered and scored using the Zinnes-Griggs (1974) model and algorithms developed by Stark and Drasgow (2002) and Stark, Chernyshenko, and Drasgow (2006). In this format, statements representing the same personality dimension are presented in pairs. Examinees are asked to choose one statement that is more descriptive of them. Examples of existing personality inventories using this format are NCAPS (Note that NCAPS runs an earlier adaptive item selection algorithm developed by Stark and Drasgow (2002). The TAPAS algorithm, which was developed by Stark, Chernyshenko and Drasgow in 2006, is different in terms of how pairs of statements are selected). 4) Multidimensional pairwise preference response format (“Which or the two statements is more like you?) administered and scored using the multidimensional pairwise preference (MDPP) model and algorithms developed by Stark (2002) and Stark, Chernyshenko, and Drasgow (2005). In this format, statements representing different personality dimension are presented in pairs. Examinees are asked to choose one statement that is more descriptive of them. This format is unique to TAPAS. Other multidimensional forced choice inventories typically are composed of items having 4 statements (a.k.a., tetrad) and respondents are asked to choose the statements that are the most and least like them. Examples of existing personality inventories using the tetrad format are Assessment of Individual Motivation and Occupational Personality Questionnaire (Note that AIM and OPQ are scored using classical methods and don’t have adaptive item selection.). 4 TAPAS: Tailored Adaptive Personality Assessment System Each of the four response format has its own advantages. The choice of formats depends largely on the goal of the assessment; the single statement format should probably be favored more for counseling purposes whereas the forced choice format is more suitable for personnel selection purposes. Our primary interest to date has been in the multidimensional pairwise preference (MDPP) format, because it shows the most promise in operational testing contexts where intentional response distortion is likely. Our SBIR research has shown that resistance to response distortions seems to be a function of how multidimensional pairs are formed. We have conducted a number of experiments that investigated the link between pair fakability and various item parameters (e.g., statement social desirability, location, etc…). In TAPAS, we can manipulate constraints on how MDPP items are created depending on the degree of the user’s concern about response distortion. The adaptive testing format, regardless of the response format or psychometric model used, offers greater test efficiency than the static, fixed length format. In addition, adaptive testing has better test security, because each examinee receives what is essentially a unique parallel test form. Nevertheless, using TAPAS and our existing item pool, we can create multiple static forms for any of the four response formats. An example of this is the TAPAS-static95 form that was created in collaboration with ARI and is currently being administered under the EEE Metrics effort of HUMRRO and ARI. The TAPAS-static95 form measures 12 personality dimensions that were selected based on prior empirical research to predict attrition and training performance. There are a total of 95 fake-resistant item pairs that were selected based on Drasgow Group and ARI pre-testing research. The computerized administration of TAPAS is now possible via the Drasgow Group Internet server. Computer adaptive testing algorithms have been implemented for SGR, ZG and MDPP models. We are currently working on the 3PL algorithm, which will underlie the single statement dichotomous response format. Our simulation studies involving ZG and MDPP models for unidimensional and multidimensional pairwise preference formats have shown good recovery of trait scores with tests having 10-15 statements per personality dimension. In other words, to measure 10 personality dimensions accurately, one would need between 100 to 150 items. What does TAPAS measure? A comprehensive set of nonredundant narrow facets of fundamental personality traits constitutes the basic building blocks of TAPAS. Rather than adhering to some existing rational or theoretical nomenclature (e.g., NEO-PI or 16PF), our approach to developing the lower-order trait taxonomy was rooted in examining results of large scale empirical factor-analytic studies, conducted using subjects’ responses to a maximally diverse array of temperament indicators (e.g., adjectives, behavioral statements, or scales). 5 TAPAS: Tailored Adaptive Personality Assessment System Two studies were utilized. The first study by Saucier and Ostendorf (1999) examined the structure of 500 adjectives describing human behavior (i.e., assertive, talkative, anxious). The second empirical study, conducted by the members of our research team together with researchers from several US Universities, focused on scales from 7 widely used personality inventories. By factor analyzing responses to scales contained in these seven personality measures, we were able to establish a shared overall hierarchical structure linking broader, general temperament traits and narrower facets. A total of 22 lower-order facets were initially identified (3-6 facets per Big Five dimension). Within each broad Big Five domain, the lower-order facet structure was organized hierarchically. This is advantageous for applied purposes because the TAPAS system can report trait scores at any level of generality, ranging from 5 to 22 dimensions. Moreover, the availability of pattern loading matrices for each domain allowed us to identify empirical markers for nearly all 22 facets (in the form of adjectives or existing scales). These are important for future construct validity investigations. Finally, specific to military applications, we added the Physical Conditioning facet, which we placed on the Extraversion broad factor due to its high positive correlations with Dominance and Energy facets. Table 1 presents a summary of the current 23 facet TAPAS taxonomy. The table is organized into 5 broad clusters representing the Big Five (see column 1). Within these clusters, each row presents the TAPAS facet name (column 2) followed by examples of other scales assessing this facet (column 3) and a brief description of a typical high/low scorer (column 4). A detailed example of Conscientiousness related facets can be found in Roberts, Chernyshenko, Stark, and Goldberg (2005). TAPAS can administer any of these 23 facet dimensions in any of the 4 response formats. If the goal is to provide counseling or developmental feedback then a more comprehensive assessment is warranted and we would suggest selecting 15-23 facets. In personnel selection contexts, the facet selection is determined mainly by the types of criteria being predicted. 6 TAPAS: Tailored Adaptive Personality Assessment System Table 1. Lower Order Facet Taxonomy for TAPAS: Trait Names, Markers, and Descriptions Extraversion Broad Factor TAPAS Facets Brief Description Dominance AIM Leadership, cpi independence, cpi dominance, hpi leadership High scoring individuals are domineering, take charge and are often called by their peers as "natural leaders". Sociability ab5c sociability,neo gregariousness, jpi social Describes individual's level of interest in friendly social interactions. Unrestraint neo excitement seeking, hpi exhibitionistic, hpi entertaining Individuals scoring high on the Unrestrained facet engage in behaviors attract a lot of social attention; they are loud, loquacious, entertaining, and even boastful. jpi energy, neo activity, ABLE energy High scoring individuals have a lot of energy, can forego sleep without much detriment to performance, and are interested in physical activity. Physical Condition ABLE Physical Condition, AIM Physical Condition High Scoring individuals routinely participate in vigorous sports or exercise and enjoy hard physical work. Warmth ab5c warmth, 16pf warmth, neo warmth Individuals scoring high on this facet are affectionate, compassionate, sensitive, and caring. ab5c cooperation, neo modesty Individuals scoring high on this facet are generous with their time and resources, while individuals scoring low are egoistical, greedy, and snobbish. Energy Agreeableness Key Existing Scale Markers Generosity neo trust, hpi no hostility, hpi trusting, ab5c Cooperation/Trust pleasantness, hpi easy to live with Individuals scoring high on this facet are trusting, cordial, noncritical, and easy to live with, while those scoring low are skeptical, suspicious, and even combative. 7 TAPAS: Tailored Adaptive Personality Assessment System Industriousness Conscientiousness Order Self-control neo order, 16pf perfectionism, jpi organization, NCAPS Dependability ab5c cautiousness, neo deliberation, mpq selfcontrol, NCAPS vigilance Individuals with high scores on this factor would be described as hard working, ambitious, confident, and resourceful. Emphasizes the ability to organize tasks and activities and the desire to maintain neat and clean surroundings. Individuals with high scores on Self-control tend to be cautious, levelheaded, able to delay gratification, and be patient. Responsibility cpi responsibility, jpi responsibility, ABLE Nondeliquency Individuals with high Responsibility scores like to be of service to others, frequently contribute their time and money to community projects, and tend to be cooperative and dependable. Traditionalism mpq traditionalism, 16pf rule consciousness, ABLE Nondelinquency People with high scores on Traditionalism tend to comply with current rules, customs, norms, and expectations; they dislike changes and do not challenge authority. cpi good impression, hpi virtuous Virtue represents a constellation of beliefs and behaviors associated with adherence to standards of honesty, morality, and “good Samaritan” behavior. No Anxiety 16pf apprehensive, jpi anxiety, neo anxiety, hpi not anxious, mpq stress reaction Individuals scoring low on the No Anxiety facet are high strung, self-conscious and apprehensive regardless of the type of situation they are dealing with. Even Tempered ab5c calmness, neo hostility, hpi even tempered, NCAPS Stress Tolerance Those scoring low on this facet have a tendency to experience a range of negative emotions including irritability, anger, hostility, or even aggression; those scoring high tend to be calm, even tempered, and stable. Virtue Emotional Stability AIM Work Orientation, NCAPS achievement, neo competence, neo achievement striving 8 Openness To Experience TAPAS: Tailored Adaptive Personality Assessment System Well-Being neo depression, ab5c happiness, cpi well-being The aim is to assess an individual’s general emotional tone. The continuum here is despair and depression on the one end, and joy and well-being on the other. Intellectual Efficiency hpi good memory, hpi reading,cpi intellectual efficiency, ab5c intellect Individuals with high scores on this factor are able to process information quickly and would be described by others as knowledgeable, astute, and intellectual. Ingenuity ab5c Ingenuity, hpi generates ideas,ab5c competence, jpi innovation A prototypical individual scoring high on the Ingenuity facet is an inventor, a person who constantly strives to make improvements to the existing information or products. Curiosity 16pf sensitivity, hpi curiosity, hpi science ability, hpi thrill seeking Individuals with high scores on this facet would be characterized as inquisitive and perceptive; they read popular science/mechanics magazines and are interested in experimenting with objects and substances. Aesthetics neo aesthetics,ab5c reflection, mpq absorption, neo feelings, hpi culture Individuals scoring high genuinely enjoy acquiring, participating, or creating various forms of artistic, musical, or architectural outputs Tolerance cpi flexibility, neo values, cpi psychological mindedness, jpi tolerant Individuals scoring high on Tolerance like to attend cultural events or meet and befriend people with different views. They also tend to better adapt to novel situations. Depth 16pf abstractness, ab5c High scoring individuals exhibit behaviors targeted toward depth, ab5c introspection, jpi understanding the meaning of one’s life and/or facilitating selfcomplexity improvement and self-actualization. 9 TAPAS: Tailored Adaptive Personality Assessment System Validity of TAPAS personality dimensions The primary purpose of TAPAS is to help organizations to improve the utility of their selection systems as well as facilitate personal development. Selection systems, however, can be quite diverse and target a number of criteria, ranging from job performance (task proficiency and citizenship) to identifying leadership potential or decreasing attrition. Adding further to this complexity is the fact that personality/temperament measures used in the past are not easily comparable to each other and differ markedly in the content and breadth of constructs assessed. This leads to a lack of consensus among researchers and applied users about which specific personality facets to use for which criterion. Our TAPAS validation research aims at overcoming the existing limitation by conducting a comprehensive meta-analysis of personality-criterion relationships in military contexts. To do that, we first mapped most existing scales and measures to the unified facet structure described above (i.e., the Big Five – 23 facet TAPAS taxonomy). We used results of our factor analyses to empirically identify which scales from seven widely used personality inventories (i.e., 16PF, NEO, HPI, CPI, MPQ, AB5C, and JPI) had high loadings on which TAPAS facets. Once this initial set of markers for TAPAS facets was identified, we then examined the research literature to find other related scales. The resulting tables are similar to the one presented in Table 2. In this table, we show known scale markers for the Industriousness facet of TAPAS. As can be seen in the table, four NEO-PI and four AB5C scale have had high empirical loadings on the Industriousness facet (see columns 1 and 2). A number of other scales were also identified to measure Industriousness including the AIM Work Orientation scale, the NCAPS Achievement scale, and the MPQ Hard work scale (see column 3 of Table 2). Table 2. Scales Measuring TAPAS Industriousness Facet neo competence neo achievement striving ab5c organization ab5c purposefulness neo self-discipline ab5c efficiency ab5c rationality neo dutifulness .88 .76 .75 .67 .65 .63 .50 .49 PRF Achievement PRF Endurance ABLE/AIM Work Orientation (Achievement and Self Esteem composite) Proactive Personality (Siebery et al., 1999). CPI Achievement via Independence Self esteem [(Rosebberg,1965) see Atwater, 1999) Achievement orientation (CCSQ composite) TSDI Conscientiousness PCI achievement MPQ Hard Work NCAPS Achievement OPQ Achieving, Competitive ABLE Work Orientation, ABLE Achievement In the second step of the meta-analysis, we identified 42 unique empirical studies published between 1988 and 2006 that utilized a variety of personality/temperament scales to predict performance in military, police, or fire fighter occupations. We then coded a total of 1494 criterion-related validities reported in these studies for 8 criteria most relevant to military 10 TAPAS: Tailored Adaptive Personality Assessment System selection contexts: task proficiency, contextual performance, counterproductivity, attrition, leadership, training performance, adaptability and fitness level. In Table 3, we present an example of a table summarizing the meta-analytic results for the Industriousness facet. Column 1 shows the 8 criterion variables, Column 2 refers to the total sample size used to compute the observed validity coeffcient, Column 3 and Column 4 indicate the number of study and unique validity coefficients coded, Column 5 presents the observed meta-analytic estimate of the validity for the Industriousness facet, and Column 6 shows validity after predictor and criterion measures were corrected for unreliability (Note that when reliability values were unavailable, we assumed a conservative .8 reliabilities). As can be seen in the Table 3, the Industriousness facet is most predictive of Contextual performance (a.k.a., personal initiative), followed by Physical Fitness, Adaptability, Leadership, and Counterproductivity. Together with similar validity tables for the other 22 TAPAS facets, this information offers much needed guidance for applied researchers and policymakers in terms of which personality predictors they may wish to consider. N kd kc Job/Task Performance 38964 14 36 Contextual Performance 19423 9 18 Counterproductivity 17673 8 17 Attrition 17912 5 8 Leadership 9429 12 20 Training Performance 6156 8 27 Adaptability 1291 3 4 18044 5 17 Criterion .05 .21 -.14 -.09 .15 .14 .17 .18 Corrected Validity .06 .26 -.18 -.10 .18 .17 .21 .23 Physical Fitness Observed Validity Table 3. Meta-analytic Validity Estimates for TAPAS Industriousness Facet Note that similar meta-analytic tables are available for civilian occupations. These tables currently are based on 4755 validity coefficients sorted into TAPAS facets and the 8 criteria. Summary We believe that TAPAS represents the state-of-the-art in personality assessment because it uses advanced technology, innovative psychometric theory, a comprehensive analysis of personality facets, and the latest meta-analytic findings concerning the validity of personality dimensions. The result is a web-based, adaptive assessment tool, customizable in terms of the facets assessed and the number of items administered per facet, that provides precise measurement of the facets identified in a comprehensive analysis of the latent structure of personality. Of course, it is impossible to “get something for nothing.” TAPAS requires 10 to 15 items per dimension in order to produce highly accurate trait scores. Because it is adaptive, this number of items is equivalent to roughly 20 to 25 items per trait administered in a 11 TAPAS: Tailored Adaptive Personality Assessment System conventional static assessment. In addition to adaptivity, TAPAS uses IRT scoring, which is able to either increase the precision of an assessment tool for a given test length or reduce test length for a given level of precision. Thus, for a specified level of measurement precision (i.e., a given reliability level or standard error of measurement), TAPAS provides the shortest scale length possible. We suspect that users will find TAPAS’s customization feature very useful. Instead of requiring every individual to complete every item for a lengthy list of facets, users can select the set they desire. For example, applicants for jobs in sales might complete the Industriousness, Energy, Sociability, and Well Being scales. With ten items per facet, applicants would be able to complete a single statement assessment in perhaps 5 minutes and a forced choice assessment in perhaps 10 minutes. Thus, with just a few minutes of applicants’ time, an employer would be able to substantially increase revenue. In sum, users can consult our meta-analytic results to identify the traits that drive success for their jobs. They can then instruct TAPAS to assess just these dimensions, resulting in a highly efficient assessment process. Moreover, because this process is evidence-based, users can be very confident that the scores of job applicants will be strongly related to the job performance. 12 TAPAS: Tailored Adaptive Personality Assessment System References: Barrick, M. R., & Mount, M. K. (1991). The Big Five personality dimensions and job performance: A meta-analysis. Personnel Psychology, 44, 1 – 26. Campbell, J. P., & Knapp, D. J. (2001). Exploring the Limits in Personnel Selection and Classification, New Jersey: Lawrence Erlbaum Associates. Costa, P. T., Jr., McCrae, R. R., & Dye, D. A. (1991). Facet scales for agreeableness and conscientiousness: A revision of the NEO Personality Inventory. Personality and Individual Differences, 12, 887-898. Fabrigar, L.R., Wegener, D. T., MacCallum, R. C. & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4, 272-299. Hernández, A., Drasgow, F., & González-Romá, V. (2004). Investigating the functioning of a middle category by means of a mixed-measurement model. Journal of Applied Psychology, 89, 687-699. Hofstee, W. K., de Raad, B., & Goldberg, L. R. (1992). Integration of the Big Five and circumplex approaches to trait structure. Journal of Personality and Social Psychology, 63, 146-163. Hough, L. M., & Ones, D. S. (2002). The structure, measurement, validity, and use of personality variables in industrial work, and organizational psychology. In N. Anderson, D. S. Ones, H. K. Sinangil, & C. Viswesvaran (Eds.), Handbook of industrial, work and organizational psychology, Vol. 1 (pp.233-277). Sage Publications. Jackson, D. N., Wrobleski, V. R., & Ashton, M. C. (2000). The impact of faking on employment tests: Does forced-choice offer a solution? Human Performance, 13, 371 – 388. Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124, 262-274. Saucier, G., & Ostendorf, F. (1999). Hierarchical subcomponents of the Big Five personality factors: A cross-language replication. Journal of Personality and Social Psychology, 76, 613-627. Stark, S., & Chernyshenko, O.S. (in review). An examination of linking procedures for multidimensional pairwise preference items. Stark, S., Chernyshenko, O.S., & Drasgow, F. (November, 2003). A new approach to constructing and scoring fake-resistant personality measures. Paper presented at the 45th annual conference of the International Military Testing Association. Pensacola, FL. Stark, S., Chernyshenko, O.S., & Drasgow, F. (2005). An IRT approach to constructing and scoring pairwise preference items involving stimuli on different dimensions: An application to the problem of faking in personality assessment. Applied Psychological Measurement, 29, 184 – 201. White, L. A., & Young, M. C. (1998). Development and validation of the Assessment of Individual Motivation (AIM). Paper presented at the Annual Meeting of the American Psychological Association, San Francisco, CA. White, L. A., Nord, R. D., Mael, F. A., & Young, M. C. (1993). The Assessment of Background and Life Experiences (ABLE). In T. Trent & J. H. Laurence (Eds.), Adaptability screening for the Armed Forces. Washington, DC: Office of the Assistant Secretary of Defense (Force Management and Personnel). 13 TAPAS: Tailored Adaptive Personality Assessment System White, L. A., Young, M. C., & Rumsey, M. G. (2001). ABLE implementation issues and related research. In J. P. Campbell & D. J. Knapp (Eds.), Exploring the limits of personnel selection and classification. Mahwah, NJ: Erlbaum. (pp. 525-558). Zinnes, J. L., & Griggs, R. A. (1974). Probabalistic, multidimensional unfolding analysis. Psychometrika, 39, 327-350. 14