1 Chapter 1 OVERVIEW In a company there are many job positions, which typically range from very simple to extremely complex. All of these positions need to be well understood in order to hire employees, evaluate their performance, train them, and set pay rates, among other human resource functions. In order to do this effectively it is necessary for one to know what types of duties and tasks are required for the positions. In order to find out what duties and tasks are related to a job position a job analysis is done. A job analysis is a technique used in order to measure the differing aspects of a job (Kaplan & Saccuzzo, 2005). The data obtained from a job analysis allows one to understand the frequency and importance of the duties and tasks involved in the job, which aides in the process of developing human resource and performance management tools. After job analysis ratings are collected they must be analyzed statistically in order to identify critical tasks and provide a summary of the job. This is typically done with basic descriptive statistics, but recently Item Response Theory (IRT) has been explored as a method for scaling tasks. While polytomous IRT models exist that can be used when analyzing task analysis data, the data are sometimes dichotomized and analyzed with binary IRT models (e.g., Harvey, 2003). This dichotomization is accomplished by recoding the ratings from three or more data points down to two. The question that this study answers is whether transforming the data in this manner causes a loss of information. This study explores this 2 question by comparing dichotomous to polytomous treatments of the job analysis ratings from officers (N = 544) in the corrections profession. 3 Chapter 2 JOB ANALYSIS Why is a Job Analysis Done? In order to understand why a job analysis is done one might benefit from first knowing what a job is. A job position is better understood when it is broken into its component parts. The job can be broken down into “units” such as “duties, tasks, activities, or elements” (Brannick, Levine, & Morgeson, 2007, p. 8). Often in a job analysis the position is broken down into smaller parts so the company and the person in the position know every aspect of the job. The breakdown of the units might be easiest to describe from smallest to largest. First, the smallest unit is the element. An element is something that has “a clear beginning, middle, and end” (Brannick et al., 2007, p. 6). Brannick et al. (2007) give dialing a phone as an example of an element. This is something that has a clear beginning, middle, and end. Second, an activity is considered a cluster “of elements directed at fulfilling a work requirement” (Brannick et al., 2007, p. 7). Third, a task is a group “of activities” (Brannick et al., 2007, p. 7). Brannick et al. (2007) give talking “to conflicting parties to settle disturbances” as an example of a task (p. 7). One is required to do several activities in order to complete this task. Finally, a duty is “a collection of task all directed at general goals of a job” (Brannick et al., 2007, p. 7). One would most likely use activities or tasks when creating a job analysis questionnaire. The job analysis helps a company understand the importance and frequency of each of the tasks that are involved in a job. 4 A job analysis is carried out in order to better understand a job position. Brannick et al. (2007) describe a job analysis as the “discovery of the nature of a job” (p. 7). After a job analysis is done a company is better able to know the exact tasks, duties, and many of the knowledge, skills, abilities, and other personal characteristics (KSAOs) that are involved in a job position. Also, a company gains a broader knowledge of which tasks and duties are most critical. When hunting for job applicants a company is able to use a job analysis in order to tell whether the potential employee will be able to fulfill the duties expected of them. Also, the potential hire is able to know if they meet the minimum qualifications for the job position and they are able to know whether the job position sounds like a position of interest. Beyond giving a solid job description, a job analysis benefits a company in the following areas: recruitment and selection; criterion development and training; and performance appraisals and job evaluations (Prien, Goodstein, Goodstein, & Gamble, 2009). The job analysis also helps a company avoid “exposure to litigation based on allegations of discriminatory hiring” (Prien et al., 2009, p. 19). In the recruitment and selection process the recruiter, as well as the individual seeking employment, needs to have an in depth knowledge about what tasks are related to a job (Prien et al., 2009). By having an extensive knowledge base about the job the recruiter is able to choose the best candidate. Another benefit of a job analysis in the recruitment and selection process is that the recruiter and the company are able to know any changes a job has undergone over time (Prien et al., 2009), which enables a company to focus on selecting candidates who have the most relevant qualifications. 5 Most job analyses are centered on the selection process (Prien et al. 2009). It is highly important that a company chooses proficient candidates for the job positions. Therefore, one can see the value of a job analysis in the selection process, because a company’s success or failure depends on whether the right candidates are selected. By knowing the essential functions of a job one is able to select the best qualified candidates for the job. Many times a job analysis is done on a job which exists and the job has people in the position. As was mentioned above, there are many aspects of the job which are deemed to be crucial. When the tasks and duties are found to be crucial it often becomes apparent that certain individuals may not be sufficiently trained (Prien et al., 2009). Those who do not fit the qualifications may need to be further educated in order to increase their productivity in a job. After identifying the criticality of the tasks a company is better equipped in the training process. There are many companies which understand that candidates will not have certain qualifications when entering a job. A job analysis will help a company understand which qualifications a candidate needs to have when entering the job and which qualifications can be taught on the job. A job analysis is also used to set a pay range for the job and it gives one a reference for job evaluations. In setting a pay range a job analysis identifies the amount of education, training, and experience that are needed to enter a job and the complexity or degree of difficulty of a job. Job evaluations are also aided by the job analysis. The job analysis identifies what are known as critical tasks, which are the most important tasks 6 involved in a job (explained in greater detail below). These tasks can help guide a company in creating a job evaluation (Prien et al., 2009). If individuals do well in the critical tasks areas they are often given raises or promotions. If they are doing poorly they often do not receive a raise or promotion and it may be determined that they require additional training. By doing a job analysis a company has an outline of what is expected of an individual in a specified position. If the company does not use what is defined as being part of the job to determine whether an individual should be hired, promoted, etc., they may face legal problems. Basically, the job analysis helps companies avoid litigation. By using a job analysis when one is hiring, giving pay raises, giving promotions, etc., a company is basing their decisions on qualifications and performance and not the individual’s race, sex, or any other factor that is unrelated to the job. If they do this they are complying with the law, which states that individuals can not be discriminated against and they have protections which allow them to receive equal opportunities for employment and equal opportunities for pay (e.g., Equal Employment Opportunity Commission). By having an outline of a job a company has a way of ensuring that their hiring and pay structure is performance based (Prien et al., 2009). When testing for a job position it is very important that the test coincides with the job analysis. If one is testing for things that are not related to the job they run the risk of violating the law. Bemis, Belenky, and Soder (1983) give an example of a landmark case (U.S. v. State of New York) which helped to define the importance of linking performance and job tasks. The case emphasized that the Job Element approach (the individual having 7 the characteristics/experience to be in a job) was not a sufficient approach on which to base decisions. Decision making (e.g., raises, promotions, etc.) had to be based on the performance of the individual in job related tasks. How is a Job Analysis Done? There are a number of sources used in order to complete a job analysis. Some of the main sources that are used include past job analyses, the Occupational Information Network (O*NET), and subject matter experts (SMEs). Many times a past job analysis has already been carried out by a company. As companies grow or change the job analysis becomes outdated, because the job position requires a different set of KSAs, or because job duties may have changed due to advances in technology. However, the past job analysis can often be used in order to guide an individual in creating questions or task statements that are rated by SMEs. The O*NET is another source which provides good, basic, information on many different job positions. Before the O*NET came into existence the Dictionary of Occupational Titles (DOT; United States Department of Labor, 1939) was used. The DOT was originally put together by the United States Employment Service (USES) (Guion, 1998). The DOT was put together by conducting observations and interviews of individuals in many different job positions. The information obtained from the observations and interviews were transformed into descriptions of the job positions (Guion, 1998). The information in the DOT centered on the tasks that the individual needed to do on the job. This was not enough to describe a job position. Also, the 8 descriptions of jobs in the DOT tended to be outdated in a short amount of time (Brannick et al., 2007). Due to the problems encountered with the DOT it was necessary to create a better source describing job positions. As technology grew, and the internet came into existence, the O*NET was created so that individuals could know what was involved in a job position at a faster rate and could have the most up-to-date information available. The contents of the O*NET can be understood by looking at the O*NET content model. The model contains the following information: 1. Worker requirements: basic skills, cross-functional skills, knowledges, and education. 2. Experience requirements: training, experience, and licensure. 3. Worker characteristics: abilities, occupational values and interests, and work styles. 4. Occupational requirements: generalized work activities, work context, and organizational context. 5. Occupation-specific requirements: occupational knowledges; occupational skills; tasks; and duties; machines, tools and equipment. 6. Occupation characteristics: labor market information; occupational outlook; and wages (Mumford & Peterson, 1999). Basically, the O*NET centers around the KSAOs that are required in order to complete the tasks and duties of a job. If a person is knowledgeable about things involved in a job they are able to navigate the technical aspects of the job (Brannick et al., 2007). If an individual possesses the skills required for the position they are capable of performing the tasks of the job. Ability involves the physical and mental aspects of the 9 individual (Brannick et al., 2007). In other words, a qualified person is physically (e.g., lift 100 pounds) and mentally (e.g., deal with multiple job stressors) able to deal with the tasks required of them when they are in a job position. Past job analyses and the O*NET are often used to gain an early understanding of the job, which is generally followed up with an interview with SMEs to understand the details of a job and to guide an individual when they are creating the tasks statements, which are compiled into a questionnaire. These questionnaires are used to rate the frequency and importance of each task that is involved in a job. Methods of Collecting Data for Creating Job Analysis Questionnaires There are several methods one can use in order to create a job analysis questionnaire. Wei and Salvendy (2004) discuss the following methods for collecting data: “observations and interviews”, “process tracing”, and “conceptual techniques” (p. 276, italics in original). They explain that the observations and interviews approach is the most direct approach. One is directly observing the individuals who are in the job and asking them questions based on information gathered about the job. Wei and Salvendy (2004) warn that this approach is sometimes “unwieldy and difficult to interpret” (p. 276). Ultimately, data are interpreted based on numbers. When interviewing an individual one is asking questions about the job and not assigning numbers (e.g., level of importance) to the questions. Therefore, it is difficult to rate the questions because they are not initially assigned a number. Even though there are certain interpretation problems with interviewing, this method may be preferred over other methods. In fact, Prien et al. (2009) emphasizes the 10 benefit of an interview over a self-report. When an individual is interviewed they are often interviewed by a trained professional. However, Prien et al. (2009) explains that when individuals are told to write a description about their job there is a greater probability that they will inflate the importance of the tasks involved in their job. Similar to interviewing, the process tracing technique is also verbally based. However, with process tracing one is looking at a specified group of tasks. In a job analysis there may be certain tasks which require a more in depth analysis. By using process tracing one is able to get detailed information about those tasks (Wei & Salvendy, 2004). Again, due to the fact that the data are verbal it is hard to interpret. However, it is a useful way to obtain information on tasks and duties, which require a greater amount of detail. With conceptual techniques one is focused on “domain concepts” (Wei & Salvendy, 2004, p. 276). The benefit of conceptual techniques is that it asks about specific tasks and can be rated (e.g., questionnaire; Wei & Salvendy, 2004). A drawback is that it does not provide the detail which is often obtained through an interview. The questionnaire is a highly used conceptual technique. Prien et al. (2009) discusses two types of questionnaires which are used. First, one can use a “custom-designed questionnaire” (Prien et al., 2009, p. 33). The questionnaire is specifically designed for the job being analyzed. One could also use a more general questionnaire, “the commercially available questionnaire”, to analyze the job (Prien et al., 2009, p. 34). Prien et al. (2009) explains that the drawback to the commercially available questionnaire is that it is often created for a broad range of jobs. Therefore, there may be questions which 11 are not directly applicable to the job. Also, it may be missing questions that should be asked. The questions are rated by SMEs, who are individuals that have an in-depth knowledge of the functions and KSAOs involved in a job position. For instance, a police officer or a lieutenant would be a SME for a police officer position. They would be familiar with the position because they are currently in the position or they have been in the position in the past. However, one does not have to be in the job or have been in a job in order to be a SME. One simply needs to have a strong knowledge of a job position. As was mentioned above, past job analyses and the O*NET give individuals who are creating the questionnaire a guide for making the tasks statements. The O*NET gives a general guide and past job analyses gives a more detailed guide. Even though one might gain a lot of insight into the job based on these sources, these sources alone may not be sufficient to create a detailed questionnaire based on the position in its current form. One of the best sources for creating a questionnaire is the current SMEs who occupy the position of interest. One can observe (if circumstances allow) the SMEs in their work environment and ask them questions either on the job or in an interview. The interview questions can be made by using the O*NET or a past job analysis. However, since jobs tend to evolve over time one might come up with some questions while doing the interview. In other words, one might notice that certain questions need to be asked that may not have been thought of based on the past job analysis or based on the information obtained on the O*NET. Brannick et al., (2007) explain that you can use paper and pencil or a video or audio recorder to write out or tape the interview. They 12 further explain that if you are going to tape the interview it would be a good idea to ask the person first, since recording devices tend to make people nervous. After the information is obtained from these sources it is then turned into a job analysis questionnaire. When creating a job analysis questionnaire it is important to collect demographic information (e.g., age, gender, ethnicity, job position, etc.). This information is very useful when analyzing certain aspects of the data. The questionnaire should also be divided into several subcategories in order to better organize the different duties of the job. For instance, a task that is clerical would not go in a security category. Questionnaires are often lengthy and organizing the tasks into groups helps to clarify what is being asked and helps to avoid confusion. As has been mentioned before, the task statements are rated on the level of importance and frequency. For instance, the questionnaire can ask, on a Likert type scale, the level of importance (0 = not applicable to job, 1 = not very important, 2 = somewhat important, 3 = important, 4 = very important) and the frequency (0 = not applicable to job, 1 = infrequent, 2 = somewhat frequent, 3 = frequent, 4 = very frequent). There are several types of rating scales that one can choose from. The rating scales can be made based on the type of job that one is asking about. Also, the number of data points for the scale can vary. For instance, a rating scale used by Baranowski and Anderson (2005) gives the following prompt “How important is this knowledge, skill, or ability for performing this work behavior?” (p. 1044). The prompt is followed by a five point rating scale: “1 = Not important”, “2 = Slightly important”, “3 = Moderately 13 important”, “4 = Very important”, and “5 = Extremely important” (Baranowski & Anderson, 2005, p. 1044). However, the amount of ratings do not have to be limited to five points. Typically, one would see ratings between three and seven points. One could also use a two point rating scale if he or she is simply interested in knowing whether the tasks are relevant to the job (e.g., has some degree of frequency and/or importance). Harvey (1991) gives a couple of common examples of scales that are used in a job analysis. In the first example the SME indicates that the tasks are part of the job prior to filling out the scale. When filling out the scale, the SME gives an estimate of time given to each task. The scale has eight ratings, which range from “0 = I spend no time on this task” to “ 7 = I spend a very large amount of time on this task as compared with most other tasks I perform.” (Harvey, 1991, p. 90). The second example gives a scale which tells the reader to consider importance, frequency, and difficulty when rating the task statements and it explains that not all the statements will apply to the job. The ratings range from 0 = definitely not part of my job; I never do it” to “7 = Of most significance to my job” (Harvey, 1991, p. 91). A significant difference on the ratings is that the only ratings that are defined are 0, 1, 4, and 7. In other words, on the previous scale 1, 2, 3, 4, 5, 6, and 7 are defined, while this scale only has definitions for 0, 1, 4, and 7. On this scale, the SME may decide that the task does not quite fit under the 4 definition, therefore, they may choose to rate the task as an undefined 3. The third example gives a frequency scale with time related ratings. The ratings range from “ 0 = I do not perform this task on my current job” and “1 = about once every year” to “7 = about once each hour or more often.” (Gael, 1983; as cited in Harvey, 1991, 14 p. 92). The fourth example gives a scale that is known as a Behaviorally Anchored Rating Scale (BARS). This scale is rated on a continuum and the data points are defined in sentence form (Campbell, Dunnette, Arvey, & Hellervik, 1973; as cited in Harvey 1991). Finally, the fifth example shows the Job Element Inventory (JEI), which, as the name implies, breaks the job down into elements. The inventory uses a simple five point rating scale and tells the person to fill in the rating that best fits with the element (Cornelius & Hakel, 1978; as cited in Harvey, 1991). As can be seen ratings can take many forms. Harvey (1991) explains that using a limited number of rating scales may be beneficial to one’s analysis. He also explains that if two scales are highly correlated then one of the scales can be dropped from the analysis. Harvey (1991) explains that when the rating scales appear to ask similar questions, and the correlation is high, one is often asking redundant information. In deciding on how many data points to use one might consider a study done by Wilson and Harvey (1990). In their study they compared the difference between relative time spent (RTS) scales and simply asking whether the tasks were or were not part of the job. They found a correlation of .90 between the two types of ratings. In other words, dichotomously and polytomously scored variables gave approximately the same amount of information. Also, a study done by Hernandez, Drasgow, and Gonzalez-Roma (2004) found that when rating personality many individuals do not use the “?, not sure, or undecided” category (p. 687). They explain that there is a commonly held belief that having a middle category is beneficial because it does not force the individual to answer a question that they are actually unsure of (Hernandez et al., 2004). Hernandez et al. (2004) 15 found that those with certain personality characteristics (e.g., reserved) tended to use the middle category more often. Based on their findings they recommended that the middle category should be used with caution, because it may lead to interpretation problems. How are the Task Inventory Data Typically Analyzed? There are several ways in which the task inventory data are analyzed. The descriptive statistics may tell you something about how much the individuals agree on the task statements. Brannick et al. (2007) emphasizes that at a minimum one should report descriptive statistics such as the “mean, standard deviation, and N” (p. 273). The mean of each task statement tells you how the average rater scored the statement. The overall mean of the task statements tells you how the average rater scored all of the items. The standard deviation of the task statements tells you how much the ratings differed in the statements. A large standard deviation would indicate that the individuals differed a lot when it came to rating a certain statement. There may be reasons why the ratings for individual task statements differ substantially among the SMEs (e.g., poorly written task statements). N is simply the number of individuals (e.g., SMEs) who rated the task. There are tests which can help one to learn more about how the items are functioning. Some of these tests include tests of interrater reliability and interrater agreement. Brannick et al. (2007) explain that there is a difference between interrater reliability and interrater agreement. They explain that “[r]eliability ... refers to two sources of variance: variance due to random errors and variance due to systematic differences among the items of interest” and “[a]greement simply refers to a function of those judgments that are identical and different” (p. 276). 16 One can understand reliability a bit more clearly through the following equation given by Meyers (2007): Reliability = VT VO VT is the true variance and it is divided by VO which is the observed variance. The true score would be a score that does not contain any error. The observed score is the score obtained from the individual or subject and it does contain error. When a study, test, questionnaire, etc. controls for error (e.g., testing at the same time of day) the observed score increases thereby increasing the reliability. However, there will always be a certain amount of error that cannot be controlled for (e.g., individuals not getting good sleep the night before the test takes place). Interrater agreement can be understood by its name. It is simply the amount of times the judges/raters agree on how the item is scored. For instance, if three judges rated a task statement and all of the judges rated the statement as a two (highly important) then it would indicate that there is one hundred percent agreement among the judges. Brannick et al. (2007) explain that interrater agreement (“interjudge agreement”) “is simply the number of ratings for which the judges agree divided by the total number of judgments” (p. 275). Meyers (2007) explains that the level of agreement can be found by using Cohen’s Kappa. He explains that the procedure is available in SPSS and it can be interpreted using the chi square statistic. Meyers (2007) also explains that if there are more than two raters then an alternate form of Kappa created by Fleiss (1971) can be used. Kaplan and Saccuzzo (2005) explain the common way to interpret Kappa (e.g., 17 Fleiss, 1981). They explain that if the value is “.75” then one would have ““excellent” agreement, a value between .40 and .75 indicates “fair to good” ... agreement, a value less than .40 indicate poor agreement” (Fleiss, 1981; as cited in Kaplan and Saccuzzo, 2005, p. 118). Meyers (2007) also explains that rater reliability can be found by finding the Pearson correlation. With this statistic one is finding the “correlation between pairs of raters” (Meyers, 2005, p. 2). In other words, one is finding the degree to which there ratings trend in the same direction. Meyers (2005) explains that correlations above .8 are good, but correlations “in the .7s” may also be strong enough in many cases. The correlation coefficient (r) can be used in order to estimate interrater reliability. Brannick et al. (2007) explains that a large correlation indicates that there is a small amount of error in the problem and it indicates that the problem has good reliability. In other words, it indicates that there is a low amount of random error in the items. Besides making estimates of how well the items are working and how accurately the judges are rating the items, one can also estimate how critical each task is to the job. Understanding the Criticality of the Tasks After the task statements are rated by multiple SMEs the means of the frequency and importance scales are multiplied for each of the task in order to find which tasks had the highest criticality. This is used to find the critical tasks, which are then organized into a criticality index. The criticality index shows the tasks from the highest criticality to the lowest criticality. The following equation is often used to calculate the criticality of each of the tasks: 18 Criticality Index = Importance (M) x Frequency (M) For instance, if SMEs are rating tasks on an importance scale (5 = very important, 4 = important, 3 = somewhat important , 2 = somewhat unimportant, 1 = unimportant) and a frequency scale (5 = highly frequent, 4 = frequent, 3 = somewhat frequent , 2 = somewhat infrequent , 1 = not frequent ) one would use the mean rating from each of these scales to calculate the criticality index. If the mean of the SMEs scores is 3 for importance and 3 for frequency then the score on the criticality index would be 9. One would decide how high the score should be in order for the task to be considered critical. This helps one to understand a bit about the combination of importance and frequency. If one of these ratings is low then the task may not be considered to be critical to the job. By finding the most critical aspects of a job a company or organization is better able to understand the type of employee they need to be searching for to fill a position. If the potential employee does not possess the KSAs needed to perform the critical tasks of a job then they may not be a qualified candidate. There are several ways in which the statements and questions for a job analysis questionnaire can be analyzed. Some have been mentioned above (e.g., mean ratings, standard deviation). Another way to look at the items is through IRT. In a job analysis one is looking at the educated opinion of individuals rating the items. The mean and standard deviation are a good starting point in gathering information about each of the items. While the mean gives an indication of the average level at which the raters are rating the tasks, IRT gives an indication of how individuals are rating the item from 19 lowest to highest ratings for each task. IRT potentially allows one to gather more detail about the items and allows one to understand whether the items are functioning properly. 20 Chapter 3 ITEM RESPONSE THEORY In an IRT analysis one would refer to a person’s ability level as theta. Ability does not simply refer to an individual’s ability to answer a question correctly (e.g., multiple choice test). It can also indicate the level at which the frequency of SMEs rating a task is the highest. In other words, if a majority of the SMEs (who filled out the questionnaire) rated a task as having some degree of relevance to the job then the task would have a low theta level. On the other hand, if there were not many SMEs who rated the task as being part of a job then the task statement would have a high theta level. Also, if there were a moderate level of SMEs who rated the task as being part of the job there would be a moderate theta level. The general term theta will be used to refer to what is described above. One might also understand the b parameter (location of the curve) by comparing task ratings to ability test (math, reading, etc.). The b parameter is lower on ability test problems when a lot of the individuals taking the test get the problem correct. Lower b paramenter indicates that the problem does not have a high degree of difficulty. Similarly, the b parameter is lower on task statements that have a large amount of SMEs who rate the task as being a part of the job. However, when rating task statements the b parameter is based on how the SMEs rated the statement and it does not indicate that the tasks is rated correctly or incorrectly. At the individual level a SME’s theta is based on how they rated the task statements. Again, the theta level does not indicate that the SME rated the tasks correctly 21 or incorrectly. It is simply an indication of how they rated the tasks. For example, if the SME rated a majority of the task statements as being part of the job then they would have a high theta level. If the SME rated a majority of the task statements as being irrelevant to the job then they would have a low theta level. After SMEs have rated statements one can see how well the item is functioning through IRT. IRT looks at how well the item functions based on theta (Harvey, 2003). An individual’s theta level can be predicted through the use of different models. The models that will be explained in more detail below are the Rasch Model (“one-parameter logistic model”, “1PL”) and the “two-parameter logistic model (2PL)” (Embretson & Reise, 2000, p. 48, 51). However, before explaining the models in detail it is important to know the assumptions that need to be met for IRT to function properly. Assumptions of IRT There are several assumptions which need to be met for IRT calculations to function properly. The most common are that the items have to be locally independent and they need to be unidimensional (Embretson & Reise, 2000). Local independence indicates a separation of items. As the term implies, the items are independent. In other words, one would not have to rely on one item in order to know the answer to another item. Embretson and Reise (2000) explain that IRT models share similarities with factor analysis. Due to the similarity, when one factor fits the data one can often assume that local independence has been achieved (Embretson & Reise, 2000). In other words, the variables all fall close to (have high loadings on) a single factor. If one of the variables is not loading on a factor then it may not have good item fit, which is essential for items to 22 function efficiently in IRT. In other words, it may not fit with the other items in that specific IRT analysis. One needs to decide the best way to analyze the items if they are loading on different factors. It might be necessary to run multiple analyses if the items are not all loading on one factor. If the items were dependent it would indicate that the items rely on each other for the answer (Embretson & Reise, 2000). For instance, in item matching (matching a term from a list to its appropriate definition in a list of definitions) a person would eliminate the definitions that he or she has used, which would give a better indication of what the answer might be for the remaining unknown items. Therefore, since the items are dependent on one another, items from item matching would not be able to be used in the standard IRT models. Another example of local dependence is the Cloze exam (Taylor, 1953). In this exam an individual fills in the blanks of a paragraph with what they believe to be a missing word. This is done at several points in the paragraph. If the individual knows certain words it may give them clues to the other missing words in the paragraph, which would violate local independence. Yen (1993) explained that the following features of a test may lead to local dependence: (a) “External assistance or interference”; (b) “Speededness” (e.g., not having sufficient time to take the test) (c) “Fatigue”; (d) “Practice”; (e) “Item or response format”; (f) “Passage dependence”; (g) “Item chaining” (h) “Explanation of previous answer”; (i) “Scoring rubrics or raters”; and (j) “Content, knowledge, and abilities (p. 189-190, italics in original). Therefore, if one wants to avoid creating local dependence in a test they should give a sufficient amount of time to take the test, should 23 not expose students to the material prior to the test, and they should avoid tying questions together in a test. They should also be aware of the way they score and format the test. Local dependence is not necessarily a bad thing. However, these types of problems should not be used in IRT. If there is local dependence it is suggested that one create what is known as “testlets”, which is done by combining items that depend on each other (Yen, 1993). A testlet is created by changing multiple items into a single item (Embretson & Reise, 2000). In other words, items that give clues to other items are combined. As was explained above, the Cloze paragraph has a certain amount of items. The items would be estimated as a single item, and one would give the item partial credit if some of the items are correct. Yen (1993) explains that a “testlet score is assumed to be independent of all other testlets and items” (p. 201). By making items with local dependence into one item you are achieving local independence. For instance, if one were to calculate multiple Cloze items as one item the individuals taking the test would only depend on information within the item to answer the question. Unidimensionality has to do with how well the model fits the data and the “trait level estimates” that are needed to explain the data (Embretson & Reise, 2000, p. 189). Basically, if local independence is met then the data meets that requirement for unidimensionality (Embretson & Reise, 2000). Embretson and Reise, (2000) explain that there have been many ways which have been proposed for assessing dimensionality. However, they explain that many people have been turning toward goodness-of-fit indices in order to assess the data (Embretson & Reise, 2000). 24 One way of finding out whether the item fits the data is by dividing chi-square (𝜒 2 ) by the degrees of freedom (df). Embretson and Reise (2000) explain that 𝜒 2 is based on the observed frequencies and the expected frequencies. Basically, the SMEs are divided into groups (theta level groups) based on how they rate the task statements. IRT will give certain expected frequencies based on the model that is used. When the observed frequencies line up with the expected frequencies it indicates that the task statement fits the model. Drasgow, Levine, Tsien, Williams, and Mead (1995) defined the item fit for their set of data. They defined the smallest number as having the best fit. In other words, when computing 𝜒 2 /df the numbers that are closest to 0 (between 0 and 3) fit the model the best. Drasgow et al. (1995) emphasized that less than 1 is considered “very small”, between one and two is considered “small”, between two and three is considered “moderately large” and if the 𝜒 2 ⁄𝑑𝑓 is greater than three it is considered “large” (p. 151). Therefore, one might consider removing items that are above three from a set of items for that particular analysis, or one might consider seeing whether the items fit better with a different model. It is essential that one looks at the item fit in order to know whether the items are giving accurate information. When the items fit one is able to trust the information that is displayed through the item parameters (location, slope, and guessing parameters) and the plots, which are known as the item characteristic curves (ICCs) and item information curves (IICs). ICCs will be covered in more detail below. IICs are simply plots of the amount of information given at each theta level. 25 Item Characteristic Curve If an item is functioning correctly it will have an ICC that is flat at the bottom curves up in the middle and is flat at the top (with dichotomous data). An ICC is a plot of the predicted probability of individuals that endorsed a particular response in combination with the theta level of the individuals (e.g., SMEs ratings). In a job analysis a higher theta level simply means there were a lower amount of individuals who rated tasks as having high relevance (or some degree of relevance if one is using a two point rating scale). Figure 1. Plots for Dichotomous 1PL Item Characteristic Curves (ICCs) Figure 1 (above) shows what the dichotomous positively endorsed category would look like. If an item begins to slant upward at a lower theta level (e.g., Item 1) it simply indicates that a large amount of individuals endorsed the tasks as being part of the job. If the task begins to increase at a higher theta level (e.g., Item 4) it indicates that there were 26 fewer SMEs who endorsed the item as being part of the job. Also, there were a greater amount of individuals who felt that the task was not applicable to the job. In other words, a majority of the SMEs rated the task as having low relevance to the job. An ICC with poor discrimination would have a flat line going across the top of the plot, or, basically, would not have a curve in the middle. One would not want to see a flat line going from the left lower corner to the right upper corner on an ICC. A flat line would mean that once an individual reaches a higher theta level they would automatically be more likely to rate the tasks as being part of the job. In other words, the task would not have good discrimination between differing theta levels. Rasch Model, 1PL, 2PL In order to understand the ICC more fully it is important to understand the Rasch model (1PL model), and the 2PL model. With the Rasch Model one uses one theta parameter for each of the individuals, which allows one to estimate the “difficulty parameter” (location of the curve) for each of the items (Wright, 1977, p. 97). In other words, the individual’s theta level gives a good indication of where the individual would be on the ICC. The theta level is understood based on the way the individual rates the tasks. If the individual gives a lower rating to the task they will be grouped, on the ICC, with those who have the lower theta level. Similarly, those who rate the task high would be grouped with those who have a higher theta level. In a 1PL model “the dependent variable is the dichotomous response” (or polytomous response; e.g., questionnaire data can be both) (Embretson & Reise, 2000, p. 27 48). In a job analysis questionnaire the dependent variable is the way in which the individual rated the item. On the 1PL model, “the independent variables are the person’s trait score, θ𝑠 ” (theta) “and the item’s difficulty level, βi ” (Embretson & Reise, 2000, p. 48). In a job analysis the theta of the individual is expressed by where the individual is located along the theta continuum. The items which are more highly endorsed are on the left and the items not endorsed by very many SMEs are on the right. This is based on dichotomous data plots. Polytomous plots may take many shapes depending on the responses of the SMEs. For instance, a certain amount of SMEs will rate a task in the middle category/categories and others will rate the tasks in the upper category/categories. Therefore, the division of SMEs will cause the polytomous ICCs to look a lot different than the dichotomous ICCs. Harvey (2003) explains that “the ‘difficulty’ of an item [in a job analysis questionnaire] is defined in terms of the amount of the general work activity (GWA) construct (θ) that would be needed to produce a given level of item endorsement” (p. 2). As has been explained before, the theta level is based on how many individuals endorsed the task as being a part or not being a part of the job (e.g., frequency and importance). Therefore, if the SME is familiar with the job he or she will rate the task similar to the way the other SMEs rate the task. As was explained above, an individual can land at different spots on the ICC based on their theta level. By using the Rasch model (combined with the ICC) one can see how the items are functioning at each theta level. 28 The calculation of item difficulty is understood through “log odds or probabilities” (Embretson & Reise, 2000, p. 48). The “ratio” is estimated for each individual (Embretson & Reise, 2000, p. 48). The odds are the degree to which the individual is likely to get a problem correct. For instance, Embretson and Reise (2000) give an example of “the odds that a person passes an item is 4/1”, which indicates that “out of five chances, four successes and one failure are expected” (Embretson & Reise, 2000, p. 49). If an individual were rating task statements, and the odds ratio were 4/1, it would indicate that for every four tasks that the individual rates as being part of the job they would rate one as not being applicable to the job. Also, “odds are the probability of success divided by the probability of failure” (Embretson & Reise, 2000, p. 49). In a job analysis one does not deal with success or failure. One simply deals with the degree to which an individual feels that a statement is part of a job. The following equation shows “the ratio of the probability of success”: ln [ Pis ] = θs − βi (1 − Pis ) where “θs ” is the “trait score”, and “βi ” is the “items difficulty” (Embretson & Reise, 2000, p. 49). Individuals who are higher on the trait would have higher log odds ratios, which would indicate that they are more likely to endorse the item as being part of the job. The 2PL model functions a bit differently than the 1PL model. The 2PL model includes two parameters (Embretson & Reise, 2000). The two parameters that are included in the model are “item difficulty, βi , and item discrimination, 𝛼𝑖 ” (Embretson & 29 Reise, 2000, p. 51). Figure 2 gives an example of how items look in the 2PL model. The location parameter has already been covered with the 1PL. The discrimination parameter is an additional parameter that is shown by the 2PL model. Figure 2. Plots for Dichotomous 2PL Item Characteristic Curves (ICCs) Looking at the difference between the items, an item with a high amount of discrimination would curve upward sharply (e.g., Item 5). This indicates that the item is discriminating at a more specific theta level. An item with low discrimination does not lean as much in the middle (e.g., Item 1). If the ICC of a problem increases smoothly it indicates that the problem does not discriminate as well between people of increasing theta levels. If it increases smoothly it indicates the problem discriminates between individuals of several degrees of theta, but not at a specific range of theta levels. 30 The equation for the 2PL model is expressed as follows: 𝑃(𝑋𝑖𝑠 = 1|𝜃𝑠 , 𝛽𝑖 , 𝛼𝑖 ) = exp(𝛼𝑖 (𝜃𝑠 − 𝛽𝑖 ) 1 + exp(𝛼𝑖 (𝜃𝑠 − 𝛽𝑖 ) (Embretson & Reise, 2000). The model that would be most useful when analyzing a job analysis using rating scales is the graded-response model (GRM). Embretson and Reise (2000) explain that this model “is appropriate to use when item responses can be characterized as ordered categorical responses such as exist in Likert rating scales” (p. 97). In other words, these are responses that do not necessarily have a right or wrong answer. The responses are based on the educated opinion of a set of SMEs. GRM is represented in the following equation: ∗ “𝑃𝑖𝑥 (𝜃) = exp[𝛼𝑖 (𝜃−𝛽𝑖𝑗 ) ” 1+exp[𝛼𝑖 (𝜃−𝛽𝑖𝑗 ) ∗ (p. 98). Embretson and Reise (2000) explain that “𝑃𝑖𝑥 (𝜃)” is the “operating characteristic curves” (p. 98). They continue by explaining that “[i]n the GRM one operating characteristic curve must be estimated for each between category threshold, and hence for a graded response item with five categories, four 𝛽𝑖𝑗 parameters are estimated and one common item slope (𝛼𝑖 ) parameter” (Embretson & Reise, 2000, p. 98). It should be mentioned here that the threshold is the location parameter (b parameter). Referring back to Figures 1 and 2, what is shown is the dichotomous affirmative (e.g., task does apply to job) response to a task statement. What is not shown is the rejecting statement (does not apply). The rejection of the statement is a mirror image of the affirmative statement. In other words, it is a reflection of the statement going in the opposite direction in the same 31 general location. Where the positive and negative responses cross is the threshold. The 𝛽𝑖𝑗 “represents the trait level necessary to respond above threshold j with .50 probability” (Embretson & Reise, 2000, p. 98-99). Basically, 𝛽𝑖𝑗 is based on the way in which the SMEs rated the task statement (e.g., frequency and importance). For polytomous items there are multiple thresholds. The threshold is where the response probability curves cross. For instance, if one were rating a task as either being relevant or irrelevant to a job one would have an ICC for the task being irrelevant (no degree of frequency, no degree of importance) and an ICC for the task being relevant (having some degree of frequency or importance). If one were to look at a dichotomous plot the irrelevant category would be a reflection the relevant category going in the opposite direction. For instance, if the relevant ICC category starts at the lower left corner and curves toward the upper right corner then the irrelevant statement would start at the upper left corner and curve down toward the lower right corner. The point at which these two categories cross is the threshold. With polytomous statements each of the middle categories has an ICC. If one were looking at a three point statement, there is a threshold where category one crosses with category two and where at category two crosses with category three. Therefore, a polytomous statement would have multiple thresholds. In the current study the modified graded response model (M-GRM) is used. The GRM and the M-GRM have similar slope parameters (Embretson & Reise, 2000). In other words, the slopes for the ICC categories on the frequency and importance scales would look similar. However, Embretson and Reise (2000) explain that “[t]he difference between the GRM and the M-GRM is that in the GRM one set of category threshold 32 parameters (𝛽𝑖𝑗 ) is estimated for each scale item, whereas in the M-GRM one set of category threshold parameters (𝑐𝑗 ) is estimated for the entire scale, and one location (𝑏𝑖 ) is estimated for each item” (Embretson & Reise, 2000, p. 103). One can see the difference between GRM and M-GRM by looking at the output. When looking at the output for the GRM on a task statement (with more than two categories) one would see multiple 𝛽 (location) parameters. However, there is only one location parameter on the M-GRM. Dichotomous and Polytomous Models As was explained, the difference between a dichotomous and a polytomous variable is the amount of rating scale categories. A dichotomous variable has only two categories, while a polytomous variable has three or more categories. For instance, the questionnaire used in this study asks the SME to rate whether there would be consequences if an individual does not complete a task. The ratings are polytomous in the sense that they have three categories and the choices are as follows: “0. Task Not Part of Job”; “1. Not Likely”; or “2. Likely”. If one were to change the data to dichotomous the choices would be “0. Task Not Part of Job”; or 1.Task is Part of the Job/May Have Some Degree of Consequence. Essentially, what one is doing when they dichotomize a variable, in a job analysis, is they are changing the degree of importance/frequency to saying that the task is simply part of the job or is not part of a job. Dichotomizing variables has become common for item response theory when one is analyzing job analysis data (e.g., Harvey, 2003). When dichotomizing variables one might ask themselves if they are changing the meaning of the data or losing information 33 through this process. If one changes the shape of the ICC when dichotomizing does it change the meaning of what is portrayed by IRT? With a job analysis questionnaire, where one is trying to figure out the importance of tasks statements, would dichotomizing a variable change the meaning of the SMEs ratings? Bond and Fox (2007) explain that when a category has a low frequency it may make sense to combine it with another category. This is due to the fact that low frequency categories often do not give any additional information to the analysis. Bond and Fox (2007) also explain that when one collapses categories they should try to collapse it multiple ways and see which way gives the highest amount of good fitting items. However, one might ask what might happen when a middle category, with a high frequency of SMEs, is combined with an upper category. It was hypothesized that since the categories are being changed from three (polytomous) to two (dichotomous) categories that there will be less accurate information after the change. In other words, it is likely that the interpretation of the ICCs and the information given in the ICCs will be less precise as a dichotomous statement, both because there is less division among the categories and because one is combining categories which have a relatively high frequency. This, in turn, gives a less detailed picture of what the task is telling us. This study explores what occurs (or whether nothing occurs) when one dichotomizes the task statements. 34 Chapter 4 METHODS Sample This study uses archival data from a questionnaire used by used by a corrections agency in the western United States. There were a total of 544 SMEs, working at the correctional facilities, who were included in this study. The job positions of the individuals included 231 Correctional Officers, 101 Juvenile Officers, 94 Juvenile Counselors, 91 Correctional Sergeants, 10 Juvenile Sergeants, and 13 Senior Juvenile Counselors. There were a total of 434 males,106 females, and 4 who did not indicate gender. Ethnic background included 228 White, 165 Hispanic, 98 Black/AfricanAmerican, 10 Asian, 10 Filipino, 6 Native American, 8 Pacific Islander, 9 other, and 10 did not indicate ethnicity. Instrument Data from a job analysis questionnaire was used in this study. The questionnaire was originally used to clarify job duties and equipment that would be used by incoming Correctional Peace Officers, Juvenile Peace Officers and Juvenile Counselors. The tasks were rated for the jobs from the individuals in the position and those supervising the individuals in the position. The following areas are covered in the questionnaire: 1. Booking, Receiving and Releasing; 2. Casework; 3. Counseling; 4. Court-related Board Hearings; 5. Arrests; 6. Emergencies; 7. Escort, Move, Transportation; 8. General Duties; 9. Health and Medical; 10. Investigation; 11. Oral Communication; 12. Read, Review, and Analyze; 13. 35 Referrals; 14. Search; 15. Security; 16. Supervision of Non-inmates; 17. Supervision of Wards/Inmates; 18. Restraints and Use of Force; and 19. Written Communication. There was also an equipment section. However, this study focused on what the officers do and not what they use. Individuals were given task statements and were asked to rate the frequency and degree of consequence for not performing each of the task statements. Individuals are given a column for the frequency of the task and for the importance of the task. The individuals were asked to rate the frequency and importance of the task related to the entry-level correctional peace officer position. For frequency the prompt was given, “Based on your experience in the facilities in which you’ve worked, how many entry-level correctional peace officers will perform this task in the first three years on the job (even if they do it only a few times)?” The prompt was followed by the following: “0. Task Not Part of Job”; “1. Less than a Majority”; or “2. A Majority”. As can be seen, this data is an approximation to polytomous data. It is polytomous in the sense that it has 3 categories. However, the 0 (Task Not Part of Job) is not actually on the theta continuum. In other words, when one chooses the option of the task not being part of the job they are not giving the task a degree of frequency or importance. If the individual was giving the task a degree of frequency or importance then they would be rating the task on the continuum, because they would be indicating that the task has a degree relevance to the job. Options 1. Less than a Majority and 2. A Majority are on the continuum. 36 For importance the degree of consequence for not performing a task was measured with the following prompt: “How likely is it that there would be serious negative consequences if this task is NOT performed or if it is performed incorrectly?” The following categories are provided after the prompt: “0. Task Not Part of Job”; “1. Not Likely”; or “2. Likely”. Only the frequency ratings were used in this analysis. The questionnaire was extensive and it was decided that this analysis should focus on a specific portion of the questionnaire to answer the question of whether there is a difference between dichotomous and polytomous task ratings. The task statements that were included in the final analysis are listed in Appendix A. The tasks were analyzed with IRT in dichotomous and polytomous form, which was different from the way it was originally analyzed by the corrections agency. The corrections agency analyzed the data in its original form using typical descriptive statistics (mean, standard deviation, percentages, etc.) and not IRT. The current study explores what occurs when the data is dichotomized by comparing polytomous to dichotomous models. Also, using IRT in order to analyze job analysis ratings is a relatively new approach. However, the analysis and comparison does help to clarify what occurs when one changes task statements from polytomous to dichotomous. 37 Chapter 5 RESULTS Initial Data Preparation and Selection of Tasks for Analysis In order to make a direct comparison between dichotomous and polytomous variables it was essential to first dichotomize the task statements and save it as a new file. In other words, all the 2s in the original polytomous data file were changed to 1s, which left the dichotomous data with 0s and 1s. Step 1 and 2 of Appendix B describe how this was carried out. The dichotomous data indicated whether the tasks were or were not a part of the job, while the polytomous data indicated the frequency and degree of consequence of the tasks. However, as was mentioned, only the frequency statements were used. When the ratings were dichotomized there were certain task statements that no longer had variance (e.g., all data became 1s). In other words, all of the SMEs who rated these statements felt that they had at least some degree of relevance to the entry-level correctional peace officer position. These statements are shown in Table 1. 38 Table 1 Deleted Variables Due to Loss of Variance When Dichotomized Category Oral Communication Search Security Supervision of Wards/Inmates Restraints and Use of Force Task Statement Communicate orally. Confiscate contraband. Call for back-up. Intervene in/breakup physical altercations. Confront wards/inmates exhibiting inappropriate behavior. Identify violent wards/inmates. Discharge chemical agents to control resistant inmates or quell disturbances/riots. Restrain an assaultive ward/inmate. Use departmentally approved “use of force” methods. If a variable does not have variability it indicates that everyone who was sampled had 100% agreement (which is not exactly true since the data were dichotomized). Without variance there is no standard deviation from the mean and there is no score difference and error to use in calculations. Therefore, the variable does not add any additional information to the analysis. In other words, when the data were dichotomized some of the most relevant tasks were unable to be analyzed. Because of this, the items that no longer had variance were removed from the analysis. There was variability with these statements in polytomous form. However, in order to make a direct comparison (using the same variables in the dichotomous and polytomous analysis) it was decided that these statements should not be used. The data were also dichotomized by combining the lower two levels. With this manner of dichotomizing none of the task statements were lost. This method of 39 dichotomizing the tasks is consistent with what Bond and Fox (2007) suggested (e.g., combine categories with a low frequency). However, in the current study the data were dichotomized in a way similar to Harvey (2003; e.g., does not apply is 0 and all other rating were 1). By dichotomizing the data in this manner one avoids combining contradictory categories. For instance, when one is combining the task is not part of the job and less than a majority of the individuals will perform the task they are combining categories that are going in negative and positive directions. Therefore, it makes more sense to combine the upper categories. In the original job analysis there were several variables that were considered to have low criticality to the entry-level correctional peace officer job position (variables had low criticality across all SME groups). This was calculated by multiplying the frequency and degree of consequence mean scales for each item. The variables’ criticality indices fell between 0 and 4. The variables that were below .5, across all SME groups, were not considered to be critical to the job. These variables are listed in Table 2. 40 Table 2 Deleted Variables Due to Low Criticality Category Booking, Receiving and Releasing Casework Court-related Board Hearings Emergencies General Duties Health and Medical Investigation Supervision of Non-Inmates Written Communication Task Statement Discuss charges against juvenile with arresting/transporting officer. Place holds on wards/inmates and notify department holding warrant. Process bail. Run warrant checks, holds and search clauses. Assign wards/inmates to program/counselor. Conduct a home study where juveniles are to be released. Coordinate with external resources for ward/inmate employment and rehabilitation services. Process applications for alternative sentencing programs. Conduct closed circuit video arrangements. Handle canines to control crowds. Operates facility canteen. Recruit job applicants and volunteers. Serve as a departmental representative to external groups. Distribute medication. Weigh wards/inmates. Administer a breath analyzer test to wards/inmates. Supervise infants only (no adult visitors present). Request Department of Justice (DOJ) criminal history. Since the tasks listed in Table 2 had low criticality they would not add to the analysis, because the tasks had little to do with the entry-level positions. Therefore, these tasks were deleted from the analysis. 41 The polytomous factor loadings were used to determine which factors would go into the IRT analysis. The group means with factor loadings that were above .50 were used in the IRT analysis. This would allow the analysis to be relatively unidimensional, which is one of the assumptions of IRT. If the items load on a single factor it is a good indication of unidimensionality (Embretson & Reise, 2000). Therefore, in order to test the dimensionality of the data all of the sub categories of the questionnaire were analyzed using factor analysis. First the mean scores of the task statements in each of the sub categories were calculated by combining the variables in each sub category (e.g., Booking, Receiving and Releasing; Casework; Counseling; Court-related Board Hearings; Arrests; Emergencies; Escort, Move, Transportation; General Duties; Health and Medical; Investigation; Oral Communication; Read, Review and Analyze; Referrals; Search; Security; Supervision of Non-Inmates; Supervision of Wards/Inmates; Restraints and Use of Force; Written Communication). These mean scores were then analyzed using exploratory factor analysis. The maximum likelihood extraction method was used and a single factor solution was imposed on the data. The mean scores, descriptive statistics, Chronbach’s Alpha, Inter Item Correlation, and the factor analysis are shown in Table 3 below. The means of the subgroups that had loadings > .50 were used in the final analysis. This is based on the factor loading explanation given by Comrey and Lee (1992). Factor loadings above .45 are considered to be in the fair range and above .55 are considered to be in the good range. Therefore, the groupings are fair or better. 42 Table 3 Polytomous/Dichotomous Descriptive Statistics, Chronbach’s Alpha, Inter Item Correlation, and Factor Loadings Polytomous Dichotomous Categories # variables Mean SD 𝛼 IIC Factor Mean SD 𝛼 IIC Factor Booking, Receiving and Releasing 8 .67 .41 .81 .35 .46 .56 .29 Casework 11 .45 .43 .86 .38 .35 .32 .27 Counseling 4 1.28 .55 .78 .50 .37 .78 .24 Court-related Board Hearings 2 .87 .48 .61 .45 .41 .76 .35 Arrests 1 1.28 .74 .37 .82 .38 Emergencies 13 1.22 .33 .79 .24 .67* .83 .14 Escort, Move, Transportation 16 1.35 .36 .87 .29 .67* .87 .18 General Duties 38 1.20 .30 .90 .20 .80* .77 .15 Health and Medical 14 1.14 .31 .77 .20 .73* .80 .14 Investigation 15 1.17 .42 .90 .39 .75* .81 .20 Oral Communication 15 1.55 .25 .80 .24 .67* .91 .10 Read, Review and Analyze 8 1.17 .45 .78 .31 .62* .76 .23 Referrals 4 1.31 .53 .80 .52 .57* .85 .23 Search 4 1.70 .35 .62 .33 .58* .96 .12 Security 38 1.43 .31 .93 .25 .70* .86 .15 Supervision of Non-Inmates 4 1.05 .49 .72 .40 .57* .77 .29 Supervision of Wards/Inmates 31 1.40 .31 .91 .26 .75* .85 .11 Restraints and Use of Force 8 1.47 .32 .78 .32 .47 .96 .08 Written Communication 31 1.19 .33 .90 .22 .79* .77 .17 Note. 𝛼 and IIC were not estimated for Arrests because it only had one item. *Factor Loadings > .50 .79 .84 .63 .60 .63 .86 .85 .63 .84 .60 .70 .63 .36 .91 .71 .79 .41 .86 .32 .35 .36 .46 .12 .27 .13 .12 .28 .09 .24 .36 .11 .20 .43 .12 .11 .17 .60* .23 .10 .49 .36 .60* .65* .78* .63* .67* .58* .49 .37 .51* .73* .70* .57* .45 .75* 43 Figure 3. Scree Plot for Dichotomous Variables 44 Figure 4. Scree Plot for Polytomous Variables Above, in Figure 3 and 4 are the scree plots for the dichotomous and polytomous variables that loaded above .5 on the first factor. As can be seen from the scree plots above, the grouping of variables is mostly explained by one factor. On the dichotomous analysis the eigenvalue for the first factor was 5.70, which explains 43.82% of the variance, and the second is 1.57, which explains 12.07% of the variance. The ratio of eigenvalues on the dichotomous data is 5.70 to 1.57, which indicates that the first factor explains 3.63 times more data than the second factor. The eigenvalue for the first factor of the polytomous analysis is 6.66, which explains 51.24% of the variance, and the eigenvalue for the second factor is 1.27, which explains 9.75% of the variance. The ratio for the polytomous data is 6.66 to 1.27, which indicates that the first factor explains 5.24 times more of the data than the second factor. As can be seen, the first factor with the 45 data in polytomous form explains more than when it is in dichotomous form, indicating that the polytomous form of the data has greater unidimensionality. A common rule explained by Meyers, Gamst, and Guarino (2006) is that the factors used should account for 50% of the variance (e.g., Tabachnick & Fidell, 2001b). As can be seen, the first factor for the dichotomous data accounts for almost 50% of the variance in the data, while the first factor of the polytomous data accounts for more than 50% of the variance. Another rule explained by Spicer (2005) is that the eigenvalues should be above 1. Even though factors 1 and 2 are both above 1 it can be seen that a large portion of the variance is accounted for by the first factor. Therefore, only one factor was used to decide which variables would be used in the final analysis. Assessment of Model Fit and Model Selection PARSCALE, a program used to run IRT, was used in order to calibrate items and assess model fit. As was mentioned before, PARSCALE uses the M-GRM to estimate rating scales. Also, the logistic metric was used. The main difference between the normal and logistic metric is that the equation for the normal metric is multiplied by 1.7. However, Embretson and Reise (2000) explain that the “normal metric is often used in theoretical studies because the probabilities predicted by the model may be approximated by the cumulative normal distribution” and the “parameters are anchored to a trait distribution” (p. 131). The current study is not measuring the trait of the SMEs. Therefore, the logistic metric was used. The logistic model was estimated with a scaling factor of 1.7 to place it on the normal metric. 46 It should be mentioned that the previous factor analysis was done at the subcategory level and averaged across groups of individual tasks. However, the current analysis is done at the item level. In order to decide which type of model to use (1PL or 2PL) it was imperative to see how the items fit the data. The item fit for item response theory was analyzed and Figure 5, 6, 7, and 8 below show how well 193 of the items fit the data under the 1PL and 2PL models. There were 37 items outside the plot range when dichotomized. These items were not given a fit calculation by PARSCALE. Therefore, the plots only include the 193 items for both dichotomous and polytomous to show a direct comparison. As can be seen in Appendix C and D, there were a total of 230 items in the original analysis. The following grand means were found for the combined 𝜒 2 ⁄𝑑𝑓 for all of the tasks statements: dichotomous 1PL model equals 5.18; dichotomous 2PL model equals 29.34; polytomous 1PL model equals 2.84; and the polytomous 2PL model equals 12.36. A lower 𝜒 2 ⁄𝑑𝑓 indicates that the items fit the data better. Comparing dichotomous 1PL means to dichotomous 2PL means it can be seen that the dichotomous 1PL grand mean (5.18) is much lower than the dichotomous 2PL grand mean (29.34), which indicates that there is better fit with the 1PL model when analyzing the dichotomous items. Also, the polytomous 1PL grand mean (2.84) is lower than the polytomous 2PL grand mean (12.36), which indicates that there is better fit with the 1PL model when analyzing the polytomous items. Therefore, based on the information that was found it was decided that the 1PL model would be used in order to describe the data. As was mentioned before the 47 1PL model has a location parameter for all of the items. It also has a common slope for all of the items. The outliers are items that have extremely bad fit with the different models, and they skew the histograms. Therefore, the plots (Figure 5-8) show the items with and without outliers so that finer comparisons can be made at lower levels of 𝜒 2 ⁄𝑑𝑓 . As can be seen in the histograms below, the 𝜒 2 ⁄𝑑𝑓 for the 1PL model is more centrally grouped. The main grouping of variables (task statements) for the dichotomous 𝜒 2 ⁄𝑑𝑓 data points is between 0 and 12, and the main grouping for the polytomous 1PL is between 1 and 9. This also gives an indication that there is better item fit for the 1PL data. 48 Figure 5: Item Fit Histogram for the Dichotomous 2PL Data With (top) and Without (bottom) Outliers 49 Figure 6. Item Fit Histogram for the Dichotomous 1PL Data With (top) and Without (bottom) Outliers 50 Figure 7. Item Fit Histogram for the Polytomous 2PL Data With (top) and Without (bottom) Outliers 51 Figure 8. Item Fit Histogram for the Polytomous 1PL Data With (top) and Without (bottom) Outliers Due to the fact that certain items did not have good fit it was decided to remove items so that dichotomous versus polytomous comparisons could be made with items that 52 conform to model assumptions. Based on the standard outlined by Drasgow et al. (1995) it was decided that task statements with 𝜒 2 ⁄𝑑𝑓 > 3 would be removed. After the initial 1PL IRT analysis there were 38 items removed, which was based on the polytomous item analysis. The remaining tasks were rerun through the IRT analysis. There were an additional 26 items removed due to poor fit (e.g., 𝜒 2 ⁄𝑑𝑓 > 3). The syntax that was used to create the final data file and the PARSCALE syntax are in Appendix B. Also, the specific items that were removed are shown in Tables 6 and 7 in Appendix E and F. The following number of items remain in each of the subsections: Emergencies 8; General Duties 28; Health and Medical 8; Investigation 9; Oral Communication 13; Read, Review, and Analyze 5; Referrals 4; Search 3; Security 26; Supervision of Non Inmates 4; Supervision of Wards/Inmates 21; Written Communication 22. The remaining items can be seen in Appendix A. The histograms in Figure 9 and 10 show the dispersion of 𝜒 2 ⁄𝑑𝑓 for the final selection of task statements using the 1PL dichotomous and polytomous models: 53 Figure 9. Item Fit Histogram for Dichotomous Items After Item Removal Figure 10. Item Fit Histogram for Polytomous Items After Item Removal The same tasks statements were used for both the dichotomous and polytomous item analysis. As can be seen in Figures 9 and 10, when items have good fit as 54 polytomous items (Figure 10) they may not have as good of fit when they are transformed into dichotomous items (Figure 9). The overall mean of 𝜒 2 ⁄𝑑𝑓 for the polytomous items is 1.82 while the overall mean for the dichotomous items is 4.29. This indicates that there is better fit with the polytomous items. It is noted that the opposite may be true if one removed items based on the item fit for the dichotomous tasks statements. However, this study focuses on what occurs when the items are dichotomized. Therefore, the removal of items is based on the fit of the polytomous tasks statement. The tasks statements that are used in the final analysis are in Appendix B and the statistics for the remaining statements are in Appendix C in Tables 8 and 9. Total Information Plots for Dichotomous and Polytomous Statements Below, in Figures 11 and 12, are the total information plots for the dichotomous and polytomous task statements. As can be seen, the dichotomous total information plot has a higher peak (approximately 47) than the polytomous total information plot (approximately 11.5), which indicates that the dichotomous task statements have higher precision at a narrower range of theta. The polytomous items seem to give information over a broader range of theta values. For instance, when one reaches +3 theta the dichotomous items are barely providing any information. However, the polytomous items still provide information at this level. If one simply wanted to answer whether or not a task was done the dichotomous form of the task statements does give a lot of information. However, if one wanted to know where the categories are centralized and have an idea of how the tasks were rated the polytomous form of the data may be better, because it spreads the information across a wider range of theta levels. 55 Figure 11. 1PL Dichotomous Total Information Plot Figure 12. 1PL Polytomous Total Information Plot 56 Item Characteristic Curves for Dichotomous and Polytomous Models All of the 1PL plots have a common slope. In the PARSCALE analysis, all of the slope parameters for the dichotomous statements were estimated at .788 with a standard error of .018. The slope parameter for the polytomous statements were estimated at .482 with a standard error of .001. With a common slope it may be difficult to discriminate between categories for the polytomous form of the data for those tasks whose slopes are underestimated in the 1PL. For instance, if there is a low b parameter one knows that most of the SMEs were in the upper categories and those categories do not differ greatly. If the slopes for such tasks were underestimated it would be more difficult to differentiate those categories. With a common slope one may encounter this situation whereas the 2PL model would allow better discrimination between the categories. Tables 4 and 5, in Appendix C, give the dichotomous and polytomous items prior to the removal of poorly fitting items. Tables 6 and 7, in Appendix C, give the dichotomous and polytomous items after the poorly fitting items were removed. The tables include the PARSCALE item identifier, the item number (original number on the questionnaire), b (location) parameter, standard error, 𝜒 2 , 𝑑𝑓, the calcuation of 𝜒 2 ⁄𝑑𝑓 , and the significance. The b parameter allows one to know the theta level where the item curves the most (where the item gives the most information). 𝜒 2 is related to the observed and expected frequencies. A high 𝜒 2 indicates that the observed and expected frequencies differ substantially. 𝑑𝑓 is the degrees of freedom. The division of 𝜒 2 ⁄𝑑𝑓 allows one to know whether the item fits the model (e.g., below 3). Finally, the significance is also 57 related to the observed and expected frequencies. If the item is significant it indicates that there is a significant difference between the observed and expected frequencies. In order to give a good indication of what happens when a polytomous item is dichotomized there are three item plots given as both polytomous and dichotomous. As can be seen in Figure 13 the dichotomized item 38 gives ICCs that are far to the left. Figure 13. 1PL Dichotomous ICC and IIC Plots for Item 38 This indicates that a majority of the SMEs have rated the task statement as having relevance to the job. The b parameter (location parameter) for this item is at -3.91. PARSCALE did not estimate 𝜒 2 ⁄𝑑𝑓 for this item due to the fact that the dichotomization caused the curve to be at a very low combined theta level (below where PARSCALE is plotting the data). In fact, there were 24 dichotomized items whose fit statistics were not estimated (PARSCALE did not calculate 𝜒 2 and 𝑑𝑓) because dichotomizing the items pushed the items below the plot range. The IIC for this curve is also very low. In other 58 words, the location where the item is giving the most information is below -3 and is not visible in this plot. Figure 14. 1PL Polytomous ICC and IIC Plots for Item 38 In Figure 14 one can see that for the polytomous item 38 there is a relatively low number of SMEs who indicated the the tasks does not apply (category 1). Also, the ICCs for categories 2 (Less than a Majority) and 3 (A Majority) are going in opposite directions. The b parameter for this item is -2.68 and the 𝜒 2 ⁄𝑑𝑓 is 1.03. The IIC peaks at a little below the -2 theta level. The low b parameter gives an indication that the SMEs rated the item in the upper categories. As one can see, the dichotomization of the item caused the item to have the greatest amount of information at a lower theta level. This is because the combined categories 2 and 3 causes the data to appear to have much stronger agreement. In other words, when categories 2 and 3 are combined one is combining the opinions of multiple 59 SMEs, which causes a lower theta level. This has the effect of giving the appearence that there is a large amount of agreement that the task is relevant. This will be discussed in more detail in the next section. The next plot is for the dichotomized item 58 ( Figure 15). It is a little bit more centralized and, as can be seen, there are several more individuals who felt that the tasks did not apply to the job (category 1). The IIC indicates that the curve is giving the most information at about the -1.3 theta level. The b parameter for the dichotomized item 58 is -1.23 and the 𝜒 2 ⁄𝑑𝑓 is 4.15, indicating that the item does not fit the data well. Figure 15. 1PL Dichotomous ICC and IIC Plots for Item 58 The next plot (Figure 16) is the polytomous plot for item 58. As can be seen, there were a fair amount of individuals who felt that this task statement did not apply to the job (category 1). At the lower levels of categories 2 and 3 one can see that the opinions flow 60 in the same direction. However, once the item reaches the threshold (the point where the items cross) category 2 begins to decrease while category 3 continues to increase. The b Parameter for this item is at -0.44, which gives a good indication of where the opinions cross (threshold). The centralized b parameter indicates that the ratings were spread across categories. Again it may be difficult to discriminate between the categories due to the common slope. The 𝜒 2 ⁄𝑑𝑓 is at 2.58, indicating that this item fits the data well. As can be seen in the IIC plot in Figure 16, the item is giving the most information at about the .5 opinion level. Figure 16. 1PL Polytomous ICC and IIC Plots for Item 58 The dichotomous plot for item 100 (Figure 17) shows curves that are a bit more centralized. One can see that there were a fair amount of individuals who did not think that this task was relevant to the job. There were also a fair amount of individuals who felt that the task had some degree of relevance to the job. This item has a b (location) parameter of -0.65, indicating where the curves slope the most. Also, the 𝜒 2 ⁄𝑑𝑓 for this 61 item is 3.11, indicating that it has good item fit. The IIC indicates that the majority of the information is given between opinion levels -2 and 2, and the IIC peaks at about -1.7. Figure 17. 1PL Dichotomous ICC and IIC Plots for Item 100 The polytomous plot for item 100 (Figure 18) shows that there were a fair number of individuals who felt that the tasks did not apply to the job. Similar to item 58, when opinions 2 and 3 reach threshold they begin to go in different directions. However, opinions 2 and 3 do flow in the same direction until they reach a higher theta level. The IIC for this item peaks at about 0. The majority of the information is given from about -3 to over 3. The b parameter for this item is at 0.13 and the 𝜒 2 ⁄𝑑𝑓 is 1.05, which indicates that the item fits the data. The centralized b parameter indicates that the rating were spread out across categories. 62 Figure 18. 1PL Polytomous ICC and IIC Plots for Item 100 When one compares the b parameters of the dichotomous and polytomous items they can see that the b parameters (where the items are centrally located) change when the items are dichotomized. Figure 19 shows a histogram of the change in b parameters when the tasks are changed from polytomous to dichotomous. The change in b parameters ranged from a minimum of -3.12 (decreasing) to a maximum of 1.69 (increasing). The mean score for the amount of change was -.034. In other words, the b parameters tended to decrease more than they increased. However, the IIC did tend to decrease when it was dichotomized. In other words, the point at which the item is giving the maximum amount of information (the peak of the IIC) was at a lower theta level for the dichotomous items, which fits with what is shown in the total information curves. 63 Figure 19. Change in b Parameters When Tasks are Dichotomized Out of the 166 items there were only 3 items which had an IIC that moved in the upward direction (based on where the item peaks). Item 63 moved from approximately -2 to approximately -1.8. Item 101 moved from approximately -3.4 to approximately -2.8 and item 261 moved from approximately -3.2 to approximately -2.8. However, the dichotomous versions of these tasks had bad fit (e.g., 𝜒 2 ⁄𝑑𝑓 >3). Therefore, based on this information it seems that, unless there is bad fit, the IIC moves to a lower theta level, which means that the dichotomous item would be interpreted differently based on the IIC. Further Exploration of the Rating Scale A final exploration of the categories was done using only the upper categories (“1. Less than a Majority” and “2. A Majority”) and excluding the lowest category (“0. Task Not Part of Job”). This was done to address the question of whether the upper categories alone would be sufficient to describe the task statements. As can be seen in the total 64 information plot, the upper two categories for the task statements peak at a higher theta level than the dichotomous form of the data in Figure 11 above where the upper categories were combined. This is because the combination of the upper categories in the previous dichotomous model caused the task statements to have a relatively low b parameter. Also, the information in this plot has a lot higher peak than the polytomous total information plot (Figure 12). This indicates that the item is giving information at a more specific level of theta. The information peak is not much different when comparing the upper levels combined (Figure 11) to the upper levels alone (Figure 20; with DNA removed). For instance, the information peak is a little higher when the upper levels are combined (Figure 11). Figure 20. 1PL Dichotomous Total Information Plot With Upper Categories Only The different versions of the items (item 38, 58, and 100) that were shown above in Figures 13 to 18 are also shown below in Figures 21 to 23. In this analysis, item 38 65 (Figure 21) has a b parameter of -0.91. This is much higher than the -3.91 in the dichotomous version above (Figure 13) with the combined upper levels. This version also has a 𝜒 2 ⁄𝑑𝑓 of 0.24, which indicates that the item has excellent fit. Also, with the task statement in this form, and with a moderately low b parameter, it is clear that there were a moderate amount of individuals who chose the middle category (e.g., Less than a Majority) and a large amount of individuals who chose the upper category (A Majority). Figure 21. 1PL Dichotomous ICC and IIC Plots for Item 38 With Upper Categories Item 58 (Figure 22) has a b parameter at 0.76. This is also higher than the b parameter of the dichotomous item 58 above (Figure 15). The 𝜒 2 ⁄𝑑𝑓 for this item is 1.43, which also indicates that it has good item fit. The b parameter indicates that there were a large amount of SMEs who rated the task in the middle category (Less than a Majority) and a moderate amount of SMEs who rated the task in the upper category (A Majoirity). 66 Figure 22. 1PL Dichotomous ICC and IIC Plots for Item 58 With Upper Categories Finally, item 100 (Figure 23) has a b parameter of 0.72. This is also higher than the b parameter of the dichotomous version of the item above (Figure 17). Item 100 has a 𝜒 2 ⁄𝑑𝑓 of 1.98, indicating that the item has good fit. The b parameter indicates that there were a larger amount of SMEs endorsing the middle category (Less than a Majority) than there were endorsing the upper category (A Majority). However, it would be important to note that this item also has a lot of SMEs rating it in the DNA category, which is shown in the dichotomization in Figure 18. The meaning of these findings will be discussed in more detail in the next chapter. 67 Figure 23. 1PL Dichotomous ICC and IIC Plots for Item 100 With Upper Categories Looking at the b parameter of these items and comparing it to the old dichotomous items it appears that all of these moved to a higher b parameter. The mean amount of increase was 1.48. There were no items which had a b parameter above 3 or below -3 theta level. This indicates that one would get a better visual description of the item when analyzing the upper categories alone. In other words, the curve of the item would be viewable on the plot, because the plot is between -3 and 3. Also, with the b parameter moving in an upward direction the IIC for each of the items would move in an upward direction, which would indicate that the task statements for the upper levels alone (when compared to the lower and combined upper levels) are giving the maximum amount of information at a higher theta level. When comparing the dichotomous version (upper levels alone with DNA removed) to the polytomous version, one can see that the polytomous version gives an idea of how the item was rated across the full rating scale based on the b parameter, 68 however, the dichotomous version allows one to gain a better understanding of how just the upper levels were rated. 69 Chapter 6 DISCUSSION This study looked at the difference between dichotomous and polytomous task statements when using IRT, and looked at whether one needs to be cautious when dichotomizing task statements. One thing that stood out immediately was that there were several task statements which completely lost variance when they were dichotomized (e.g., all items data points became a 1). This indicates that everyone in the sample thought that these specific tasks statements had at least some relevance to the job position (e.g., no one chose “0. Task Not Part of Job”). Therefore, immediately after dichotomizing, some of the most relevant variables to the job position were not usable, because they no longer had variance. There appear to be certain benefits and disadvantages to the dichotomization of questionnaire data. When one looks at the full data plots they can see that the polytomous items give information across all theta levels, while the dichotomous items give information at the lower theta levels. However, even though the dichotomous items do not cover the entire spectrum of theta levels they do have a higher peak, which indicates a greater amount of information at a lower, more specific, theta level. Often in occupational testing, when one is establishing a specific pass/fail point for an examination, one would want the problems to be at a specific theta level, because it allows one to know that they are targeting the right ability range. However, one might question whether the “target” theta should be more specific with task statements. 70 As can be seen in the plots from the Results section one loses a certain amount of relevant data when one dichotomizes (e.g., combining the upper categories). With Item 38 (Figure 13), when there is a high amount of individuals who feel the tasks are relevant to the job, the combination of categories 2 (Less than a Majority) and 3 (A Majority) pushes the ICC to a much lower theta level. It also gives the impression that there was almost 100% agreement. In other word, the b parameter (location parameter) is really low (-4.69) indicating that the majority of the SMEs marked the item as having at least some level of relevance to the job. While it is true that the majority of the individuals chose categories 2 and 3 and the combination of the two categories would give a curve that starts at a low b parameter, it does not account for the fact that there was a split between categories 2 and 3. The polytomous form of the data (using the 1PL model) does not allow one to discriminate between categories. One would need to look at the upper levels with DNA removed to get a clearer picture of the upper categories. One can also see this in item 58 (Figure 16). In this item, categories 2 and 3 are flowing together at certain points. However, the dichotomization of the variable does not account for the fact that categories 2 and 3 begin to go in different directions (e.g., after they cross the threshold). As was mentioned in the Results section, if one simply wanted to answer whether the task is part of the job then the dichotomous form of the data (with the upper levels combined) would be sufficient to answer this question. Item 100 is a bit different. With the polytomous form of item 100, categories 2 and 3 are going in the same direction at a higher theta level. This indicates that there are a fair amount of SMEs rating this item in the DNA category. One should note how many 71 SMEs are in each category. If there are a few SMEs in one of the categories then combining the levels may not have an effect on the curve (Bond & Fox, 2007). Another difference between the dichotomous and polytomous task statements is in the IIC. The IIC indicates where the curves are giving the most information. As can be seen, the dichotomous data has a sharper peak on all of the dichotomous plots, which indicates that the data are giving information at a more specific theta range. Whereas, the IIC for the polytomous data are broader, indicating that the items cover a larger theta range. In testing, when one wants to have a specific pass/fail point, it is preferable to have an IIC which is a bit more narrow, because it allows one to know at what ability level the data are functioning the best. However, this study is not based on right and wrong answers it is based on educated opinion. One could argue that there are opinions that are more correct than others (e.g., items which require extensive job experience), but all of the SMEs have their own unique view point based on their experience or previous experience in the position. In other words, certain individuals may have had experiences that have led them to believe that certain tasks are of higher or lower importance. Therefore, it is hard to say whether a narrow or broad IIC is preferable for tasks related data. Finally, if one looks at the dichotomous (with combined upper levels) and polytomous IICs for the same problem one can see that the dichotomous IICs consistently peak at a lower level than the polytomous IICs. This occurs because when one combines categories one is creating what appears to be a higher level of agreement for the newly 72 dichotomized category. Therefore, when the data are dichotomized it gives the impression that there is a higher level of agreement than there actually is. Implications When one dichotomizes task statements he or she may lose some of the statements, depending on the sample size, due to a loss of variance. All of the individuals in a sample may feel that certain task statements are relevant to a job position. If this is the case, then the variable would be 100% the same (e.g., all 1s) when it is dichotomized. By losing this data one is losing variables which are highly critical to the job position. For instance, all of the variables that were not used in this study, due to lack of variability, after dichotomizing, were high on the criticality index. When one is losing highly critical data they are losing the data that is most relevant to the job. However, these task statements could be analyzed by looking at the upper levels alone (e.g., with DNA removed). As was mentioned before, one should be cautious when dichotomizing opinion related statements. When one dichotomizes the task statements they are changing the look and meaning of the statements. Often, the polytomous choices are going in different directions. For instance, one could interpret the polytomous version of Item 38 (Figure 14) as saying that there are a low level of SMEs who consider the task to be irrelevant to the job; there is a moderate amount of SMEs who consider the task to be performed by less than a majority of the individuals entering the job; and there is a high number of SMEs who feel that the task would be performed by a majority of the people in this job. It does seem that one should also be cautious when using the polytomous form of the data with a 1PL model. The b parameter gives a good indication of how the item was 73 rated. For instance, a low b parameter would indicate that a majority of the SMEs rated the upper categories. However, with a common slope one is not able to discriminate between the categories as much as one might with the 2PL model. For example, the low b parameter alone does not allow one to understand the degree of endorsement of the upper categories. The 1PL model allows one to know where the categories are centrally based on the continuum, which gives a good indication of how the tasks were rated, but it may not allow one to clearly understand the degree of discrimination between the categories. One might also question the appropriateness of the M-GRM with a DNA category on a 3-point scale. One might question whether the DNA category should be considered a zero point along a continuum or whether DNA should be off the continuum. If the DNA category is endorsed the SME is essentially saying the task does not have a degree of relevance to the job. If the DNA category is considered to be off the continuum then the M-GRM may not be appropriate for the polytomous form of the data, (e.g., since it is not “ordered categorical”; Embretson & Reise, 2000, p. 97). When one dichotomizes the item it can be interpreted in the following way: there is a low number of individuals who consider this tasks to be irrelevant to the job; there is a high level of agreement that this task has some form of relevance to the job. Essentially, when one is dichotomizing by combining the upper levels they are losing the specifics of how the data is flowing. However, they are answering the question of whether the task does or does not have some form of frequency. With the final analysis the items were analyzed with the upper levels (“2. Less Than a Majority” and “3. A Majority”) in isolation. Item 38 (Figure 21) could be 74 interpreted as follows: a majority of the SMEs felt that this task would be performed by a majority of the people in the job. The upper categories alone can be interpreted based on where the b parameter is located. If it has a low b parameter it indicates that most of the SMEs chose 3. A Majority. If the task statement has a high b parameter it indicates that most of the SMEs chose 2. Less Than a Majority. One should use caution when analyzing the upper levels alone, because if there are a lot of SMEs who chose “Task Not Part of Job” then one would not get the full picture of how the task statement was rated. If there are a small amount of SMEs who chose the lower category then the data could be analyzed in this manner. However, it may be useful to dichotomize the data with does not apply as 0 and all other categories as 1 and by using the upper categories alone. This would allow one to see whether the SMEs have rated the task statement as being part of the job and it would allow one to see the degree to which it is part of the job. It does seem that using the polytomous form of the data with a 1PL model would not give as clear of a picture as the two dichotomous forms of the data. The b parameter, with the data in polytomous form, gives an idea of how the task was rated, but it does not give a clear description of how the upper categories were rated. The dichotomous form of the data with the upper levels combined answers whether or not the task is part of the job. This is answered by the polytomous form of the data. However, the dichotomous form of the data with the upper categories alone (e.g., DNA removed) allows one to gain a greater understanding of how the upper categories were endorsed (e.g., did more SMEs select A 75 Majority or Less than a Majority). This question is not answered by the polytomous form of the data. Limitations One of the limitations of this study was the moderate sample size (e.g., 544 individuals). It stands to reason that if there were a larger sample size (e.g. 1000) then there would be a greater chance that the individuals in the sample would have collectively chosen the full range of categories (e.g., “1. Task Not Part of Job”; “2. Less Than a Majority”; or “3. A Majority”) for each item. If this were the case, variables would not be lost when dichotomizing. However, if 544 individuals felt that certain tasks are part of the job then increasing the sample may not increase the variability by much (e.g., a large majority would be a 1 when the variable is dichotomized). Chuah, Drasgow, and Luecht (2006) “simulated responses of 300, 500, and 1,000 respondents” (p. 241). They found that 300 respondents were sufficient to estimate “ability” (Chauh et al., 2006, p. 241). Based on this, it seems that the data in this study may be considered as at least a moderate sample size for an IRT analysis. This study may have had limited generalizability. There were approximately 80 percent male SMEs in this study. If there is not a good mix of male and female SMEs it may reduce the external validity of the study. However, this may reflect a common population in a correctional facility. In other words, there may be more males who are in and who apply for the Correctional Officer position. Another factor which may have contributed to a reduction of accuracy is that there were three different entry-level job positions (Correctional Officers, Juvenile Officers, 76 and Juvenile Counselors). These were tasks that were common to all of the positions. However, it stands to reason that certain tasks are performed more by people in one job title than those in another. If the data were specifically from one job position the accuracy of the IRT analysis would likely increase. The corrections agency originally looked at the positions separately. In their analysis they were able to identify the tasks that were rated similarly and those that were rated differently. The current study would benefit from being able to analyze the positions separately. However, the sample size would end up being too small for an IRT analysis. A final limitation is that with the 1PL model is not able discriminate between categories with the polytomous data. For instance, it is not as clear how the upper categories were endorsed with the polytomous form of the data. Therefore, the data is not as clear with the 1PL model. Future Research There are several directions one could go in doing future research. One could use a larger sample size to see if variables are lost when dichotomizing. If one has a larger sample size it is quite possible that they would not lose any variables when dichotomizing. One could also use different job classifications to see if these results are generalizable to other jobs. One could also see how different classifications of SMEs (e.g. Correctional Officers vs. Juvenile Officers) rate the tasks and how the results differ under dichotomous and polytomous variables. For instance, do the supervisors rate the tasks differently than the individuals who are in the job? If so, why do they rate the tasks differently? 77 One could ask what theta means for the individual. At times theta may be a bit difficult to define, especially if there are a lot of variables included in the factor. How does one define theta for a job analysis? As was explained above “the ‘difficulty’ of an item is defined in terms of the amount of the general work activity (GWA) construct (𝜃) that would be needed to produce a given level of item endorsement” (Harvey, 2003, p. 2). Therefore, one might suggest that theta, for a job analysis, is based on the individuals knowledge of the job position. So, the characteristic that one understands through IRT is the amount of knowledge that an individual has about the tasks carried out on the job. Also, in relation to the last question (e.g., differences in classifications), if there are differences in classifications then which classification should be given more weight? Future research could also look at a comparison of the 1PL and 2PL models. One might ask whether having both a location and a discrimination parameter would change the ICC plots for the poltytomous data. Also, would this allow to one to understand the task statements without using a dichotomization? One could also ask what type of information they are receiving from the SMEs and if certain SMEs are giving less reliable information. For instance, is the information of an individual who is outside of the job, who is supposed to be knowledgeable of the job, similar to someone who is actually in the job position? Also, which information would be considered to be the most accurate? These and other questions could be looked at with future research. 78 APPENDICES 79 APPENDIX A Task Statements in Final Analysis Emergencies 36. Clean up contaminated or hazardous material. 37. Conduct emergency and disaster drills. 38. Control hostile groups, disturbances, and riots. 39. Dispatch help in emergencies or disturbances within or outside the facility. 43. Implement emergency procedures/disaster plan. 46. Report emergencies. 47. Respond to disturbances or emergencies within or outside the facility. 49. Negotiate hostage release. Escort, Move, Transportation 50. Escort medical professionals who are providing medical services to wards/inmates. 51. Escort vehicle(s) during emergency and/or high security transport. 52. Escort ward/inmates within and outside the facility. 53. Evaluate wards’s/inmate’s potential security risk prior to transport. 54. Inform central control of ward/inmate movement. 55. Monitor all individuals and vehicle movement inside, outside and in the immediate area of the facility. 56. Process vehicles entering, leaving or within the facility. 58. Prepare wards/inmates for transportation to court, hospital, etc. 59. Transport equipment, supplies and evidence. 60. Transport injured wards/inmates. 61. Transport wards/inmates individually and in groups outside the facility. 80 62. Use ward/inmate daily movement sheet. 63. Issue passes/ducats to wards/inmates. 65. Move wards/inmates in and out of cells. General Duties 66. Assign jobs to wards/inmates. 67. Attend staff meetings. 68. Attend training. 69. Clean areas of the facility when wards/inmates are not available. 70. Consult with supervisors. 71. Develop proposals for program, facility or policy improvements. 72. Exchange ward/inmate linens and clothing. 75. Distribute mail, supplies, meals, commissary items, equipment, etc. 77. Approve or disapprove special purchases for wards/inmates. 78. Confiscate and replace damaged ward/inmate linens and clothing. 80. Operate a vehicle or bicycle. 83. Order supplies. 85. Prepare meals. 86. Process law library requests and library books. 89. Raise/lower flag. 91. Report food shortages. 93. Serve meals. 94. Tour other facilities. 95. Give instructions to staff. 96. Observe the work of facility staff through peer review. 81 98. Process ward/inmate grievances and complaints. 99. Respond to ward/inmate questions or requests. 100. Train correctional staff. 101. Instruct wards/inmates. 102. Observe blind spots using a curved mirror. 103. Monitor wards/inmates and facility using closed circuit television systems. 104. Operate communication equipment. 105. Operate safety equipment. Health and Medical 110. Identify the immediate need for medical treatment. 111. Prepare injured individuals for transport. 112. Report changes in ward/inmate physical, mental and emotional condition. 113. Screen wards/inmates to determine if medical/mental health attention is needed before intake/booking. 116. Verify that wards/inmates receive food for special diets. 118. Decontaminate wards/inmates after use of chemical agent. 119. Comply with Prison Rape Elimination Act guidelines. 120. Implement safety/heat precautions for wards/inmates on psychotropic medications. Investigation 123. Assist police in their investigation of crimes. 124. Develop ward/inmates informats. 125. Gather information for disciplinary proceedings. 129. Implement ward/inmate due process procedures. 130. Interview wards/inmates as part of an investigation. 82 131. Investigate accidents or crimes that occur within the facility. 132. Investigate disciplinary reports. 133. Investigate ward/inmate injuries. 137. Process evidence. Oral Communication 138. Alert staff members of ward/inmate behavior changes. 139. Answer phone calls. 143. Communicate with external departments. 144. Confer with staff, specialists and others regarding wards/inmates. 145. Explain institutional policies, procedures and services to wards/inmates. 146. Follow oral instructions. 147. Give oral instructions and reports. 148. Inform relief staff of facility events during shift change. 149. Inform visitors and staff of facility facts, policies and procedures individually or in groups. 150. Notify supervisors of potential emergencies/hazards. 151. Notify wards/inmates of visitors. 152. Translate foreign languages into English. 153. Use radio codes to communicate with staff. Read, Review, and Analyze 156. Interpret common street terminology. 157. Interpret Department of Justice (DOJ) criminal history reports. 83 158. Read written and/or electronic documents. 159. Review forms and documents for accuracy and completeness. 161. Review ward/inmate case files. Referrals 162. Advocate for urgent services for wards/inmates. 163. Identify wards/inmates in need of medical or psychiatric care. 164. Make appropriate referrals. 165. Obtain assistance for wards/inmates in need of medical, dental or psychiatric care. Search 167. Dispose of contraband. 169. Perform a contraband watch. 170. Search individuals, property, supplies, areas and vehicles. Security 171. Account for facility keys. 172. Activate personal and/or control center alarm. 178. Compare fingerprints/palmprints to verify identification of wards/inmates. 180. Conduct metal detection screening of visitors. 182. Inspect food for contamination and/or tampering. 183. Inspect and document vehicle safety and operating condition. 185. Log weapons/guns in and out. 186. Monitor all persons entering, leaving and within the facility. 188. Monitor the zone control panel. 84 189. Operate/secure gates, doors, locks and sallyports. 191. Process wards/inmates leaving a security area. 192. Protect the security of courtrooms, hospitals and other external locations when wards/inmates are present. 193. Report ward/inmate count discrepancies. 195. Sign in and out of the facility. 196. Test all equipment to ensure proper functioning. 197. Update count of visitors entering and leaving the facility. 198. Account for location and status of wards/inmates. 199. Verify identification badges and passes. 201. Verify ward/inmate count. 202. Verify ward/inmate identity. 204. Admit and release visitors. 205. Issue identification badges and passes. 206. Screen visitors against approved visitor list and enforce visiting dress code. 207. Use stamp and black light to identify visitors. 208. Maintain confidentiality of information. 209. Account for location and status of staff within and outside the facility. Supervision of Non-inmates 210. Conduct facility towards. 211. Escort contract workers, non-custody staff and visitors within the facility. 213. Supervises visitors in contact and non-contact visits. 85 214. Account for location and status of visitors. Supervision of Wards/Inmates 216. Arrange daily schedules of wards/inmates. 217. Identify wards/inmates with disabilities and assist them. 218. Assist wards/inmates with paperwork/schoolwork. 221. Encourage wards/inmates through positive feedback. 222. Evaluate wards/inmates. 223. Higher wards/inmates for work detail. 224. Identify gang affiliation and implement processing procedures. 225. Identify homosexual behavior. 229. Intervene in ward/inmate disputes to deescalate a potentially violent conflict. 231. Maintain ward/inmate discipline. 234. Monitor ward/inmate activity. 235. Monitor ward/inmate phone calls. 236. Monitor wards/inmates for signs of alcohol or drug use/abuse and document any issues. 237. Monitor wards/inmates in safety cell, sobering cells, crisis rooms/center or restraints. 239. Obtain and process urine samples. 240. Obtain wards/inmates signature on forms. 242. Plan on and off-grounds activities for wards/inmates. 243. Prevent unauthorized ward/inmate communication. 86 244. Recommend ward/inmate work assignments. 245. Supervise ward/inmate cell and area moves. 248. Implement suicide watch procedures. Written Communication 261. Complete paperwork and forms. 263. File and retrieve documents and record system. 264. Notify housing units of wards/inmates scheduled for release or transfer. 265. Notify sender and receiver of the confiscation of contraband. 266. Prepare a list of wards/inmates going to court. 268. Process deceased inmates. 269. Process ward/inmate money. 270. Record all phone calls placed to or by wards/inmates in a log. 271. Document changes in wards'/inmates' mental and physical condition. 273. Document inspections and security checks. 274. Document persons and vehicles entering and leaving the facility. 275. Document the condition or security of printer structures, weapons or equipment. 276. Document Ward/inmate injuries. 278. Document Ward/inmate movement and activities. 279. Document ward/inmate rule violations. 280. Document ward/inmate trust account information. 281. Document ward/inmate visits. 282. Document whether ward/inmate takes or refuses medication or food. 87 283. Release property or money to transferred, released or paroled ward/inmate. 285. Requests repairs to facility and/or equipment. 289. Update list of approved ward/inmate visitors. 290. Update logs, documents, records and files. 88 APPENDIX B How Data were Converted and Run in PARSCALE This guide indicates how the data were converted and ran in PARSCALE. It is not meant to be a comprehensive guide. It is simply the step-by-step process that was done in order to run the data in this project. Some of these steps would not be the same if one were looking at how tests questions are functioning (e.g., recoding would be different). The data were run in SPSS Version 16.0. Step 1: Recode the variables into 1s and 0s. It should be mentioned that this step would not apply to the polytomous data. The following steps are taken in SPSS to recode: Transform > Recode into Same Variables > Select Variables to be Recoded > Click Old and New Values > Enter Old Value as 2 and the New Value as 1 > Add > Continue > OK. This will change all of the 2s to 1, which makes the data dichotomous. Step 2 : Recode the blanks into 9s. The following steps are taken in SPSS to change the blanks: Transform > Recode into Same Variables > Select Variables to be Recoded > Click Old and New Values > Click System Missing > Enter New Value 9 >Add > Continue > OK. Step 3: The following syntax was used to create the data file: 89 write outfile='f:practice4_dichotomous4.dat' /idcode (a12,1x) EM_A036 EM_A037 EM_A038 EM_A039 EM_A043 EM_A046 EM_A047 EM_A049 ES_A050 ES_A051 ES_A052 ES_A053 ES_A054 ES_A055 ES_A056 ES_A057 ES_A058 ES_A059 ES_A060 ES_A061 ES_A062 ES_A063 ES_A065 GD_A066 GD_A067 GD_A068 GD_A069 GD_A070 GD_A071 GD_A072 GD_A075 GD_A077 GD_A078 GD_A080 GD_A083 GD_A085 GD_A086 GD_A089 GD_A091 GD_A093 GD_A094 GD_A095 GD_A096 GD_A098 GD_A099 GD_A100 GD_A101 GD_A102 GD_A103 GD_A104 GD_A105 HM_A110 HM_A111 HM_A112 HM_A113 HM_A116 HM_A118 HM_A119 HM_A120 IN_A123 IN_A124 IN_A125 IN_A129 IN_A130 IN_A131 IN_A132 IN_A133 IN_A137 OC_A138 OC_A139 OC_A143 OC_A144 OC_A145 OC_A146 OC_A147 OC_A148 OC_A149 OC_A150 OC_A151 OC_A152 OC_A153 RR_A156 RR_A157 RR_A158 RR_A159 RR_A161 RF_A162 RF_A163 RF_A164 RF_A165 SR_A167 SR_A169 SR_A170 SC_A171 SC_A172 SC_A178 SC_A180 SC_A182 SC_A183 SC_A185 SC_A186 SC_A188 SC_A189 SC_A191 SC_A192 SC_A193 SC_A195 SC_A196 SC_A197 SC_A198 SC_A199 SC_A201 SC_A202 SC_A204 SC_A205 SC_A206 SC_A207 SC_A208 SC_A209 SN_A210 SN_A211 SN_A213 SN_A214 SI_A216 SI_A217 SI_A218 SI_A221 SI_A222 SI_A223 SI_A224 SI_A225 SI_A229 SI_A231 SI_A234 SI_A235 SI_A236 SI_A237 SI_A239 SI_A240 SI_A242 SI_A243 SI_A244 SI_A245 SI_A248 WC_A261 WC_A263 WC_A264 WC_A265 WC_A266 WC_A268 WC_A269 WC_A270 WC_A271 WC_A273 WC_A274 WC_A275 WC_A276 WC_A278 WC_A279 WC_A280 WC_A281 WC_A282 WC_A283 WC_A285 WC_A289 WC_A290 (166(n1)). exe. The same syntax was used for the dichotomous and polytomous data. The only difference was that the filename was changed. The syntax starts with the filename followed by the ID code. The a12 indicates that the ID code is alphanumeric and contains 12 spaces. This is followed by the variable names that were assigned. 230(n1) indicates that there are 230 variables. Finally, the syntax ends with an execute command (exe). Step 4: Create a default file for the non-answered portions of the test. A default file can be created in Notepad. This file should have the same format as the original file. In other words, there should be the same amount of spaces for your ID code and variables that there are in the original data file. The default file contains only one 90 line. Start the default file with an ID code space (e.g., 000000000001). Notice there are 12 numbers in the ID code. Follow this by a space and input as many 9s as there are variables (e.g., 230 variables equals 230 9s). Step 5: Create the syntax to run the data in PARSCALE. The syntax below was used to analyze the data in PARSCALE. This was after finding good fit and rerunning the data with only 136 task statements: Polytomous analysis >COMMENTS Final analysis of the questionnaire using polytomous data. >FILE OFNAME='MissDichPol.txt',DFNAME='practice4_polytomous4.dat', SAVE; >SAVE PARM='polytomous.par', SCORE='polytomous.SCO', FIT='polytomous.fit'; >INPUT NIDCH=12, NTOTAL=166, NTEST=1, LENGTH=166; (12A1,1x,166A1) >TEST1 TNAME=POLYTOTAL, INAME=( a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,a17,a18,a19, a20,a21,a22,a23,a24,a25,a26,a27,a28,a29,a30,a31,a32,a33,a34,a35,a36,a37,a38, a39,a40,a41,a42,a43,a44,a45,a46,a47,a48,a49,a50,a51,a52,a53,a54,a55,a56,a57, a58,a59,a60,a61,a62,a63,a64,a65,a66,a67,a68,a69,a70,a71,a72,a73,a74,a75,a76, a77,a78,a79,a80,a81,a82,a83,a84,a85,a86,a87,a88,a89,a90,a91,a92,a93,a94,a95, a96,a97,a98,a99,a100,a101,a102,a103,a104,a105,a106,a107,a108,a109,a110,a111, a112,a113,a114,a115,a116,a117,a118,a119,a120,a121,a122,a123,a124,a125,a126, a127,a128,a129,a130,a131,a132,a133,a134,a135,a136,a137,a138,a139,a140,a141, a142,a143,a144,a145,a146,a147,a148,a149,a150,a151,a152,a153,a154,a155,a156, a157,a158,a159,a160,a161,a162,a163,a164,a165,a166), NBLOCK=1; >BLOCK1 BNAME=POLY, NITEMS=166, NCAT=3, ORIGINAL=(0,1,2), MODIFIED=(1,2,3), CADJUST=0.0, CNAME=(none,perform,mjperform); >CALIB GRADED, LOGISTIC, SCALE=1.7, NQPTS=15, CYCLES=(50,1,1,1,1), NEWTON=2, CRIT=0.01, ITEMFIT=10, TPRIOR,SPRIOR,CSLOPE; >SCORE MLE, SMEAN=0.0, SSD=1.0, NAME=MLE, PFQ=5; Dichotomous analysis >COMMENTS Final analysis of the questionnaire using dichotomous data. >FILE DFNAME='practice4_dichotomous4.dat',OFNAME='MissDichPol.dat',SAVE; >SAVE PARM='dichotomous.par', SCORE='dichotomous.SCO', FIT='dichotomous.fit'; >INPUT NIDCH=12, NTOTAL=166, NTEST=1, LENGTH=166; (12A1,1x,166A1) >TEST1 TNAME=Form291, INAME=( a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16, a17,a18,a19,a20,a21,a22,a23,a24,a25,a26,a27,a28,a29,a30,a31,a32, a33,a34,a35,a36,a37,a38,a39,a40,a41,a42,a43,a44,a45,a46,a47,a48,a49, a50,a51,a52,a53,a54,a55,a56,a57,a58,a59,a60,a61,a62,a63, 91 a64,a65,a66,a67,a68,a69,a70,a71,a72,a73,a74,a75,a76,a77, a78,a79,a80,a81,a82,a83,a84,a85,a86,a87,a88,a89,a90,a91, a92,a93,a94,a95,a96,a97,a98,a99,a100,a101,a102,a103,a104, a105,a106,a107,a108,a109,a110,a111,a112,a113,a114,a115,a116, a117,a118,a119,a120,a121,a122,a123,a124,a125,a126,a127,a128, a129,a130,a131,a132,a133,a134,a135,a136,a137,a138,a139,a140, a141,a142,a143,a144,a145,a146,a147,a148,a149,a150,a151,a152, a153,a154,a155,a156,a157,a158,a159,a160,a161,a162,a163,a164, a165,a166), NBLOCK=1; >BLOCK BNAME=DICH, NITEMS=166, NCAT=2, ORIGINAL=(0,1), MODIFIED=(1,2), CADJUST=0.0, CNAME=(none,perform); >CALIB GRADED, LOGISTIC, SCALE=1.7, NQPTS=15, CYCLES=(50,1,1,1,1), NEWTON=2, CRIT=0.01, ITEMFIT=10, TPRIOR, SPRIOR,CSLOPE; >SCORE MLE, SMEAN=0.0, SSD=1.0, NAME=MLE, PFQ=5; One thing worth mentioning is that the difference between the 1PL and the 2PL model is that the 1PL model has the CSLOPE (common slope) command and the 2PL model does not have the CSLOPE. Step 6: Run the data through all of the phases to get your results. 92 APPENDIX C Polytomous Data for 1PL Model Before Removal of Poorly Fitting Items Table A1 Item Number 36 37 38 39 40* 41* 43 44* 45* 46 47 48* 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64** 65 66 67 68 69 70 71 72 b Parameter (Location) -0.36 -1.05 -2.81 -0.74 -0.85 -0.12 0.90 0.77 -1.13 -3.52 -3.19 -1.37 3.07 -0.96 -0.25 -2.50 -0.63 -3.26 -1.40 -0.50 1.33 -0.35 -0.32 -1.02 -0.22 -2.92 -2.85 -3.61 -3.72 -0.12 -2.02 -4.41 -1.23 -4.08 1.14 -1.57 SE 0.16 0.15 0.21 0.14 0.20 0.22 0.17 0.15 0.21 0.22 0.21 0.19 0.18 0.17 0.21 0.18 0.17 0.22 0.19 0.22 0.18 0.18 0.18 0.21 0.19 0.19 0.22 0.28 0.27 0.15 0.18 0.27 0.18 0.28 0.15 0.17 𝜒2 41.23 19.52 9.46 34.69 79.13 181.52 30.90 130.54 100.01 15.31 15.98 114.90 26.03 29.81 28.43 5.48 20.95 8.45 24.20 54.09 36.98 41.71 31.99 32.00 32.35 3.34 17.81 24.31 16.30 32.20 24.01 9.21 37.55 20.16 25.11 31.06 𝜒 2 ⁄𝑑𝑓 df 17 16 7 16 16 17 18 18 15 7 7 15 18 16 17 9 17 7 15 17 18 17 17 16 17 7 7 7 7 17 12 6 15 6 18 15 2.43 1.22 1.35 2.17 4.95 10.68 1.72 7.25 6.67 2.19 2.28 7.66 1.45 1.86 1.67 0.61 1.23 1.21 1.61 3.18 2.05 2.45 1.88 2.00 1.90 0.48 2.54 3.47 2.33 1.89 2.00 1.53 2.50 3.36 1.40 2.07 Sig 0.001 0.242 0.221 0.004 0.000 0.000 0.030 0.000 0.000 0.032 0.025 0.000 0.099 0.019 0.040 0.792 0.228 0.294 0.062 0.000 0.005 0.001 0.015 0.010 0.014 0.853 0.013 0.001 0.022 0.014 0.020 0.161 0.001 0.003 0.122 0.009 93 73* 74* 75 76* 77 78 79** 80 82* 83 84** 85 86 87* 88** 89 91 93 94 95 96 97* 98 99 100 101 102 103 104 105 106* 107* 108** 110 111 112 113 114* 115* 116 118 119 120 121** -0.87 -2.53 -3.43 2.65 1.91 -1.71 -2.86 -1.71 1.70 -0.76 0.70 2.20 1.74 -1.78 -1.70 1.48 0.39 -0.90 1.60 -0.63 1.69 0.35 0.49 -2.84 0.25 -4.23 -1.65 -0.73 -4.08 -2.58 0.19 -0.53 -0.61 -1.58 0.24 -2.02 2.48 -0.96 1.13 -0.43 -2.84 -2.01 -1.95 2.94 0.14 0.23 0.24 0.16 0.16 0.18 0.21 0.16 0.13 0.17 0.17 0.16 0.16 0.18 0.19 0.17 0.15 0.16 0.17 0.17 0.16 0.14 0.15 0.22 0.16 0.26 0.16 0.16 0.26 0.16 0.31 0.24 0.18 0.17 0.15 0.19 0.16 0.20 0.13 0.17 0.20 0.16 0.17 0.15 88.31 36.31 23.31 143.44 33.40 21.94 22.09 19.47 146.49 35.00 63.57 37.65 22.78 54.59 44.39 42.00 31.76 36.48 20.25 26.67 26.69 71.15 45.58 10.41 23.61 9.62 16.01 14.66 5.96 8.42 370.70 177.82 54.69 26.14 32.53 35.85 46.96 131.05 101.03 36.17 7.71 17.42 37.71 59.04 16 8 7 18 18 13 7 13 18 16 18 18 18 13 13 18 17 16 17 17 18 17 17 7 17 6 13 16 6 7 17 17 17 15 17 12 18 16 18 17 7 12 12 18 5.52 4.54 3.33 7.97 1.86 1.69 3.16 1.50 8.14 2.19 3.53 2.09 1.27 4.20 3.41 2.33 1.87 2.28 1.19 1.57 1.48 4.19 2.68 1.49 1.39 1.60 1.23 0.92 0.99 1.20 21.81 10.46 3.22 1.74 1.91 2.99 2.61 8.19 5.61 2.13 1.10 1.45 3.14 3.28 0.000 0.000 0.002 0.000 0.015 0.056 0.003 0.109 0.000 0.004 0.000 0.004 0.199 0.000 0.000 0.001 0.016 0.003 0.261 0.063 0.085 0.000 0.000 0.166 0.130 0.140 0.248 0.550 0.428 0.296 0.000 0.000 0.000 0.036 0.013 0.000 0.000 0.000 0.000 0.004 0.358 0.134 0.000 0.000 94 123 124 125 126** 127* 128* 129 130 131 132 133 134* 135* 136* 137 138 139 141** 142** 143 144 145 146 147 148 149 150 151 152 153 154* 155* 156 157 158 159 160* 161 162 163 164 165 167 168** 2.42 0.64 -0.77 -1.27 -1.49 -2.00 0.02 -0.31 0.36 1.00 -0.44 -2.04 -1.71 -2.10 -1.07 -3.63 -4.40 0.98 -3.93 0.73 -2.33 -3.29 -4.65 -4.04 -5.05 -1.17 -3.60 -2.39 1.47 -4.75 -1.76 -1.46 -1.58 1.93 -1.60 -1.47 1.34 -0.39 0.55 -1.95 -1.25 -2.02 -3.21 -4.36 0.17 0.17 0.17 0.23 0.26 0.23 0.15 0.19 0.18 0.17 0.18 0.23 0.26 0.26 0.19 0.24 0.26 0.20 0.25 0.16 0.19 0.22 0.31 0.27 0.36 0.16 0.23 0.19 0.20 0.31 0.15 0.16 0.17 0.16 0.16 0.16 0.14 0.17 0.14 0.22 0.22 0.21 0.21 0.29 41.94 30.34 34.37 57.43 115.12 60.23 48.89 38.11 41.41 40.67 59.26 74.79 69.20 73.03 41.50 12.36 9.65 64.55 12.04 35.65 12.12 7.79 5.01 12.72 7.66 35.01 14.36 17.20 56.72 12.67 74.93 79.22 19.05 18.53 37.18 29.67 98.21 28.33 39.83 36.26 28.58 24.14 14.64 22.61 18 17 16 15 15 12 17 17 17 18 17 12 13 12 16 7 6 18 6 18 10 7 6 6 5 15 7 10 18 6 13 15 15 18 15 15 18 17 17 12 15 12 7 6 2.33 1.78 2.15 3.83 7.67 5.02 2.88 2.24 2.44 2.26 3.49 6.23 5.32 6.09 2.59 1.77 1.61 3.59 2.01 1.98 1.21 1.11 0.83 2.12 1.53 2.33 2.05 1.72 3.15 2.11 5.76 5.28 1.27 1.03 2.48 1.98 5.46 1.67 2.34 3.02 1.91 2.01 2.09 3.77 0.001 0.024 0.005 0.000 0.000 0.000 0.000 0.002 0.001 0.002 0.000 0.000 0.000 0.000 0.000 0.089 0.139 0.000 0.061 0.008 0.277 0.351 0.544 0.047 0.175 0.003 0.045 0.070 0.000 0.048 0.000 0.000 0.211 0.421 0.001 0.013 0.000 0.041 0.001 0.000 0.018 0.020 0.041 0.001 95 169 170 171 172 173** 175** 176** 177** 178 179* 180 181* 182 183 184* 185 186 187* 188 189 190** 191 192 193 194** 195 196 197 198 199 200** 201 202 203* 204 205 206 207 208 209 210 211 213 214 -1.42 -3.94 -3.45 -3.91 -0.88 -4.52 -4.82 -3.48 3.18 -1.10 -0.68 -5.40 0.70 -0.60 -1.18 -0.16 -1.40 1.93 0.68 -2.62 -3.33 -2.43 -0.36 -2.97 -3.55 -5.43 -3.43 -0.61 -4.46 -2.40 -4.81 -4.29 -4.75 -0.02 -0.37 -0.25 -0.51 0.13 -2.98 -1.56 1.55 -0.10 -0.78 -1.05 0.17 0.26 0.21 0.25 0.16 0.32 0.37 0.22 0.18 0.15 0.20 0.42 0.15 0.17 0.20 0.19 0.20 0.15 0.17 0.20 0.23 0.21 0.18 0.19 0.24 0.40 0.21 0.21 0.30 0.18 0.32 0.28 0.35 0.20 0.22 0.16 0.25 0.19 0.19 0.15 0.16 0.19 0.22 0.22 27.74 12.86 16.88 10.72 38.09 21.13 22.14 21.73 35.17 65.14 30.09 16.40 45.90 24.31 79.77 41.83 13.69 74.38 20.82 7.46 27.41 15.99 24.18 11.84 24.05 7.39 12.46 34.98 13.32 27.10 17.25 14.51 19.12 146.08 45.24 40.07 41.61 39.29 6.32 31.63 18.01 45.62 29.66 24.51 15 6 7 6 16 6 6 7 18 16 16 4 18 17 15 17 15 18 18 7 7 10 17 7 7 4 7 17 6 10 6 6 6 17 17 17 17 17 7 14 17 17 16 16 1.85 2.14 2.41 1.79 2.38 3.52 3.69 3.10 1.95 4.07 1.88 4.10 2.55 1.43 5.32 2.46 0.91 4.13 1.16 1.07 3.92 1.60 1.42 1.69 3.44 1.85 1.78 2.06 2.22 2.71 2.87 2.42 3.19 8.59 2.66 2.36 2.45 2.31 0.90 2.26 1.06 2.68 1.85 1.53 0.023 0.045 0.018 0.097 0.002 0.002 0.001 0.003 0.009 0.000 0.018 0.003 0.000 0.111 0.000 0.001 0.550 0.000 0.288 0.383 0.000 0.099 0.114 0.105 0.001 0.115 0.086 0.006 0.038 0.003 0.009 0.024 0.004 0.000 0.000 0.001 0.001 0.002 0.503 0.005 0.388 0.000 0.020 0.079 96 215* 216 217 218 221 222 223 224 225 226** 228** 229 230** 231 232** 233** 234 235 236 237 238* 239 240 241** 242 243 244 245 246* 247* 248 260** 261 262* 263 264 265 266 267* 268 269 270 271 272* -1.74 -0.29 -1.07 0.88 -2.46 -1.90 0.07 -1.16 -1.32 -3.18 -5.11 -3.71 2.92 -3.86 -3.74 -3.27 -4.64 -2.11 -3.40 -2.11 -0.87 -0.22 -1.71 1.71 2.98 -2.41 -0.35 -3.57 -3.01 -6.09 -0.62 -2.03 -4.15 0.53 -0.71 -0.30 -0.84 2.17 -0.65 2.24 1.71 -0.74 -1.55 -1.59 0.14 0.16 0.16 0.16 0.20 0.17 0.16 0.17 0.17 0.25 0.38 0.24 0.17 0.24 0.31 0.24 0.34 0.19 0.25 0.18 0.15 0.17 0.19 0.14 0.17 0.20 0.18 0.25 0.23 0.47 0.15 0.17 0.27 0.18 0.15 0.17 0.16 0.17 0.14 0.19 0.16 0.15 0.20 0.20 83.83 46.18 17.75 25.96 11.01 23.25 31.45 42.65 37.34 24.66 10.66 13.27 59.03 12.42 23.92 23.66 6.85 17.14 17.55 32.42 74.33 39.80 32.07 62.63 44.35 8.68 52.60 25.89 36.88 0.00 47.46 33.65 10.14 78.35 27.36 32.78 24.41 29.27 97.31 43.52 20.72 34.08 39.44 66.59 13 17 16 18 10 12 17 15 15 7 4 7 18 7 7 7 6 12 7 12 16 17 13 18 18 10 17 7 7 0 17 12 6 17 16 17 16 18 17 18 18 16 15 15 6.45 2.72 1.11 1.44 1.10 1.94 1.85 2.84 2.49 3.52 2.67 1.90 3.28 1.77 3.42 3.38 1.14 1.43 2.51 2.70 4.65 2.34 2.47 3.48 2.46 0.87 3.09 3.70 5.27 0.000 0.000 0.339 0.100 0.357 0.026 0.018 0.000 0.001 0.001 0.030 0.065 0.000 0.087 0.001 0.001 0.335 0.144 0.014 0.001 0.000 0.001 0.002 0.000 0.001 0.563 0.000 0.001 0.000 2.79 2.80 1.69 4.61 1.71 1.93 1.53 1.63 5.72 2.42 1.15 2.13 2.63 4.44 0.000 0.001 0.118 0.000 0.038 0.012 0.081 0.045 0.000 0.001 0.294 0.005 0.001 0.000 97 273 -4.13 0.24 21.42 6 3.57 0.002 274 -0.43 0.19 18.92 17 1.11 0.333 275 -2.17 0.18 22.71 12 1.89 0.030 276 -2.48 0.22 20.33 9 2.26 0.016 277* -0.09 0.15 74.23 17 4.37 0.000 278 -3.03 0.22 14.90 7 2.13 0.037 279 -4.32 0.28 21.88 6 3.65 0.001 280 2.03 0.17 29.87 18 1.66 0.039 281 -0.86 0.17 26.75 16 1.67 0.044 282 -1.50 0.18 18.88 15 1.26 0.219 283 1.13 0.17 39.42 18 2.19 0.003 285 -2.30 0.19 25.89 10 2.59 0.004 286* 1.40 0.15 88.97 18 4.94 0.000 287* -1.45 0.15 72.56 15 4.84 0.000 288** 1.30 0.17 63.61 18 3.53 0.000 289 1.46 0.18 50.21 18 2.79 0.000 290 -2.33 0.18 22.82 10 2.28 0.012 291** -0.44 0.14 54.68 17 3.22 0.000 2⁄ Note. *Removed after first IRT attempt with polytomous χ df over 3. ** Removed after multiple IRT attempts with polytomous χ2 ⁄df over 3. Blanks on Sig and 𝜒 2 ⁄𝑑𝑓 are items not calculated by PARSCALE. 98 APPENDIX D Dichotomous Data for 1PL Model Before Removal of Poorly Fitting Items Table A2 Item Number 36 37 38 39 40* 41* 43 44* 45* 46 47 48* 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64** 65 66 67 68 69 70 71 b Parameter (Location) -0.81 -1.38 -5.17 -1.00 -2.08 -1.70 -0.24 0.11 -2.81 -5.17 -3.07 -2.92 1.02 -1.65 -1.18 -2.22 -1.05 -2.89 -1.75 -1.21 -0.05 -1.34 -0.89 -1.36 -1.02 -1.99 -2.09 -2.45 -2.12 -0.61 -2.16 -4.26 -1.91 -3.55 -0.04 SE 0.14 0.14 0.73 0.12 0.22 0.17 0.11 0.10 0.31 1.00 0.48 0.54 0.10 0.18 0.19 0.30 0.13 0.32 0.21 0.22 0.11 0.17 0.16 0.23 0.17 0.26 0.24 0.49 0.32 0.11 0.24 0.70 0.19 0.96 0.10 𝜒2 60.49 18.55 0.00 26.90 14.81 17.70 44.20 68.22 5.67 0.00 0.00 5.23 19.54 11.79 15.60 17.08 46.73 1.65 5.23 29.25 29.65 15.18 37.84 39.33 31.34 33.07 11.38 35.88 29.46 84.44 25.12 0.00 10.20 0.00 39.67 𝜒 2 ⁄𝑑𝑓 df 6 5 0 6 4 5 8 8 3 0 0 1 10 5 6 4 6 2 5 6 8 6 6 5 6 5 4 4 4 8 4 0 5 0 8 Sig 10.08 3.71 0.000 0.002 4.48 3.70 3.54 5.53 8.53 1.89 0.000 0.005 0.004 0.000 0.000 0.127 5.23 1.95 2.36 2.60 4.27 7.79 0.83 1.05 4.88 3.71 2.53 6.31 7.87 5.22 6.61 2.85 8.97 7.36 10.55 6.28 0.021 0.034 0.037 0.016 0.002 0.000 0.441 0.388 0.000 0.000 0.019 0.000 0.000 0.000 0.000 0.022 0.000 0.000 0.000 0.000 2.04 0.069 4.96 0.000 99 72 73* 74* 75 76* 77 78 79** 80 82* 83 84** 85 86 87* 88** 89 91 93 94 95 96 97* 98 99 100 101 102 103 104 105 106* 107* 108** 110 111 112 113 114* 115* 116 118 119 120 -1.42 -0.86 -2.19 -2.92 1.11 0.47 -1.76 -2.58 -1.62 0.58 -1.22 -0.68 0.74 0.32 -1.21 -1.76 0.14 -0.40 -0.83 0.13 -1.30 0.42 -0.33 -0.55 -4.16 -0.69 -3.01 -1.98 -1.40 -2.98 -1.87 -2.60 -2.17 -1.34 -1.57 -0.60 -2.40 0.88 -2.72 0.25 -1.00 -2.79 -2.26 -1.45 0.15 0.12 0.32 0.32 0.11 0.10 0.21 0.32 0.19 0.09 0.16 0.11 0.10 0.10 0.17 0.24 0.10 0.11 0.14 0.10 0.14 0.10 0.11 0.11 0.44 0.12 0.32 0.23 0.14 0.47 0.20 0.52 0.21 0.15 0.16 0.11 0.25 0.10 0.27 0.10 0.13 0.32 0.30 0.16 32.58 20.29 19.32 0.00 55.38 57.27 24.94 6.40 16.09 50.11 35.47 20.88 26.16 36.07 28.28 23.70 73.17 39.02 48.63 26.73 23.38 46.54 18.10 10.67 0.00 17.66 0.00 22.61 13.20 0.00 24.24 20.25 19.16 39.51 38.79 23.57 11.29 33.94 2.76 46.98 29.72 9.69 27.27 20.71 5 6 4 1 10 8 5 4 5 8 6 7 8 8 6 5 8 8 6 8 6 8 8 8 0 7 0 5 5 0 5 4 4 6 5 8 4 9 3 8 6 3 4 5 6.52 3.38 4.83 0.00 5.54 7.16 4.99 1.60 3.22 6.26 5.91 2.98 3.27 4.51 4.71 4.74 9.15 4.88 8.11 3.34 3.90 5.82 2.26 1.33 0.000 0.003 0.001 0.930 0.000 0.000 0.000 0.169 0.007 0.000 0.000 0.004 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.001 0.000 0.021 0.220 2.52 0.014 4.52 2.64 0.000 0.022 4.85 5.06 4.79 6.58 7.76 2.95 2.82 3.77 0.92 5.87 4.95 3.23 6.82 4.14 0.000 0.001 0.001 0.000 0.000 0.003 0.023 0.000 0.432 0.000 0.000 0.021 0.000 0.001 100 121** 123 124 125 126** 127* 128* 129 130 131 132 133 134* 135* 136* 137 138 139 141** 142** 143 144 145 146 147 148 149 150 151 152 153 154* 155* 156 157 158 159 160* 161 162 163 164 165 167 1.11 0.67 -0.54 -1.01 -1.83 -2.40 -2.55 -0.60 -0.86 -0.72 -0.06 -0.92 -2.29 -2.10 -3.35 -1.47 -3.67 -4.29 -0.40 -2.36 -0.61 -2.66 -2.78 -4.56 -2.77 -3.98 -1.51 -4.42 -2.29 0.00 -3.63 -1.04 -0.94 -1.73 0.49 -1.10 -1.56 0.20 -0.85 -0.23 -2.01 -1.58 -2.25 -2.50 0.10 0.10 0.12 0.17 0.25 0.41 0.31 0.12 0.17 0.14 0.12 0.15 0.30 0.33 0.39 0.17 1.29 0.58 0.13 0.37 0.11 0.28 0.32 0.85 0.67 0.80 0.21 0.95 0.27 0.12 0.70 0.13 0.14 0.19 0.10 0.15 0.15 0.10 0.13 0.10 0.23 0.18 0.21 0.22 52.66 43.37 29.36 43.06 19.35 16.05 19.03 20.65 48.86 5.55 20.60 15.58 17.71 18.75 0.00 11.64 0.00 0.00 39.04 26.02 18.28 11.93 4.66 0.00 13.81 0.00 23.58 0.00 8.72 52.77 0.00 62.32 60.87 24.63 67.49 67.00 8.17 35.83 45.49 96.48 15.50 28.88 4.86 2.73 10 8 8 6 5 4 4 8 6 7 8 6 4 4 0 5 0 0 8 4 8 4 3 0 3 0 5 0 4 8 0 6 6 5 8 6 5 8 6 8 5 5 4 4 5.27 5.42 3.67 7.18 3.87 4.01 4.76 2.58 8.14 0.79 2.57 2.60 4.43 4.69 0.000 0.000 0.000 0.000 0.002 0.003 0.001 0.008 0.000 0.595 0.008 0.016 0.002 0.001 2.33 0.040 4.88 6.50 2.29 2.98 1.55 0.000 0.000 0.019 0.018 0.197 4.60 0.003 4.72 0.000 2.18 6.60 0.068 0.000 10.39 10.14 4.93 8.44 11.17 1.63 4.48 7.58 12.06 3.10 5.78 1.21 0.68 0.000 0.000 0.000 0.000 0.000 0.146 0.000 0.000 0.000 0.009 0.000 0.302 0.607 101 168** 169 170 171 172 173** 175** 176** 177** 178 179* 180 181* 182 183 184* 185 186 187* 188 189 190** 191 192 193 194** 195 196 197 198 199 200** 201 202 203* 204 205 206 207 208 209 210 211 213 -2.71 -1.68 -2.80 -2.28 -3.88 -0.76 -6.34 -3.16 -3.27 1.29 -0.90 -1.53 -6.51 -0.21 -0.95 -1.16 -0.46 -1.50 0.53 -0.23 -1.79 -3.22 -1.67 -1.09 -2.26 -2.22 -3.29 -2.78 -1.34 -1.75 -2.21 -3.87 -3.59 -5.22 -1.69 -1.37 -0.73 -1.09 -0.41 -3.00 -1.45 0.16 -0.99 -1.92 0.88 0.20 0.40 0.29 0.48 0.15 1.20 0.83 0.35 0.11 0.13 0.20 1.06 0.10 0.16 0.16 0.14 0.23 0.10 0.12 0.29 0.34 0.30 0.18 0.26 0.37 0.44 0.29 0.20 0.85 0.26 0.64 0.52 0.94 0.18 0.25 0.13 0.27 0.14 0.34 0.17 0.10 0.17 0.24 28.42 26.27 7.02 5.53 0.00 59.54 0.00 0.00 0.00 30.22 36.92 7.83 0.00 20.63 35.46 3.91 45.51 22.81 42.34 42.07 34.25 0.00 39.51 20.53 14.33 30.38 0.00 6.16 10.17 134.91 5.70 0.00 0.00 0.00 18.98 11.25 39.88 39.43 46.09 0.00 12.87 21.74 44.68 11.33 3 5 3 4 0 7 0 0 0 10 6 5 0 8 6 6 8 5 8 8 5 0 5 6 4 4 0 3 6 5 4 0 0 0 5 6 7 6 8 0 5 8 6 5 9.47 5.25 2.34 1.38 0.000 0.000 0.070 0.236 8.51 0.000 3.02 6.15 1.57 0.001 0.000 0.165 2.58 5.91 0.65 5.69 4.56 5.29 5.26 6.85 0.008 0.000 0.691 0.000 0.000 0.000 0.000 0.000 7.90 3.42 3.58 7.60 0.000 0.002 0.006 0.000 2.05 1.70 26.98 1.43 0.102 0.117 0.000 0.221 3.80 1.88 5.70 6.57 5.76 0.002 0.080 0.000 0.000 0.000 2.57 2.72 7.45 2.27 0.024 0.006 0.000 0.045 102 214 215* 216 217 218 221 222 223 224 225 226** 228** 229 230** 231 232** 233** 234 235 236 237 238* 239 240 241** 242 243 244 245 246* 247* 248 260** 261 262* 263 264 265 266 267* 268 269 270 271 -0.72 -1.03 -0.61 -1.34 -0.10 -2.39 -1.51 -0.50 -1.53 -1.76 -7.38 -3.85 -1.12 1.11 -3.05 -6.44 -2.89 -6.63 -2.27 -3.02 -1.81 -0.80 -1.25 -2.09 0.44 1.19 -2.46 -0.99 -3.48 -3.34 -4.59 -0.72 -1.39 -2.87 -0.79 -1.15 -0.81 -1.31 0.46 -0.65 0.55 0.29 -0.93 -1.76 0.25 0.13 0.12 0.16 0.11 0.23 0.17 0.11 0.19 0.20 1.17 0.82 1.27 0.11 0.41 0.89 0.32 1.50 0.27 0.50 0.26 0.11 0.15 0.22 0.09 0.11 0.26 0.13 0.58 0.48 1.01 0.12 0.17 0.41 0.12 0.14 0.14 0.14 0.11 0.12 0.11 0.10 0.13 0.19 157.43 54.69 44.70 30.58 72.01 19.27 33.07 57.19 19.99 21.48 0.00 0.00 325.05 40.79 0.00 0.00 2.91 0.00 5.80 0.00 39.62 45.76 18.41 18.41 53.21 41.83 12.61 24.61 0.00 0.00 0.00 80.64 24.96 11.59 26.64 23.54 26.84 15.62 12.79 51.12 18.99 17.01 30.90 10.81 7 6 8 6 8 4 5 8 5 5 0 0 6 10 0 0 1 0 4 0 5 6 6 4 8 10 4 6 0 0 0 7 5 2 6 6 6 6 8 8 8 8 6 5 22.49 9.12 5.59 5.10 9.00 4.82 6.61 7.15 4.00 4.30 0.000 0.000 0.000 0.000 0.000 0.001 0.000 0.000 0.001 0.001 54.18 4.08 0.000 0.000 2.91 0.084 1.45 0.213 7.92 7.63 3.07 4.60 6.65 4.18 3.15 4.10 0.000 0.000 0.005 0.001 0.000 0.000 0.013 0.000 11.52 4.99 5.80 4.44 3.92 4.47 2.60 1.60 6.39 2.37 2.13 5.15 2.16 0.000 0.000 0.003 0.000 0.001 0.000 0.016 0.118 0.000 0.015 0.030 0.000 0.055 103 -2.49 0.38 12.29 4 3.07 0.015 272* -3.12 0.50 0.00 0 273 -1.02 0.19 29.98 6 5.00 0.000 274 -1.92 0.23 6.22 5 1.24 0.285 275 -2.51 0.30 21.34 4 5.33 0.000 276 -0.45 0.12 36.82 8 4.60 0.000 277* -2.43 0.35 23.18 4 5.79 0.000 278 -4.04 0.61 0.00 0 279 0.40 0.11 21.61 8 2.70 0.006 280 -1.36 0.16 22.10 6 3.68 0.001 281 -1.79 0.17 11.26 5 2.25 0.046 282 -0.12 0.11 23.76 8 2.97 0.003 283 -2.40 0.28 28.54 4 7.14 0.000 285 0.39 0.11 46.50 8 5.81 0.000 286* -0.94 0.13 82.77 6 13.79 0.000 287* 0.07 0.12 19.55 8 2.44 0.012 288** 0.26 0.11 31.60 8 3.95 0.000 289 -1.49 0.16 36.31 5 7.26 0.000 290 -0.66 0.11 36.33 8 4.54 0.000 291** 2⁄ Note. *Removed after first IRT attempt with polytomous χ df over 3. ** Removed after multiple IRT attempts when polytomous χ2 ⁄df was over 3. Blanks on Sig and 𝜒 2 ⁄𝑑𝑓 are items not calculated by PARSCALE. 104 APPENDIX E Polytomous Data for 1PL Model After Removal of Poorly Fitting Items Table A3 PARSCALE ID A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 A14 A15 A16 A17 A18 A19 A20 A21 A22 A23 A24 A25 A26 A27 A28 A29 A30 A31 A32 A33 A34 A35 A36 A37 Item Number 36 37 38 39 43 46 47 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 65 66 67 68 69 70 71 72 75 77 78 80 83 85 86 b Parameter (Location) -0.46 -1.09 -2.68 -0.73 0.87 -3.46 -3.18 2.89 -1.00 -0.30 -2.47 -0.64 -3.21 -1.34 -0.51 1.24 -0.44 -0.37 -1.05 -0.16 -2.84 -2.74 -3.48 -0.11 -2.04 -4.36 -1.23 -3.85 1.13 -1.60 -3.33 1.80 -1.68 -1.70 -0.76 2.08 1.62 SE 0.13 0.13 0.18 0.12 0.13 0.20 0.18 0.16 0.15 0.18 0.16 0.14 0.19 0.16 0.19 0.16 0.16 0.16 0.18 0.17 0.17 0.18 0.21 0.13 0.16 0.24 0.16 0.24 0.14 0.15 0.19 0.13 0.15 0.15 0.15 0.14 0.14 𝜒2 33.07 22.03 8.22 29.22 25.45 17.81 12.77 18.82 22.50 30.68 1.75 14.43 7.44 32.69 47.03 45.77 43.88 39.39 39.09 37.35 5.70 14.27 9.02 47.02 17.92 11.93 37.77 15.93 24.57 30.11 18.88 32.26 22.82 40.53 29.89 41.64 28.43 𝜒 2 ⁄𝑑𝑓 df 17 17 8 17 19 7 8 18 17 17 8 17 8 17 17 19 17 17 17 18 8 8 7 18 12 7 17 7 19 14 8 18 14 14 17 18 19 1.95 1.30 1.03 1.72 1.34 2.54 1.60 1.05 1.32 1.80 0.22 0.85 0.93 1.92 2.77 2.41 2.58 2.32 2.30 2.07 0.71 1.78 1.29 2.61 1.49 1.70 2.22 2.28 1.29 2.15 2.36 1.79 1.63 2.90 1.76 2.31 1.50 Sig 0.011 0.183 0.412 0.033 0.146 0.013 0.119 0.403 0.166 0.022 0.987 0.637 0.491 0.012 0.000 0.001 0.000 0.002 0.002 0.005 0.683 0.074 0.250 0.000 0.118 0.102 0.003 0.026 0.175 0.007 0.016 0.021 0.063 0.000 0.027 0.001 0.075 105 A38 A39 A40 A41 A42 A43 A44 A45 A46 A47 A48 A49 A50 A51 A52 A53 A54 A55 A56 A57 A58 A59 A60 A61 A62 A63 A64 A65 A66 A67 A68 A69 A70 A71 A72 A73 A74 A75 A76 A77 A78 A79 A80 A81 A82 89 91 93 94 95 96 98 99 100 101 102 103 104 105 110 111 112 113 116 118 119 120 123 124 125 129 130 131 132 133 137 138 139 143 144 145 146 147 148 149 150 151 152 153 156 1.34 0.36 -0.93 1.51 -0.63 1.57 0.42 -2.88 0.13 -4.12 -1.61 -0.71 -4.12 -2.49 -1.53 0.16 -1.96 2.34 -0.44 -2.78 -2.03 -1.97 2.34 0.53 -0.72 -0.02 -0.35 0.28 1.00 -0.50 -1.12 -3.56 -4.31 0.70 -2.32 -3.19 -4.56 -3.98 -4.84 -1.22 -3.48 -2.33 1.36 -4.67 -1.56 0.15 0.13 0.14 0.15 0.15 0.13 0.13 0.18 0.14 0.23 0.15 0.14 0.23 0.15 0.14 0.14 0.17 0.14 0.14 0.17 0.14 0.14 0.15 0.15 0.15 0.13 0.17 0.16 0.15 0.15 0.14 0.20 0.23 0.15 0.16 0.19 0.28 0.24 0.31 0.14 0.20 0.16 0.15 0.27 0.15 49.88 33.57 34.88 18.45 11.55 35.59 49.36 11.59 18.84 12.27 20.45 20.88 9.24 12.57 13.74 29.18 28.11 39.50 30.04 7.28 14.82 29.45 43.10 22.69 33.74 50.49 23.70 34.66 44.77 51.03 31.93 12.42 12.07 34.48 11.89 12.79 5.79 15.63 4.37 41.18 9.97 13.58 51.89 10.32 11.54 19 18 17 19 17 19 18 8 18 7 15 17 7 8 15 18 12 18 17 8 12 12 18 18 17 18 17 18 19 17 17 7 7 18 11 8 5 7 4 17 7 10 19 5 15 2.63 1.87 2.05 0.97 0.68 1.87 2.74 1.45 1.05 1.75 1.36 1.23 1.32 1.57 0.92 1.62 2.34 2.19 1.77 0.91 1.24 2.45 2.39 1.26 1.98 2.80 1.39 1.93 2.36 3.00 1.88 1.77 1.72 1.92 1.08 1.60 1.16 2.23 1.09 2.42 1.42 1.36 2.73 2.06 0.77 0.000 0.014 0.007 0.493 0.827 0.012 0.000 0.170 0.402 0.091 0.155 0.231 0.235 0.127 0.546 0.046 0.005 0.003 0.026 0.507 0.251 0.003 0.001 0.202 0.009 0.000 0.128 0.010 0.001 0.000 0.015 0.087 0.098 0.011 0.372 0.119 0.327 0.029 0.358 0.001 0.190 0.192 0.000 0.066 0.714 106 A83 A84 A85 A86 A87 A88 A89 A90 A91 A92 A93 A94 A95 A96 A97 A98 A99 A100 A101 A102 A103 A104 A105 A106 A107 A108 A109 A110 A111 A112 A113 A114 A115 A116 A117 A118 A119 A120 A121 A122 A123 A124 A125 A126 A127 157 158 159 161 162 163 164 165 167 169 170 171 172 178 180 182 183 185 186 188 189 191 192 193 195 196 197 198 199 201 202 204 205 206 207 208 209 210 211 213 214 216 217 218 221 1.79 -1.54 -1.47 -0.46 0.50 -1.93 -1.32 -1.97 -3.09 -1.47 -3.85 -3.35 -3.84 3.06 -0.67 0.64 -0.65 -0.21 -1.36 0.64 -2.62 -2.44 -0.41 -2.88 -5.34 -3.25 -0.58 -4.46 -2.33 -4.15 -4.70 -0.55 -0.27 -0.49 0.15 -2.96 -1.59 1.46 -0.15 -0.85 -1.00 -0.34 -1.07 0.80 -2.40 0.14 0.14 0.14 0.14 0.12 0.19 0.19 0.18 0.17 0.14 0.22 0.18 0.21 0.16 0.18 0.13 0.15 0.15 0.17 0.15 0.17 0.17 0.16 0.16 0.33 0.18 0.18 0.25 0.16 0.24 0.29 0.20 0.14 0.22 0.17 0.16 0.13 0.15 0.17 0.19 0.19 0.14 0.14 0.14 0.17 29.25 34.29 30.81 39.84 37.01 31.75 26.44 19.63 18.89 21.07 14.90 11.08 14.26 34.33 27.14 42.34 24.65 43.13 21.99 14.34 4.32 7.40 30.20 7.86 4.60 10.70 38.72 13.02 33.26 10.87 12.52 37.57 38.16 42.68 40.31 8.70 26.55 23.58 40.97 37.95 41.45 50.78 16.88 20.08 5.10 19 15 15 17 18 12 17 12 8 15 7 8 7 18 17 18 17 18 16 18 8 8 17 8 3 8 17 7 11 7 5 17 18 17 18 8 15 19 18 17 17 17 17 19 8 1.54 2.29 2.05 2.34 2.06 2.65 1.56 1.64 2.36 1.40 2.13 1.39 2.04 1.91 1.60 2.35 1.45 2.40 1.37 0.80 0.54 0.93 1.78 0.98 1.53 1.34 2.28 1.86 3.02 1.55 2.50 2.21 2.12 2.51 2.24 1.09 1.77 1.24 2.28 2.23 2.44 2.99 0.99 1.06 0.64 0.062 0.003 0.009 0.001 0.005 0.002 0.067 0.074 0.016 0.134 0.037 0.196 0.046 0.012 0.056 0.001 0.102 0.001 0.143 0.707 0.828 0.495 0.025 0.447 0.202 0.219 0.002 0.071 0.001 0.143 0.028 0.003 0.004 0.001 0.002 0.368 0.033 0.212 0.002 0.003 0.001 0.000 0.463 0.390 0.749 107 A128 A129 A130 A131 A132 A133 A134 A135 A136 A137 A138 A139 A140 A141 A142 A143 A144 A145 A146 A147 A148 A149 A150 A151 A152 A153 A154 A155 A156 A157 A158 A159 A160 A161 A162 A163 A164 A165 A166 222 223 224 225 229 231 234 235 236 237 239 240 242 243 244 245 248 261 263 264 265 266 268 269 270 271 273 274 275 276 278 279 280 281 282 283 285 289 290 -1.84 0.03 -1.16 -1.31 -3.57 -3.75 -4.65 -2.13 -3.39 -2.10 -0.25 -1.65 2.86 -2.34 -0.36 -3.52 -0.61 -4.07 -0.79 -0.30 -0.86 2.00 2.20 1.65 -0.74 -1.56 -4.00 -0.49 -2.12 -2.49 -3.01 -4.16 1.88 -0.82 -1.53 1.04 -2.27 1.39 -2.31 0.15 0.14 0.15 0.15 0.20 0.21 0.28 0.16 0.20 0.16 0.15 0.17 0.14 0.17 0.16 0.21 0.13 0.23 0.13 0.15 0.14 0.15 0.17 0.14 0.13 0.17 0.21 0.16 0.15 0.18 0.19 0.24 0.15 0.15 0.15 0.15 0.16 0.14 0.15 32.64 38.01 34.54 32.62 9.17 9.73 6.26 17.89 16.88 21.65 38.41 28.89 47.88 11.39 49.17 18.88 51.43 9.56 29.43 27.80 19.76 24.92 47.22 18.85 33.72 35.81 21.57 23.84 15.91 15.04 16.56 20.98 45.11 33.97 31.54 25.87 26.47 34.80 19.46 12 18 17 17 7 7 5 12 8 12 18 14 18 11 17 7 17 7 17 17 17 18 18 19 17 15 7 17 12 8 8 7 18 17 15 19 11 18 11 2.72 2.11 2.03 1.92 1.31 1.39 1.25 1.49 2.11 1.80 2.13 2.06 2.66 1.04 2.89 2.70 3.03 1.37 1.73 1.64 1.16 1.38 2.62 0.99 1.98 2.39 3.08 1.40 1.33 1.88 2.07 3.00 2.51 2.00 2.10 1.36 2.41 1.93 1.77 0.001 0.004 0.007 0.013 0.240 0.203 0.281 0.119 0.031 0.042 0.003 0.011 0.000 0.411 0.000 0.009 0.000 0.214 0.031 0.047 0.286 0.127 0.000 0.467 0.009 0.002 0.003 0.124 0.195 0.058 0.035 0.004 0.000 0.009 0.008 0.133 0.006 0.010 0.053 108 APPENDIX F Dichotomous Data for 1PL Model After Removal of Poorly Fitting Items Table A4 PARSCALE ID A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 A14 A15 A16 A17 A18 A19 A20 A21 A22 A23 A24 A25 A26 A27 A28 A29 A30 A31 A32 A33 A34 A35 A36 A37 Item Number 36 37 38 39 43 46 47 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 65 66 67 68 69 70 71 72 75 77 78 80 83 85 86 b Parameter (Location) -0.96 -1.38 -3.91 -0.94 -0.11 -4.50 -3.18 1.14 -1.69 -1.14 -2.29 -1.05 -2.75 -1.50 -1.29 -0.04 -1.23 -1.06 -1.43 -1.21 -2.25 -1.94 -2.48 -0.71 -2.20 -4.38 -1.78 -4.36 -0.06 -1.48 -2.50 0.52 -1.88 -1.64 -1.33 0.77 0.36 SE 0.11 0.12 0.56 0.11 0.09 0.89 0.40 0.10 0.15 0.17 0.26 0.12 0.28 0.18 0.19 0.10 0.15 0.14 0.20 0.14 0.21 0.19 0.26 0.10 0.21 0.62 0.16 0.82 0.09 0.14 0.25 0.09 0.18 0.16 0.13 0.10 0.09 𝜒2 37.86 17.41 0.00 32.20 61.37 0.00 0.00 22.41 9.64 20.76 24.68 45.83 4.73 16.26 23.60 26.91 24.87 22.03 36.48 15.48 13.11 16.69 9.11 88.19 23.43 0.00 19.28 0.00 38.74 27.24 2.60 73.46 14.46 27.58 23.41 31.94 43.25 df 6 6 0 7 8 0 0 10 6 6 4 6 2 6 6 8 6 6 6 6 4 5 4 8 4 0 5 0 8 6 3 9 5 6 6 9 9 𝜒 2 ⁄𝑑𝑓 Sig 6.31 2.90 0.000 0.008 4.60 7.67 0.000 0.000 2.24 1.61 3.46 6.17 7.64 2.37 2.71 3.93 3.36 4.15 3.67 6.08 2.58 3.28 3.34 2.28 11.02 5.86 0.013 0.140 0.002 0.000 0.000 0.092 0.013 0.001 0.001 0.000 0.001 0.000 0.017 0.011 0.005 0.058 0.000 0.000 3.86 0.002 4.84 4.54 0.87 8.16 2.89 4.60 3.90 3.55 4.81 0.000 0.000 0.460 0.000 0.013 0.000 0.001 0.000 0.000 109 A38 A39 A40 A41 A42 A43 A44 A45 A46 A47 A48 A49 A50 A51 A52 A53 A54 A55 A56 A57 A58 A59 A60 A61 A62 A63 A64 A65 A66 A67 A68 A69 A70 A71 A72 A73 A74 A75 A76 A77 A78 A79 A80 A81 A82 89 91 93 94 95 96 98 99 100 101 102 103 104 105 110 111 112 113 116 118 119 120 123 124 125 129 130 131 132 133 137 138 139 143 144 145 146 147 148 149 150 151 152 153 156 0.06 -0.42 -0.99 0.19 -1.18 0.40 -0.34 -3.50 -0.65 -2.78 -2.11 -1.17 -3.45 -1.98 -1.57 -0.66 -2.45 0.93 -1.01 -2.74 -2.64 -1.36 0.70 -0.48 -1.24 -0.53 -0.90 -0.55 0.02 -0.93 -1.43 -4.50 -4.00 -0.48 -2.52 -2.93 -5.30 -2.94 -3.68 -1.42 -4.68 -2.25 -0.03 -4.13 -1.75 0.09 0.10 0.12 0.09 0.12 0.09 0.10 0.36 0.11 0.29 0.19 0.13 0.40 0.18 0.14 0.10 0.21 0.09 0.11 0.27 0.26 0.14 0.09 0.10 0.14 0.11 0.15 0.12 0.10 0.13 0.14 1.17 0.50 0.10 0.24 0.28 0.78 0.56 0.64 0.17 0.76 0.23 0.09 0.60 0.16 51.30 35.48 26.18 31.53 32.04 52.87 28.43 0.00 24.89 13.46 11.29 30.71 0.00 17.46 44.95 17.66 8.08 41.95 31.96 10.70 9.18 30.70 47.08 33.12 17.21 27.02 45.83 19.57 28.14 21.12 19.98 0.00 0.00 32.48 8.69 3.87 0.00 6.73 0.00 40.11 0.00 12.08 44.26 0.00 19.29 8 8 6 8 6 9 8 0 8 2 4 6 0 5 6 8 4 10 6 2 3 6 9 8 6 8 7 8 8 7 6 0 0 8 3 2 0 2 0 6 0 4 8 0 6 6.41 4.43 4.36 3.94 5.34 5.87 3.55 0.000 0.000 0.000 0.000 0.000 0.000 0.000 3.11 6.73 2.82 5.12 0.002 0.001 0.023 0.000 3.49 7.49 2.21 2.02 4.20 5.33 5.35 3.06 5.12 5.23 4.14 2.87 3.38 6.55 2.45 3.52 3.02 3.33 0.004 0.000 0.024 0.087 0.000 0.000 0.005 0.027 0.000 0.000 0.000 0.009 0.001 0.000 0.012 0.000 0.004 0.003 4.06 2.90 1.93 0.000 0.033 0.142 3.36 0.034 6.68 0.000 3.02 5.53 0.017 0.000 3.22 0.004 110 A83 A84 A85 A86 A87 A88 A89 A90 A91 A92 A93 A94 A95 A96 A97 A98 A99 A100 A101 A102 A103 A104 A105 A106 A107 A108 A109 A110 A111 A112 A113 A114 A115 A116 A117 A118 A119 A120 A121 A122 A123 A124 A125 A126 A127 157 158 159 161 162 163 164 165 167 169 170 171 172 178 180 182 183 185 186 188 189 191 192 193 195 196 197 198 199 201 202 204 205 206 207 208 209 210 211 213 214 216 217 218 221 0.52 -1.32 -1.44 -0.93 -0.22 -2.19 -1.57 -1.93 -2.19 -1.70 -3.18 -2.34 -3.47 1.30 -1.24 -0.13 -1.13 -0.51 -1.45 -0.31 -2.13 -1.83 -1.14 -2.28 -3.65 -2.70 -1.21 -3.56 -2.06 -3.01 -4.37 -1.26 -0.84 -1.15 -0.61 -2.46 -1.40 0.18 -1.11 -1.22 -1.68 -0.65 -1.37 -0.10 -2.49 0.09 0.13 0.14 0.11 0.09 0.21 0.16 0.19 0.18 0.17 0.35 0.23 0.40 0.10 0.17 0.09 0.14 0.12 0.20 0.11 0.23 0.25 0.15 0.21 0.39 0.25 0.18 0.55 0.22 0.42 0.81 0.21 0.11 0.22 0.13 0.27 0.14 0.10 0.15 0.21 0.22 0.11 0.14 0.10 0.21 78.86 39.20 20.49 39.02 100.51 6.86 32.91 17.08 11.02 31.79 0.00 4.84 0.00 22.80 30.10 29.43 16.53 40.16 38.47 33.68 8.30 30.00 21.65 13.35 0.00 5.98 21.47 0.00 14.14 5.03 0.00 22.47 22.12 35.11 21.06 8.35 19.10 20.85 37.22 32.71 20.84 42.70 30.31 81.38 9.37 9 6 6 7 8 4 6 5 4 6 0 4 0 10 6 8 6 8 6 8 4 5 6 4 0 3 6 0 4 2 0 6 7 6 8 4 6 8 6 6 6 8 6 8 3 8.76 6.53 3.42 5.57 12.56 1.72 5.48 3.42 2.76 5.30 0.000 0.000 0.002 0.000 0.000 0.142 0.000 0.005 0.026 0.000 1.21 0.303 2.28 5.02 3.68 2.76 5.02 6.41 4.21 2.08 6.00 3.61 3.34 0.012 0.000 0.000 0.011 0.000 0.000 0.000 0.080 0.000 0.002 0.010 1.99 3.58 0.111 0.002 3.54 2.52 0.007 0.079 3.75 3.16 5.85 2.63 2.09 3.18 2.61 6.20 5.45 3.47 5.34 5.05 10.17 3.12 0.001 0.003 0.000 0.007 0.078 0.004 0.008 0.000 0.000 0.002 0.000 0.000 0.000 0.025 111 A128 A129 A130 A131 A132 A133 A134 A135 A136 A137 A138 A139 A140 A141 A142 A143 A144 A145 A146 A147 A148 A149 A150 A151 A152 A153 A154 A155 A156 A157 A158 A159 A160 A161 A162 A163 A164 A165 A166 222 223 224 225 229 231 234 235 236 237 239 240 242 243 244 245 248 261 263 264 265 266 268 269 270 271 273 274 275 276 278 279 280 281 282 283 285 289 290 -1.55 -0.54 -1.72 -1.74 -3.31 -2.99 -7.76 -2.25 -3.33 -2.03 -1.13 -2.14 1.24 -2.44 -0.96 -3.05 -0.84 -2.79 -1.08 -0.91 -1.26 0.59 0.58 0.33 -0.90 -1.60 -3.30 -1.08 -1.76 -2.56 -2.85 -3.78 0.51 -1.38 -1.72 -0.09 -2.64 0.18 -1.66 0.15 0.10 0.16 0.17 0.96 0.37 1.27 0.23 0.43 0.21 0.13 0.20 0.10 0.22 0.12 0.43 0.10 0.33 0.12 0.13 0.12 0.10 0.10 0.09 0.11 0.16 0.40 0.16 0.19 0.26 0.31 0.54 0.09 0.14 0.15 0.10 0.25 0.10 0.15 34.76 57.23 19.57 31.42 0.00 4.35 0.00 6.76 0.00 21.27 33.82 13.28 60.42 13.46 30.71 0.00 69.54 8.21 30.17 15.10 21.58 27.92 22.92 18.44 34.30 23.11 0.00 27.40 15.64 21.48 8.90 0.00 38.43 20.94 18.30 26.41 11.05 22.57 30.64 6 8 6 6 0 2 0 4 0 4 6 4 10 4 6 0 7 2 6 7 6 9 9 9 7 6 0 6 6 3 2 0 9 6 6 8 3 8 6 5.79 7.15 3.26 5.24 0.000 0.000 0.003 0.000 2.18 0.111 1.69 0.148 5.32 5.64 3.32 6.04 3.36 5.12 0.000 0.000 0.010 0.000 0.009 0.000 9.93 4.11 5.03 2.16 3.60 3.10 2.55 2.05 4.90 3.85 0.000 0.016 0.000 0.035 0.002 0.001 0.007 0.030 0.000 0.001 4.57 2.61 7.16 4.45 0.000 0.016 0.000 0.012 4.27 3.49 3.05 3.30 3.68 2.82 5.11 0.000 0.002 0.006 0.001 0.012 0.004 0.000 Note. Blanks on Sig and 𝜒 2 ⁄𝑑𝑓 are items not calculated by PARSCALE. 112 APPENDIX G Dichotomous (Upper Categories Combined) and Polytomous Matrix Plots Figure G1. PARSCALE Identifier A1-A100 for Dichotomous Data Plots. 113 Figure G2. PARSCALE Identifier A101-A166 Dichotomous Data Plots 114 Figure G3. PARSCALE Identifier A1-A100 Polytomous Data Plots 115 Figure G4. PARSCALE Identifier A101-A166 Polytomous Data Plots 116 REFERENCES Baranowski, L. E. & Anderson, L. E. (2005). Examining rating source variation in work behavior to KSA linkages. Personnel Psychology, 58, 1041-1054. Bemis, S.E., Belenky, A. H. & Soder, D.A. (1983). Job analysis: An effective management tool. Washington, D.C.: The Bureau of National Affairs, Inc. Bond, T.G. & Fox, C.M. (2007). Applying the Rasch Model: Fundamental measurement in the human sciences (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates. Brannick, M.T., Levine, E. L., & Morgeson, F. P. (2007). Job and work analysis: Methods, research, and applications for human resource management (2nd ed.). Thousand Oaks, CA: Sage Publication. Chau, S. C., Drasgow, F., & Luecht, R. (2006). How big is big enough? Sample size requirements for CAST item parameter estimation, Applied Measurment in Education, 19(3), 241-255. Comrey, A.L., & Lee, H.B. (1992). A first course in factor analysis (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates Drasgow, F., Levine, M.V., Sherman, T., Williams, B. & Mead, A. D. (1995). Fitting polytomous item response theory models to multiple-choice tests. Applied Psychological Measurement, 19(2), 143-165. Embretson, S. E., & Reise, S. P. (2000). Item Response Theory for psychologist. Mahwah, NJ: Lawrence Erlbaum Associates. Guion, R. M. (1998). Assessment, measurement, and prediction for personnel decisions. Mahwah, NJ: Lawrence Erlbaum Associates. 117 Harvey, R. J. (1991). Job analysis. In M. D. Dunnette (Ed.), Hanbook of industrial and organizational psychology (pp. 71-163). Palo Alto, CA: Consulting Psychologist Press. Harvey, R. J. (2003, April). Applicability of binary IRT models to job analysis data. Paper presented at the 2003 Symposium at the Annual Conference of the society for Industrial and Organizational Psychology, Orlando. Hernandez, A., Drasgow, F., & Gonzalez-Roma, V. (2004). Investigating the functioning of a middle category by means of a mixed-measurement model. Journal of Applied Psychology, 89(4), 687-699. Kaplan, R. M. & Saccuzzo, D. P. (2005). Psychological testing: Principles, applications, and issues (6th ed.). Belmont, CA: Thomson Wadsworth. Meyers, L. (2005). Rater Reliability. Unpublished Manuscript. Meyers, L. (2007). Reliability, error, and attenuation. Unpublished Manuscript. Meyers, L. S., Gamst, G., & Guarino, A.J. (2006). Applied multivariate research: Design and interpretation. Thousand Oaks, CA: Sage Publication. Mumford, M. D. & Peterson, N. G. (1999) The O*NET content model: structural considerations in describing jobs. In N. G. Peterson, M. D. Mumford, W. C. Borman, P. R. Jeanneret, & E. A. Fleishman (Eds.), An occupational information system for the 21st century: The development of O*NET (pp. 21-30). Washington D. C.: American Psychological Association. Prien, E. P., Goodstein, L. D., Goodstein, J. & Gamble, L. G. Jr. (2009). A practical guide to job analysis. San Francisco, CA: John Wiley & Sons, Inc. 118 Spicer, J. (2005). Making sense of multivariate data analysis. Thousand Oaks, CA: Sage Publications. Taylor, W. L. (1953). 'Cloze procedure': a new tool for measuring readability. Journalism Quarterly, 30, 415-433. U.S. Department of Labor. (1939). Dictionary of occupational titles. Washington, DC: U.S. Government Printing Office. Wei, J. & Salvendy, G. (2004). The cognitive tasks analysis methods for job and task design: Review and reappraisal, Behavior and Information Technology, 23(4), 273-299. Wilson, M.A., & Harvey, R.J. (1990). The value of relative-time-spent ratings in taskoriented job analysis. Journal of Business and Psychlogy 4(4), 453-461. Wright, B. D. (1977). Solving measurement problems with the Rasch Model, Journal of Educational Measurement, 14(2), 97-116. Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30, 187-213.