Chapter 1 OVERVIEW

advertisement
1
Chapter 1
OVERVIEW
In a company there are many job positions, which typically range from very
simple to extremely complex. All of these positions need to be well understood in order
to hire employees, evaluate their performance, train them, and set pay rates, among other
human resource functions. In order to do this effectively it is necessary for one to know
what types of duties and tasks are required for the positions. In order to find out what
duties and tasks are related to a job position a job analysis is done. A job analysis is a
technique used in order to measure the differing aspects of a job (Kaplan & Saccuzzo,
2005). The data obtained from a job analysis allows one to understand the frequency and
importance of the duties and tasks involved in the job, which aides in the process of
developing human resource and performance management tools.
After job analysis ratings are collected they must be analyzed statistically in order
to identify critical tasks and provide a summary of the job. This is typically done with
basic descriptive statistics, but recently Item Response Theory (IRT) has been explored
as a method for scaling tasks.
While polytomous IRT models exist that can be used when analyzing task
analysis data, the data are sometimes dichotomized and analyzed with binary IRT models
(e.g., Harvey, 2003). This dichotomization is accomplished by recoding the ratings from
three or more data points down to two. The question that this study answers is whether
transforming the data in this manner causes a loss of information. This study explores this
2
question by comparing dichotomous to polytomous treatments of the job analysis ratings
from officers (N = 544) in the corrections profession.
3
Chapter 2
JOB ANALYSIS
Why is a Job Analysis Done?
In order to understand why a job analysis is done one might benefit from first
knowing what a job is. A job position is better understood when it is broken into its
component parts. The job can be broken down into “units” such as “duties, tasks,
activities, or elements” (Brannick, Levine, & Morgeson, 2007, p. 8). Often in a job
analysis the position is broken down into smaller parts so the company and the person in
the position know every aspect of the job.
The breakdown of the units might be easiest to describe from smallest to largest.
First, the smallest unit is the element. An element is something that has “a clear
beginning, middle, and end” (Brannick et al., 2007, p. 6). Brannick et al. (2007) give
dialing a phone as an example of an element. This is something that has a clear
beginning, middle, and end. Second, an activity is considered a cluster “of elements
directed at fulfilling a work requirement” (Brannick et al., 2007, p. 7). Third, a task is a
group “of activities” (Brannick et al., 2007, p. 7). Brannick et al. (2007) give talking “to
conflicting parties to settle disturbances” as an example of a task (p. 7). One is required to
do several activities in order to complete this task. Finally, a duty is “a collection of task
all directed at general goals of a job” (Brannick et al., 2007, p. 7). One would most likely
use activities or tasks when creating a job analysis questionnaire. The job analysis helps a
company understand the importance and frequency of each of the tasks that are involved
in a job.
4
A job analysis is carried out in order to better understand a job position. Brannick
et al. (2007) describe a job analysis as the “discovery of the nature of a job” (p. 7). After
a job analysis is done a company is better able to know the exact tasks, duties, and many
of the knowledge, skills, abilities, and other personal characteristics (KSAOs) that are
involved in a job position. Also, a company gains a broader knowledge of which tasks
and duties are most critical. When hunting for job applicants a company is able to use a
job analysis in order to tell whether the potential employee will be able to fulfill the
duties expected of them. Also, the potential hire is able to know if they meet the
minimum qualifications for the job position and they are able to know whether the job
position sounds like a position of interest.
Beyond giving a solid job description, a job analysis benefits a company in the
following areas: recruitment and selection; criterion development and training; and
performance appraisals and job evaluations (Prien, Goodstein, Goodstein, & Gamble,
2009). The job analysis also helps a company avoid “exposure to litigation based on
allegations of discriminatory hiring” (Prien et al., 2009, p. 19).
In the recruitment and selection process the recruiter, as well as the individual
seeking employment, needs to have an in depth knowledge about what tasks are related to
a job (Prien et al., 2009). By having an extensive knowledge base about the job the
recruiter is able to choose the best candidate. Another benefit of a job analysis in the
recruitment and selection process is that the recruiter and the company are able to know
any changes a job has undergone over time (Prien et al., 2009), which enables a company
to focus on selecting candidates who have the most relevant qualifications.
5
Most job analyses are centered on the selection process (Prien et al. 2009). It is
highly important that a company chooses proficient candidates for the job positions.
Therefore, one can see the value of a job analysis in the selection process, because a
company’s success or failure depends on whether the right candidates are selected. By
knowing the essential functions of a job one is able to select the best qualified candidates
for the job.
Many times a job analysis is done on a job which exists and the job has people in
the position. As was mentioned above, there are many aspects of the job which are
deemed to be crucial. When the tasks and duties are found to be crucial it often becomes
apparent that certain individuals may not be sufficiently trained (Prien et al., 2009).
Those who do not fit the qualifications may need to be further educated in order to
increase their productivity in a job. After identifying the criticality of the tasks a company
is better equipped in the training process.
There are many companies which understand that candidates will not have certain
qualifications when entering a job. A job analysis will help a company understand which
qualifications a candidate needs to have when entering the job and which qualifications
can be taught on the job.
A job analysis is also used to set a pay range for the job and it gives one a
reference for job evaluations. In setting a pay range a job analysis identifies the amount
of education, training, and experience that are needed to enter a job and the complexity or
degree of difficulty of a job. Job evaluations are also aided by the job analysis. The job
analysis identifies what are known as critical tasks, which are the most important tasks
6
involved in a job (explained in greater detail below). These tasks can help guide a
company in creating a job evaluation (Prien et al., 2009). If individuals do well in the
critical tasks areas they are often given raises or promotions. If they are doing poorly they
often do not receive a raise or promotion and it may be determined that they require
additional training.
By doing a job analysis a company has an outline of what is expected of an
individual in a specified position. If the company does not use what is defined as being
part of the job to determine whether an individual should be hired, promoted, etc., they
may face legal problems. Basically, the job analysis helps companies avoid litigation. By
using a job analysis when one is hiring, giving pay raises, giving promotions, etc., a
company is basing their decisions on qualifications and performance and not the
individual’s race, sex, or any other factor that is unrelated to the job. If they do this they
are complying with the law, which states that individuals can not be discriminated against
and they have protections which allow them to receive equal opportunities for
employment and equal opportunities for pay (e.g., Equal Employment Opportunity
Commission). By having an outline of a job a company has a way of ensuring that their
hiring and pay structure is performance based (Prien et al., 2009).
When testing for a job position it is very important that the test coincides with the
job analysis. If one is testing for things that are not related to the job they run the risk of
violating the law. Bemis, Belenky, and Soder (1983) give an example of a landmark case
(U.S. v. State of New York) which helped to define the importance of linking performance
and job tasks. The case emphasized that the Job Element approach (the individual having
7
the characteristics/experience to be in a job) was not a sufficient approach on which to
base decisions. Decision making (e.g., raises, promotions, etc.) had to be based on the
performance of the individual in job related tasks.
How is a Job Analysis Done?
There are a number of sources used in order to complete a job analysis. Some of
the main sources that are used include past job analyses, the Occupational Information
Network (O*NET), and subject matter experts (SMEs). Many times a past job analysis
has already been carried out by a company. As companies grow or change the job
analysis becomes outdated, because the job position requires a different set of KSAs, or
because job duties may have changed due to advances in technology. However, the past
job analysis can often be used in order to guide an individual in creating questions or task
statements that are rated by SMEs.
The O*NET is another source which provides good, basic, information on many
different job positions. Before the O*NET came into existence the Dictionary of
Occupational Titles (DOT; United States Department of Labor, 1939) was used. The
DOT was originally put together by the United States Employment Service (USES)
(Guion, 1998). The DOT was put together by conducting observations and interviews of
individuals in many different job positions. The information obtained from the
observations and interviews were transformed into descriptions of the job positions
(Guion, 1998). The information in the DOT centered on the tasks that the individual
needed to do on the job. This was not enough to describe a job position. Also, the
8
descriptions of jobs in the DOT tended to be outdated in a short amount of time
(Brannick et al., 2007).
Due to the problems encountered with the DOT it was necessary to create a better
source describing job positions. As technology grew, and the internet came into
existence, the O*NET was created so that individuals could know what was involved in a
job position at a faster rate and could have the most up-to-date information available.
The contents of the O*NET can be understood by looking at the O*NET content
model. The model contains the following information:
1. Worker requirements: basic skills, cross-functional skills, knowledges, and education.
2. Experience requirements: training, experience, and licensure.
3. Worker characteristics: abilities, occupational values and interests, and work styles.
4. Occupational requirements: generalized work activities, work context, and
organizational context.
5. Occupation-specific requirements: occupational knowledges; occupational skills; tasks;
and duties; machines, tools and equipment.
6. Occupation characteristics: labor market information; occupational outlook; and wages
(Mumford & Peterson, 1999).
Basically, the O*NET centers around the KSAOs that are required in order to
complete the tasks and duties of a job. If a person is knowledgeable about things involved
in a job they are able to navigate the technical aspects of the job (Brannick et al., 2007).
If an individual possesses the skills required for the position they are capable of
performing the tasks of the job. Ability involves the physical and mental aspects of the
9
individual (Brannick et al., 2007). In other words, a qualified person is physically (e.g.,
lift 100 pounds) and mentally (e.g., deal with multiple job stressors) able to deal with the
tasks required of them when they are in a job position.
Past job analyses and the O*NET are often used to gain an early understanding of
the job, which is generally followed up with an interview with SMEs to understand the
details of a job and to guide an individual when they are creating the tasks statements,
which are compiled into a questionnaire. These questionnaires are used to rate the
frequency and importance of each task that is involved in a job.
Methods of Collecting Data for Creating Job Analysis Questionnaires
There are several methods one can use in order to create a job analysis
questionnaire. Wei and Salvendy (2004) discuss the following methods for collecting
data: “observations and interviews”, “process tracing”, and “conceptual techniques” (p.
276, italics in original). They explain that the observations and interviews approach is the
most direct approach. One is directly observing the individuals who are in the job and
asking them questions based on information gathered about the job. Wei and Salvendy
(2004) warn that this approach is sometimes “unwieldy and difficult to interpret” (p.
276). Ultimately, data are interpreted based on numbers. When interviewing an individual
one is asking questions about the job and not assigning numbers (e.g., level of
importance) to the questions. Therefore, it is difficult to rate the questions because they
are not initially assigned a number.
Even though there are certain interpretation problems with interviewing, this
method may be preferred over other methods. In fact, Prien et al. (2009) emphasizes the
10
benefit of an interview over a self-report. When an individual is interviewed they are
often interviewed by a trained professional. However, Prien et al. (2009) explains that
when individuals are told to write a description about their job there is a greater
probability that they will inflate the importance of the tasks involved in their job.
Similar to interviewing, the process tracing technique is also verbally based.
However, with process tracing one is looking at a specified group of tasks. In a job
analysis there may be certain tasks which require a more in depth analysis. By using
process tracing one is able to get detailed information about those tasks (Wei &
Salvendy, 2004). Again, due to the fact that the data are verbal it is hard to interpret.
However, it is a useful way to obtain information on tasks and duties, which require a
greater amount of detail.
With conceptual techniques one is focused on “domain concepts” (Wei &
Salvendy, 2004, p. 276). The benefit of conceptual techniques is that it asks about
specific tasks and can be rated (e.g., questionnaire; Wei & Salvendy, 2004). A drawback
is that it does not provide the detail which is often obtained through an interview.
The questionnaire is a highly used conceptual technique. Prien et al. (2009) discusses two
types of questionnaires which are used. First, one can use a “custom-designed
questionnaire” (Prien et al., 2009, p. 33). The questionnaire is specifically designed for
the job being analyzed. One could also use a more general questionnaire, “the
commercially available questionnaire”, to analyze the job (Prien et al., 2009, p. 34). Prien
et al. (2009) explains that the drawback to the commercially available questionnaire is
that it is often created for a broad range of jobs. Therefore, there may be questions which
11
are not directly applicable to the job. Also, it may be missing questions that should be
asked.
The questions are rated by SMEs, who are individuals that have an in-depth
knowledge of the functions and KSAOs involved in a job position. For instance, a police
officer or a lieutenant would be a SME for a police officer position. They would be
familiar with the position because they are currently in the position or they have been in
the position in the past. However, one does not have to be in the job or have been in a job
in order to be a SME. One simply needs to have a strong knowledge of a job position.
As was mentioned above, past job analyses and the O*NET give individuals who
are creating the questionnaire a guide for making the tasks statements. The O*NET gives
a general guide and past job analyses gives a more detailed guide. Even though one might
gain a lot of insight into the job based on these sources, these sources alone may not be
sufficient to create a detailed questionnaire based on the position in its current form. One
of the best sources for creating a questionnaire is the current SMEs who occupy the
position of interest. One can observe (if circumstances allow) the SMEs in their work
environment and ask them questions either on the job or in an interview.
The interview questions can be made by using the O*NET or a past job analysis.
However, since jobs tend to evolve over time one might come up with some questions
while doing the interview. In other words, one might notice that certain questions need to
be asked that may not have been thought of based on the past job analysis or based on the
information obtained on the O*NET. Brannick et al., (2007) explain that you can use
paper and pencil or a video or audio recorder to write out or tape the interview. They
12
further explain that if you are going to tape the interview it would be a good idea to ask
the person first, since recording devices tend to make people nervous. After the
information is obtained from these sources it is then turned into a job analysis
questionnaire.
When creating a job analysis questionnaire it is important to collect demographic
information (e.g., age, gender, ethnicity, job position, etc.). This information is very
useful when analyzing certain aspects of the data. The questionnaire should also be
divided into several subcategories in order to better organize the different duties of the
job. For instance, a task that is clerical would not go in a security category.
Questionnaires are often lengthy and organizing the tasks into groups helps to clarify
what is being asked and helps to avoid confusion.
As has been mentioned before, the task statements are rated on the level of
importance and frequency. For instance, the questionnaire can ask, on a Likert type scale,
the level of importance (0 = not applicable to job, 1 = not very important, 2 = somewhat
important, 3 = important, 4 = very important) and the frequency (0 = not applicable to
job, 1 = infrequent, 2 = somewhat frequent, 3 = frequent, 4 = very frequent).
There are several types of rating scales that one can choose from. The rating
scales can be made based on the type of job that one is asking about. Also, the number of
data points for the scale can vary. For instance, a rating scale used by Baranowski and
Anderson (2005) gives the following prompt “How important is this knowledge, skill, or
ability for performing this work behavior?” (p. 1044). The prompt is followed by a five
point rating scale: “1 = Not important”, “2 = Slightly important”, “3 = Moderately
13
important”, “4 = Very important”, and “5 = Extremely important” (Baranowski &
Anderson, 2005, p. 1044). However, the amount of ratings do not have to be limited to
five points. Typically, one would see ratings between three and seven points. One could
also use a two point rating scale if he or she is simply interested in knowing whether the
tasks are relevant to the job (e.g., has some degree of frequency and/or importance).
Harvey (1991) gives a couple of common examples of scales that are used in a job
analysis. In the first example the SME indicates that the tasks are part of the job prior to
filling out the scale. When filling out the scale, the SME gives an estimate of time given
to each task. The scale has eight ratings, which range from “0 = I spend no time on this
task” to “ 7 = I spend a very large amount of time on this task as compared with most
other tasks I perform.” (Harvey, 1991, p. 90). The second example gives a scale which
tells the reader to consider importance, frequency, and difficulty when rating the task
statements and it explains that not all the statements will apply to the job. The ratings
range from 0 = definitely not part of my job; I never do it” to “7 = Of most significance
to my job” (Harvey, 1991, p. 91). A significant difference on the ratings is that the only
ratings that are defined are 0, 1, 4, and 7. In other words, on the previous scale 1, 2, 3, 4,
5, 6, and 7 are defined, while this scale only has definitions for 0, 1, 4, and 7. On this
scale, the SME may decide that the task does not quite fit under the 4 definition,
therefore, they may choose to rate the task as an undefined 3.
The third example gives a frequency scale with time related ratings. The ratings
range from “ 0 = I do not perform this task on my current job” and “1 = about once every
year” to “7 = about once each hour or more often.” (Gael, 1983; as cited in Harvey, 1991,
14
p. 92). The fourth example gives a scale that is known as a Behaviorally Anchored Rating
Scale (BARS). This scale is rated on a continuum and the data points are defined in
sentence form (Campbell, Dunnette, Arvey, & Hellervik, 1973; as cited in Harvey 1991).
Finally, the fifth example shows the Job Element Inventory (JEI), which, as the name
implies, breaks the job down into elements. The inventory uses a simple five point rating
scale and tells the person to fill in the rating that best fits with the element (Cornelius &
Hakel, 1978; as cited in Harvey, 1991). As can be seen ratings can take many forms.
Harvey (1991) explains that using a limited number of rating scales may be
beneficial to one’s analysis. He also explains that if two scales are highly correlated then
one of the scales can be dropped from the analysis. Harvey (1991) explains that when the
rating scales appear to ask similar questions, and the correlation is high, one is often
asking redundant information.
In deciding on how many data points to use one might consider a study done by
Wilson and Harvey (1990). In their study they compared the difference between relative
time spent (RTS) scales and simply asking whether the tasks were or were not part of the
job. They found a correlation of .90 between the two types of ratings. In other words,
dichotomously and polytomously scored variables gave approximately the same amount
of information. Also, a study done by Hernandez, Drasgow, and Gonzalez-Roma (2004)
found that when rating personality many individuals do not use the “?, not sure, or
undecided” category (p. 687). They explain that there is a commonly held belief that
having a middle category is beneficial because it does not force the individual to answer a
question that they are actually unsure of (Hernandez et al., 2004). Hernandez et al. (2004)
15
found that those with certain personality characteristics (e.g., reserved) tended to use the
middle category more often. Based on their findings they recommended that the middle
category should be used with caution, because it may lead to interpretation problems.
How are the Task Inventory Data Typically Analyzed?
There are several ways in which the task inventory data are analyzed. The
descriptive statistics may tell you something about how much the individuals agree on the
task statements. Brannick et al. (2007) emphasizes that at a minimum one should report
descriptive statistics such as the “mean, standard deviation, and N” (p. 273). The mean of
each task statement tells you how the average rater scored the statement. The overall
mean of the task statements tells you how the average rater scored all of the items. The
standard deviation of the task statements tells you how much the ratings differed in the
statements. A large standard deviation would indicate that the individuals differed a lot
when it came to rating a certain statement. There may be reasons why the ratings for
individual task statements differ substantially among the SMEs (e.g., poorly written task
statements). N is simply the number of individuals (e.g., SMEs) who rated the task. There
are tests which can help one to learn more about how the items are functioning. Some of
these tests include tests of interrater reliability and interrater agreement.
Brannick et al. (2007) explain that there is a difference between interrater
reliability and interrater agreement. They explain that “[r]eliability ... refers to two
sources of variance: variance due to random errors and variance due to systematic
differences among the items of interest” and “[a]greement simply refers to a function of
those judgments that are identical and different” (p. 276).
16
One can understand reliability a bit more clearly through the following equation
given by Meyers (2007):
Reliability =
VT
VO
VT is the true variance and it is divided by VO which is the observed variance. The true
score would be a score that does not contain any error. The observed score is the score
obtained from the individual or subject and it does contain error. When a study, test,
questionnaire, etc. controls for error (e.g., testing at the same time of day) the observed
score increases thereby increasing the reliability. However, there will always be a certain
amount of error that cannot be controlled for (e.g., individuals not getting good sleep the
night before the test takes place).
Interrater agreement can be understood by its name. It is simply the amount of
times the judges/raters agree on how the item is scored. For instance, if three judges rated
a task statement and all of the judges rated the statement as a two (highly important) then
it would indicate that there is one hundred percent agreement among the judges.
Brannick et al. (2007) explain that interrater agreement (“interjudge agreement”) “is
simply the number of ratings for which the judges agree divided by the total number of
judgments” (p. 275).
Meyers (2007) explains that the level of agreement can be found by using
Cohen’s Kappa. He explains that the procedure is available in SPSS and it can be
interpreted using the chi square statistic. Meyers (2007) also explains that if there are
more than two raters then an alternate form of Kappa created by Fleiss (1971) can be
used. Kaplan and Saccuzzo (2005) explain the common way to interpret Kappa (e.g.,
17
Fleiss, 1981). They explain that if the value is “.75” then one would have ““excellent”
agreement, a value between .40 and .75 indicates “fair to good” ... agreement, a value
less than .40 indicate poor agreement” (Fleiss, 1981; as cited in Kaplan and Saccuzzo,
2005, p. 118).
Meyers (2007) also explains that rater reliability can be found by finding the
Pearson correlation. With this statistic one is finding the “correlation between pairs of
raters” (Meyers, 2005, p. 2). In other words, one is finding the degree to which there
ratings trend in the same direction. Meyers (2005) explains that correlations above .8 are
good, but correlations “in the .7s” may also be strong enough in many cases.
The correlation coefficient (r) can be used in order to estimate interrater
reliability. Brannick et al. (2007) explains that a large correlation indicates that there is a
small amount of error in the problem and it indicates that the problem has good
reliability. In other words, it indicates that there is a low amount of random error in the
items. Besides making estimates of how well the items are working and how accurately
the judges are rating the items, one can also estimate how critical each task is to the job.
Understanding the Criticality of the Tasks
After the task statements are rated by multiple SMEs the means of the frequency
and importance scales are multiplied for each of the task in order to find which tasks had
the highest criticality. This is used to find the critical tasks, which are then organized into
a criticality index. The criticality index shows the tasks from the highest criticality to the
lowest criticality. The following equation is often used to calculate the criticality of each
of the tasks:
18
Criticality Index = Importance (M) x Frequency (M)
For instance, if SMEs are rating tasks on an importance scale (5 = very important, 4 =
important, 3 = somewhat important , 2 = somewhat unimportant, 1 = unimportant) and a
frequency scale (5 = highly frequent, 4 = frequent, 3 = somewhat frequent , 2 = somewhat
infrequent , 1 = not frequent ) one would use the mean rating from each of these scales to
calculate the criticality index. If the mean of the SMEs scores is 3 for importance and 3
for frequency then the score on the criticality index would be 9. One would decide how
high the score should be in order for the task to be considered critical. This helps one to
understand a bit about the combination of importance and frequency. If one of these
ratings is low then the task may not be considered to be critical to the job. By finding the
most critical aspects of a job a company or organization is better able to understand the
type of employee they need to be searching for to fill a position. If the potential employee
does not possess the KSAs needed to perform the critical tasks of a job then they may not
be a qualified candidate.
There are several ways in which the statements and questions for a job analysis
questionnaire can be analyzed. Some have been mentioned above (e.g., mean ratings,
standard deviation). Another way to look at the items is through IRT. In a job analysis
one is looking at the educated opinion of individuals rating the items. The mean and
standard deviation are a good starting point in gathering information about each of the
items. While the mean gives an indication of the average level at which the raters are
rating the tasks, IRT gives an indication of how individuals are rating the item from
19
lowest to highest ratings for each task. IRT potentially allows one to gather more detail
about the items and allows one to understand whether the items are functioning properly.
20
Chapter 3
ITEM RESPONSE THEORY
In an IRT analysis one would refer to a person’s ability level as theta. Ability does
not simply refer to an individual’s ability to answer a question correctly (e.g., multiple
choice test). It can also indicate the level at which the frequency of SMEs rating a task is
the highest. In other words, if a majority of the SMEs (who filled out the questionnaire)
rated a task as having some degree of relevance to the job then the task would have a low
theta level. On the other hand, if there were not many SMEs who rated the task as being
part of a job then the task statement would have a high theta level. Also, if there were a
moderate level of SMEs who rated the task as being part of the job there would be a
moderate theta level. The general term theta will be used to refer to what is described
above.
One might also understand the b parameter (location of the curve) by comparing
task ratings to ability test (math, reading, etc.). The b parameter is lower on ability test
problems when a lot of the individuals taking the test get the problem correct. Lower b
paramenter indicates that the problem does not have a high degree of difficulty.
Similarly, the b parameter is lower on task statements that have a large amount of SMEs
who rate the task as being a part of the job. However, when rating task statements the b
parameter is based on how the SMEs rated the statement and it does not indicate that the
tasks is rated correctly or incorrectly.
At the individual level a SME’s theta is based on how they rated the task
statements. Again, the theta level does not indicate that the SME rated the tasks correctly
21
or incorrectly. It is simply an indication of how they rated the tasks. For example, if the
SME rated a majority of the task statements as being part of the job then they would have
a high theta level. If the SME rated a majority of the task statements as being irrelevant to
the job then they would have a low theta level.
After SMEs have rated statements one can see how well the item is functioning
through IRT. IRT looks at how well the item functions based on theta (Harvey, 2003). An
individual’s theta level can be predicted through the use of different models. The models
that will be explained in more detail below are the Rasch Model (“one-parameter logistic
model”, “1PL”) and the “two-parameter logistic model (2PL)” (Embretson & Reise,
2000, p. 48, 51). However, before explaining the models in detail it is important to know
the assumptions that need to be met for IRT to function properly.
Assumptions of IRT
There are several assumptions which need to be met for IRT calculations to
function properly. The most common are that the items have to be locally independent
and they need to be unidimensional (Embretson & Reise, 2000). Local independence
indicates a separation of items. As the term implies, the items are independent. In other
words, one would not have to rely on one item in order to know the answer to another
item. Embretson and Reise (2000) explain that IRT models share similarities with factor
analysis. Due to the similarity, when one factor fits the data one can often assume that
local independence has been achieved (Embretson & Reise, 2000). In other words, the
variables all fall close to (have high loadings on) a single factor. If one of the variables is
not loading on a factor then it may not have good item fit, which is essential for items to
22
function efficiently in IRT. In other words, it may not fit with the other items in that
specific IRT analysis. One needs to decide the best way to analyze the items if they are
loading on different factors. It might be necessary to run multiple analyses if the items are
not all loading on one factor.
If the items were dependent it would indicate that the items rely on each other for
the answer (Embretson & Reise, 2000). For instance, in item matching (matching a term
from a list to its appropriate definition in a list of definitions) a person would eliminate
the definitions that he or she has used, which would give a better indication of what the
answer might be for the remaining unknown items. Therefore, since the items are
dependent on one another, items from item matching would not be able to be used in the
standard IRT models. Another example of local dependence is the Cloze exam (Taylor,
1953). In this exam an individual fills in the blanks of a paragraph with what they believe
to be a missing word. This is done at several points in the paragraph. If the individual
knows certain words it may give them clues to the other missing words in the paragraph,
which would violate local independence.
Yen (1993) explained that the following features of a test may lead to local
dependence: (a) “External assistance or interference”; (b) “Speededness” (e.g., not
having sufficient time to take the test) (c) “Fatigue”; (d) “Practice”; (e) “Item or
response format”; (f) “Passage dependence”; (g) “Item chaining” (h) “Explanation of
previous answer”; (i) “Scoring rubrics or raters”; and (j) “Content, knowledge, and
abilities (p. 189-190, italics in original). Therefore, if one wants to avoid creating local
dependence in a test they should give a sufficient amount of time to take the test, should
23
not expose students to the material prior to the test, and they should avoid tying questions
together in a test. They should also be aware of the way they score and format the test.
Local dependence is not necessarily a bad thing. However, these types of problems
should not be used in IRT.
If there is local dependence it is suggested that one create what is known as
“testlets”, which is done by combining items that depend on each other (Yen, 1993). A
testlet is created by changing multiple items into a single item (Embretson & Reise,
2000). In other words, items that give clues to other items are combined. As was
explained above, the Cloze paragraph has a certain amount of items. The items would be
estimated as a single item, and one would give the item partial credit if some of the items
are correct. Yen (1993) explains that a “testlet score is assumed to be independent of all
other testlets and items” (p. 201). By making items with local dependence into one item
you are achieving local independence. For instance, if one were to calculate multiple
Cloze items as one item the individuals taking the test would only depend on information
within the item to answer the question.
Unidimensionality has to do with how well the model fits the data and the “trait
level estimates” that are needed to explain the data (Embretson & Reise, 2000, p. 189).
Basically, if local independence is met then the data meets that requirement for
unidimensionality (Embretson & Reise, 2000). Embretson and Reise, (2000) explain that
there have been many ways which have been proposed for assessing dimensionality.
However, they explain that many people have been turning toward goodness-of-fit
indices in order to assess the data (Embretson & Reise, 2000).
24
One way of finding out whether the item fits the data is by dividing chi-square
(πœ’ 2 ) by the degrees of freedom (df). Embretson and Reise (2000) explain that πœ’ 2 is based
on the observed frequencies and the expected frequencies. Basically, the SMEs are
divided into groups (theta level groups) based on how they rate the task statements. IRT
will give certain expected frequencies based on the model that is used. When the
observed frequencies line up with the expected frequencies it indicates that the task
statement fits the model.
Drasgow, Levine, Tsien, Williams, and Mead (1995) defined the item fit for their
set of data. They defined the smallest number as having the best fit. In other words, when
computing πœ’ 2 /df the numbers that are closest to 0 (between 0 and 3) fit the model the
best. Drasgow et al. (1995) emphasized that less than 1 is considered “very small”,
between one and two is considered “small”, between two and three is considered
“moderately large” and if the πœ’ 2 ⁄𝑑𝑓 is greater than three it is considered “large” (p. 151).
Therefore, one might consider removing items that are above three from a set of items for
that particular analysis, or one might consider seeing whether the items fit better with a
different model. It is essential that one looks at the item fit in order to know whether the
items are giving accurate information. When the items fit one is able to trust the
information that is displayed through the item parameters (location, slope, and guessing
parameters) and the plots, which are known as the item characteristic curves (ICCs) and
item information curves (IICs). ICCs will be covered in more detail below. IICs are
simply plots of the amount of information given at each theta level.
25
Item Characteristic Curve
If an item is functioning correctly it will have an ICC that is flat at the bottom
curves up in the middle and is flat at the top (with dichotomous data). An ICC is a plot of
the predicted probability of individuals that endorsed a particular response in
combination with the theta level of the individuals (e.g., SMEs ratings). In a job analysis
a higher theta level simply means there were a lower amount of individuals who rated
tasks as having high relevance (or some degree of relevance if one is using a two point
rating scale).
Figure 1. Plots for Dichotomous 1PL Item Characteristic Curves (ICCs)
Figure 1 (above) shows what the dichotomous positively endorsed category would
look like. If an item begins to slant upward at a lower theta level (e.g., Item 1) it simply
indicates that a large amount of individuals endorsed the tasks as being part of the job. If
the task begins to increase at a higher theta level (e.g., Item 4) it indicates that there were
26
fewer SMEs who endorsed the item as being part of the job. Also, there were a greater
amount of individuals who felt that the task was not applicable to the job. In other words,
a majority of the SMEs rated the task as having low relevance to the job.
An ICC with poor discrimination would have a flat line going across the top of
the plot, or, basically, would not have a curve in the middle. One would not want to see a
flat line going from the left lower corner to the right upper corner on an ICC. A flat line
would mean that once an individual reaches a higher theta level they would automatically
be more likely to rate the tasks as being part of the job. In other words, the task would not
have good discrimination between differing theta levels.
Rasch Model, 1PL, 2PL
In order to understand the ICC more fully it is important to understand the Rasch
model (1PL model), and the 2PL model. With the Rasch Model one uses one theta
parameter for each of the individuals, which allows one to estimate the “difficulty
parameter” (location of the curve) for each of the items (Wright, 1977, p. 97). In other
words, the individual’s theta level gives a good indication of where the individual would
be on the ICC. The theta level is understood based on the way the individual rates the
tasks. If the individual gives a lower rating to the task they will be grouped, on the ICC,
with those who have the lower theta level. Similarly, those who rate the task high would
be grouped with those who have a higher theta level.
In a 1PL model “the dependent variable is the dichotomous response” (or
polytomous response; e.g., questionnaire data can be both) (Embretson & Reise, 2000, p.
27
48). In a job analysis questionnaire the dependent variable is the way in which the
individual rated the item.
On the 1PL model, “the independent variables are the person’s trait score, θ𝑠 ”
(theta) “and the item’s difficulty level, βi ” (Embretson & Reise, 2000, p. 48). In a job
analysis the theta of the individual is expressed by where the individual is located along
the theta continuum. The items which are more highly endorsed are on the left and the
items not endorsed by very many SMEs are on the right. This is based on dichotomous
data plots. Polytomous plots may take many shapes depending on the responses of the
SMEs. For instance, a certain amount of SMEs will rate a task in the middle
category/categories and others will rate the tasks in the upper category/categories.
Therefore, the division of SMEs will cause the polytomous ICCs to look a lot different
than the dichotomous ICCs. Harvey (2003) explains that “the ‘difficulty’ of an item [in a
job analysis questionnaire] is defined in terms of the amount of the general work activity
(GWA) construct (θ) that would be needed to produce a given level of item endorsement”
(p. 2). As has been explained before, the theta level is based on how many individuals
endorsed the task as being a part or not being a part of the job (e.g., frequency and
importance). Therefore, if the SME is familiar with the job he or she will rate the task
similar to the way the other SMEs rate the task.
As was explained above, an individual can land at different spots on the ICC
based on their theta level. By using the Rasch model (combined with the ICC) one can
see how the items are functioning at each theta level.
28
The calculation of item difficulty is understood through “log odds or
probabilities” (Embretson & Reise, 2000, p. 48). The “ratio” is estimated for each
individual (Embretson & Reise, 2000, p. 48). The odds are the degree to which the
individual is likely to get a problem correct. For instance, Embretson and Reise (2000)
give an example of “the odds that a person passes an item is 4/1”, which indicates that
“out of five chances, four successes and one failure are expected” (Embretson & Reise,
2000, p. 49). If an individual were rating task statements, and the odds ratio were 4/1, it
would indicate that for every four tasks that the individual rates as being part of the job
they would rate one as not being applicable to the job. Also, “odds are the probability of
success divided by the probability of failure” (Embretson & Reise, 2000, p. 49). In a job
analysis one does not deal with success or failure. One simply deals with the degree to
which an individual feels that a statement is part of a job. The following equation shows
“the ratio of the probability of success”:
ln [
Pis
] = θs − βi
(1 − Pis )
where “θs ” is the “trait score”, and “βi ” is the “items difficulty” (Embretson & Reise,
2000, p. 49). Individuals who are higher on the trait would have higher log odds ratios,
which would indicate that they are more likely to endorse the item as being part of the
job.
The 2PL model functions a bit differently than the 1PL model. The 2PL model
includes two parameters (Embretson & Reise, 2000). The two parameters that are
included in the model are “item difficulty, βi , and item discrimination, 𝛼𝑖 ” (Embretson &
29
Reise, 2000, p. 51). Figure 2 gives an example of how items look in the 2PL model. The
location parameter has already been covered with the 1PL. The discrimination parameter
is an additional parameter that is shown by the 2PL model.
Figure 2. Plots for Dichotomous 2PL Item Characteristic Curves (ICCs)
Looking at the difference between the items, an item with a high amount of
discrimination would curve upward sharply (e.g., Item 5). This indicates that the item is
discriminating at a more specific theta level. An item with low discrimination does not
lean as much in the middle (e.g., Item 1). If the ICC of a problem increases smoothly it
indicates that the problem does not discriminate as well between people of increasing
theta levels. If it increases smoothly it indicates the problem discriminates between
individuals of several degrees of theta, but not at a specific range of theta levels.
30
The equation for the 2PL model is expressed as follows:
𝑃(𝑋𝑖𝑠 = 1|πœƒπ‘  , 𝛽𝑖 , 𝛼𝑖 ) =
exp(𝛼𝑖 (πœƒπ‘  − 𝛽𝑖 )
1 + exp(𝛼𝑖 (πœƒπ‘  − 𝛽𝑖 )
(Embretson & Reise, 2000).
The model that would be most useful when analyzing a job analysis using rating
scales is the graded-response model (GRM). Embretson and Reise (2000) explain that
this model “is appropriate to use when item responses can be characterized as ordered
categorical responses such as exist in Likert rating scales” (p. 97). In other words, these
are responses that do not necessarily have a right or wrong answer. The responses are
based on the educated opinion of a set of SMEs.
GRM is represented in the following equation:
∗
“𝑃𝑖π‘₯
(πœƒ) =
exp[𝛼𝑖 (πœƒ−𝛽𝑖𝑗 )
”
1+exp[𝛼𝑖 (πœƒ−𝛽𝑖𝑗 )
∗
(p. 98). Embretson and Reise (2000) explain that “𝑃𝑖π‘₯
(πœƒ)” is the “operating characteristic
curves” (p. 98). They continue by explaining that “[i]n the GRM one operating
characteristic curve must be estimated for each between category threshold, and hence for
a graded response item with five categories, four 𝛽𝑖𝑗 parameters are estimated and one
common item slope (𝛼𝑖 ) parameter” (Embretson & Reise, 2000, p. 98). It should be
mentioned here that the threshold is the location parameter (b parameter). Referring back
to Figures 1 and 2, what is shown is the dichotomous affirmative (e.g., task does apply to
job) response to a task statement. What is not shown is the rejecting statement (does not
apply). The rejection of the statement is a mirror image of the affirmative statement. In
other words, it is a reflection of the statement going in the opposite direction in the same
31
general location. Where the positive and negative responses cross is the threshold. The
𝛽𝑖𝑗 “represents the trait level necessary to respond above threshold j with .50 probability”
(Embretson & Reise, 2000, p. 98-99). Basically, 𝛽𝑖𝑗 is based on the way in which the
SMEs rated the task statement (e.g., frequency and importance).
For polytomous items there are multiple thresholds. The threshold is where the
response probability curves cross. For instance, if one were rating a task as either being
relevant or irrelevant to a job one would have an ICC for the task being irrelevant (no
degree of frequency, no degree of importance) and an ICC for the task being relevant
(having some degree of frequency or importance). If one were to look at a dichotomous
plot the irrelevant category would be a reflection the relevant category going in the
opposite direction. For instance, if the relevant ICC category starts at the lower left corner
and curves toward the upper right corner then the irrelevant statement would start at the
upper left corner and curve down toward the lower right corner. The point at which these
two categories cross is the threshold. With polytomous statements each of the middle
categories has an ICC. If one were looking at a three point statement, there is a threshold
where category one crosses with category two and where at category two crosses with
category three. Therefore, a polytomous statement would have multiple thresholds.
In the current study the modified graded response model (M-GRM) is used. The
GRM and the M-GRM have similar slope parameters (Embretson & Reise, 2000). In
other words, the slopes for the ICC categories on the frequency and importance scales
would look similar. However, Embretson and Reise (2000) explain that “[t]he difference
between the GRM and the M-GRM is that in the GRM one set of category threshold
32
parameters (𝛽𝑖𝑗 ) is estimated for each scale item, whereas in the M-GRM one set of
category threshold parameters (𝑐𝑗 ) is estimated for the entire scale, and one location (𝑏𝑖 )
is estimated for each item” (Embretson & Reise, 2000, p. 103). One can see the
difference between GRM and M-GRM by looking at the output. When looking at the
output for the GRM on a task statement (with more than two categories) one would see
multiple 𝛽 (location) parameters. However, there is only one location parameter on the
M-GRM.
Dichotomous and Polytomous Models
As was explained, the difference between a dichotomous and a polytomous
variable is the amount of rating scale categories. A dichotomous variable has only two
categories, while a polytomous variable has three or more categories. For instance, the
questionnaire used in this study asks the SME to rate whether there would be
consequences if an individual does not complete a task. The ratings are polytomous in the
sense that they have three categories and the choices are as follows: “0. Task Not Part of
Job”; “1. Not Likely”; or “2. Likely”. If one were to change the data to dichotomous the
choices would be “0. Task Not Part of Job”; or 1.Task is Part of the Job/May Have Some
Degree of Consequence. Essentially, what one is doing when they dichotomize a
variable, in a job analysis, is they are changing the degree of importance/frequency to
saying that the task is simply part of the job or is not part of a job.
Dichotomizing variables has become common for item response theory when one
is analyzing job analysis data (e.g., Harvey, 2003). When dichotomizing variables one
might ask themselves if they are changing the meaning of the data or losing information
33
through this process. If one changes the shape of the ICC when dichotomizing does it
change the meaning of what is portrayed by IRT? With a job analysis questionnaire,
where one is trying to figure out the importance of tasks statements, would dichotomizing
a variable change the meaning of the SMEs ratings?
Bond and Fox (2007) explain that when a category has a low frequency it may
make sense to combine it with another category. This is due to the fact that low frequency
categories often do not give any additional information to the analysis. Bond and Fox
(2007) also explain that when one collapses categories they should try to collapse it
multiple ways and see which way gives the highest amount of good fitting items.
However, one might ask what might happen when a middle category, with a high
frequency of SMEs, is combined with an upper category.
It was hypothesized that since the categories are being changed from three
(polytomous) to two (dichotomous) categories that there will be less accurate information
after the change. In other words, it is likely that the interpretation of the ICCs and the
information given in the ICCs will be less precise as a dichotomous statement, both
because there is less division among the categories and because one is combining
categories which have a relatively high frequency. This, in turn, gives a less detailed
picture of what the task is telling us. This study explores what occurs (or whether nothing
occurs) when one dichotomizes the task statements.
34
Chapter 4
METHODS
Sample
This study uses archival data from a questionnaire used by used by a corrections
agency in the western United States. There were a total of 544 SMEs, working at the
correctional facilities, who were included in this study. The job positions of the
individuals included 231 Correctional Officers, 101 Juvenile Officers, 94 Juvenile
Counselors, 91 Correctional Sergeants, 10 Juvenile Sergeants, and 13 Senior Juvenile
Counselors. There were a total of 434 males,106 females, and 4 who did not indicate
gender. Ethnic background included 228 White, 165 Hispanic, 98 Black/AfricanAmerican, 10 Asian, 10 Filipino, 6 Native American, 8 Pacific Islander, 9 other, and 10
did not indicate ethnicity.
Instrument
Data from a job analysis questionnaire was used in this study. The questionnaire
was originally used to clarify job duties and equipment that would be used by incoming
Correctional Peace Officers, Juvenile Peace Officers and Juvenile Counselors. The tasks
were rated for the jobs from the individuals in the position and those supervising the
individuals in the position.
The following areas are covered in the questionnaire: 1. Booking, Receiving and
Releasing; 2. Casework; 3. Counseling; 4. Court-related Board Hearings; 5. Arrests; 6.
Emergencies; 7. Escort, Move, Transportation; 8. General Duties; 9. Health and Medical;
10. Investigation; 11. Oral Communication; 12. Read, Review, and Analyze; 13.
35
Referrals; 14. Search; 15. Security; 16. Supervision of Non-inmates; 17. Supervision of
Wards/Inmates; 18. Restraints and Use of Force; and 19. Written Communication. There
was also an equipment section. However, this study focused on what the officers do and
not what they use.
Individuals were given task statements and were asked to rate the frequency and
degree of consequence for not performing each of the task statements. Individuals are
given a column for the frequency of the task and for the importance of the task. The
individuals were asked to rate the frequency and importance of the task related to the
entry-level correctional peace officer position.
For frequency the prompt was given, “Based on your experience in the facilities
in which you’ve worked, how many entry-level correctional peace officers will perform
this task in the first three years on the job (even if they do it only a few times)?” The
prompt was followed by the following: “0. Task Not Part of Job”; “1. Less than a
Majority”; or “2. A Majority”. As can be seen, this data is an approximation to
polytomous data. It is polytomous in the sense that it has 3 categories. However, the 0
(Task Not Part of Job) is not actually on the theta continuum. In other words, when one
chooses the option of the task not being part of the job they are not giving the task a
degree of frequency or importance. If the individual was giving the task a degree of
frequency or importance then they would be rating the task on the continuum, because
they would be indicating that the task has a degree relevance to the job. Options 1. Less
than a Majority and 2. A Majority are on the continuum.
36
For importance the degree of consequence for not performing a task was
measured with the following prompt: “How likely is it that there would be serious
negative consequences if this task is NOT performed or if it is performed incorrectly?”
The following categories are provided after the prompt: “0. Task Not Part of Job”; “1.
Not Likely”; or “2. Likely”.
Only the frequency ratings were used in this analysis. The questionnaire was
extensive and it was decided that this analysis should focus on a specific portion of the
questionnaire to answer the question of whether there is a difference between
dichotomous and polytomous task ratings. The task statements that were included in the
final analysis are listed in Appendix A.
The tasks were analyzed with IRT in dichotomous and polytomous form, which
was different from the way it was originally analyzed by the corrections agency. The
corrections agency analyzed the data in its original form using typical descriptive
statistics (mean, standard deviation, percentages, etc.) and not IRT. The current study
explores what occurs when the data is dichotomized by comparing polytomous to
dichotomous models. Also, using IRT in order to analyze job analysis ratings is a
relatively new approach. However, the analysis and comparison does help to clarify what
occurs when one changes task statements from polytomous to dichotomous.
37
Chapter 5
RESULTS
Initial Data Preparation and Selection of Tasks for Analysis
In order to make a direct comparison between dichotomous and polytomous
variables it was essential to first dichotomize the task statements and save it as a new file.
In other words, all the 2s in the original polytomous data file were changed to 1s, which
left the dichotomous data with 0s and 1s. Step 1 and 2 of Appendix B describe how this
was carried out. The dichotomous data indicated whether the tasks were or were not a
part of the job, while the polytomous data indicated the frequency and degree of
consequence of the tasks. However, as was mentioned, only the frequency statements
were used.
When the ratings were dichotomized there were certain task statements that no
longer had variance (e.g., all data became 1s). In other words, all of the SMEs who rated
these statements felt that they had at least some degree of relevance to the entry-level
correctional peace officer position. These statements are shown in Table 1.
38
Table 1
Deleted Variables Due to Loss of Variance When Dichotomized
Category
Oral Communication
Search
Security
Supervision of Wards/Inmates
Restraints and Use of Force
Task Statement
Communicate orally.
Confiscate contraband.
Call for back-up.
Intervene in/breakup physical
altercations.
Confront wards/inmates exhibiting
inappropriate behavior.
Identify violent wards/inmates.
Discharge chemical agents to control
resistant inmates or quell
disturbances/riots.
Restrain an assaultive ward/inmate.
Use departmentally approved “use of
force” methods.
If a variable does not have variability it indicates that everyone who was sampled
had 100% agreement (which is not exactly true since the data were dichotomized).
Without variance there is no standard deviation from the mean and there is no score
difference and error to use in calculations. Therefore, the variable does not add any
additional information to the analysis. In other words, when the data were dichotomized
some of the most relevant tasks were unable to be analyzed. Because of this, the items
that no longer had variance were removed from the analysis. There was variability with
these statements in polytomous form. However, in order to make a direct comparison
(using the same variables in the dichotomous and polytomous analysis) it was decided
that these statements should not be used.
The data were also dichotomized by combining the lower two levels. With this
manner of dichotomizing none of the task statements were lost. This method of
39
dichotomizing the tasks is consistent with what Bond and Fox (2007) suggested (e.g.,
combine categories with a low frequency). However, in the current study the data were
dichotomized in a way similar to Harvey (2003; e.g., does not apply is 0 and all other
rating were 1). By dichotomizing the data in this manner one avoids combining
contradictory categories. For instance, when one is combining the task is not part of the
job and less than a majority of the individuals will perform the task they are combining
categories that are going in negative and positive directions. Therefore, it makes more
sense to combine the upper categories.
In the original job analysis there were several variables that were considered to
have low criticality to the entry-level correctional peace officer job position (variables
had low criticality across all SME groups). This was calculated by multiplying the
frequency and degree of consequence mean scales for each item. The variables’ criticality
indices fell between 0 and 4. The variables that were below .5, across all SME groups,
were not considered to be critical to the job. These variables are listed in Table 2.
40
Table 2
Deleted Variables Due to Low Criticality
Category
Booking, Receiving and Releasing
Casework
Court-related Board Hearings
Emergencies
General Duties
Health and Medical
Investigation
Supervision of Non-Inmates
Written Communication
Task Statement
Discuss charges against juvenile with
arresting/transporting officer.
Place holds on wards/inmates and
notify department holding warrant.
Process bail.
Run warrant checks, holds and
search clauses.
Assign wards/inmates to
program/counselor.
Conduct a home study where
juveniles are to be released.
Coordinate with external resources
for ward/inmate employment and
rehabilitation services.
Process applications for alternative
sentencing programs.
Conduct closed circuit video
arrangements.
Handle canines to control crowds.
Operates facility canteen.
Recruit job applicants and
volunteers.
Serve as a departmental
representative to external groups.
Distribute medication.
Weigh wards/inmates.
Administer a breath analyzer test to
wards/inmates.
Supervise infants only (no adult
visitors present).
Request Department of Justice (DOJ)
criminal history.
Since the tasks listed in Table 2 had low criticality they would not add to the
analysis, because the tasks had little to do with the entry-level positions. Therefore, these
tasks were deleted from the analysis.
41
The polytomous factor loadings were used to determine which factors would go
into the IRT analysis. The group means with factor loadings that were above .50 were
used in the IRT analysis. This would allow the analysis to be relatively unidimensional,
which is one of the assumptions of IRT.
If the items load on a single factor it is a good indication of unidimensionality
(Embretson & Reise, 2000). Therefore, in order to test the dimensionality of the data all
of the sub categories of the questionnaire were analyzed using factor analysis. First the
mean scores of the task statements in each of the sub categories were calculated by
combining the variables in each sub category (e.g., Booking, Receiving and Releasing;
Casework; Counseling; Court-related Board Hearings; Arrests; Emergencies; Escort,
Move, Transportation; General Duties; Health and Medical; Investigation; Oral
Communication; Read, Review and Analyze; Referrals; Search; Security; Supervision of
Non-Inmates; Supervision of Wards/Inmates; Restraints and Use of Force; Written
Communication). These mean scores were then analyzed using exploratory factor
analysis. The maximum likelihood extraction method was used and a single factor
solution was imposed on the data. The mean scores, descriptive statistics, Chronbach’s
Alpha, Inter Item Correlation, and the factor analysis are shown in Table 3 below. The
means of the subgroups that had loadings > .50 were used in the final analysis. This is
based on the factor loading explanation given by Comrey and Lee (1992). Factor loadings
above .45 are considered to be in the fair range and above .55 are considered to be in the
good range. Therefore, the groupings are fair or better.
42
Table 3
Polytomous/Dichotomous Descriptive Statistics, Chronbach’s Alpha, Inter Item Correlation, and Factor Loadings
Polytomous
Dichotomous
Categories
# variables Mean
SD 𝛼
IIC Factor Mean
SD 𝛼
IIC Factor
Booking, Receiving and Releasing
8
.67
.41 .81
.35
.46
.56
.29
Casework
11
.45
.43 .86
.38
.35
.32
.27
Counseling
4
1.28
.55 .78
.50
.37
.78
.24
Court-related Board Hearings
2
.87
.48 .61
.45
.41
.76
.35
Arrests
1
1.28
.74
.37
.82
.38
Emergencies
13
1.22
.33 .79
.24
.67* .83
.14
Escort, Move, Transportation
16
1.35
.36 .87
.29
.67* .87
.18
General Duties
38
1.20
.30 .90
.20
.80* .77
.15
Health and Medical
14
1.14
.31 .77
.20
.73* .80
.14
Investigation
15
1.17
.42 .90
.39
.75* .81
.20
Oral Communication
15
1.55
.25 .80
.24
.67* .91
.10
Read, Review and Analyze
8
1.17
.45 .78
.31
.62* .76
.23
Referrals
4
1.31
.53 .80
.52
.57* .85
.23
Search
4
1.70
.35 .62
.33
.58* .96
.12
Security
38
1.43
.31 .93
.25
.70* .86
.15
Supervision of Non-Inmates
4
1.05
.49 .72
.40
.57* .77
.29
Supervision of Wards/Inmates
31
1.40
.31 .91
.26
.75* .85
.11
Restraints and Use of Force
8
1.47
.32 .78
.32
.47
.96
.08
Written Communication
31
1.19
.33 .90
.22
.79* .77
.17
Note. 𝛼 and IIC were not estimated for Arrests because it only had one item. *Factor Loadings > .50
.79
.84
.63
.60
.63
.86
.85
.63
.84
.60
.70
.63
.36
.91
.71
.79
.41
.86
.32
.35
.36
.46
.12
.27
.13
.12
.28
.09
.24
.36
.11
.20
.43
.12
.11
.17
.60*
.23
.10
.49
.36
.60*
.65*
.78*
.63*
.67*
.58*
.49
.37
.51*
.73*
.70*
.57*
.45
.75*
43
Figure 3. Scree Plot for Dichotomous Variables
44
Figure 4. Scree Plot for Polytomous Variables
Above, in Figure 3 and 4 are the scree plots for the dichotomous and polytomous
variables that loaded above .5 on the first factor. As can be seen from the scree plots
above, the grouping of variables is mostly explained by one factor. On the dichotomous
analysis the eigenvalue for the first factor was 5.70, which explains 43.82% of the
variance, and the second is 1.57, which explains 12.07% of the variance. The ratio of
eigenvalues on the dichotomous data is 5.70 to 1.57, which indicates that the first factor
explains 3.63 times more data than the second factor. The eigenvalue for the first factor
of the polytomous analysis is 6.66, which explains 51.24% of the variance, and the
eigenvalue for the second factor is 1.27, which explains 9.75% of the variance. The ratio
for the polytomous data is 6.66 to 1.27, which indicates that the first factor explains 5.24
times more of the data than the second factor. As can be seen, the first factor with the
45
data in polytomous form explains more than when it is in dichotomous form, indicating
that the polytomous form of the data has greater unidimensionality.
A common rule explained by Meyers, Gamst, and Guarino (2006) is that the
factors used should account for 50% of the variance (e.g., Tabachnick & Fidell, 2001b).
As can be seen, the first factor for the dichotomous data accounts for almost 50% of the
variance in the data, while the first factor of the polytomous data accounts for more than
50% of the variance. Another rule explained by Spicer (2005) is that the eigenvalues
should be above 1. Even though factors 1 and 2 are both above 1 it can be seen that a
large portion of the variance is accounted for by the first factor. Therefore, only one
factor was used to decide which variables would be used in the final analysis.
Assessment of Model Fit and Model Selection
PARSCALE, a program used to run IRT, was used in order to calibrate items and
assess model fit. As was mentioned before, PARSCALE uses the M-GRM to estimate
rating scales. Also, the logistic metric was used. The main difference between the normal
and logistic metric is that the equation for the normal metric is multiplied by 1.7.
However, Embretson and Reise (2000) explain that the “normal metric is often used in
theoretical studies because the probabilities predicted by the model may be approximated
by the cumulative normal distribution” and the “parameters are anchored to a trait
distribution” (p. 131). The current study is not measuring the trait of the SMEs.
Therefore, the logistic metric was used. The logistic model was estimated with a scaling
factor of 1.7 to place it on the normal metric.
46
It should be mentioned that the previous factor analysis was done at the
subcategory level and averaged across groups of individual tasks. However, the current
analysis is done at the item level. In order to decide which type of model to use (1PL or
2PL) it was imperative to see how the items fit the data. The item fit for item response
theory was analyzed and Figure 5, 6, 7, and 8 below show how well 193 of the items fit
the data under the 1PL and 2PL models. There were 37 items outside the plot range when
dichotomized. These items were not given a fit calculation by PARSCALE. Therefore,
the plots only include the 193 items for both dichotomous and polytomous to show a
direct comparison. As can be seen in Appendix C and D, there were a total of 230 items
in the original analysis.
The following grand means were found for the combined πœ’ 2 ⁄𝑑𝑓 for all of the
tasks statements: dichotomous 1PL model equals 5.18; dichotomous 2PL model equals
29.34; polytomous 1PL model equals 2.84; and the polytomous 2PL model equals 12.36.
A lower πœ’ 2 ⁄𝑑𝑓 indicates that the items fit the data better. Comparing dichotomous 1PL
means to dichotomous 2PL means it can be seen that the dichotomous 1PL grand mean
(5.18) is much lower than the dichotomous 2PL grand mean (29.34), which indicates that
there is better fit with the 1PL model when analyzing the dichotomous items. Also, the
polytomous 1PL grand mean (2.84) is lower than the polytomous 2PL grand mean
(12.36), which indicates that there is better fit with the 1PL model when analyzing the
polytomous items. Therefore, based on the information that was found it was decided that
the 1PL model would be used in order to describe the data. As was mentioned before the
47
1PL model has a location parameter for all of the items. It also has a common slope for
all of the items.
The outliers are items that have extremely bad fit with the different models, and they
skew the histograms. Therefore, the plots (Figure 5-8) show the items with and without
outliers so that finer comparisons can be made at lower levels of πœ’ 2 ⁄𝑑𝑓 . As can be seen
in the histograms below, the πœ’ 2 ⁄𝑑𝑓 for the 1PL model is more centrally grouped. The
main grouping of variables (task statements) for the dichotomous πœ’ 2 ⁄𝑑𝑓 data points is
between 0 and 12, and the main grouping for the polytomous 1PL is between 1 and 9.
This also gives an indication that there is better item fit for the 1PL data.
48
Figure 5: Item Fit Histogram for the Dichotomous 2PL Data With (top) and Without
(bottom) Outliers
49
Figure 6. Item Fit Histogram for the Dichotomous 1PL Data With (top) and Without
(bottom) Outliers
50
Figure 7. Item Fit Histogram for the Polytomous 2PL Data With (top) and Without
(bottom) Outliers
51
Figure 8. Item Fit Histogram for the Polytomous 1PL Data With (top) and Without
(bottom) Outliers
Due to the fact that certain items did not have good fit it was decided to remove
items so that dichotomous versus polytomous comparisons could be made with items that
52
conform to model assumptions. Based on the standard outlined by Drasgow et al. (1995)
it was decided that task statements with πœ’ 2 ⁄𝑑𝑓 > 3 would be removed. After the initial
1PL IRT analysis there were 38 items removed, which was based on the polytomous item
analysis. The remaining tasks were rerun through the IRT analysis. There were an
additional 26 items removed due to poor fit (e.g., πœ’ 2 ⁄𝑑𝑓 > 3). The syntax that was used
to create the final data file and the PARSCALE syntax are in Appendix B. Also, the
specific items that were removed are shown in Tables 6 and 7 in Appendix E and F. The
following number of items remain in each of the subsections: Emergencies 8; General
Duties 28; Health and Medical 8; Investigation 9; Oral Communication 13; Read,
Review, and Analyze 5; Referrals 4; Search 3; Security 26; Supervision of Non Inmates
4; Supervision of Wards/Inmates 21; Written Communication 22. The remaining items
can be seen in Appendix A. The histograms in Figure 9 and 10 show the dispersion of
πœ’ 2 ⁄𝑑𝑓 for the final selection of task statements using the 1PL dichotomous and
polytomous models:
53
Figure 9. Item Fit Histogram for Dichotomous Items After Item Removal
Figure 10. Item Fit Histogram for Polytomous Items After Item Removal
The same tasks statements were used for both the dichotomous and polytomous
item analysis. As can be seen in Figures 9 and 10, when items have good fit as
54
polytomous items (Figure 10) they may not have as good of fit when they are
transformed into dichotomous items (Figure 9). The overall mean of πœ’ 2 ⁄𝑑𝑓 for the
polytomous items is 1.82 while the overall mean for the dichotomous items is 4.29. This
indicates that there is better fit with the polytomous items. It is noted that the opposite
may be true if one removed items based on the item fit for the dichotomous tasks
statements. However, this study focuses on what occurs when the items are
dichotomized. Therefore, the removal of items is based on the fit of the polytomous tasks
statement. The tasks statements that are used in the final analysis are in Appendix B and
the statistics for the remaining statements are in Appendix C in Tables 8 and 9.
Total Information Plots for Dichotomous and Polytomous Statements
Below, in Figures 11 and 12, are the total information plots for the dichotomous
and polytomous task statements. As can be seen, the dichotomous total information plot
has a higher peak (approximately 47) than the polytomous total information plot
(approximately 11.5), which indicates that the dichotomous task statements have higher
precision at a narrower range of theta. The polytomous items seem to give information
over a broader range of theta values. For instance, when one reaches +3 theta the
dichotomous items are barely providing any information. However, the polytomous items
still provide information at this level. If one simply wanted to answer whether or not a
task was done the dichotomous form of the task statements does give a lot of information.
However, if one wanted to know where the categories are centralized and have an idea of
how the tasks were rated the polytomous form of the data may be better, because it
spreads the information across a wider range of theta levels.
55
Figure 11. 1PL Dichotomous Total Information Plot
Figure 12. 1PL Polytomous Total Information Plot
56
Item Characteristic Curves for Dichotomous and Polytomous Models
All of the 1PL plots have a common slope. In the PARSCALE analysis, all of the
slope parameters for the dichotomous statements were estimated at .788 with a standard
error of .018. The slope parameter for the polytomous statements were estimated at .482
with a standard error of .001. With a common slope it may be difficult to discriminate
between categories for the polytomous form of the data for those tasks whose slopes are
underestimated in the 1PL. For instance, if there is a low b parameter one knows that
most of the SMEs were in the upper categories and those categories do not differ greatly.
If the slopes for such tasks were underestimated it would be more difficult to differentiate
those categories. With a common slope one may encounter this situation whereas the 2PL
model would allow better discrimination between the categories.
Tables 4 and 5, in Appendix C, give the dichotomous and polytomous items prior to
the removal of poorly fitting items. Tables 6 and 7, in Appendix C, give the dichotomous
and polytomous items after the poorly fitting items were removed. The tables include the
PARSCALE item identifier, the item number (original number on the questionnaire), b
(location) parameter, standard error, πœ’ 2 , 𝑑𝑓, the calcuation of πœ’ 2 ⁄𝑑𝑓 , and the
significance. The b parameter allows one to know the theta level where the item curves
the most (where the item gives the most information). πœ’ 2 is related to the observed and
expected frequencies. A high πœ’ 2 indicates that the observed and expected frequencies
differ substantially. 𝑑𝑓 is the degrees of freedom. The division of πœ’ 2 ⁄𝑑𝑓 allows one to
know whether the item fits the model (e.g., below 3). Finally, the significance is also
57
related to the observed and expected frequencies. If the item is significant it indicates that
there is a significant difference between the observed and expected frequencies.
In order to give a good indication of what happens when a polytomous item is
dichotomized there are three item plots given as both polytomous and dichotomous. As
can be seen in Figure 13 the dichotomized item 38 gives ICCs that are far to the left.
Figure 13. 1PL Dichotomous ICC and IIC Plots for Item 38
This indicates that a majority of the SMEs have rated the task statement as having
relevance to the job. The b parameter (location parameter) for this item is at -3.91.
PARSCALE did not estimate πœ’ 2 ⁄𝑑𝑓 for this item due to the fact that the dichotomization
caused the curve to be at a very low combined theta level (below where PARSCALE is
plotting the data). In fact, there were 24 dichotomized items whose fit statistics were not
estimated (PARSCALE did not calculate πœ’ 2 and 𝑑𝑓) because dichotomizing the items
pushed the items below the plot range. The IIC for this curve is also very low. In other
58
words, the location where the item is giving the most information is below -3 and is not
visible in this plot.
Figure 14. 1PL Polytomous ICC and IIC Plots for Item 38
In Figure 14 one can see that for the polytomous item 38 there is a relatively low
number of SMEs who indicated the the tasks does not apply (category 1). Also, the ICCs
for categories 2 (Less than a Majority) and 3 (A Majority) are going in opposite
directions. The b parameter for this item is -2.68 and the πœ’ 2 ⁄𝑑𝑓 is 1.03. The IIC peaks at
a little below the -2 theta level. The low b parameter gives an indication that the SMEs
rated the item in the upper categories.
As one can see, the dichotomization of the item caused the item to have the
greatest amount of information at a lower theta level. This is because the combined
categories 2 and 3 causes the data to appear to have much stronger agreement. In other
words, when categories 2 and 3 are combined one is combining the opinions of multiple
59
SMEs, which causes a lower theta level. This has the effect of giving the appearence that
there is a large amount of agreement that the task is relevant. This will be discussed in
more detail in the next section.
The next plot is for the dichotomized item 58 ( Figure 15). It is a little bit more
centralized and, as can be seen, there are several more individuals who felt that the tasks
did not apply to the job (category 1). The IIC indicates that the curve is giving the most
information at about the -1.3 theta level. The b parameter for the dichotomized item 58 is
-1.23 and the πœ’ 2 ⁄𝑑𝑓 is 4.15, indicating that the item does not fit the data well.
Figure 15. 1PL Dichotomous ICC and IIC Plots for Item 58
The next plot (Figure 16) is the polytomous plot for item 58. As can be seen, there
were a fair amount of individuals who felt that this task statement did not apply to the job
(category 1). At the lower levels of categories 2 and 3 one can see that the opinions flow
60
in the same direction. However, once the item reaches the threshold (the point where the
items cross) category 2 begins to decrease while category 3 continues to increase.
The b Parameter for this item is at -0.44, which gives a good indication of where
the opinions cross (threshold). The centralized b parameter indicates that the ratings were
spread across categories. Again it may be difficult to discriminate between the categories
due to the common slope. The πœ’ 2 ⁄𝑑𝑓 is at 2.58, indicating that this item fits the data
well. As can be seen in the IIC plot in Figure 16, the item is giving the most information
at about the .5 opinion level.
Figure 16. 1PL Polytomous ICC and IIC Plots for Item 58
The dichotomous plot for item 100 (Figure 17) shows curves that are a bit more
centralized. One can see that there were a fair amount of individuals who did not think
that this task was relevant to the job. There were also a fair amount of individuals who
felt that the task had some degree of relevance to the job. This item has a b (location)
parameter of -0.65, indicating where the curves slope the most. Also, the πœ’ 2 ⁄𝑑𝑓 for this
61
item is 3.11, indicating that it has good item fit. The IIC indicates that the majority of the
information is given between opinion levels -2 and 2, and the IIC peaks at about -1.7.
Figure 17. 1PL Dichotomous ICC and IIC Plots for Item 100
The polytomous plot for item 100 (Figure 18) shows that there were a fair number
of individuals who felt that the tasks did not apply to the job. Similar to item 58, when
opinions 2 and 3 reach threshold they begin to go in different directions. However,
opinions 2 and 3 do flow in the same direction until they reach a higher theta level. The
IIC for this item peaks at about 0. The majority of the information is given from about -3
to over 3. The b parameter for this item is at 0.13 and the πœ’ 2 ⁄𝑑𝑓 is 1.05, which indicates
that the item fits the data. The centralized b parameter indicates that the rating were
spread out across categories.
62
Figure 18. 1PL Polytomous ICC and IIC Plots for Item 100
When one compares the b parameters of the dichotomous and polytomous items
they can see that the b parameters (where the items are centrally located) change when
the items are dichotomized. Figure 19 shows a histogram of the change in b parameters
when the tasks are changed from polytomous to dichotomous. The change in b
parameters ranged from a minimum of -3.12 (decreasing) to a maximum of 1.69
(increasing). The mean score for the amount of change was -.034. In other words, the b
parameters tended to decrease more than they increased. However, the IIC did tend to
decrease when it was dichotomized. In other words, the point at which the item is giving
the maximum amount of information (the peak of the IIC) was at a lower theta level for
the dichotomous items, which fits with what is shown in the total information curves.
63
Figure 19. Change in b Parameters When Tasks are Dichotomized
Out of the 166 items there were only 3 items which had an IIC that moved in the
upward direction (based on where the item peaks). Item 63 moved from approximately -2
to approximately -1.8. Item 101 moved from approximately -3.4 to approximately -2.8
and item 261 moved from approximately -3.2 to approximately -2.8. However, the
dichotomous versions of these tasks had bad fit (e.g., πœ’ 2 ⁄𝑑𝑓 >3). Therefore, based on this
information it seems that, unless there is bad fit, the IIC moves to a lower theta level,
which means that the dichotomous item would be interpreted differently based on the IIC.
Further Exploration of the Rating Scale
A final exploration of the categories was done using only the upper categories (“1.
Less than a Majority” and “2. A Majority”) and excluding the lowest category (“0. Task
Not Part of Job”). This was done to address the question of whether the upper categories
alone would be sufficient to describe the task statements. As can be seen in the total
64
information plot, the upper two categories for the task statements peak at a higher theta
level than the dichotomous form of the data in Figure 11 above where the upper
categories were combined. This is because the combination of the upper categories in the
previous dichotomous model caused the task statements to have a relatively low b
parameter. Also, the information in this plot has a lot higher peak than the polytomous
total information plot (Figure 12). This indicates that the item is giving information at a
more specific level of theta. The information peak is not much different when comparing
the upper levels combined (Figure 11) to the upper levels alone (Figure 20; with DNA
removed). For instance, the information peak is a little higher when the upper levels are
combined (Figure 11).
Figure 20. 1PL Dichotomous Total Information Plot With Upper Categories Only
The different versions of the items (item 38, 58, and 100) that were shown above
in Figures 13 to 18 are also shown below in Figures 21 to 23. In this analysis, item 38
65
(Figure 21) has a b parameter of -0.91. This is much higher than the -3.91 in the
dichotomous version above (Figure 13) with the combined upper levels. This version also
has a πœ’ 2 ⁄𝑑𝑓 of 0.24, which indicates that the item has excellent fit. Also, with the task
statement in this form, and with a moderately low b parameter, it is clear that there were a
moderate amount of individuals who chose the middle category (e.g., Less than a
Majority) and a large amount of individuals who chose the upper category (A Majority).
Figure 21. 1PL Dichotomous ICC and IIC Plots for Item 38 With Upper Categories
Item 58 (Figure 22) has a b parameter at 0.76. This is also higher than the b
parameter of the dichotomous item 58 above (Figure 15). The πœ’ 2 ⁄𝑑𝑓 for this item is
1.43, which also indicates that it has good item fit. The b parameter indicates that there
were a large amount of SMEs who rated the task in the middle category (Less than a
Majority) and a moderate amount of SMEs who rated the task in the upper category (A
Majoirity).
66
Figure 22. 1PL Dichotomous ICC and IIC Plots for Item 58 With Upper Categories
Finally, item 100 (Figure 23) has a b parameter of 0.72. This is also higher than
the b parameter of the dichotomous version of the item above (Figure 17). Item 100 has a
πœ’ 2 ⁄𝑑𝑓 of 1.98, indicating that the item has good fit. The b parameter indicates that there
were a larger amount of SMEs endorsing the middle category (Less than a Majority) than
there were endorsing the upper category (A Majority). However, it would be important to
note that this item also has a lot of SMEs rating it in the DNA category, which is shown
in the dichotomization in Figure 18. The meaning of these findings will be discussed in
more detail in the next chapter.
67
Figure 23. 1PL Dichotomous ICC and IIC Plots for Item 100 With Upper Categories
Looking at the b parameter of these items and comparing it to the old
dichotomous items it appears that all of these moved to a higher b parameter. The mean
amount of increase was 1.48. There were no items which had a b parameter above 3 or
below -3 theta level. This indicates that one would get a better visual description of the
item when analyzing the upper categories alone. In other words, the curve of the item
would be viewable on the plot, because the plot is between -3 and 3. Also, with the b
parameter moving in an upward direction the IIC for each of the items would move in an
upward direction, which would indicate that the task statements for the upper levels alone
(when compared to the lower and combined upper levels) are giving the maximum
amount of information at a higher theta level.
When comparing the dichotomous version (upper levels alone with DNA
removed) to the polytomous version, one can see that the polytomous version gives an
idea of how the item was rated across the full rating scale based on the b parameter,
68
however, the dichotomous version allows one to gain a better understanding of how just
the upper levels were rated.
69
Chapter 6
DISCUSSION
This study looked at the difference between dichotomous and polytomous task
statements when using IRT, and looked at whether one needs to be cautious when
dichotomizing task statements. One thing that stood out immediately was that there were
several task statements which completely lost variance when they were dichotomized
(e.g., all items data points became a 1). This indicates that everyone in the sample
thought that these specific tasks statements had at least some relevance to the job position
(e.g., no one chose “0. Task Not Part of Job”). Therefore, immediately after
dichotomizing, some of the most relevant variables to the job position were not usable,
because they no longer had variance.
There appear to be certain benefits and disadvantages to the dichotomization of
questionnaire data. When one looks at the full data plots they can see that the polytomous
items give information across all theta levels, while the dichotomous items give
information at the lower theta levels. However, even though the dichotomous items do
not cover the entire spectrum of theta levels they do have a higher peak, which indicates a
greater amount of information at a lower, more specific, theta level. Often in occupational
testing, when one is establishing a specific pass/fail point for an examination, one would
want the problems to be at a specific theta level, because it allows one to know that they
are targeting the right ability range. However, one might question whether the “target”
theta should be more specific with task statements.
70
As can be seen in the plots from the Results section one loses a certain amount of
relevant data when one dichotomizes (e.g., combining the upper categories). With Item
38 (Figure 13), when there is a high amount of individuals who feel the tasks are relevant
to the job, the combination of categories 2 (Less than a Majority) and 3 (A Majority)
pushes the ICC to a much lower theta level. It also gives the impression that there was
almost 100% agreement. In other word, the b parameter (location parameter) is really low
(-4.69) indicating that the majority of the SMEs marked the item as having at least some
level of relevance to the job. While it is true that the majority of the individuals chose
categories 2 and 3 and the combination of the two categories would give a curve that
starts at a low b parameter, it does not account for the fact that there was a split between
categories 2 and 3. The polytomous form of the data (using the 1PL model) does not
allow one to discriminate between categories. One would need to look at the upper levels
with DNA removed to get a clearer picture of the upper categories.
One can also see this in item 58 (Figure 16). In this item, categories 2 and 3 are
flowing together at certain points. However, the dichotomization of the variable does not
account for the fact that categories 2 and 3 begin to go in different directions (e.g., after
they cross the threshold). As was mentioned in the Results section, if one simply wanted
to answer whether the task is part of the job then the dichotomous form of the data (with
the upper levels combined) would be sufficient to answer this question.
Item 100 is a bit different. With the polytomous form of item 100, categories 2
and 3 are going in the same direction at a higher theta level. This indicates that there are a
fair amount of SMEs rating this item in the DNA category. One should note how many
71
SMEs are in each category. If there are a few SMEs in one of the categories then
combining the levels may not have an effect on the curve (Bond & Fox, 2007).
Another difference between the dichotomous and polytomous task statements is in
the IIC. The IIC indicates where the curves are giving the most information. As can be
seen, the dichotomous data has a sharper peak on all of the dichotomous plots, which
indicates that the data are giving information at a more specific theta range. Whereas, the
IIC for the polytomous data are broader, indicating that the items cover a larger theta
range.
In testing, when one wants to have a specific pass/fail point, it is preferable to
have an IIC which is a bit more narrow, because it allows one to know at what ability
level the data are functioning the best. However, this study is not based on right and
wrong answers it is based on educated opinion. One could argue that there are opinions
that are more correct than others (e.g., items which require extensive job experience), but
all of the SMEs have their own unique view point based on their experience or previous
experience in the position. In other words, certain individuals may have had experiences
that have led them to believe that certain tasks are of higher or lower importance.
Therefore, it is hard to say whether a narrow or broad IIC is preferable for tasks related
data.
Finally, if one looks at the dichotomous (with combined upper levels) and
polytomous IICs for the same problem one can see that the dichotomous IICs consistently
peak at a lower level than the polytomous IICs. This occurs because when one combines
categories one is creating what appears to be a higher level of agreement for the newly
72
dichotomized category. Therefore, when the data are dichotomized it gives the
impression that there is a higher level of agreement than there actually is.
Implications
When one dichotomizes task statements he or she may lose some of the statements,
depending on the sample size, due to a loss of variance. All of the individuals in a sample
may feel that certain task statements are relevant to a job position. If this is the case, then
the variable would be 100% the same (e.g., all 1s) when it is dichotomized. By losing this
data one is losing variables which are highly critical to the job position. For instance, all
of the variables that were not used in this study, due to lack of variability, after
dichotomizing, were high on the criticality index. When one is losing highly critical data
they are losing the data that is most relevant to the job. However, these task statements
could be analyzed by looking at the upper levels alone (e.g., with DNA removed).
As was mentioned before, one should be cautious when dichotomizing opinion
related statements. When one dichotomizes the task statements they are changing the
look and meaning of the statements. Often, the polytomous choices are going in different
directions. For instance, one could interpret the polytomous version of Item 38 (Figure
14) as saying that there are a low level of SMEs who consider the task to be irrelevant to
the job; there is a moderate amount of SMEs who consider the task to be performed by
less than a majority of the individuals entering the job; and there is a high number of
SMEs who feel that the task would be performed by a majority of the people in this job.
It does seem that one should also be cautious when using the polytomous form of the
data with a 1PL model. The b parameter gives a good indication of how the item was
73
rated. For instance, a low b parameter would indicate that a majority of the SMEs rated
the upper categories. However, with a common slope one is not able to discriminate
between the categories as much as one might with the 2PL model. For example, the low b
parameter alone does not allow one to understand the degree of endorsement of the upper
categories. The 1PL model allows one to know where the categories are centrally based
on the continuum, which gives a good indication of how the tasks were rated, but it may
not allow one to clearly understand the degree of discrimination between the categories.
One might also question the appropriateness of the M-GRM with a DNA category on
a 3-point scale. One might question whether the DNA category should be considered a
zero point along a continuum or whether DNA should be off the continuum. If the DNA
category is endorsed the SME is essentially saying the task does not have a degree of
relevance to the job. If the DNA category is considered to be off the continuum then the
M-GRM may not be appropriate for the polytomous form of the data, (e.g., since it is not
“ordered categorical”; Embretson & Reise, 2000, p. 97).
When one dichotomizes the item it can be interpreted in the following way: there is a
low number of individuals who consider this tasks to be irrelevant to the job; there is a
high level of agreement that this task has some form of relevance to the job. Essentially,
when one is dichotomizing by combining the upper levels they are losing the specifics of
how the data is flowing. However, they are answering the question of whether the task
does or does not have some form of frequency.
With the final analysis the items were analyzed with the upper levels (“2. Less
Than a Majority” and “3. A Majority”) in isolation. Item 38 (Figure 21) could be
74
interpreted as follows: a majority of the SMEs felt that this task would be performed by a
majority of the people in the job. The upper categories alone can be interpreted based on
where the b parameter is located. If it has a low b parameter it indicates that most of the
SMEs chose 3. A Majority. If the task statement has a high b parameter it indicates that
most of the SMEs chose 2. Less Than a Majority.
One should use caution when analyzing the upper levels alone, because if there are a
lot of SMEs who chose “Task Not Part of Job” then one would not get the full picture of
how the task statement was rated. If there are a small amount of SMEs who chose the
lower category then the data could be analyzed in this manner. However, it may be useful
to dichotomize the data with does not apply as 0 and all other categories as 1 and by
using the upper categories alone. This would allow one to see whether the SMEs have
rated the task statement as being part of the job and it would allow one to see the degree
to which it is part of the job.
It does seem that using the polytomous form of the data with a 1PL model would not
give as clear of a picture as the two dichotomous forms of the data. The b parameter, with
the data in polytomous form, gives an idea of how the task was rated, but it does not give
a clear description of how the upper categories were rated. The dichotomous form of the
data with the upper levels combined answers whether or not the task is part of the job.
This is answered by the polytomous form of the data. However, the dichotomous form of
the data with the upper categories alone (e.g., DNA removed) allows one to gain a greater
understanding of how the upper categories were endorsed (e.g., did more SMEs select A
75
Majority or Less than a Majority). This question is not answered by the polytomous form
of the data.
Limitations
One of the limitations of this study was the moderate sample size (e.g., 544
individuals). It stands to reason that if there were a larger sample size (e.g. 1000) then
there would be a greater chance that the individuals in the sample would have collectively
chosen the full range of categories (e.g., “1. Task Not Part of Job”; “2. Less Than a
Majority”; or “3. A Majority”) for each item. If this were the case, variables would not be
lost when dichotomizing. However, if 544 individuals felt that certain tasks are part of the
job then increasing the sample may not increase the variability by much (e.g., a large
majority would be a 1 when the variable is dichotomized). Chuah, Drasgow, and Luecht
(2006) “simulated responses of 300, 500, and 1,000 respondents” (p. 241). They found
that 300 respondents were sufficient to estimate “ability” (Chauh et al., 2006, p. 241).
Based on this, it seems that the data in this study may be considered as at least a moderate
sample size for an IRT analysis.
This study may have had limited generalizability. There were approximately 80
percent male SMEs in this study. If there is not a good mix of male and female SMEs it
may reduce the external validity of the study. However, this may reflect a common
population in a correctional facility. In other words, there may be more males who are in
and who apply for the Correctional Officer position.
Another factor which may have contributed to a reduction of accuracy is that there
were three different entry-level job positions (Correctional Officers, Juvenile Officers,
76
and Juvenile Counselors). These were tasks that were common to all of the positions.
However, it stands to reason that certain tasks are performed more by people in one job
title than those in another. If the data were specifically from one job position the accuracy
of the IRT analysis would likely increase.
The corrections agency originally looked at the positions separately. In their analysis
they were able to identify the tasks that were rated similarly and those that were rated
differently. The current study would benefit from being able to analyze the positions
separately. However, the sample size would end up being too small for an IRT analysis.
A final limitation is that with the 1PL model is not able discriminate between
categories with the polytomous data. For instance, it is not as clear how the upper
categories were endorsed with the polytomous form of the data. Therefore, the data is not
as clear with the 1PL model.
Future Research
There are several directions one could go in doing future research. One could use
a larger sample size to see if variables are lost when dichotomizing. If one has a larger
sample size it is quite possible that they would not lose any variables when
dichotomizing. One could also use different job classifications to see if these results are
generalizable to other jobs. One could also see how different classifications of SMEs (e.g.
Correctional Officers vs. Juvenile Officers) rate the tasks and how the results differ under
dichotomous and polytomous variables. For instance, do the supervisors rate the tasks
differently than the individuals who are in the job? If so, why do they rate the tasks
differently?
77
One could ask what theta means for the individual. At times theta may be a bit
difficult to define, especially if there are a lot of variables included in the factor. How
does one define theta for a job analysis? As was explained above “the ‘difficulty’ of an
item is defined in terms of the amount of the general work activity (GWA) construct (πœƒ)
that would be needed to produce a given level of item endorsement” (Harvey, 2003, p. 2).
Therefore, one might suggest that theta, for a job analysis, is based on the individuals
knowledge of the job position. So, the characteristic that one understands through IRT is
the amount of knowledge that an individual has about the tasks carried out on the job.
Also, in relation to the last question (e.g., differences in classifications), if there are
differences in classifications then which classification should be given more weight?
Future research could also look at a comparison of the 1PL and 2PL models. One
might ask whether having both a location and a discrimination parameter would change
the ICC plots for the poltytomous data. Also, would this allow to one to understand the
task statements without using a dichotomization?
One could also ask what type of information they are receiving from the SMEs
and if certain SMEs are giving less reliable information. For instance, is the information
of an individual who is outside of the job, who is supposed to be knowledgeable of the
job, similar to someone who is actually in the job position? Also, which information
would be considered to be the most accurate? These and other questions could be looked
at with future research.
78
APPENDICES
79
APPENDIX A
Task Statements in Final Analysis
Emergencies
36. Clean up contaminated or hazardous material.
37. Conduct emergency and disaster drills.
38. Control hostile groups, disturbances, and riots.
39. Dispatch help in emergencies or disturbances within or outside the facility.
43. Implement emergency procedures/disaster plan.
46. Report emergencies.
47. Respond to disturbances or emergencies within or outside the facility.
49. Negotiate hostage release.
Escort, Move, Transportation
50. Escort medical professionals who are providing medical services to wards/inmates.
51. Escort vehicle(s) during emergency and/or high security transport.
52. Escort ward/inmates within and outside the facility.
53. Evaluate wards’s/inmate’s potential security risk prior to transport.
54. Inform central control of ward/inmate movement.
55. Monitor all individuals and vehicle movement inside, outside and in the immediate
area of the facility.
56. Process vehicles entering, leaving or within the facility.
58. Prepare wards/inmates for transportation to court, hospital, etc.
59. Transport equipment, supplies and evidence.
60. Transport injured wards/inmates.
61. Transport wards/inmates individually and in groups outside the facility.
80
62. Use ward/inmate daily movement sheet.
63. Issue passes/ducats to wards/inmates.
65. Move wards/inmates in and out of cells.
General Duties
66. Assign jobs to wards/inmates.
67. Attend staff meetings.
68. Attend training.
69. Clean areas of the facility when wards/inmates are not available.
70. Consult with supervisors.
71. Develop proposals for program, facility or policy improvements.
72. Exchange ward/inmate linens and clothing.
75. Distribute mail, supplies, meals, commissary items, equipment, etc.
77. Approve or disapprove special purchases for wards/inmates.
78. Confiscate and replace damaged ward/inmate linens and clothing.
80. Operate a vehicle or bicycle.
83. Order supplies.
85. Prepare meals.
86. Process law library requests and library books.
89. Raise/lower flag.
91. Report food shortages.
93. Serve meals.
94. Tour other facilities.
95. Give instructions to staff.
96. Observe the work of facility staff through peer review.
81
98. Process ward/inmate grievances and complaints.
99. Respond to ward/inmate questions or requests.
100. Train correctional staff.
101. Instruct wards/inmates.
102. Observe blind spots using a curved mirror.
103. Monitor wards/inmates and facility using closed circuit television systems.
104. Operate communication equipment.
105. Operate safety equipment.
Health and Medical
110. Identify the immediate need for medical treatment.
111. Prepare injured individuals for transport.
112. Report changes in ward/inmate physical, mental and emotional condition.
113. Screen wards/inmates to determine if medical/mental health attention is needed
before intake/booking.
116. Verify that wards/inmates receive food for special diets.
118. Decontaminate wards/inmates after use of chemical agent.
119. Comply with Prison Rape Elimination Act guidelines.
120. Implement safety/heat precautions for wards/inmates on psychotropic medications.
Investigation
123. Assist police in their investigation of crimes.
124. Develop ward/inmates informats.
125. Gather information for disciplinary proceedings.
129. Implement ward/inmate due process procedures.
130. Interview wards/inmates as part of an investigation.
82
131. Investigate accidents or crimes that occur within the facility.
132. Investigate disciplinary reports.
133. Investigate ward/inmate injuries.
137. Process evidence.
Oral Communication
138. Alert staff members of ward/inmate behavior changes.
139. Answer phone calls.
143. Communicate with external departments.
144. Confer with staff, specialists and others regarding wards/inmates.
145. Explain institutional policies, procedures and services to wards/inmates.
146. Follow oral instructions.
147. Give oral instructions and reports.
148. Inform relief staff of facility events during shift change.
149. Inform visitors and staff of facility facts, policies and procedures individually or in
groups.
150. Notify supervisors of potential emergencies/hazards.
151. Notify wards/inmates of visitors.
152. Translate foreign languages into English.
153. Use radio codes to communicate with staff.
Read, Review, and Analyze
156. Interpret common street terminology.
157. Interpret Department of Justice (DOJ) criminal history reports.
83
158. Read written and/or electronic documents.
159. Review forms and documents for accuracy and completeness.
161. Review ward/inmate case files.
Referrals
162. Advocate for urgent services for wards/inmates.
163. Identify wards/inmates in need of medical or psychiatric care.
164. Make appropriate referrals.
165. Obtain assistance for wards/inmates in need of medical, dental or psychiatric care.
Search
167. Dispose of contraband.
169. Perform a contraband watch.
170. Search individuals, property, supplies, areas and vehicles.
Security
171. Account for facility keys.
172. Activate personal and/or control center alarm.
178. Compare fingerprints/palmprints to verify identification of wards/inmates.
180. Conduct metal detection screening of visitors.
182. Inspect food for contamination and/or tampering.
183. Inspect and document vehicle safety and operating condition.
185. Log weapons/guns in and out.
186. Monitor all persons entering, leaving and within the facility.
188. Monitor the zone control panel.
84
189. Operate/secure gates, doors, locks and sallyports.
191. Process wards/inmates leaving a security area.
192. Protect the security of courtrooms, hospitals and other external locations when
wards/inmates are present.
193. Report ward/inmate count discrepancies.
195. Sign in and out of the facility.
196. Test all equipment to ensure proper functioning.
197. Update count of visitors entering and leaving the facility.
198. Account for location and status of wards/inmates.
199. Verify identification badges and passes.
201. Verify ward/inmate count.
202. Verify ward/inmate identity.
204. Admit and release visitors.
205. Issue identification badges and passes.
206. Screen visitors against approved visitor list and enforce visiting dress code.
207. Use stamp and black light to identify visitors.
208. Maintain confidentiality of information.
209. Account for location and status of staff within and outside the facility.
Supervision of Non-inmates
210. Conduct facility towards.
211. Escort contract workers, non-custody staff and visitors within the facility.
213. Supervises visitors in contact and non-contact visits.
85
214. Account for location and status of visitors.
Supervision of Wards/Inmates
216. Arrange daily schedules of wards/inmates.
217. Identify wards/inmates with disabilities and assist them.
218. Assist wards/inmates with paperwork/schoolwork.
221. Encourage wards/inmates through positive feedback.
222. Evaluate wards/inmates.
223. Higher wards/inmates for work detail.
224. Identify gang affiliation and implement processing procedures.
225. Identify homosexual behavior.
229. Intervene in ward/inmate disputes to deescalate a potentially violent conflict.
231. Maintain ward/inmate discipline.
234. Monitor ward/inmate activity.
235. Monitor ward/inmate phone calls.
236. Monitor wards/inmates for signs of alcohol or drug use/abuse and document any
issues.
237. Monitor wards/inmates in safety cell, sobering cells, crisis rooms/center or
restraints.
239. Obtain and process urine samples.
240. Obtain wards/inmates signature on forms.
242. Plan on and off-grounds activities for wards/inmates.
243. Prevent unauthorized ward/inmate communication.
86
244. Recommend ward/inmate work assignments.
245. Supervise ward/inmate cell and area moves.
248. Implement suicide watch procedures.
Written Communication
261. Complete paperwork and forms.
263. File and retrieve documents and record system.
264. Notify housing units of wards/inmates scheduled for release or transfer.
265. Notify sender and receiver of the confiscation of contraband.
266. Prepare a list of wards/inmates going to court.
268. Process deceased inmates.
269. Process ward/inmate money.
270. Record all phone calls placed to or by wards/inmates in a log.
271. Document changes in wards'/inmates' mental and physical condition.
273. Document inspections and security checks.
274. Document persons and vehicles entering and leaving the facility.
275. Document the condition or security of printer structures, weapons or equipment.
276. Document Ward/inmate injuries.
278. Document Ward/inmate movement and activities.
279. Document ward/inmate rule violations.
280. Document ward/inmate trust account information.
281. Document ward/inmate visits.
282. Document whether ward/inmate takes or refuses medication or food.
87
283. Release property or money to transferred, released or paroled ward/inmate.
285. Requests repairs to facility and/or equipment.
289. Update list of approved ward/inmate visitors.
290. Update logs, documents, records and files.
88
APPENDIX B
How Data were Converted and Run in PARSCALE
This guide indicates how the data were converted and ran in PARSCALE. It is not
meant to be a comprehensive guide. It is simply the step-by-step process that was done in
order to run the data in this project. Some of these steps would not be the same if one
were looking at how tests questions are functioning (e.g., recoding would be different).
The data were run in SPSS Version 16.0.
Step 1: Recode the variables into 1s and 0s.
It should be mentioned that this step would not apply to the polytomous data. The
following steps are taken in SPSS to recode: Transform > Recode into Same Variables >
Select Variables to be Recoded > Click Old and New Values > Enter Old Value as 2 and
the New Value as 1 > Add > Continue > OK. This will change all of the 2s to 1, which
makes the data dichotomous.
Step 2 : Recode the blanks into 9s. The following steps are taken in SPSS to change the
blanks: Transform > Recode into Same Variables > Select Variables to be Recoded >
Click Old and New Values > Click System Missing > Enter New Value 9 >Add >
Continue > OK.
Step 3: The following syntax was used to create the data file:
89
write outfile='f:practice4_dichotomous4.dat'
/idcode (a12,1x) EM_A036 EM_A037 EM_A038 EM_A039 EM_A043
EM_A046 EM_A047 EM_A049 ES_A050 ES_A051 ES_A052 ES_A053
ES_A054 ES_A055 ES_A056 ES_A057 ES_A058 ES_A059 ES_A060 ES_A061
ES_A062 ES_A063 ES_A065 GD_A066 GD_A067 GD_A068 GD_A069
GD_A070 GD_A071 GD_A072 GD_A075 GD_A077 GD_A078 GD_A080
GD_A083 GD_A085 GD_A086 GD_A089 GD_A091 GD_A093 GD_A094
GD_A095 GD_A096 GD_A098 GD_A099 GD_A100 GD_A101 GD_A102
GD_A103 GD_A104 GD_A105 HM_A110 HM_A111 HM_A112 HM_A113
HM_A116 HM_A118 HM_A119 HM_A120 IN_A123 IN_A124 IN_A125
IN_A129 IN_A130 IN_A131 IN_A132 IN_A133 IN_A137 OC_A138 OC_A139
OC_A143 OC_A144 OC_A145 OC_A146 OC_A147 OC_A148 OC_A149
OC_A150 OC_A151 OC_A152 OC_A153 RR_A156 RR_A157 RR_A158
RR_A159 RR_A161 RF_A162 RF_A163 RF_A164 RF_A165 SR_A167
SR_A169 SR_A170 SC_A171 SC_A172 SC_A178 SC_A180 SC_A182
SC_A183 SC_A185 SC_A186 SC_A188 SC_A189 SC_A191 SC_A192
SC_A193 SC_A195 SC_A196 SC_A197 SC_A198 SC_A199 SC_A201
SC_A202 SC_A204 SC_A205 SC_A206 SC_A207 SC_A208 SC_A209
SN_A210 SN_A211 SN_A213 SN_A214 SI_A216 SI_A217 SI_A218 SI_A221
SI_A222 SI_A223 SI_A224 SI_A225 SI_A229 SI_A231 SI_A234 SI_A235
SI_A236 SI_A237 SI_A239 SI_A240 SI_A242 SI_A243 SI_A244 SI_A245
SI_A248 WC_A261 WC_A263 WC_A264 WC_A265 WC_A266 WC_A268
WC_A269 WC_A270 WC_A271 WC_A273 WC_A274 WC_A275 WC_A276
WC_A278 WC_A279 WC_A280 WC_A281 WC_A282 WC_A283 WC_A285
WC_A289 WC_A290
(166(n1)).
exe.
The same syntax was used for the dichotomous and polytomous data. The only
difference was that the filename was changed. The syntax starts with the filename
followed by the ID code. The a12 indicates that the ID code is alphanumeric and contains
12 spaces. This is followed by the variable names that were assigned. 230(n1) indicates
that there are 230 variables. Finally, the syntax ends with an execute command (exe).
Step 4: Create a default file for the non-answered portions of the test.
A default file can be created in Notepad. This file should have the same format as the
original file. In other words, there should be the same amount of spaces for your ID code
and variables that there are in the original data file. The default file contains only one
90
line. Start the default file with an ID code space (e.g., 000000000001). Notice there are
12 numbers in the ID code. Follow this by a space and input as many 9s as there are
variables (e.g., 230 variables equals 230 9s).
Step 5: Create the syntax to run the data in PARSCALE.
The syntax below was used to analyze the data in PARSCALE. This was after finding
good fit and rerunning the data with only 136 task statements:
Polytomous analysis
>COMMENTS
Final analysis of the questionnaire using polytomous data.
>FILE
OFNAME='MissDichPol.txt',DFNAME='practice4_polytomous4.dat', SAVE;
>SAVE
PARM='polytomous.par', SCORE='polytomous.SCO', FIT='polytomous.fit';
>INPUT NIDCH=12, NTOTAL=166, NTEST=1, LENGTH=166;
(12A1,1x,166A1)
>TEST1 TNAME=POLYTOTAL, INAME=(
a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,a17,a18,a19,
a20,a21,a22,a23,a24,a25,a26,a27,a28,a29,a30,a31,a32,a33,a34,a35,a36,a37,a38,
a39,a40,a41,a42,a43,a44,a45,a46,a47,a48,a49,a50,a51,a52,a53,a54,a55,a56,a57,
a58,a59,a60,a61,a62,a63,a64,a65,a66,a67,a68,a69,a70,a71,a72,a73,a74,a75,a76,
a77,a78,a79,a80,a81,a82,a83,a84,a85,a86,a87,a88,a89,a90,a91,a92,a93,a94,a95,
a96,a97,a98,a99,a100,a101,a102,a103,a104,a105,a106,a107,a108,a109,a110,a111,
a112,a113,a114,a115,a116,a117,a118,a119,a120,a121,a122,a123,a124,a125,a126,
a127,a128,a129,a130,a131,a132,a133,a134,a135,a136,a137,a138,a139,a140,a141,
a142,a143,a144,a145,a146,a147,a148,a149,a150,a151,a152,a153,a154,a155,a156,
a157,a158,a159,a160,a161,a162,a163,a164,a165,a166),
NBLOCK=1;
>BLOCK1 BNAME=POLY, NITEMS=166, NCAT=3, ORIGINAL=(0,1,2), MODIFIED=(1,2,3),
CADJUST=0.0, CNAME=(none,perform,mjperform);
>CALIB GRADED, LOGISTIC, SCALE=1.7, NQPTS=15, CYCLES=(50,1,1,1,1),
NEWTON=2, CRIT=0.01, ITEMFIT=10, TPRIOR,SPRIOR,CSLOPE;
>SCORE MLE, SMEAN=0.0, SSD=1.0, NAME=MLE, PFQ=5;
Dichotomous analysis
>COMMENTS
Final analysis of the questionnaire using dichotomous data.
>FILE
DFNAME='practice4_dichotomous4.dat',OFNAME='MissDichPol.dat',SAVE;
>SAVE
PARM='dichotomous.par', SCORE='dichotomous.SCO', FIT='dichotomous.fit';
>INPUT NIDCH=12, NTOTAL=166, NTEST=1, LENGTH=166;
(12A1,1x,166A1)
>TEST1 TNAME=Form291, INAME=(
a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,
a17,a18,a19,a20,a21,a22,a23,a24,a25,a26,a27,a28,a29,a30,a31,a32,
a33,a34,a35,a36,a37,a38,a39,a40,a41,a42,a43,a44,a45,a46,a47,a48,a49,
a50,a51,a52,a53,a54,a55,a56,a57,a58,a59,a60,a61,a62,a63,
91
a64,a65,a66,a67,a68,a69,a70,a71,a72,a73,a74,a75,a76,a77,
a78,a79,a80,a81,a82,a83,a84,a85,a86,a87,a88,a89,a90,a91,
a92,a93,a94,a95,a96,a97,a98,a99,a100,a101,a102,a103,a104,
a105,a106,a107,a108,a109,a110,a111,a112,a113,a114,a115,a116,
a117,a118,a119,a120,a121,a122,a123,a124,a125,a126,a127,a128,
a129,a130,a131,a132,a133,a134,a135,a136,a137,a138,a139,a140,
a141,a142,a143,a144,a145,a146,a147,a148,a149,a150,a151,a152,
a153,a154,a155,a156,a157,a158,a159,a160,a161,a162,a163,a164,
a165,a166), NBLOCK=1;
>BLOCK BNAME=DICH, NITEMS=166, NCAT=2, ORIGINAL=(0,1), MODIFIED=(1,2),
CADJUST=0.0, CNAME=(none,perform);
>CALIB GRADED, LOGISTIC, SCALE=1.7, NQPTS=15, CYCLES=(50,1,1,1,1),
NEWTON=2, CRIT=0.01, ITEMFIT=10, TPRIOR, SPRIOR,CSLOPE;
>SCORE MLE, SMEAN=0.0, SSD=1.0, NAME=MLE, PFQ=5;
One thing worth mentioning is that the difference between the 1PL and the 2PL model is
that the 1PL model has the CSLOPE (common slope) command and the 2PL model does
not have the CSLOPE.
Step 6: Run the data through all of the phases to get your results.
92
APPENDIX C
Polytomous Data for 1PL Model Before Removal of Poorly Fitting Items
Table A1
Item
Number
36
37
38
39
40*
41*
43
44*
45*
46
47
48*
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64**
65
66
67
68
69
70
71
72
b
Parameter
(Location)
-0.36
-1.05
-2.81
-0.74
-0.85
-0.12
0.90
0.77
-1.13
-3.52
-3.19
-1.37
3.07
-0.96
-0.25
-2.50
-0.63
-3.26
-1.40
-0.50
1.33
-0.35
-0.32
-1.02
-0.22
-2.92
-2.85
-3.61
-3.72
-0.12
-2.02
-4.41
-1.23
-4.08
1.14
-1.57
SE
0.16
0.15
0.21
0.14
0.20
0.22
0.17
0.15
0.21
0.22
0.21
0.19
0.18
0.17
0.21
0.18
0.17
0.22
0.19
0.22
0.18
0.18
0.18
0.21
0.19
0.19
0.22
0.28
0.27
0.15
0.18
0.27
0.18
0.28
0.15
0.17
πœ’2
41.23
19.52
9.46
34.69
79.13
181.52
30.90
130.54
100.01
15.31
15.98
114.90
26.03
29.81
28.43
5.48
20.95
8.45
24.20
54.09
36.98
41.71
31.99
32.00
32.35
3.34
17.81
24.31
16.30
32.20
24.01
9.21
37.55
20.16
25.11
31.06
πœ’ 2 ⁄𝑑𝑓
df
17
16
7
16
16
17
18
18
15
7
7
15
18
16
17
9
17
7
15
17
18
17
17
16
17
7
7
7
7
17
12
6
15
6
18
15
2.43
1.22
1.35
2.17
4.95
10.68
1.72
7.25
6.67
2.19
2.28
7.66
1.45
1.86
1.67
0.61
1.23
1.21
1.61
3.18
2.05
2.45
1.88
2.00
1.90
0.48
2.54
3.47
2.33
1.89
2.00
1.53
2.50
3.36
1.40
2.07
Sig
0.001
0.242
0.221
0.004
0.000
0.000
0.030
0.000
0.000
0.032
0.025
0.000
0.099
0.019
0.040
0.792
0.228
0.294
0.062
0.000
0.005
0.001
0.015
0.010
0.014
0.853
0.013
0.001
0.022
0.014
0.020
0.161
0.001
0.003
0.122
0.009
93
73*
74*
75
76*
77
78
79**
80
82*
83
84**
85
86
87*
88**
89
91
93
94
95
96
97*
98
99
100
101
102
103
104
105
106*
107*
108**
110
111
112
113
114*
115*
116
118
119
120
121**
-0.87
-2.53
-3.43
2.65
1.91
-1.71
-2.86
-1.71
1.70
-0.76
0.70
2.20
1.74
-1.78
-1.70
1.48
0.39
-0.90
1.60
-0.63
1.69
0.35
0.49
-2.84
0.25
-4.23
-1.65
-0.73
-4.08
-2.58
0.19
-0.53
-0.61
-1.58
0.24
-2.02
2.48
-0.96
1.13
-0.43
-2.84
-2.01
-1.95
2.94
0.14
0.23
0.24
0.16
0.16
0.18
0.21
0.16
0.13
0.17
0.17
0.16
0.16
0.18
0.19
0.17
0.15
0.16
0.17
0.17
0.16
0.14
0.15
0.22
0.16
0.26
0.16
0.16
0.26
0.16
0.31
0.24
0.18
0.17
0.15
0.19
0.16
0.20
0.13
0.17
0.20
0.16
0.17
0.15
88.31
36.31
23.31
143.44
33.40
21.94
22.09
19.47
146.49
35.00
63.57
37.65
22.78
54.59
44.39
42.00
31.76
36.48
20.25
26.67
26.69
71.15
45.58
10.41
23.61
9.62
16.01
14.66
5.96
8.42
370.70
177.82
54.69
26.14
32.53
35.85
46.96
131.05
101.03
36.17
7.71
17.42
37.71
59.04
16
8
7
18
18
13
7
13
18
16
18
18
18
13
13
18
17
16
17
17
18
17
17
7
17
6
13
16
6
7
17
17
17
15
17
12
18
16
18
17
7
12
12
18
5.52
4.54
3.33
7.97
1.86
1.69
3.16
1.50
8.14
2.19
3.53
2.09
1.27
4.20
3.41
2.33
1.87
2.28
1.19
1.57
1.48
4.19
2.68
1.49
1.39
1.60
1.23
0.92
0.99
1.20
21.81
10.46
3.22
1.74
1.91
2.99
2.61
8.19
5.61
2.13
1.10
1.45
3.14
3.28
0.000
0.000
0.002
0.000
0.015
0.056
0.003
0.109
0.000
0.004
0.000
0.004
0.199
0.000
0.000
0.001
0.016
0.003
0.261
0.063
0.085
0.000
0.000
0.166
0.130
0.140
0.248
0.550
0.428
0.296
0.000
0.000
0.000
0.036
0.013
0.000
0.000
0.000
0.000
0.004
0.358
0.134
0.000
0.000
94
123
124
125
126**
127*
128*
129
130
131
132
133
134*
135*
136*
137
138
139
141**
142**
143
144
145
146
147
148
149
150
151
152
153
154*
155*
156
157
158
159
160*
161
162
163
164
165
167
168**
2.42
0.64
-0.77
-1.27
-1.49
-2.00
0.02
-0.31
0.36
1.00
-0.44
-2.04
-1.71
-2.10
-1.07
-3.63
-4.40
0.98
-3.93
0.73
-2.33
-3.29
-4.65
-4.04
-5.05
-1.17
-3.60
-2.39
1.47
-4.75
-1.76
-1.46
-1.58
1.93
-1.60
-1.47
1.34
-0.39
0.55
-1.95
-1.25
-2.02
-3.21
-4.36
0.17
0.17
0.17
0.23
0.26
0.23
0.15
0.19
0.18
0.17
0.18
0.23
0.26
0.26
0.19
0.24
0.26
0.20
0.25
0.16
0.19
0.22
0.31
0.27
0.36
0.16
0.23
0.19
0.20
0.31
0.15
0.16
0.17
0.16
0.16
0.16
0.14
0.17
0.14
0.22
0.22
0.21
0.21
0.29
41.94
30.34
34.37
57.43
115.12
60.23
48.89
38.11
41.41
40.67
59.26
74.79
69.20
73.03
41.50
12.36
9.65
64.55
12.04
35.65
12.12
7.79
5.01
12.72
7.66
35.01
14.36
17.20
56.72
12.67
74.93
79.22
19.05
18.53
37.18
29.67
98.21
28.33
39.83
36.26
28.58
24.14
14.64
22.61
18
17
16
15
15
12
17
17
17
18
17
12
13
12
16
7
6
18
6
18
10
7
6
6
5
15
7
10
18
6
13
15
15
18
15
15
18
17
17
12
15
12
7
6
2.33
1.78
2.15
3.83
7.67
5.02
2.88
2.24
2.44
2.26
3.49
6.23
5.32
6.09
2.59
1.77
1.61
3.59
2.01
1.98
1.21
1.11
0.83
2.12
1.53
2.33
2.05
1.72
3.15
2.11
5.76
5.28
1.27
1.03
2.48
1.98
5.46
1.67
2.34
3.02
1.91
2.01
2.09
3.77
0.001
0.024
0.005
0.000
0.000
0.000
0.000
0.002
0.001
0.002
0.000
0.000
0.000
0.000
0.000
0.089
0.139
0.000
0.061
0.008
0.277
0.351
0.544
0.047
0.175
0.003
0.045
0.070
0.000
0.048
0.000
0.000
0.211
0.421
0.001
0.013
0.000
0.041
0.001
0.000
0.018
0.020
0.041
0.001
95
169
170
171
172
173**
175**
176**
177**
178
179*
180
181*
182
183
184*
185
186
187*
188
189
190**
191
192
193
194**
195
196
197
198
199
200**
201
202
203*
204
205
206
207
208
209
210
211
213
214
-1.42
-3.94
-3.45
-3.91
-0.88
-4.52
-4.82
-3.48
3.18
-1.10
-0.68
-5.40
0.70
-0.60
-1.18
-0.16
-1.40
1.93
0.68
-2.62
-3.33
-2.43
-0.36
-2.97
-3.55
-5.43
-3.43
-0.61
-4.46
-2.40
-4.81
-4.29
-4.75
-0.02
-0.37
-0.25
-0.51
0.13
-2.98
-1.56
1.55
-0.10
-0.78
-1.05
0.17
0.26
0.21
0.25
0.16
0.32
0.37
0.22
0.18
0.15
0.20
0.42
0.15
0.17
0.20
0.19
0.20
0.15
0.17
0.20
0.23
0.21
0.18
0.19
0.24
0.40
0.21
0.21
0.30
0.18
0.32
0.28
0.35
0.20
0.22
0.16
0.25
0.19
0.19
0.15
0.16
0.19
0.22
0.22
27.74
12.86
16.88
10.72
38.09
21.13
22.14
21.73
35.17
65.14
30.09
16.40
45.90
24.31
79.77
41.83
13.69
74.38
20.82
7.46
27.41
15.99
24.18
11.84
24.05
7.39
12.46
34.98
13.32
27.10
17.25
14.51
19.12
146.08
45.24
40.07
41.61
39.29
6.32
31.63
18.01
45.62
29.66
24.51
15
6
7
6
16
6
6
7
18
16
16
4
18
17
15
17
15
18
18
7
7
10
17
7
7
4
7
17
6
10
6
6
6
17
17
17
17
17
7
14
17
17
16
16
1.85
2.14
2.41
1.79
2.38
3.52
3.69
3.10
1.95
4.07
1.88
4.10
2.55
1.43
5.32
2.46
0.91
4.13
1.16
1.07
3.92
1.60
1.42
1.69
3.44
1.85
1.78
2.06
2.22
2.71
2.87
2.42
3.19
8.59
2.66
2.36
2.45
2.31
0.90
2.26
1.06
2.68
1.85
1.53
0.023
0.045
0.018
0.097
0.002
0.002
0.001
0.003
0.009
0.000
0.018
0.003
0.000
0.111
0.000
0.001
0.550
0.000
0.288
0.383
0.000
0.099
0.114
0.105
0.001
0.115
0.086
0.006
0.038
0.003
0.009
0.024
0.004
0.000
0.000
0.001
0.001
0.002
0.503
0.005
0.388
0.000
0.020
0.079
96
215*
216
217
218
221
222
223
224
225
226**
228**
229
230**
231
232**
233**
234
235
236
237
238*
239
240
241**
242
243
244
245
246*
247*
248
260**
261
262*
263
264
265
266
267*
268
269
270
271
272*
-1.74
-0.29
-1.07
0.88
-2.46
-1.90
0.07
-1.16
-1.32
-3.18
-5.11
-3.71
2.92
-3.86
-3.74
-3.27
-4.64
-2.11
-3.40
-2.11
-0.87
-0.22
-1.71
1.71
2.98
-2.41
-0.35
-3.57
-3.01
-6.09
-0.62
-2.03
-4.15
0.53
-0.71
-0.30
-0.84
2.17
-0.65
2.24
1.71
-0.74
-1.55
-1.59
0.14
0.16
0.16
0.16
0.20
0.17
0.16
0.17
0.17
0.25
0.38
0.24
0.17
0.24
0.31
0.24
0.34
0.19
0.25
0.18
0.15
0.17
0.19
0.14
0.17
0.20
0.18
0.25
0.23
0.47
0.15
0.17
0.27
0.18
0.15
0.17
0.16
0.17
0.14
0.19
0.16
0.15
0.20
0.20
83.83
46.18
17.75
25.96
11.01
23.25
31.45
42.65
37.34
24.66
10.66
13.27
59.03
12.42
23.92
23.66
6.85
17.14
17.55
32.42
74.33
39.80
32.07
62.63
44.35
8.68
52.60
25.89
36.88
0.00
47.46
33.65
10.14
78.35
27.36
32.78
24.41
29.27
97.31
43.52
20.72
34.08
39.44
66.59
13
17
16
18
10
12
17
15
15
7
4
7
18
7
7
7
6
12
7
12
16
17
13
18
18
10
17
7
7
0
17
12
6
17
16
17
16
18
17
18
18
16
15
15
6.45
2.72
1.11
1.44
1.10
1.94
1.85
2.84
2.49
3.52
2.67
1.90
3.28
1.77
3.42
3.38
1.14
1.43
2.51
2.70
4.65
2.34
2.47
3.48
2.46
0.87
3.09
3.70
5.27
0.000
0.000
0.339
0.100
0.357
0.026
0.018
0.000
0.001
0.001
0.030
0.065
0.000
0.087
0.001
0.001
0.335
0.144
0.014
0.001
0.000
0.001
0.002
0.000
0.001
0.563
0.000
0.001
0.000
2.79
2.80
1.69
4.61
1.71
1.93
1.53
1.63
5.72
2.42
1.15
2.13
2.63
4.44
0.000
0.001
0.118
0.000
0.038
0.012
0.081
0.045
0.000
0.001
0.294
0.005
0.001
0.000
97
273
-4.13
0.24
21.42
6
3.57
0.002
274
-0.43
0.19
18.92
17
1.11
0.333
275
-2.17
0.18
22.71
12
1.89
0.030
276
-2.48
0.22
20.33
9
2.26
0.016
277*
-0.09
0.15
74.23
17
4.37
0.000
278
-3.03
0.22
14.90
7
2.13
0.037
279
-4.32
0.28
21.88
6
3.65
0.001
280
2.03
0.17
29.87
18
1.66
0.039
281
-0.86
0.17
26.75
16
1.67
0.044
282
-1.50
0.18
18.88
15
1.26
0.219
283
1.13
0.17
39.42
18
2.19
0.003
285
-2.30
0.19
25.89
10
2.59
0.004
286*
1.40
0.15
88.97
18
4.94
0.000
287*
-1.45
0.15
72.56
15
4.84
0.000
288**
1.30
0.17
63.61
18
3.53
0.000
289
1.46
0.18
50.21
18
2.79
0.000
290
-2.33
0.18
22.82
10
2.28
0.012
291**
-0.44
0.14
54.68
17
3.22
0.000
2⁄
Note. *Removed after first IRT attempt with polytomous χ df over 3. ** Removed after
multiple IRT attempts with polytomous χ2 ⁄df over 3. Blanks on Sig and πœ’ 2 ⁄𝑑𝑓 are items
not calculated by PARSCALE.
98
APPENDIX D
Dichotomous Data for 1PL Model Before Removal of Poorly Fitting Items
Table A2
Item
Number
36
37
38
39
40*
41*
43
44*
45*
46
47
48*
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64**
65
66
67
68
69
70
71
b
Parameter
(Location)
-0.81
-1.38
-5.17
-1.00
-2.08
-1.70
-0.24
0.11
-2.81
-5.17
-3.07
-2.92
1.02
-1.65
-1.18
-2.22
-1.05
-2.89
-1.75
-1.21
-0.05
-1.34
-0.89
-1.36
-1.02
-1.99
-2.09
-2.45
-2.12
-0.61
-2.16
-4.26
-1.91
-3.55
-0.04
SE
0.14
0.14
0.73
0.12
0.22
0.17
0.11
0.10
0.31
1.00
0.48
0.54
0.10
0.18
0.19
0.30
0.13
0.32
0.21
0.22
0.11
0.17
0.16
0.23
0.17
0.26
0.24
0.49
0.32
0.11
0.24
0.70
0.19
0.96
0.10
πœ’2
60.49
18.55
0.00
26.90
14.81
17.70
44.20
68.22
5.67
0.00
0.00
5.23
19.54
11.79
15.60
17.08
46.73
1.65
5.23
29.25
29.65
15.18
37.84
39.33
31.34
33.07
11.38
35.88
29.46
84.44
25.12
0.00
10.20
0.00
39.67
πœ’ 2 ⁄𝑑𝑓
df
6
5
0
6
4
5
8
8
3
0
0
1
10
5
6
4
6
2
5
6
8
6
6
5
6
5
4
4
4
8
4
0
5
0
8
Sig
10.08
3.71
0.000
0.002
4.48
3.70
3.54
5.53
8.53
1.89
0.000
0.005
0.004
0.000
0.000
0.127
5.23
1.95
2.36
2.60
4.27
7.79
0.83
1.05
4.88
3.71
2.53
6.31
7.87
5.22
6.61
2.85
8.97
7.36
10.55
6.28
0.021
0.034
0.037
0.016
0.002
0.000
0.441
0.388
0.000
0.000
0.019
0.000
0.000
0.000
0.000
0.022
0.000
0.000
0.000
0.000
2.04
0.069
4.96
0.000
99
72
73*
74*
75
76*
77
78
79**
80
82*
83
84**
85
86
87*
88**
89
91
93
94
95
96
97*
98
99
100
101
102
103
104
105
106*
107*
108**
110
111
112
113
114*
115*
116
118
119
120
-1.42
-0.86
-2.19
-2.92
1.11
0.47
-1.76
-2.58
-1.62
0.58
-1.22
-0.68
0.74
0.32
-1.21
-1.76
0.14
-0.40
-0.83
0.13
-1.30
0.42
-0.33
-0.55
-4.16
-0.69
-3.01
-1.98
-1.40
-2.98
-1.87
-2.60
-2.17
-1.34
-1.57
-0.60
-2.40
0.88
-2.72
0.25
-1.00
-2.79
-2.26
-1.45
0.15
0.12
0.32
0.32
0.11
0.10
0.21
0.32
0.19
0.09
0.16
0.11
0.10
0.10
0.17
0.24
0.10
0.11
0.14
0.10
0.14
0.10
0.11
0.11
0.44
0.12
0.32
0.23
0.14
0.47
0.20
0.52
0.21
0.15
0.16
0.11
0.25
0.10
0.27
0.10
0.13
0.32
0.30
0.16
32.58
20.29
19.32
0.00
55.38
57.27
24.94
6.40
16.09
50.11
35.47
20.88
26.16
36.07
28.28
23.70
73.17
39.02
48.63
26.73
23.38
46.54
18.10
10.67
0.00
17.66
0.00
22.61
13.20
0.00
24.24
20.25
19.16
39.51
38.79
23.57
11.29
33.94
2.76
46.98
29.72
9.69
27.27
20.71
5
6
4
1
10
8
5
4
5
8
6
7
8
8
6
5
8
8
6
8
6
8
8
8
0
7
0
5
5
0
5
4
4
6
5
8
4
9
3
8
6
3
4
5
6.52
3.38
4.83
0.00
5.54
7.16
4.99
1.60
3.22
6.26
5.91
2.98
3.27
4.51
4.71
4.74
9.15
4.88
8.11
3.34
3.90
5.82
2.26
1.33
0.000
0.003
0.001
0.930
0.000
0.000
0.000
0.169
0.007
0.000
0.000
0.004
0.001
0.000
0.000
0.000
0.000
0.000
0.000
0.001
0.001
0.000
0.021
0.220
2.52
0.014
4.52
2.64
0.000
0.022
4.85
5.06
4.79
6.58
7.76
2.95
2.82
3.77
0.92
5.87
4.95
3.23
6.82
4.14
0.000
0.001
0.001
0.000
0.000
0.003
0.023
0.000
0.432
0.000
0.000
0.021
0.000
0.001
100
121**
123
124
125
126**
127*
128*
129
130
131
132
133
134*
135*
136*
137
138
139
141**
142**
143
144
145
146
147
148
149
150
151
152
153
154*
155*
156
157
158
159
160*
161
162
163
164
165
167
1.11
0.67
-0.54
-1.01
-1.83
-2.40
-2.55
-0.60
-0.86
-0.72
-0.06
-0.92
-2.29
-2.10
-3.35
-1.47
-3.67
-4.29
-0.40
-2.36
-0.61
-2.66
-2.78
-4.56
-2.77
-3.98
-1.51
-4.42
-2.29
0.00
-3.63
-1.04
-0.94
-1.73
0.49
-1.10
-1.56
0.20
-0.85
-0.23
-2.01
-1.58
-2.25
-2.50
0.10
0.10
0.12
0.17
0.25
0.41
0.31
0.12
0.17
0.14
0.12
0.15
0.30
0.33
0.39
0.17
1.29
0.58
0.13
0.37
0.11
0.28
0.32
0.85
0.67
0.80
0.21
0.95
0.27
0.12
0.70
0.13
0.14
0.19
0.10
0.15
0.15
0.10
0.13
0.10
0.23
0.18
0.21
0.22
52.66
43.37
29.36
43.06
19.35
16.05
19.03
20.65
48.86
5.55
20.60
15.58
17.71
18.75
0.00
11.64
0.00
0.00
39.04
26.02
18.28
11.93
4.66
0.00
13.81
0.00
23.58
0.00
8.72
52.77
0.00
62.32
60.87
24.63
67.49
67.00
8.17
35.83
45.49
96.48
15.50
28.88
4.86
2.73
10
8
8
6
5
4
4
8
6
7
8
6
4
4
0
5
0
0
8
4
8
4
3
0
3
0
5
0
4
8
0
6
6
5
8
6
5
8
6
8
5
5
4
4
5.27
5.42
3.67
7.18
3.87
4.01
4.76
2.58
8.14
0.79
2.57
2.60
4.43
4.69
0.000
0.000
0.000
0.000
0.002
0.003
0.001
0.008
0.000
0.595
0.008
0.016
0.002
0.001
2.33
0.040
4.88
6.50
2.29
2.98
1.55
0.000
0.000
0.019
0.018
0.197
4.60
0.003
4.72
0.000
2.18
6.60
0.068
0.000
10.39
10.14
4.93
8.44
11.17
1.63
4.48
7.58
12.06
3.10
5.78
1.21
0.68
0.000
0.000
0.000
0.000
0.000
0.146
0.000
0.000
0.000
0.009
0.000
0.302
0.607
101
168**
169
170
171
172
173**
175**
176**
177**
178
179*
180
181*
182
183
184*
185
186
187*
188
189
190**
191
192
193
194**
195
196
197
198
199
200**
201
202
203*
204
205
206
207
208
209
210
211
213
-2.71
-1.68
-2.80
-2.28
-3.88
-0.76
-6.34
-3.16
-3.27
1.29
-0.90
-1.53
-6.51
-0.21
-0.95
-1.16
-0.46
-1.50
0.53
-0.23
-1.79
-3.22
-1.67
-1.09
-2.26
-2.22
-3.29
-2.78
-1.34
-1.75
-2.21
-3.87
-3.59
-5.22
-1.69
-1.37
-0.73
-1.09
-0.41
-3.00
-1.45
0.16
-0.99
-1.92
0.88
0.20
0.40
0.29
0.48
0.15
1.20
0.83
0.35
0.11
0.13
0.20
1.06
0.10
0.16
0.16
0.14
0.23
0.10
0.12
0.29
0.34
0.30
0.18
0.26
0.37
0.44
0.29
0.20
0.85
0.26
0.64
0.52
0.94
0.18
0.25
0.13
0.27
0.14
0.34
0.17
0.10
0.17
0.24
28.42
26.27
7.02
5.53
0.00
59.54
0.00
0.00
0.00
30.22
36.92
7.83
0.00
20.63
35.46
3.91
45.51
22.81
42.34
42.07
34.25
0.00
39.51
20.53
14.33
30.38
0.00
6.16
10.17
134.91
5.70
0.00
0.00
0.00
18.98
11.25
39.88
39.43
46.09
0.00
12.87
21.74
44.68
11.33
3
5
3
4
0
7
0
0
0
10
6
5
0
8
6
6
8
5
8
8
5
0
5
6
4
4
0
3
6
5
4
0
0
0
5
6
7
6
8
0
5
8
6
5
9.47
5.25
2.34
1.38
0.000
0.000
0.070
0.236
8.51
0.000
3.02
6.15
1.57
0.001
0.000
0.165
2.58
5.91
0.65
5.69
4.56
5.29
5.26
6.85
0.008
0.000
0.691
0.000
0.000
0.000
0.000
0.000
7.90
3.42
3.58
7.60
0.000
0.002
0.006
0.000
2.05
1.70
26.98
1.43
0.102
0.117
0.000
0.221
3.80
1.88
5.70
6.57
5.76
0.002
0.080
0.000
0.000
0.000
2.57
2.72
7.45
2.27
0.024
0.006
0.000
0.045
102
214
215*
216
217
218
221
222
223
224
225
226**
228**
229
230**
231
232**
233**
234
235
236
237
238*
239
240
241**
242
243
244
245
246*
247*
248
260**
261
262*
263
264
265
266
267*
268
269
270
271
-0.72
-1.03
-0.61
-1.34
-0.10
-2.39
-1.51
-0.50
-1.53
-1.76
-7.38
-3.85
-1.12
1.11
-3.05
-6.44
-2.89
-6.63
-2.27
-3.02
-1.81
-0.80
-1.25
-2.09
0.44
1.19
-2.46
-0.99
-3.48
-3.34
-4.59
-0.72
-1.39
-2.87
-0.79
-1.15
-0.81
-1.31
0.46
-0.65
0.55
0.29
-0.93
-1.76
0.25
0.13
0.12
0.16
0.11
0.23
0.17
0.11
0.19
0.20
1.17
0.82
1.27
0.11
0.41
0.89
0.32
1.50
0.27
0.50
0.26
0.11
0.15
0.22
0.09
0.11
0.26
0.13
0.58
0.48
1.01
0.12
0.17
0.41
0.12
0.14
0.14
0.14
0.11
0.12
0.11
0.10
0.13
0.19
157.43
54.69
44.70
30.58
72.01
19.27
33.07
57.19
19.99
21.48
0.00
0.00
325.05
40.79
0.00
0.00
2.91
0.00
5.80
0.00
39.62
45.76
18.41
18.41
53.21
41.83
12.61
24.61
0.00
0.00
0.00
80.64
24.96
11.59
26.64
23.54
26.84
15.62
12.79
51.12
18.99
17.01
30.90
10.81
7
6
8
6
8
4
5
8
5
5
0
0
6
10
0
0
1
0
4
0
5
6
6
4
8
10
4
6
0
0
0
7
5
2
6
6
6
6
8
8
8
8
6
5
22.49
9.12
5.59
5.10
9.00
4.82
6.61
7.15
4.00
4.30
0.000
0.000
0.000
0.000
0.000
0.001
0.000
0.000
0.001
0.001
54.18
4.08
0.000
0.000
2.91
0.084
1.45
0.213
7.92
7.63
3.07
4.60
6.65
4.18
3.15
4.10
0.000
0.000
0.005
0.001
0.000
0.000
0.013
0.000
11.52
4.99
5.80
4.44
3.92
4.47
2.60
1.60
6.39
2.37
2.13
5.15
2.16
0.000
0.000
0.003
0.000
0.001
0.000
0.016
0.118
0.000
0.015
0.030
0.000
0.055
103
-2.49
0.38
12.29
4
3.07
0.015
272*
-3.12
0.50
0.00
0
273
-1.02
0.19
29.98
6
5.00
0.000
274
-1.92
0.23
6.22
5
1.24
0.285
275
-2.51
0.30
21.34
4
5.33
0.000
276
-0.45
0.12
36.82
8
4.60
0.000
277*
-2.43
0.35
23.18
4
5.79
0.000
278
-4.04
0.61
0.00
0
279
0.40
0.11
21.61
8
2.70
0.006
280
-1.36
0.16
22.10
6
3.68
0.001
281
-1.79
0.17
11.26
5
2.25
0.046
282
-0.12
0.11
23.76
8
2.97
0.003
283
-2.40
0.28
28.54
4
7.14
0.000
285
0.39
0.11
46.50
8
5.81
0.000
286*
-0.94
0.13
82.77
6
13.79
0.000
287*
0.07
0.12
19.55
8
2.44
0.012
288**
0.26
0.11
31.60
8
3.95
0.000
289
-1.49
0.16
36.31
5
7.26
0.000
290
-0.66
0.11
36.33
8
4.54
0.000
291**
2⁄
Note. *Removed after first IRT attempt with polytomous χ df over 3. ** Removed after
multiple IRT attempts when polytomous χ2 ⁄df was over 3. Blanks on Sig and
πœ’ 2 ⁄𝑑𝑓 are items not calculated by PARSCALE.
104
APPENDIX E
Polytomous Data for 1PL Model After Removal of Poorly Fitting Items
Table A3
PARSCALE
ID
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
A16
A17
A18
A19
A20
A21
A22
A23
A24
A25
A26
A27
A28
A29
A30
A31
A32
A33
A34
A35
A36
A37
Item
Number
36
37
38
39
43
46
47
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
65
66
67
68
69
70
71
72
75
77
78
80
83
85
86
b
Parameter
(Location)
-0.46
-1.09
-2.68
-0.73
0.87
-3.46
-3.18
2.89
-1.00
-0.30
-2.47
-0.64
-3.21
-1.34
-0.51
1.24
-0.44
-0.37
-1.05
-0.16
-2.84
-2.74
-3.48
-0.11
-2.04
-4.36
-1.23
-3.85
1.13
-1.60
-3.33
1.80
-1.68
-1.70
-0.76
2.08
1.62
SE
0.13
0.13
0.18
0.12
0.13
0.20
0.18
0.16
0.15
0.18
0.16
0.14
0.19
0.16
0.19
0.16
0.16
0.16
0.18
0.17
0.17
0.18
0.21
0.13
0.16
0.24
0.16
0.24
0.14
0.15
0.19
0.13
0.15
0.15
0.15
0.14
0.14
πœ’2
33.07
22.03
8.22
29.22
25.45
17.81
12.77
18.82
22.50
30.68
1.75
14.43
7.44
32.69
47.03
45.77
43.88
39.39
39.09
37.35
5.70
14.27
9.02
47.02
17.92
11.93
37.77
15.93
24.57
30.11
18.88
32.26
22.82
40.53
29.89
41.64
28.43
πœ’ 2 ⁄𝑑𝑓
df
17
17
8
17
19
7
8
18
17
17
8
17
8
17
17
19
17
17
17
18
8
8
7
18
12
7
17
7
19
14
8
18
14
14
17
18
19
1.95
1.30
1.03
1.72
1.34
2.54
1.60
1.05
1.32
1.80
0.22
0.85
0.93
1.92
2.77
2.41
2.58
2.32
2.30
2.07
0.71
1.78
1.29
2.61
1.49
1.70
2.22
2.28
1.29
2.15
2.36
1.79
1.63
2.90
1.76
2.31
1.50
Sig
0.011
0.183
0.412
0.033
0.146
0.013
0.119
0.403
0.166
0.022
0.987
0.637
0.491
0.012
0.000
0.001
0.000
0.002
0.002
0.005
0.683
0.074
0.250
0.000
0.118
0.102
0.003
0.026
0.175
0.007
0.016
0.021
0.063
0.000
0.027
0.001
0.075
105
A38
A39
A40
A41
A42
A43
A44
A45
A46
A47
A48
A49
A50
A51
A52
A53
A54
A55
A56
A57
A58
A59
A60
A61
A62
A63
A64
A65
A66
A67
A68
A69
A70
A71
A72
A73
A74
A75
A76
A77
A78
A79
A80
A81
A82
89
91
93
94
95
96
98
99
100
101
102
103
104
105
110
111
112
113
116
118
119
120
123
124
125
129
130
131
132
133
137
138
139
143
144
145
146
147
148
149
150
151
152
153
156
1.34
0.36
-0.93
1.51
-0.63
1.57
0.42
-2.88
0.13
-4.12
-1.61
-0.71
-4.12
-2.49
-1.53
0.16
-1.96
2.34
-0.44
-2.78
-2.03
-1.97
2.34
0.53
-0.72
-0.02
-0.35
0.28
1.00
-0.50
-1.12
-3.56
-4.31
0.70
-2.32
-3.19
-4.56
-3.98
-4.84
-1.22
-3.48
-2.33
1.36
-4.67
-1.56
0.15
0.13
0.14
0.15
0.15
0.13
0.13
0.18
0.14
0.23
0.15
0.14
0.23
0.15
0.14
0.14
0.17
0.14
0.14
0.17
0.14
0.14
0.15
0.15
0.15
0.13
0.17
0.16
0.15
0.15
0.14
0.20
0.23
0.15
0.16
0.19
0.28
0.24
0.31
0.14
0.20
0.16
0.15
0.27
0.15
49.88
33.57
34.88
18.45
11.55
35.59
49.36
11.59
18.84
12.27
20.45
20.88
9.24
12.57
13.74
29.18
28.11
39.50
30.04
7.28
14.82
29.45
43.10
22.69
33.74
50.49
23.70
34.66
44.77
51.03
31.93
12.42
12.07
34.48
11.89
12.79
5.79
15.63
4.37
41.18
9.97
13.58
51.89
10.32
11.54
19
18
17
19
17
19
18
8
18
7
15
17
7
8
15
18
12
18
17
8
12
12
18
18
17
18
17
18
19
17
17
7
7
18
11
8
5
7
4
17
7
10
19
5
15
2.63
1.87
2.05
0.97
0.68
1.87
2.74
1.45
1.05
1.75
1.36
1.23
1.32
1.57
0.92
1.62
2.34
2.19
1.77
0.91
1.24
2.45
2.39
1.26
1.98
2.80
1.39
1.93
2.36
3.00
1.88
1.77
1.72
1.92
1.08
1.60
1.16
2.23
1.09
2.42
1.42
1.36
2.73
2.06
0.77
0.000
0.014
0.007
0.493
0.827
0.012
0.000
0.170
0.402
0.091
0.155
0.231
0.235
0.127
0.546
0.046
0.005
0.003
0.026
0.507
0.251
0.003
0.001
0.202
0.009
0.000
0.128
0.010
0.001
0.000
0.015
0.087
0.098
0.011
0.372
0.119
0.327
0.029
0.358
0.001
0.190
0.192
0.000
0.066
0.714
106
A83
A84
A85
A86
A87
A88
A89
A90
A91
A92
A93
A94
A95
A96
A97
A98
A99
A100
A101
A102
A103
A104
A105
A106
A107
A108
A109
A110
A111
A112
A113
A114
A115
A116
A117
A118
A119
A120
A121
A122
A123
A124
A125
A126
A127
157
158
159
161
162
163
164
165
167
169
170
171
172
178
180
182
183
185
186
188
189
191
192
193
195
196
197
198
199
201
202
204
205
206
207
208
209
210
211
213
214
216
217
218
221
1.79
-1.54
-1.47
-0.46
0.50
-1.93
-1.32
-1.97
-3.09
-1.47
-3.85
-3.35
-3.84
3.06
-0.67
0.64
-0.65
-0.21
-1.36
0.64
-2.62
-2.44
-0.41
-2.88
-5.34
-3.25
-0.58
-4.46
-2.33
-4.15
-4.70
-0.55
-0.27
-0.49
0.15
-2.96
-1.59
1.46
-0.15
-0.85
-1.00
-0.34
-1.07
0.80
-2.40
0.14
0.14
0.14
0.14
0.12
0.19
0.19
0.18
0.17
0.14
0.22
0.18
0.21
0.16
0.18
0.13
0.15
0.15
0.17
0.15
0.17
0.17
0.16
0.16
0.33
0.18
0.18
0.25
0.16
0.24
0.29
0.20
0.14
0.22
0.17
0.16
0.13
0.15
0.17
0.19
0.19
0.14
0.14
0.14
0.17
29.25
34.29
30.81
39.84
37.01
31.75
26.44
19.63
18.89
21.07
14.90
11.08
14.26
34.33
27.14
42.34
24.65
43.13
21.99
14.34
4.32
7.40
30.20
7.86
4.60
10.70
38.72
13.02
33.26
10.87
12.52
37.57
38.16
42.68
40.31
8.70
26.55
23.58
40.97
37.95
41.45
50.78
16.88
20.08
5.10
19
15
15
17
18
12
17
12
8
15
7
8
7
18
17
18
17
18
16
18
8
8
17
8
3
8
17
7
11
7
5
17
18
17
18
8
15
19
18
17
17
17
17
19
8
1.54
2.29
2.05
2.34
2.06
2.65
1.56
1.64
2.36
1.40
2.13
1.39
2.04
1.91
1.60
2.35
1.45
2.40
1.37
0.80
0.54
0.93
1.78
0.98
1.53
1.34
2.28
1.86
3.02
1.55
2.50
2.21
2.12
2.51
2.24
1.09
1.77
1.24
2.28
2.23
2.44
2.99
0.99
1.06
0.64
0.062
0.003
0.009
0.001
0.005
0.002
0.067
0.074
0.016
0.134
0.037
0.196
0.046
0.012
0.056
0.001
0.102
0.001
0.143
0.707
0.828
0.495
0.025
0.447
0.202
0.219
0.002
0.071
0.001
0.143
0.028
0.003
0.004
0.001
0.002
0.368
0.033
0.212
0.002
0.003
0.001
0.000
0.463
0.390
0.749
107
A128
A129
A130
A131
A132
A133
A134
A135
A136
A137
A138
A139
A140
A141
A142
A143
A144
A145
A146
A147
A148
A149
A150
A151
A152
A153
A154
A155
A156
A157
A158
A159
A160
A161
A162
A163
A164
A165
A166
222
223
224
225
229
231
234
235
236
237
239
240
242
243
244
245
248
261
263
264
265
266
268
269
270
271
273
274
275
276
278
279
280
281
282
283
285
289
290
-1.84
0.03
-1.16
-1.31
-3.57
-3.75
-4.65
-2.13
-3.39
-2.10
-0.25
-1.65
2.86
-2.34
-0.36
-3.52
-0.61
-4.07
-0.79
-0.30
-0.86
2.00
2.20
1.65
-0.74
-1.56
-4.00
-0.49
-2.12
-2.49
-3.01
-4.16
1.88
-0.82
-1.53
1.04
-2.27
1.39
-2.31
0.15
0.14
0.15
0.15
0.20
0.21
0.28
0.16
0.20
0.16
0.15
0.17
0.14
0.17
0.16
0.21
0.13
0.23
0.13
0.15
0.14
0.15
0.17
0.14
0.13
0.17
0.21
0.16
0.15
0.18
0.19
0.24
0.15
0.15
0.15
0.15
0.16
0.14
0.15
32.64
38.01
34.54
32.62
9.17
9.73
6.26
17.89
16.88
21.65
38.41
28.89
47.88
11.39
49.17
18.88
51.43
9.56
29.43
27.80
19.76
24.92
47.22
18.85
33.72
35.81
21.57
23.84
15.91
15.04
16.56
20.98
45.11
33.97
31.54
25.87
26.47
34.80
19.46
12
18
17
17
7
7
5
12
8
12
18
14
18
11
17
7
17
7
17
17
17
18
18
19
17
15
7
17
12
8
8
7
18
17
15
19
11
18
11
2.72
2.11
2.03
1.92
1.31
1.39
1.25
1.49
2.11
1.80
2.13
2.06
2.66
1.04
2.89
2.70
3.03
1.37
1.73
1.64
1.16
1.38
2.62
0.99
1.98
2.39
3.08
1.40
1.33
1.88
2.07
3.00
2.51
2.00
2.10
1.36
2.41
1.93
1.77
0.001
0.004
0.007
0.013
0.240
0.203
0.281
0.119
0.031
0.042
0.003
0.011
0.000
0.411
0.000
0.009
0.000
0.214
0.031
0.047
0.286
0.127
0.000
0.467
0.009
0.002
0.003
0.124
0.195
0.058
0.035
0.004
0.000
0.009
0.008
0.133
0.006
0.010
0.053
108
APPENDIX F
Dichotomous Data for 1PL Model After Removal of Poorly Fitting Items
Table A4
PARSCALE
ID
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
A16
A17
A18
A19
A20
A21
A22
A23
A24
A25
A26
A27
A28
A29
A30
A31
A32
A33
A34
A35
A36
A37
Item
Number
36
37
38
39
43
46
47
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
65
66
67
68
69
70
71
72
75
77
78
80
83
85
86
b
Parameter
(Location)
-0.96
-1.38
-3.91
-0.94
-0.11
-4.50
-3.18
1.14
-1.69
-1.14
-2.29
-1.05
-2.75
-1.50
-1.29
-0.04
-1.23
-1.06
-1.43
-1.21
-2.25
-1.94
-2.48
-0.71
-2.20
-4.38
-1.78
-4.36
-0.06
-1.48
-2.50
0.52
-1.88
-1.64
-1.33
0.77
0.36
SE
0.11
0.12
0.56
0.11
0.09
0.89
0.40
0.10
0.15
0.17
0.26
0.12
0.28
0.18
0.19
0.10
0.15
0.14
0.20
0.14
0.21
0.19
0.26
0.10
0.21
0.62
0.16
0.82
0.09
0.14
0.25
0.09
0.18
0.16
0.13
0.10
0.09
πœ’2
37.86
17.41
0.00
32.20
61.37
0.00
0.00
22.41
9.64
20.76
24.68
45.83
4.73
16.26
23.60
26.91
24.87
22.03
36.48
15.48
13.11
16.69
9.11
88.19
23.43
0.00
19.28
0.00
38.74
27.24
2.60
73.46
14.46
27.58
23.41
31.94
43.25
df
6
6
0
7
8
0
0
10
6
6
4
6
2
6
6
8
6
6
6
6
4
5
4
8
4
0
5
0
8
6
3
9
5
6
6
9
9
πœ’ 2 ⁄𝑑𝑓
Sig
6.31
2.90
0.000
0.008
4.60
7.67
0.000
0.000
2.24
1.61
3.46
6.17
7.64
2.37
2.71
3.93
3.36
4.15
3.67
6.08
2.58
3.28
3.34
2.28
11.02
5.86
0.013
0.140
0.002
0.000
0.000
0.092
0.013
0.001
0.001
0.000
0.001
0.000
0.017
0.011
0.005
0.058
0.000
0.000
3.86
0.002
4.84
4.54
0.87
8.16
2.89
4.60
3.90
3.55
4.81
0.000
0.000
0.460
0.000
0.013
0.000
0.001
0.000
0.000
109
A38
A39
A40
A41
A42
A43
A44
A45
A46
A47
A48
A49
A50
A51
A52
A53
A54
A55
A56
A57
A58
A59
A60
A61
A62
A63
A64
A65
A66
A67
A68
A69
A70
A71
A72
A73
A74
A75
A76
A77
A78
A79
A80
A81
A82
89
91
93
94
95
96
98
99
100
101
102
103
104
105
110
111
112
113
116
118
119
120
123
124
125
129
130
131
132
133
137
138
139
143
144
145
146
147
148
149
150
151
152
153
156
0.06
-0.42
-0.99
0.19
-1.18
0.40
-0.34
-3.50
-0.65
-2.78
-2.11
-1.17
-3.45
-1.98
-1.57
-0.66
-2.45
0.93
-1.01
-2.74
-2.64
-1.36
0.70
-0.48
-1.24
-0.53
-0.90
-0.55
0.02
-0.93
-1.43
-4.50
-4.00
-0.48
-2.52
-2.93
-5.30
-2.94
-3.68
-1.42
-4.68
-2.25
-0.03
-4.13
-1.75
0.09
0.10
0.12
0.09
0.12
0.09
0.10
0.36
0.11
0.29
0.19
0.13
0.40
0.18
0.14
0.10
0.21
0.09
0.11
0.27
0.26
0.14
0.09
0.10
0.14
0.11
0.15
0.12
0.10
0.13
0.14
1.17
0.50
0.10
0.24
0.28
0.78
0.56
0.64
0.17
0.76
0.23
0.09
0.60
0.16
51.30
35.48
26.18
31.53
32.04
52.87
28.43
0.00
24.89
13.46
11.29
30.71
0.00
17.46
44.95
17.66
8.08
41.95
31.96
10.70
9.18
30.70
47.08
33.12
17.21
27.02
45.83
19.57
28.14
21.12
19.98
0.00
0.00
32.48
8.69
3.87
0.00
6.73
0.00
40.11
0.00
12.08
44.26
0.00
19.29
8
8
6
8
6
9
8
0
8
2
4
6
0
5
6
8
4
10
6
2
3
6
9
8
6
8
7
8
8
7
6
0
0
8
3
2
0
2
0
6
0
4
8
0
6
6.41
4.43
4.36
3.94
5.34
5.87
3.55
0.000
0.000
0.000
0.000
0.000
0.000
0.000
3.11
6.73
2.82
5.12
0.002
0.001
0.023
0.000
3.49
7.49
2.21
2.02
4.20
5.33
5.35
3.06
5.12
5.23
4.14
2.87
3.38
6.55
2.45
3.52
3.02
3.33
0.004
0.000
0.024
0.087
0.000
0.000
0.005
0.027
0.000
0.000
0.000
0.009
0.001
0.000
0.012
0.000
0.004
0.003
4.06
2.90
1.93
0.000
0.033
0.142
3.36
0.034
6.68
0.000
3.02
5.53
0.017
0.000
3.22
0.004
110
A83
A84
A85
A86
A87
A88
A89
A90
A91
A92
A93
A94
A95
A96
A97
A98
A99
A100
A101
A102
A103
A104
A105
A106
A107
A108
A109
A110
A111
A112
A113
A114
A115
A116
A117
A118
A119
A120
A121
A122
A123
A124
A125
A126
A127
157
158
159
161
162
163
164
165
167
169
170
171
172
178
180
182
183
185
186
188
189
191
192
193
195
196
197
198
199
201
202
204
205
206
207
208
209
210
211
213
214
216
217
218
221
0.52
-1.32
-1.44
-0.93
-0.22
-2.19
-1.57
-1.93
-2.19
-1.70
-3.18
-2.34
-3.47
1.30
-1.24
-0.13
-1.13
-0.51
-1.45
-0.31
-2.13
-1.83
-1.14
-2.28
-3.65
-2.70
-1.21
-3.56
-2.06
-3.01
-4.37
-1.26
-0.84
-1.15
-0.61
-2.46
-1.40
0.18
-1.11
-1.22
-1.68
-0.65
-1.37
-0.10
-2.49
0.09
0.13
0.14
0.11
0.09
0.21
0.16
0.19
0.18
0.17
0.35
0.23
0.40
0.10
0.17
0.09
0.14
0.12
0.20
0.11
0.23
0.25
0.15
0.21
0.39
0.25
0.18
0.55
0.22
0.42
0.81
0.21
0.11
0.22
0.13
0.27
0.14
0.10
0.15
0.21
0.22
0.11
0.14
0.10
0.21
78.86
39.20
20.49
39.02
100.51
6.86
32.91
17.08
11.02
31.79
0.00
4.84
0.00
22.80
30.10
29.43
16.53
40.16
38.47
33.68
8.30
30.00
21.65
13.35
0.00
5.98
21.47
0.00
14.14
5.03
0.00
22.47
22.12
35.11
21.06
8.35
19.10
20.85
37.22
32.71
20.84
42.70
30.31
81.38
9.37
9
6
6
7
8
4
6
5
4
6
0
4
0
10
6
8
6
8
6
8
4
5
6
4
0
3
6
0
4
2
0
6
7
6
8
4
6
8
6
6
6
8
6
8
3
8.76
6.53
3.42
5.57
12.56
1.72
5.48
3.42
2.76
5.30
0.000
0.000
0.002
0.000
0.000
0.142
0.000
0.005
0.026
0.000
1.21
0.303
2.28
5.02
3.68
2.76
5.02
6.41
4.21
2.08
6.00
3.61
3.34
0.012
0.000
0.000
0.011
0.000
0.000
0.000
0.080
0.000
0.002
0.010
1.99
3.58
0.111
0.002
3.54
2.52
0.007
0.079
3.75
3.16
5.85
2.63
2.09
3.18
2.61
6.20
5.45
3.47
5.34
5.05
10.17
3.12
0.001
0.003
0.000
0.007
0.078
0.004
0.008
0.000
0.000
0.002
0.000
0.000
0.000
0.025
111
A128
A129
A130
A131
A132
A133
A134
A135
A136
A137
A138
A139
A140
A141
A142
A143
A144
A145
A146
A147
A148
A149
A150
A151
A152
A153
A154
A155
A156
A157
A158
A159
A160
A161
A162
A163
A164
A165
A166
222
223
224
225
229
231
234
235
236
237
239
240
242
243
244
245
248
261
263
264
265
266
268
269
270
271
273
274
275
276
278
279
280
281
282
283
285
289
290
-1.55
-0.54
-1.72
-1.74
-3.31
-2.99
-7.76
-2.25
-3.33
-2.03
-1.13
-2.14
1.24
-2.44
-0.96
-3.05
-0.84
-2.79
-1.08
-0.91
-1.26
0.59
0.58
0.33
-0.90
-1.60
-3.30
-1.08
-1.76
-2.56
-2.85
-3.78
0.51
-1.38
-1.72
-0.09
-2.64
0.18
-1.66
0.15
0.10
0.16
0.17
0.96
0.37
1.27
0.23
0.43
0.21
0.13
0.20
0.10
0.22
0.12
0.43
0.10
0.33
0.12
0.13
0.12
0.10
0.10
0.09
0.11
0.16
0.40
0.16
0.19
0.26
0.31
0.54
0.09
0.14
0.15
0.10
0.25
0.10
0.15
34.76
57.23
19.57
31.42
0.00
4.35
0.00
6.76
0.00
21.27
33.82
13.28
60.42
13.46
30.71
0.00
69.54
8.21
30.17
15.10
21.58
27.92
22.92
18.44
34.30
23.11
0.00
27.40
15.64
21.48
8.90
0.00
38.43
20.94
18.30
26.41
11.05
22.57
30.64
6
8
6
6
0
2
0
4
0
4
6
4
10
4
6
0
7
2
6
7
6
9
9
9
7
6
0
6
6
3
2
0
9
6
6
8
3
8
6
5.79
7.15
3.26
5.24
0.000
0.000
0.003
0.000
2.18
0.111
1.69
0.148
5.32
5.64
3.32
6.04
3.36
5.12
0.000
0.000
0.010
0.000
0.009
0.000
9.93
4.11
5.03
2.16
3.60
3.10
2.55
2.05
4.90
3.85
0.000
0.016
0.000
0.035
0.002
0.001
0.007
0.030
0.000
0.001
4.57
2.61
7.16
4.45
0.000
0.016
0.000
0.012
4.27
3.49
3.05
3.30
3.68
2.82
5.11
0.000
0.002
0.006
0.001
0.012
0.004
0.000
Note. Blanks on Sig and πœ’ 2 ⁄𝑑𝑓 are items not calculated by PARSCALE.
112
APPENDIX G
Dichotomous (Upper Categories Combined) and Polytomous Matrix Plots
Figure G1. PARSCALE Identifier A1-A100 for Dichotomous Data Plots.
113
Figure G2. PARSCALE Identifier A101-A166 Dichotomous Data Plots
114
Figure G3. PARSCALE Identifier A1-A100 Polytomous Data Plots
115
Figure G4. PARSCALE Identifier A101-A166 Polytomous Data Plots
116
REFERENCES
Baranowski, L. E. & Anderson, L. E. (2005). Examining rating source variation in work
behavior to KSA linkages. Personnel Psychology, 58, 1041-1054.
Bemis, S.E., Belenky, A. H. & Soder, D.A. (1983). Job analysis: An effective
management tool. Washington, D.C.: The Bureau of National Affairs, Inc.
Bond, T.G. & Fox, C.M. (2007). Applying the Rasch Model: Fundamental measurement
in the human sciences (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Brannick, M.T., Levine, E. L., & Morgeson, F. P. (2007). Job and work analysis:
Methods, research, and applications for human resource management (2nd ed.).
Thousand Oaks, CA: Sage Publication.
Chau, S. C., Drasgow, F., & Luecht, R. (2006). How big is big enough? Sample size
requirements for CAST item parameter estimation, Applied Measurment in
Education, 19(3), 241-255.
Comrey, A.L., & Lee, H.B. (1992). A first course in factor analysis (2nd ed.). Hillsdale,
NJ: Lawrence Erlbaum Associates
Drasgow, F., Levine, M.V., Sherman, T., Williams, B. & Mead, A. D. (1995). Fitting
polytomous item response theory models to multiple-choice tests. Applied
Psychological Measurement, 19(2), 143-165.
Embretson, S. E., & Reise, S. P. (2000). Item Response Theory for psychologist.
Mahwah, NJ: Lawrence Erlbaum Associates.
Guion, R. M. (1998). Assessment, measurement, and prediction for personnel decisions.
Mahwah, NJ: Lawrence Erlbaum Associates.
117
Harvey, R. J. (1991). Job analysis. In M. D. Dunnette (Ed.), Hanbook of industrial and
organizational psychology (pp. 71-163). Palo Alto, CA: Consulting Psychologist
Press.
Harvey, R. J. (2003, April). Applicability of binary IRT models to job analysis data.
Paper presented at the 2003 Symposium at the Annual Conference of the society
for Industrial and Organizational Psychology, Orlando.
Hernandez, A., Drasgow, F., & Gonzalez-Roma, V. (2004). Investigating the functioning
of a middle category by means of a mixed-measurement model. Journal of
Applied Psychology, 89(4), 687-699.
Kaplan, R. M. & Saccuzzo, D. P. (2005). Psychological testing: Principles, applications,
and issues (6th ed.). Belmont, CA: Thomson Wadsworth.
Meyers, L. (2005). Rater Reliability. Unpublished Manuscript.
Meyers, L. (2007). Reliability, error, and attenuation. Unpublished Manuscript.
Meyers, L. S., Gamst, G., & Guarino, A.J. (2006). Applied multivariate research: Design
and interpretation. Thousand Oaks, CA: Sage Publication.
Mumford, M. D. & Peterson, N. G. (1999) The O*NET content model: structural
considerations in describing jobs. In N. G. Peterson, M. D. Mumford, W. C.
Borman, P. R. Jeanneret, & E. A. Fleishman (Eds.), An occupational information
system for the 21st century: The development of O*NET (pp. 21-30). Washington
D. C.: American Psychological Association.
Prien, E. P., Goodstein, L. D., Goodstein, J. & Gamble, L. G. Jr. (2009). A practical
guide to job analysis. San Francisco, CA: John Wiley & Sons, Inc.
118
Spicer, J. (2005). Making sense of multivariate data analysis. Thousand Oaks, CA: Sage
Publications.
Taylor, W. L. (1953). 'Cloze procedure': a new tool for measuring readability. Journalism
Quarterly, 30, 415-433.
U.S. Department of Labor. (1939). Dictionary of occupational titles. Washington, DC:
U.S. Government Printing Office.
Wei, J. & Salvendy, G. (2004). The cognitive tasks analysis methods for job and task
design: Review and reappraisal, Behavior and Information Technology, 23(4),
273-299.
Wilson, M.A., & Harvey, R.J. (1990). The value of relative-time-spent ratings in taskoriented job analysis. Journal of Business and Psychlogy 4(4), 453-461.
Wright, B. D. (1977). Solving measurement problems with the Rasch Model, Journal of
Educational Measurement, 14(2), 97-116.
Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item
dependence. Journal of Educational Measurement, 30, 187-213.
Download