Journal of Applied Psychology 1988, Vol.73, No. 4, 647-656 Copyright 1988by the American PsychologicalAssociation, Inc. 0021-9010/88/$00.75 A Test of the Measurement Equivalence of the Revised Job Diagnostic Survey: Past Problems and Current Solutions Jacqueline R. Idaszak, William P. Bottom, and Fritz Drasgow University of Illinois, Urbana-Champaign The measurement equivalence of the revised Job Diagnostic Survey (JDS) was studied across samples from five worker populations. Samples included workers at a printing plant, engineers, nurses and nurses' aides, dairy employees, and part-time workers. Data were analyzed according to Jgreskog's model for simultaneous factor analysis in several populations (SlFASP),revealing the five factors contained in Hackman and Oldham's theory of job characteristics. A sixth factor also appeared that apparently resulted from the two different formats used on the instrument. When the data from each group were analyzed separately by principal axes factor analysis, three-, four-, and five-factor solutions appeared. To explain these inconsistencies, a Monte Carlo simulation was conducted. Matrices representing the a priori JDS factor loadings and a hypothetical, lengthened JDS with twice the number of items per factor were used in the simulation with three sample sizes (Ns = 75, 150, and 900). Results suggested that for scales like the JDS, which has only a few items per factor, sample sizes larger than those typicallyrecommended are needed to consistently recover the true underlying structure. The simulation results support our conclusions that the SIFASPsolution is preferable to the principal axes solution and that the JDS provides measurement equivalence across worker populations. extensive study by organizational researchers (Dunham, 1976; Dunham, Aldag, & Brief, 1977; Fried & Ferris, 1986; Green, Armenakis, Marbert, & Bedeian, 1979; Pokorney, Gilmore, & Beehr, 1980), little evidence for the hypothesized five factors has emerged. Rather, these studies, which have used a wide variety of worker samples, have yielded one-, two-, three-, and fourfactor solutions, in addition to an occasional five-factor solution. Results from three recent studies (Harvey, Billings, & Nilan, 1985; Idaszak & Drasgow, 1987; Kulik, Oldham, & Langner, 1988) using a quantitative measure of fit to determine dimensionality have differed sharply from the earlier results. Instead of fewer than five factors, the chi-square goodness-of-fit statistic from maximum likelihood factor analysis indicated that m o r e than five factors were needed to account for the correlations of the 15 JDS items. Idaszak and Drasgow found strong support for the hypothesized five factors and a sixth factor, which corresponded to the five items that were written in such a way that they required reverse scoring. Similar results were obtained for a sample of workers by Kulik et al. In addition to finding the hypothesized five factors and the reverse-scored factor, Harvey et al. found a factor associated with the five items that use 7point rating scales with anchors under the first, middle, and last scale values. To eliminate the measurement artifact that they found, Idaszak and Drasgow (1987) revised the items requiring reverse scoring so that all of the items on the survey could be scored in the same direction. Using a maximum likelihood factor analysis of revised JDS data, Idaszak and Drasgow (1987) and Kulik et al. (1988) obtained results that strongly supported the expected number of factors and the expected loadings of items on these factors. The three-anchor factor identified by Harvey et al. (1985) was not found. Identification of the hypothesized JDS factor pattern for two The multidimensional nature of the job characteristic approach to job design is appealing to both researchers and practitioners because of its potential for diagnosing similarities and differences among jobs. Essentially, the goal of this approach is to provide a job-descriptive language that makes it possible to conceptualize jobs independently of worker skills, abilities, and activities (Fleishman & Quaintance, 1984). Most instruments developed to assess job characteristics are in survey form and are completed by job incumbents (Hackman & Oldham, 1975, 1980; Sims, Szilagyi, & Keller, 1976). For such instruments to be useful and for scale scores to be comparable across workers from different occupations and organizations, the instrument must provide equivalent measurement across diverse worker subpopulations. Equivalence of measurement is obtained when workers performing tasks with the same standing on the latent job characteristic construct give the same average rating (i.e., have the same expected observed score). Currently, no measure of j o b characteristics has been shown to provide measurement equivalence. For example, one of the most popular measures of job characteristics in the psychology literature is Hackman and Oldham's (1975, 1980) Job Diagnostic Survey (JDS), which was developed to assess five job characteristics across organizations and organizational levels. Despite We thank Carol Kulik, Greg Oldham, and Paul Langner for allowing us to use their dairy-worker data. In addition, we would like to thank Greg Oldham for his helpful comments on an earlier version of this article and Ledyard Tucker for providing us with access to his library of statistical subroutines. Correspondence concerning this article should be addressed to Jacqueline R. Idaszak, Department of Psychology, University of Illinois, Psychology Building, 603 East Daniel Street, Champaign, Illinois 61820. 647 648 J. IDASZAK, W. BOTTOM, AND E DRASGOW homogeneous worker populations does not definitively establish the dimensionality of the revised JDS nor does it provide information about the measurement equivalence of the revised JDS across other populations. Given the inconsistent results from past studies of JDS data, the revised JDS needs to be assessed with samples from several groups before its measurement properties are fully understood. The research described in this article addresses the measurement equivalence issue and some methodological problems with factor analysis of the original and revised JDS. In Study 1, measurement equivalence of the revised JDS was tested using J6reskog's (1971) model for simultaneous factor analysis in several populations (SIFASP). Five samples of workers were used in this study: workers at a printing plant, engineers, student parttime workers, nurses and nurses' aides, and dairy workers. These groups were chosen because each varies greatly from the others in tasks performed on the job and qualifications needed to perform the job. Each group responded to the 15 items on the revised JDS. Responses for each group were factor analyzed individually using principal axes factor analysis and together using SIFASP. Before beginning Study 1, we had expected factor analysis of the revised JDS to consistently yield the five task characteristic factors; however factor analyses of the separate data sets did not confirm this expectation. In contemplating possible explanations for the poor results, we considered the possibility that our results might be analogous to the results from validity studies that found large differences in validity coefficients across different samples. Following Schmidt and Hunter's (1977) line of reasoning, we hypothesized that the apparent situational specificity of the JDS was due to sampling fluctuations. This hypothesis was tested in Study 2. Finally, our results and a reviewer's thoughtful comments on earlier drafts of this article led us to reconsider ways of factor analyzing data from narrow occupational categories and, more generally, from homogeneous subpopulations of a more heterogeneous population. In a homogeneous job class, there may be little variation in some task characteristic across workers. This restriction of range can reduce the intercorrelations of items measuring the task characteristic and, consequently, the task characteristic factor would not be expected to emerge in a factor analysis. A heterogeneous sample or samples from several diverse worker populations are needed to extract the task characteristic factor. Thus, the difficulties that have been encountered when data from separate samples are analyzed individually-both in our Study 1 and in earlier studies such as Dunham et al. (1977)--are not too surprising. In Study I and in the Discussion section, we show how JiSreskog's ( 197 l) SIFASPcan be used to overcome problems associated with separate analyses of homogeneous samples. Study 1 The SIFASP Model To use the SIFASPanalysis as a test of the measurement equivalence for the JDS, we begin with the assumption that the factor analytic model holds for each subpopulation g. Let x~ = (x~, x2, . . . . x15) be the vector of observed variables (items) with mean vector us and variance-covariance matrix ~s for group g. Then xg can be written as a linear function of kg common factors fg and 15 unique factors zg, xg = #g + Agfg + zg, (1) where the factor loading matrix of group g is Ag and has order 15 • kg. With the usual assumptions that the unique factors are mutually uncorrelated and uncorrelated with the common factors, it immediately follows that the variance-covariance matrix ~g of the gth group is ~s = A s r + Og, (2) where 9s is the variance-covariance matrix of the common factors and O s is a diagonal variance-covariance matrix of unique variable variances (i.e., uniquenesses). The hypothesis that we wish to test is Hx: A t = A2 = 9 9 = As, namely, that the factor loading matrices are invariant over the five groups. When this hypothesis is true, we have the same factor loadings across groups. The alternative hypothesis is that Hx is false; at least one factor loading differs across groups. If the hypothesis Hx cannot be rejected, we can then test the additional hypothesis that the uniquenesses are invariant: H0: Ot = 02 = 999= Os. The hypotheses given in Hx and H0 are tested by fixing or constraining certain parameters in the As and Og matrices. If the SIFASP model with constrained parameters provides a satisfactory goodness-of-fit statistic, we have evidence for measurement equivalence of the revised JDS. Why should we expect factor loading matrices to be invariant across groups? Meredith (1964) proved that if the factor analysis model holds in some population, then it holds in subpopulations derived from the overall population by a very wide variety of types of selection. In particular, Meredith showed that the factor loading matrices Ag are invariant across subpopulations when (a) the observed variables are scaled appropriately, (b) the regressions of the common factors on the selection variables are linear, and (c) the selection variables used to form subpopulations are independent of the unique variables. The factor variance-covariance matrices cI,s can differ, however, because selection (i.e., restriction of range) affects variances and covariances of factor scores. Meredith's restriction on the scaling of the observed variables is needed because the Og are not factor correlation matrices. In the SIFASPanalysis, they are factor variance-covariance matrices. Conceptually, the reason for not rescaling the ~1,s to be correlation matrices is that some groups may be more or less heterogeneous with respect to, say, skill variety or task significance than are other groups. In fact, a near-zero factor variance would mean that the factor has very little influence on the responses of a particular group. Scaling the cI,s to be correlation matrices would hide such between-group differences. JiSreskog (1971) and Meredith (1964) described this situation in technical terms. Meredith's (1964) first condition can be satisfied by analyzing covariance matrices or by following JiSreskog's (1971) rescaling procedure. The second condition seems unlikely to be grossly violated because linear regressions provide very good approximations to most nonlinear regressions found in organizational psychology. The third condition can be checked by viewing each unique variable as the sum of a nonrandom specific variable and a random measurement error variable. It appears quite reasonable to assume that any selection variable is independent of SIFASP r a n d o m m e a s u r e m e n t error; hence, M e r e d i t h ' s t h i r d condition c o u l d only b e violated b y a n association o f a selection variable a n d a specific variable. T h e difference between a n i t e m ' s reliability ( p e r h a p s o b t a i n e d b y a t e s t - r e t e s t design) a n d its comm u n a l i t y provides a n i n d i c a t i o n o f the variance o f the specific variable. A variance n e a r zero would i m p l y t h a t M e r e d i t h ' s t h i r d condition was satisfied. Method Samples. The first sample consisted of 134 workers employed at an urban printing plant in central Illinois. This is the sample used by Idaszak and Drasgow (1987) to study the revised JDS. Of the employees, 94 were women and 40 were men. Of the employees who responded, 87% were White, 47% were married, 56% had up to a high school education, and 31% had 1 to 3 years of college education. The average age was 31 years, the average tenure with the organization was between 1 and 3 years, and the respondents worked an average of 38 hr per week. The second data set consisted of 95 engineers employed at a processing plant in northern Illinois. Of the employees who responded, 76% were men and 24% were women, 82% were White, 70% were married, and 68% had 4 to 6 years of college education. The average age of the respondents was 39 years, the average tenure at the organization was between 6 to 10 years, and the average number of hours worked per week was 44. The third data set consisted of 140 nurses and nurses' aides employed at a hospital in central Illinois. The average age of the nurses was between 33 and 38 years, 95% were women and 5% were men, 89% were White, 72% were married, 51% had 0 to 1 years of technical training, and 36% had 3 years of technical training. The respondents averaged 5 to 7 years tenure on the job. The fourth data set, collected by Kulik et al. (1988), consisted of 224 dairy workers employed at farms in central Illinois. Their average age was 35 years, 79% were men and 21% were women, 110 respondents had up to a high school education, however the average level of education was some college or technical school. Their average job tenure was 13 years and 10 months. The final data set consisted of 269 part-time workers. Of the respondents, 57% were women and 43% were men, more than 90% were single, and their average age was 21. Most of the respondents (more than 90%) were undergraduate students at a university. They had worked an average of 11 months on the job when the survey was distributed. Administration procedures. A written survey was administered to both the engineers and the printing company employees by members of their respective organizational staffs. Employees were asked to complete the surveys on their own time and to return the completed forms, in a sealed envelope, to a specific mailbox within one week. Participation was voluntary and confidentiality of responses was assured. A written survey was distributed to the nurses and nurses' aides in groups of 15 to 25 by researchers, on company time. Participation was voluntary and confidentiality of responses was assured. Surveys were then returned directly to researchers. A written survey was distributed to dairy workers according to the procedure described by Kulik et al. (1988). A written survey was distributed by college students as part of a class project in a field research methods course. The surveys were distributed to groups of one to four individuals who held part-time jobs. Participation was voluntary and confidentiality was assured. Measures. The surveys distributed to each of the five samples included the autonomy, task identity, skill variety, task significance, and feedback questions from the revised version (Idaszak & Drasgow, 1987) of Hackman and Oldham's (1975, 1980) JDS. Analyses. Our strategy for fitting factor analysis models to the data generally followed the approach suggested by JiSreskog (1971). We began by testing the equality of the covariance matrices of the 15 JDS 649 items across the five groups. Box's (1949) test for the equality of covariance matrices was used. After performing the Box test, we rescaled the covariance matrices as suggested by JiSreskog and used the rescaled covariance matrices in subsequent analyses. (The rescaled covariance matrices are available in Idaszak & Drasgow, in press.) The second step ofJ6reskng's ( 1971) SlFASPanalysis involves separate analyses of the data for each group. Because of recent research on the revised JDS (Idaszak & Drasgow, 1987; Kulik et al., 1988), we decided to perform these analyses using confirmatory factor analysis. In these analyses, elements of the factor pattern matrix were specified to be fixed (at zero) or free according to Table 5 from Idaszak and Drasgow (1987, p. 73). Free parameters were estimated by both generalized least squares (GLS; J6reskog & Goldberger, 1972) and maximum likelihood using the LISRELvl computer program (J6reskog & SiSrbom, 1984). Very large standardized residuals were obtained in the confirmatory factor analyses; consequently, we conducted some exploratory analyses and then proceeded to a confirmatory analysis with six factors. The exploratory analyses included Humphreys and Montanelli's (1975) parallel analysis and principal axes factor analyses with varimax and DAPPFR(Tucker & Finkbeiner, 1981) rotations. After obtaining reasonably satisfactory six-factor solutions and their associated chi-square goodness of fit for each group, we proceeded to assess the invariant factor pattern hypothesis Hx using the LISREL program. In this analysis, the factor pattern matrices As were constrained to be equal across groups; the factor variance-covariance matrices Os and uniquenesses O s were free to vary across groups. To set a scale (see J6reskng, 1971, p. 423), we fixed a large loading in each column of the factor loading matrix A. A chi-square goodness-of-fit statistic was obtained for this analysis, which is termed the SlFASP analysis. The difference between the sum of the chi-squares from the individual analyses and the chi-square from the SIFASP analysis was computed. When the null hypothesis is true, the difference follows a chi-square distribution with degrees of freedom equal to the difference in degrees of freedom between the sum of individual runs and the SIFASPrun. The ratio of the chi-square difference to the degrees-of-freedom difference provides a measure of fit. As a rough rule of thumb (Carmines & McIver, 1981), this ratio should be less than 2 in order to retain Hx. Our final test was a test of the invariance of unique variable variances. Here, both As and O s were constrained to be equal across groups. Results Box test. T h e Box (1949) test o f the equivalence o f the observed variable v a r i a n c e - c o v a r i a n c e m a t r i c e s yielded F(480, 627298) = 3.63, p < .0001. It is clear t h a t we should reject the equality o f covariance matrices hypothesis a n d p r o c e e d to the SIFASP analysis. Confirmatory analysis with fivefactors. Idaszak a n d D r a s gow's (1987) p a t t e r n o f fixed a n d free p a r a m e t e r s in the factor p a t t e r n m a t r i x was t h e n used as the basis for a five-factor conf i r m a t o r y factor analysis for each group. B o t h m a x i m u m likelih o o d a n d G L S were used to estimate parameters. T h e resulting five-factor solutions were unsatisfactory. For example, the chisquare statistics f r o m the five-factor G L S analyses were quite large, a n d m a n y s t a n d a r d i z e d residuals were very large, w i t h some even exceeding 10. T h e residuals t e n d e d to b e so large for some samples t h a t we were u n a b l e to find a n y i n t e r p r e t a b l e p a t t e r n to explain the p o o r fit. T h e r e is a k n o w n e r r o r in the LISREL VI c o m p u t e r p r o g r a m t h a t c a n o c c u r w h e n G L S estimation is used (Brown, 1986). O u r results were u n r e a s o n a b l e even w h e n we used the p r o c e d u r e suggested b y B r o w n to c i r c u m v e n t the error, a n d so we suspect t h a t the e r r o r m a y n o t b e fully corrected. 650 J. IDASZAK, W. BOTTOM, AND F. DRASGOW Table 1 Goodness-of-Fit Statistics From Confirmatory Maximum Likelihood Factor Analyses for Each Sample Six-factor analysis Five-factoranalysis Data set n •2 p GOF AGOF x2 p GOF AGOF Printers Engineers Part-time workers Nurses Dairy workers 134 95 269 140 224 92.48 146.54 192.79 202.59 185.87 .07 .00 .00 .00 .00 .92 .84 .91 .84 .90 .87 .73 .86 .74 .83 69.28 114.18 98.70 108.67 99.52 .30 .00 .01 .00 .03 .94 .87 .96 .91 .95 .88 .75 .92 .83 .90 862 820.27 (370 df ) 489.84 (320 df) Total Note. The dfs for the individual five-and six-factoranalyses are 74 and 64, respectively.GOF = goodness of fit; AGOF = adjusted goodness of fit. Table 1 presents the chi-square goodness-of-fit statistics from the maximum likelihood confirmatory analyses for five factors. Because each chi-square had 74 dJ~, it is evident that the chisquares for the part-time workers, nurses, and diary workers are excessively large. We then examined the modification indices computed by LISRELVI. These statistics give a rough measure of the decrease in the overall chi-square that would occur if a fixed parameter were set free. The modification indices were generally small, and, therefore, we could not obtain satisfactory five-factor solutions by freeing some factor loadings that had been fixed at zero (e.g., the three autonomy items were specified to have zero loadings on the Task Identity factor; the modification indices showed that allowing these loadings to be nonzero would not substantially improve the fit of the five-factor solution). We then examined the standardized residuals from the maxim u m likelihood five-factor solutions. We noticed that the modeled correlations (obtained by substituting parameter estimates into Equation 2) consistently underestimated the observed correlations of the items that appeared on the first page of the JDS. This suggested that a sixth factor, specifically a format factor, was needed to satisfactorily model the data. Exploratory analyses. As a check on our conclusion that six factors were needed for the JDS, we performed several exploratory analyses. The results of parallel analyses (Humphreys & Montanelli, 1975) to determine the number of common factors are shown in Table 2. This technique requires simultaneous factoring of the N person by n variable data matrix and an N X n matrix of random normal numbers. For both the real and random data, the n • n correlation matrix is computed, communalities are estimated by squared multiple correlations, and eigenvalues of the reduced correlation matrix are computed. A common factor is assumed to be real only if its associated eigenvalue is larger than the corresponding eigenvalue from the random data. The results summarized in Table 2 indicated that at least six factors were needed for the engineers, part-time workers, and nurses. In another set of exploratory analyses, we performed principal axes factor analyses with varimax and DAPPFRrotations for two through seven factors. Results of these analyses were similar to earlier factor analytic studies of the JDS in that they provided mixed evidence about dimensionality. After examining the factor patterns for each sample, the five factors corresponding to the a priori JDS constructs were evident for the printers, engineers, and part-time workers. The most interpretable principal axes solution for the nurses had autonomy, task identity, and feedback items loading on one factor, skill variety and task significance items loading on a second factor, and the items with Table 2 Parallel Analyses for the Revised Job Diagnostic Survey Data Sample/factors Printers 1 2 3 4 5 67 Engineers 1 2 3 4 5 6 7 Part-time workers 1 2 3 4 5 6 7 Nurses 1 2 3 4 5 6 7 Dairy workers 1 2 3 4 5 6 7 Real Random Difference 3.72 1.17 0.83 0.72 0.52 0.15 0.07 0.76 0.59 0.48 0.39 0.31 0.24 0.14 2.96 1.12 0.35 0.33 0.21 -0.09 -0.07 5.36 1.77 1.36 0.83 0.73 0.37 0.13 0.93 0.73 0.59 0.48 0.39 0.29 0.19 4.24 1.04 0.77 0.35 0.34 0.07 -0.06 5.41 1.82 1.06 0.76 0.66 0.19 0.05 0.50 0.39 0.31 0.25 0.20 0.15 0.09 4.91 1.43 0.75 0.51 0.46 0.04 -0.04 4.82 1.22 0.94 0.51 0.38 0.27 0.20 0.74 0.57 0.47 0.38 0.30 0.23 0.14 4.08 0.64 0.48 0.13 0.08 0.04 0.06 4.97 0.77 0.64 0.52 0.19 0.17 0.10 0.55 0.43 0.35 0.28 0.22 0.17 0.10 4.42 0.35 0.29 0.24 -0.03 -0.00 -0.00 651 SIFASP Table 3 Invariant Factor Pattern Obtained From the Simultaneous Factor Analysis of the Five Samples Factor Item 1 2 3 4 5 14 (04) 12 (04) 6 Autonomy 1 2 3 Task identity 4 5 6 Skill variety 7 8 9 Task significance 10 11 12 Feedback 13 14 15 27 (04) 71a 54 (04) 74 (04) 12 (03) 68 (03) 84 a 73 (04) 13 (03) 81a 37 (04) 69 (05) 14 (04) 29 (04) 09 (03) 28 (04) 26 (04) 16 (04) 75 (04) 66 (04) 80 a 07 (04) 07 (04) --03 (03) 39 a 77 (04) 80 a 69 (04) 35 (05) Note. All factor loadings fixed at zero are omitted. Decimal points are omitted. Standard errors of parameter estimates are in parentheses. a Parameter was fixed during the SIFASPanalysis. three anchors loading on a third factor (the format factor). Finally, the factor analytic results from the dairy-worker data suggested four factors, with skill variety and task significance items loading on one factor, autonomy and task identity items loading on a second factor, autonomy and skill variety items loading on a third factor, and a format factor as the fourth factor. Confirmatory analysis with six factors. Because of the results of the five-factor confirmatory analyses and the parallel analyses, a confirmatory factor analysis with six factors was obtained for each group separately using maximum likelihood estimation. Five of the factors were based on Idaszak and Drasgow's (1987) pattern of fixed and free parameters. The sixth factor was specified to be orthogonal to the five JDS factors, and only items from the first page of the JDS were allowed to have nonzero loadings. We felt that these separate factor analyses, in addition to the SIFASPanalyses to follow, would provide clear evidence of overfactoring if fewer than six factors were needed for any group. So, for example, if the Skill Variety factor did not exist for the nurses, then the estimated factor loadings of the skill variety items on the Skill Variety factor should be zero in the confirmatory factor analysis. Furthermore, in the SIFASP analysis, the estimate of the variance on the Skill Variety factor for nurses should be near zero. Maximum likelihood factor analyses of two of the samples, the nurses and the part-time workers, resulted in Heywood cases (negative estimates of unique variances). For the nurses sample, the third skill variety item had a uniqueness estimate o f - . 11, and for the part-time workers, the first skill variety item had an estimated uniqueness of -.02. To eliminate this problem, we fixed these uniquenesses a t . 10. This value was chosen because it was more realistic than the value (zero) suggested by J6reskog (1967). The Heywood cases were probably a result of the relatively low loading of the second skill variety item on the Skill Variety factor (.20 and .40 for the nurses and part-time workers). Thus, only two items had large loadings on the Skill Variety factor; at least three non-zero loadings are required for loadings and uniquenesses to be statistically identified (Lawley & Maxwell, 1971). Table 1 presents several fit statistics obtained for the six-factor maximum likelihood factor analyses after a few minor modifications. Notice that the goodness-of-fit and adjusted goodness-of-fit statistics computed by the LISREL VI program show noticeable improvements for three of the samples. The chisquare statistics are also much improved for these samples. An overall goodness-of-fit statistic can be obtained by adding the individual chi-square statistics across the five samples and then looking at the ratio of the aggregate chi-square to the summed degrees of freedom. As shown in Table 1, the aggregate of the chi-square statistics is 489.84 with 320 dR. The ratio of chi-square to degrees of freedom is 1.5, which indicates a good fit. SIFASP analyses. In the next analysis, simultaneous factor analysis was used to test the invariant factor pattern hypothesis. A single A matrix was fitted to the data from the five groups, whereas the Og and ~g matrices were free to vary across groups. The estimated A matrix is shown in Table 3, and the estimated variances and covariances of the factors are presented for each group in Table 4. The values in Table 4 show that the item responses from workers in the more heterogeneous groups (i.e. the part-time workers and printers) varied more than the responses made by members of the more homogeneous worker populations. In fact, the most homogeneous group, the dairy workers, were so consistent in their responses to the feedback items that the variance of the feedback factor was only .25. The chi-square statistic for the SIFASP analysis was 671.87 with 420 dR. The difference between this chi-square statistic 652 J. IDASZAK, W. BOTTOM, AND E DRASGOW workers had the largest estimated unique variances. The printers and the nurses had the lowest levels of education, which may have also contributed to increased unique variances. Table 4 Factor Variance-Covariance Matrices for Each Sample Obtained From the SIFASPAnalysis Factor Study 2 Sample/factor 1 2 3 4 5 6 Purpose Printe~ 1 2 3 4 5 6 En~neers 1 2 3 4 5 6 Part-time workers 1 2 3 4 5 6 Nudes 1 2 3 4 5 6 Dairy workers 1 2 3 4 5 6 83 46 57 17 51 00 144 34 -11 31 00 148 15 57 00 129 54 00 139 00 113 116 20 32 39 31 00 72 18 13 23 00 66 24 29 00 80 43 00 73 00 43 173 51 71 72 41 00 137 35 35 62 00 166 76 53 00 166 56 00 148 00 70 62 45 32 29 43 00 100 28 23 59 00 65 31 36 00 45 34 00 64 00 199 68 45 45 30 24 00 52 22 27 25 00 60 26 21 00 46 22 00 25 00 86 Note. Decimal points are omitted. and the aggregate of the individual chi-square statistics was 182.03 with 100 dj~, which results in a chi-square to degreesof-freedom ratio of 1.82. This ratio is comfortably small. Therefore, we have evidence ofinvariant factor loadings across populations. Because we failed to reject the invariant 3. hypothesis, we went on to test the invariance of the uniquenesses, as suggested by J6reskog (1971). Table 5 presents the SIFASPresults for the constrained and unconstrained uniquenesses. A chi-square statistic of 1314.05 with 480 dj~ was obtained from the simultaneous factor analysis in which b o t h / t and O were constrained to be equal across groups. This is an increase in chi-square of 642.17 for 60 dj~, which is clearly unacceptable. Thus, we must conclude that the factor loadings of the JDS items on six factors are invariant across groups, but the unique variances vary considerably across groups. Hence, it appears that there is more error variance (i.e., more confusion about what the items ask the subjects) in some groups. Table 5 reveals that the most heterogeneous of the groups, the printers (workers in a printing company), the nurses (nurses and nurses' aides), and the part-time In Study 1 there was a discrepancy between the results of the SIFASP analyses and the principal axes factor analyses. Results from the SIFASPanalysis indicate that six factors and an invariant factor loading matrix provide a satisfactory model for the data. On the other hand, the results of the principal axes factor analyses were consistent with earlier research (e.g., D u n h a m et al., 1977): Different numbers of factors were needed for the different groups. Furthermore, the substantive interpretations of the principal axes factors varied across samples. We speculate that both our results with principal axes factor analysis and the results of earlier researchers with similar methods are due to at least three statistical artifacts. The first problem consists of inadequate sample sizes. One effect of an inadequate sample size may be to obscure the true number of factors; the number of factors suggested by sample correlation matrices may follow a sampling distribution. This hypothesis follows directly from Schmidt and Hunter's (1977) work on validity generalization. Prior to Schmidt and Hunter's work, applied psychologists had concluded that validity was situation specific because validity coefficients varied enormously across situations. Schmidt and Hunter demonstrated that a substantial proportion of this variability was due simply to sampling fluctuations. Once corrected, validity coefficients showed surprising consistency across situations. The apparent situational specificity of the JDS may similarly result from sampling fluctuations. A second statistical artifact may have exacerbated the problem of determining the number of JDS factors. Specifically, there are only three items measuring each hypothesized dimension. For any fixed sample size, it is usually better to have fewer items so that fewer parameters must be estimated. However, three items is the m i n i m u m number required to statistically identify loadings of common factors (Lawley & Maxwell, 1971). When combined with the small sample sizes used in practice, this number of items per factor may be insufficientto unequivocally determine the dimensionality of the JDS. A third statistical artifact is the restriction of range of the factors that occurs when data from homogeneous samples are analyzed separately. In Study 1, we showed how such differences can be modeled with the SIFASPanalysis; the variances in Table 4 reflect the degree of task heterogeneity for each sample. We decided to conduct a simulation study to test our hypotheses about the first two artifacts. Such a simulation study can determine whether small sample sizes and minimal identification cause the variability in dimensionality of the JDS that has been observed by researchers such as D u n h a m et al. (1977). In the simulation study, correlation matrices, based on varying sample sizes, were sampled from a population matrix constructed from the JDS factor loadings and were then factor analyzed using exploratory methods. The number of common factors was determined by parallel analysis and by orthogonal and oblique rotations. The results of the study should indicate what sample sizes are SIFASP 653 Table 5 Estimated Unique Variances of Revised Job Diagnostic Survey Items From Constrained and Unconstrained Simultaneous Factor Analyses of Five Worker Populations Unconstrained Constrained Item no. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Printer Part-time worker Engineer Nurse Dairy worker Est SE Est SE Est SE Est SE Est SE Est SE 41 43 42 40 30 33 34 61 29 24 60 39 40 40 41 03 03 03 03 03 02 03 03 03 03 03 03 03 03 03 83 35 54 75 45 70 33 99 65 78 50 56 65 57 95 13 06 09 12 11 11 11 13 11 14 08 10 12 11 14 18 24 06 25 17 15 20 26 16 13 37 29 28 15 28 04 04 03 05 04 04 06 04 04 04 06 06 06 04 05 32 53 38 47 19 32 11 66 54 38 66 50 27 32 41 05 06 06 05 04 04 07 06 07 06 07 07 05 05 05 63 33 37 58 34 29 36 65 19 10 21 22 38 50 48 09 05 06 09 06 05 07 08 04 05 03 04 07 07 07 26 47 54 19 24 30 30 45 25 12 80 22 38 43 34 04 05 06 03 03 04 05 05 04 03 08 03 05 05 04 Note. Decimal points are omitted. Est = estimated. needed to estimate accurately the factor loadings o f J D S items. It should also indicate whether adding items to the J D S would be beneficial from a statistical point of view. Method Design andprocedure. We initially constructed two population correlation matrices designed to reflect the 15 JDS items and a hypothetical modification of the JDS that contained twice as many items per factor. The factor loadings and correlations from Idaszak and Drasgow (1987) were used to generate the matrices and are shown in Table 6. The sixth (format) factor, obtained in Study 1, was not included in the study because it is a measurement artifact and not theoretically meaningful. The first population correlation matrix was constructed by substituting the values given in Table 6 into Equation 2. This matrix was used to simulate the case of 15 items. A second population correlation matrix for the case of 30 items was constructed with a 3. that contained two copies of each row of the 3. in Table 6. This simulates a revision of the JDS with six items per factor, where each new item has the same unique variance as a corresponding old item. The unique factors for the new items were simulated to be mutually uncorrelated and uncorrelated with all unique factors oftbe old items. To simplify our analysis, we assumed that the observed variables followed a multivariate normal distribution. Then the Wishart distribution (Anderson, 1958; Harman, 1976) was used to obtain sample covariante matrices, which were rescaled to be correlation matrices. One hundred sample matrices were generated from each population matrix for each of three sample sizes: N = 75, 150, and 900. Common factors were extracted from each sample matrix by the principal factors method with squared multiple correlations used to estimate communalities. Parallel analysis was used to determine the number of factors for each sample correlation matrix. Two decision rules were used. First, the observed number of factors was defined as the number of eigenvalues of the real data that were greater than or equal to the corresponding eigenvalues of the random data. We were somewhat dissatisfied with this approach because it assumes that a factor is a true common factor when its eigenvalue is exactly equal to a corresponding eigenvalue of the random data; this criterion seemed too lenient to us. Consequently, we decided to examine a second criterion, which coneludes that a factor is a true common factor only when its associated eigenvalue is slightly larger (.05) than the eigenvalue from the random data. Rather than actually sampling random numbers and computing their eigenvalues (which would then be random quantities with their own sampling distribution), we used Tucker's (I 977) curve-fitting procedure to approximate the expected values of the eigenvalues of the random data. As a check on both the validity of the parallel analysis criteria and the interpretability of the solutions, 20 of the sample factor solutions from each condition were rotated to simple structure using both the orthogonal varimax rotation (Kaiser, 1958) and the oblique DAPPFRrotation (Tucker & Finkbeiner, 1981). Results and Discussion The numbers of c o m m o n factors, as determined by both o f the parallel analysis criteria, are s u m m a r i z e d in Table 7. The upper half of the table contains the results from factoring the 15-item correlation matrices, and the lower half contains the results for the 30-item matrices. The most important result contained in Table 7 is that there was considerable variability in the n u m b e r o f factors identified by parallel analysis when N w a s 75 or 150. This leads directly to the question o f whether the parallel analysis criteria accurately reflect the apparent n u m b e r o f factors in a sample correlation matrix. Table 8 contains two factor loading matrices from the 15-item, N = 75 condition after rotation by varimax. In both cases the oblique DAPPFR solution did not differ appreciably from the orthogonal varimax solution. The parallel analysis criteria indicated four factors for the solution shown in the top half of the table and five factors for the solution shown in the b o t t o m half. In both cases, it is evident that parallel analysis accurately 654 J. IDASZAK, W. BOTTOM, AND E DRASGOW Table 6 Factor Loadings and Correlations Used to Generate the Population Correlation Matrices Factor Item 1 2 3 4 5 O Autonomy 1 25 2 18 3 TaskidentRy 4 32 5 6 Skillvariety 7 74 8 67 9 63 Task significance 10 23 11 12 Feedback 13 14 15 -14 -08 50 78 70 58 31 72 .48 .81 .48 -11 -22 -19 .45 .55 .71 -24 08 .62 .56 .51 47 66 40 .59 .56 .84 -13 83 73 46 .41 .47 .83 Factorcorrelationmatrix 1 2 3 4 5 22 49 55 49 -19 50 41 -31 53 -50 -- Note. Decimal points and all factor loadings equal to zero are omitted. reflects the apparent number of factors for the two solutions. Moreover, it is clear that a factor has disappeared from the solution shown in the top half of Table 8; this result is due entirely to sampling fluctuations. A second finding contained in Table 7 concerns the requirement that an eigenvalue from the real data must be at least .05 larger than a random eigenvalue. This requirement did not work as well as the alternative criterion, which simply required the real eigenvalue to equal or exceed the corresponding random eigenvalue. The .05 difference threshold resulted in consistent underfactoring, whereas the .00 requirement was usually closer to the true number of factors. For this reason, the results to be described are based on the .00 threshold. Table 7 shows that the apparent dimensionality of the solutions ranged from 2 to 7 factors in the 15-item case. When the sample size was 75, the modal number of common factors was four. When the sample size was increased to 150, the modal category was the correct number, five, but there were very large sampling fluctuations. Even with a sample size as large as 900, a small number of cases resulted in underfactoring. When the number of items was doubled, the sampling fluctuations diminished substantially. There was still a strong tendency to underfactor when N = 75. But with N = 150, the true number of common factors was found in more than 75% of the samples. With N = 900, the true number of factors was found in all 100 samples. The fluctuations in the number of factors found in these sim- ulations were very similar to those observed in empirical studies of the JDS (see, e.g., Table 1 of Dunham et al., 1977). The authors of these studies attributed their findings to real variation in the factor loadings of the JDS in different subpopulations. Because the true factor loadings were invariant in the simulations conducted here, it is clear that substantial fluctuations should be expected to occur by chance alone and, therefore, the substantive conclusions of earlier researchers are not justified. The source of the problem is clearly attributable to sample size and the number of items in the JDS. The samples found in most studies of the JDS are very similar to our two smallest sample sizes. Table 7 shows that much larger samples are needed to recover the underlying structure reliably. The small number of items in the JDS exacerbates the problem. As Table 7 demonstrates, doubling the number of items in the scale reduces the sample size requirements, even though more parameters must be estimated. Discussion The results of Study 1 provide support for the measurement equivalence of the revised JDS across worker populations: We found the revised JDS to have invariant factor loadings on the five job characteristics proposed by Hackman and Oldham (1975, 1980). However, we also found a relatively weak sixth factor, which seems to be a result of the two different formats used on the instrument. In addition, we found that the JDS items had smaller uniquenesses in some groups, which implies that the items measured their corresponding factors more accurately in these groups. Nonetheless, the format factor was considerably weaker than the five a priori factors, and the invariant A matrix allows us to make valid comparisons of JDS scale scores across groups. Table 7 Frequencies of the Number of Common Factors Obtained by Parallel Analysis in Study 2 Number of factors N Criterion 2 75 .00 .05 .00 .05 .00 .05 1 2 0 0 0 0 3 4 5 6 7 48 44 30 60 1 3 23 13 59 30 99 97 3 4 0 6 0 0 0 3 4 3 0 0 45 33 82 77 100 100 5 2 6 0 0 0 0 0 0 0 0 0 15-item matrix 150 900 21 38 1 7 0 0 30-itemmatrix 75 150 900 .00 .05 .00 .05 .00 .05 0 0 0 0 0 0 6 11 0 0 0 0 44 54 12 23 0 0 Note. Criterion refers to the size of the difference in eigenvaluesbetween real and random data that was used as a threshold for determining the number of factors. SIFASP Table 8 Five-Factor Varimax Solutions Obtained When Parallel Analysis Indicated Four and Five Factors Respectively Factor Item 1 2 3 4 5 Four apparent factors Autonomy 1 82 2 51 3 75 Task identity 4 51 5 6 27 Skill variety 7 52 8 42 9 Task significance 10 11 21 44 22 21 57 45 51 45 52 64 68 12 Feedback -33 24 -21 36 24 53 13 26 73 14 15 20 70 58 28 Five apparent factors Autonomy 72 52 65 1 2 3 Task identity 4 5 -23 6 Skill variety 7 39 8 31 9 62 Task significance 10 11 12 Feedback 13 14 15 43 67 67 25 26 49 56 60 38 66 60 40 21 74 72 39 Note. Decimal points and all factor loadings less than .20 in absolute value are omitted. Our conclusions are based on the results obtained from the SIFASP analysis with an invariant 3. matrix (shown in Table 3), but with free cI, and O matrices across groups. We prefer this representation of the data for several reasons. First, when its assumptions are satisfied, no other estimation method can be more accurate in large samples than J6reskog's (1971) estimation method (maximum likelihood estimates are asymptotically efficient). Second, we were able to use the data from all 862 subjects in a single analysis to estimate/t (because A was found to be invariant). Finally, the chi-square goodness-of-fit statistic and matrices of standardized residuals show that we obtained a reasonably good fit to the data. 655 When data from each sample were analyzed separately, principal axes factor analyses suggested three-, four-, and five-factor solutions. Our failure to extract consistently the hypothesized five factors is typical of many analyses of the JDS that relied upon similar methods. The results of Studies 1 and 2 indicate that the problem is not that the five job-characteristic dimensions do not underlie the data. Rather, for some job classes (e.g., dairy workers) the factor variances are small and, with small sample sizes and only three items per scale, exploratory procedures are not powerful enough to consistently identify these dimensions. The SIFASP analysis, on the other hand, is a more powerful procedure that clearly identified Hackman and Oldham's (1975) five job characteristics. Three statistical artifacts cause factor analyses of data from homogeneous groups to be inconsistent. First, in Study 2 we found that samples of close to 1,000 subjects were needed to obtain truly clear-cut results when factor analyzing instruments like the JDS. A second difficulty in factor analyzing JDS data occurs because the JDS has only three items per factor. At least three observed variables must have nonzero factor loadings for the factor loadings and uniquenesses to be estimable (Lawley & Maxwell, 1971) and, in practice, loadings greater than .40 seem to be needed. Evidently, minor sampling fluctuations that are common in samples of a few hundred subjects cause serious problems when only three items assess a factor. In Study 2, we obtained much more stable results from factor analyses of samples of a few hundred in the simulation of a revised JDS with six items per scale despite a lower ratio of sample size to number of estimated parameters. Our samples in Study 1, which satisfied or almost satisfied the 10-subjects-per-observed-variable rule, were not large enough. A rule such as 50-subjects-per-observed-variable appears to be needed for scales with only three items per factor. The small number of items per dimension may also increase the importance of method variance and coincidental associations. This can lead to the formation of method factors such as the reverse-scoring factor found by Idaszak and Drasgow (1987) and the format factor found in Study 1. Therefore, procedures such as reverse-scoring items that are commonly used to limit response sets may lead to other problems when used with questionnaires that have only a few items per factor. Writing a larger number of items that assess the full breadth of the psychological construct would seem to be the best way to circumvent this problem. The third statistical artifact results when factor analyzing data from workers within a homogeneous job class. Sampling all workers from a single job class can greatly restrict the range of a task characteristic and, therefore, reduce the correlations of items assessing this task characteristic to the point that it may be very difficult to extract a factor corresponding to the task characteristic. This problem can be eliminated by sampling workers across representative classes of jobs. Then the range of a task characteristic would not be restricted because of sampiing, and a factor corresponding to the task characteristic would be evident in a factor analysis. Alternatively, the SIFASP method can be used to analyze data from several groups simultaneously. The factor variance-covariance matrices (~g) provide quantitative measures of the range restrictions within each job class. Moreover, by analyzing the data from several groups 656 J. IDASZAK, W. BOTTOM, AND E DRASGOW simultaneously, the SIFASPanalysis provides a statistically powerful means for estimating A. The length of the questionnaire and the physical location of the JDS items in the survey may have also contributed to the problems we encountered in the separate principal axes factor analyses of each group's data. In the present study, the revised JDS was placed in a very short questionnaire administered to the engineer, printer, and part-time samples. Principal axes factor analysis and DAPPFR rotation yielded clear five-factor solutions. The JDS was placed near the beginning in a longer questionnaire for the dairy workers; the DAPPFR rotation of their data did not dearly show the five a priori JDS factors. Finally, the JDS was placed near the end of an instrument that required nearly 45 min for the nurses to complete. The DAPPFRrotation gave very poor results for this sample. Note, however, that we obtained an acceptable chi-square when we fit the a priori JDS factor structure and a sixth measurement factor to the nurses' data. The importance of the format factor for the nurses is underscored in Table 4 where its variance is shown to be twice as large as the variance of any of the other factors. In summary, the difficulties that researchers have encountered in factor analyzing the JDS emphasize several methodological points. First, data from workers sampled across a representative set of jobs should be analyzed simultaneously, preferably by the SIFASPmethod. Second, the instrument should have at least four to six items per scale. This would increase the likelihood that a factor analysis will accurately reflect the true underlying structure of the item pool. Sampling fluctuations seem to play an unacceptably important role in samples of several hundred individuals when only three items are used to assess each scale. Finally, these methodological problems seem to be the primary cause of the inconsistent results obtained in the large number of factor analyses of the JDS. Sampling workers from multiple job classes and increasing the number of items per scale might reduce the sample size requirements and thereby yield more accurate parameter estimates. References Anderson, T. W. (1958). An introduction to multivariate statistical analysis. New York: Wiley. Box, G. E. P. (1949). A general distribution theory for a class of likelihood criteria. Biometrika, 36, 317-346. Brown, R. L. (1986). A cautionary note on the use of LISREL'Sautomatic start values in confirmatory factor analysis studies. Applied Psychological Measurement, 10, 239-245. Carmines, E. G., & Mclver, J. P. (1981). Analyzing models with unobserved variables: Analyses of covariate structures. In G. W. Bohrnstedt & E. E Borgatta (Eds.), Social measurement: Current issues (pp. 65-115). BeverlyHills, CA: Sage. Dunham, R. B. (1976). The measurement of dimensionality of job characteristics. Journal of Applied Psychology, 61,404-409. Dunham, R. B., Aldag, R. J., & Brief, A. P. (1977). Dimensionality of task design as measured by the Job Diagnostic Survey.Academy of Management Journal 20, 209-223. Fleishman, E. A., & Quaintance, M. K. (1984). Taxonomies of human performance. Orlando, FL: Academic Press. Fried, Y., & Ferris, G. R. (1986). The dimensionality ofjob characteristics: Some neglected issues. Journal of Applied Psychology, 71, 419426. Green, S. B., Armenakis, A. A., Marbert, L. D., & Bedeian, A. G. (1979). An evaluation of the response format and scale structure of the Job Diagnostic Survey.Human Relations, 32, 181-188. Hackman, J. R., & Oldham, G. R. (1975). Developmentof the Job Diagnostic Survey.Journal of Applied Psychology,, 60, 159-170. Hackman, J. R., & Oldham, G. R. (1980). Work redesign. Reading, MA: Addison-Wesley. Harman, H. H. (1976). Modern factor analysis. Chicago: University of Chicago Press. Harvey, R., Billings,R., & Nilan, K. (1985). Confirmatory factor analysis of the Job Diagnostic Survey: Good news and bad news. Journal ofApplied Psychology, 70, 461-468. Humphreys, L. G., & Montanelli, R. G., Jr. (1975). An investigation of the parallel analysiscriterion for determining the number of common factors. Multivariate Behavioral Research, 10, 193-205. Idaszak, J. R., & Drasgow, E (1987). A revision of the Job Diagnostic Survey: Elimination of a measurement artifact. Journal of Applied Psychology, 72, 69-74. Idaszak, J. R., & Drasgow, E (in press). Simultaneous factor analysis of the revised Job Diagnostic Survey: Abstract and data. Social and Behavioral Science Documents. JSreskog, K. G. (1967). Some contributions to maximum likelihood factor analysis.Psychometrika, 32, 443-482. JSreskog, K. G. (1971). Simultaneous factor analysis in severalpopulations. Psychometrika, 36, 409-426. Jtreskog, K. G., & Goldberger,A. S. (1972). Factor analysis by generalized least squares. Psychometrika, 37, 243-260. Jtreskog, K. G., & SiSrbom, D. (1984). LISREL VI: Analysis of linear structural relations by the method of maximum likelihood. Uppsala, Sweden: Universityof Uppsala. Kaiser, H. E (1958). The varimax criterion for analytic rotation in factor analysis.Psychometrika, 23, 187-200. Kulik, C. T., Oldham, G. R., & Langner, P. H. (1988). The measurement of job characteristics: A comparison of the original and the revised Job Diagnostic Survey.Journal ofApplied Psychology,, 73, 462466. Lawley, D., & Maxwell, A. E. (1971). Factor analysis as a statistical method. New York: Elsevier. Meredith, W. (1964). Notes on factorial invariance. Psychometrika, 29, 177-185. Pokorney, J. J., Gilmore, D. C., & Beehr, T. A. (1980). Job Diagnostic Survey dimensions: Moderating effect of growth needs and correspondence with dimensions of Job Rating Form. Organizational Behavior and Human Performance, 26, 222-237. Schmidt, E L., & Hunter, J. E. (1977). Development of a general solution to the problem of validitygeneralization.Journal ofAppliedPsychology, 62, 529-540. Sims, H. P. Jr., Szilagyi,A. D., & Keller,R. T. (1976). The measurement of job characteristics. Academy of Management, 19, 195-212. Tucker, L. R (1977). Functional representation of Montanelli-Hum- phreys weightsfor judging number offactors by the parallel analysis technique, Unpublished manuscript. Tucker, L. R, & Finkbeiner, C. T. (1981). Transformation offactors by artificial personal probability functions (Research Rep. No. RR-8158). Princeton, NJ: Educational TestingService. Received April 2, 1987 Revision received March 8, 1988 Accepted March 15, 1988 9