Mokken, (1970) A theory and procedures of scale analysis: with applications in political research. The Hague: Mouton. Andrich, D. and G. Luo (1993). A hyperbolic cosine latent trait model for unfolding dichotomous singlestimulus responses. Applied Psychological Measurement, 17, 253-276. Cliff, N., Collins, L.M., Zatkin, J., Gallipeau, D. and McCormick, D.J. (1988). An ordinal scaling method for questionnaires and other ordinal data. Applied Psychological Measurement. 12, 83-97. Davison, M.L. (1977). On a metric, unidimensional unfolding model for attitudinal and developmental data. Psychometrika, 42, 523-548. Hoijtink, H. (1990). A latent trait model for dichotomous choice items. Psychometrika, 55, 641656. Mokken, R.J. (1971). A theory and procedure of scale analysis. New York/Berlin: De Gruyter (Mouton). Molenaar, I.W. (1982). Mokken scaling revisited. Kwantitatieve Methoden, 3, 145-164. Post, W.J. (1992). Nonparametric unfolding models: a latent structure approach. Leiden: DSWO Press. Post, W.J. and Snijders, T.A.B. (1993). Nonparametric unfolding models for dichotomous data. Methodika, 7, 130-156. Ross, J. and N. Cliff (1964). A generalization of the interpoint distance model. Psychometrika, 29, 167-176. Sijtsma, K., P. Debets and I.W. Molenaar (1990). Mokken scale analysis for polychotomous items: theory, a computer program, and an empirical application. Quality and Quantity, 24, 173188. Van Schuur, W.H. (1989). Unfolding the German political parties: a description and application of multiple unidimensional unfolding. In: G. de Soete, H. Feger and K.C. Klauer (eds.). New Developments in Psychological Choice Modeling. New York: Elsevier, 259-290. Van Schuur, W.H. (1993). Nonparametric unidimensional unfolding for multicategory data. Political Analysis, 4, 41-74. Van Schuur, W.H. and W. Post (1991). User's manual MUDFOLD, a program for multiple unidimensional unfolding. Groningen, i.e.c. ProGamma, Grote Rozenstraat 15, 9712 GT Groningen, The Netherlands. Van Schuur, W.H. and Kiers H.A.L. (1994). Why factor analysis is often the wrong model for analyzing bipolar concepts and what model to use instead. Applid Psychological Measurement, 18, 97-110. Van Schuur, W.H. and Kruijtbosch, M. (1995). Measuring subjective well-being: unfolding the Bradburn affect balance scale. Social Indicators Research, 36, 49-74. Verweij Anton C., Sijtsma Klaas, Koops Willem, , A Mokken Scale for Transitive Reasoning Suited for Longitudinal Research INTERNATIONAL JOURNAL OF BEHAVIORAL DEVELOPMENT, 1996, 19 (1), 219–238 Agresti, A. (1990). Categorical data analysis. New York: Wiley. Andersen, E.B. (1977). Sufficient statistics and latent trait models. Psychometrika, 42, 69-81. Andersen, E.B. (1997). The rating scale model. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 67 - 84). New York: Springer. Andrich, D. (1978). A rating scale formulation for ordered response categories. Psychometrika, 43, 561-573. Andrich, D. (1995). Distinctive and incompatible properties of two common classes of IRT models for graded responses. Applied Psychological Measurement, 19, 101-119. Chang, H., & Mazzeo, J. (1994). The unique correspondence of the item category response functions in polytomously scored item response models. Psychometrika, 59, 391-404. Akkermans, L.M.W. (1998). Studies on statistical models for polytomously scored test items. Unpublished doctoral dissertation, University of Twente, Enschede, The Netherlands. Glas, C.A.W. (1989). Contributions to estimating and testing Rasch models. Unpublished doctoral dissertation, University of Twente, Enschede, The Netherlands. Grayson, D. A. (1988). Two-group classification in latent trait theory: Scores with monotone likelihood ratio. Psychometrika, 53, 383-392. Gumpel, T., Wilson, M., & Shalev, R. (1998). An item response theory analysis of the Conner's Teachers Rating-Scale. Journal of Learning Disabilities, 31, 525-532. Hemker, B.T. (1996). Unidimensional IRT models for polytomous items, with results for Mokken scale analysis. Unpublished doctoral dissertation, Utrecht University, The Netherlands. Hemker, B.T. (2001). Reversibility revisited and other comparisons of three types of polytomous IRT models. In A. Boomsma, M.A.J. van Duijn, & T.A.B. Snijders (Eds.), Essays on item response theory (pp. 277 - 296). New York: Springer. Hemker, B.T., Sijtsma, K., Molenaar, I.W., & Junker, B.W. (1996). Polytomous IRT models and monotone likelihood ratio of the total score. Psychometrika, 61, 679-693. Hemker, B.T., Sijtsma, K., Molenaar, I.W., & Junker, B.W. (1997). Stochastic ordering using the latent trait and the sum score in polytomous IRT models. Psychometrika, 62, 331-347. Huynh, H. (1994). A new proof for monotone likelihood ratio for the sum of independent Bernoulli random variables. Psychometrika, 59, 77-79. Junker, B. W. (1991). Essential independence and likelihood-based ability estimation for polytomous items. Psychometrika, 56, 255-278. Junker, B. W. (1993). Conditional association, essential independence and monotone unidimensional item response models. The Annals of Statistics, 21, 1359-1378. Kelderman, H., & Rijkes, C.P.M. (1994). Loglinear multidimensional IRT models for polytomously scored items. Psychometrika, 59, 437-450. Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149174. Masters, G. N. (1988). Measurement models for ordered response categories. In R. Langeheine & J. Rost (Eds.), Latent trait and latent class models (pp. 11-29). New York: Plenum press. Masters, G.N., & Wright, B.D. (1997). The partial credit model. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 101 - 121). New York: Springer. Maurer, T.J., Raju, N.S., & Collins, W.C. (1998). Peer and subordinate performanceappraisal measurement equivalence. Journal of Applied Psychology, 5, 693-702. Mellenbergh, G. J. (1995). Conceptual notes on models for discrete polytomous item responses. Applied Psychological Measurement, 19, 91-100. Mokken, R.J., & Lewis, C. (1982). A nonparametric approach to the analysis of dichotomous item responses. Applied Psychological Measurement, 6, 417-430. Molenaar, I.W. (1983). Item steps (Heymans Bulletin 83-630-EX). Groningen, The Netherlands: University of Groningen, Department of Statistics and Measurement Theory. Muraki, E. (1990). Fitting a polytomous item response model to Likert-type data. Applied Psychological Measurement, 14, 59-71. Muraki, E. (1992). A generalized partial credit model: application of an EM algorithm. Applied Psychological Measurement, 16, 159-176. Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research. Rosenbaum, P. R. (1985). Comparing distributions of item responses for two groups. British Journal of Mathematical and Statistical Psychology, 38, 206 - 215. Samejima, F. (1969). Estimation of latent trait ability using a response pattern of graded scores. Psychometrika, Monograph Supplement No. 17. Samejima, F. (1972). A general model for free-response data. Psychometrika, Monograph Supplement No. 18. Samejima, F. (1995). Acceleration model in the heterogeneous case of the general graded response model. Psychometrika, 60, 549-572. Samejima, F. (1996, April). Polychotomous responses and the test score. Paper presented at National Council on Measurement in Education Meeting, New York. Samejima, F. (1997, March). An expansion of the logistic positive exponent family of models to a family of graded response models. Paper presented at National Council on Measurement in Education Meeting, Chigago. Scheiblechner, H. (1995). Isotonic ordinal probabilistic models (ISOP). Psychometrika, 60, 281304. Sijtsma, K., & Hemker, B.T. (1998). Nonparametric polytomous IRT models for invariant item ordering, with results for parametric models. Psychometrika, 63, 183-200. Sijtsma, K., & Hemker, B.T. (2000). A taxonomy for ordering persons and items using simple sum scores. Journal of Educational and Behavioral Statistics, 25, 391-415. Sijtsma, K., & Junker, B.W. (1996). A survey of theory and methods of invariant item ordering. British Journal of Mathematical and Statistical Psychology, 49, 79-105. Sijtsma, K., & Van der Ark, L.A. (2001). Progress in NIRT analysis of polytomous item scores: Dilemmas and practical solutions. In A. Boomsma, M.A.J. van Duijn, & T.A.B. Snijders (Eds.), Essays on item response theory (pp. 297 - 318). New York: Springer. Sijtsma, K., & Verweij, A.C. (1999). Knowledge of solution strategies and IRT modeling of items for transitive reasoning. Applied Psychological Measurement, 23, 55-68. Thissen, D., & Steinberg, L. (1986). A taxonomy of item response models. Psychometrika, 51, 567-577. Tutz, G. (1990). Sequential item response models with an ordered response. British Journal of Mathematical and Statistical Psychology, 43, 39-55. Tutz, G. (1997). Sequential models for ordered responses. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 139 - 152). New York: Springer. Van der Ark, L.A. (2000). Practical consequences of stochastic ordering of the latent trait under various polytomous IRT models. Manuscript submitted for publication. Van Engelenburg, G. (1997). On psychometric models for polytomous items with ordered categories within the framework of item response theory. Unpublished doctoral dissertation, University of Amsterdam. Verhelst, N. D., Glas, C. A. W., & De Vries, H. H. (1997). A steps model to analyze partial credit. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 123 - 138). New York: Springer. Verhelst, N. D., Glas, C. A. W., & Verstralen, H.H.F.M. (1995). OPLM: One Parameter Logistic Model. Computer program and manual. Arnhem, The Netherlands: CITO. Hemker B.T., Ark L.A. van der , Sijtsma K., On Measurement Properties of Continuation Ratio Models, Measurement and Research Department Reports 20006, Citogroep Arnhem, december 2000 Jacoby William G , “Issue Framing and Public Opinion on Government Spending” (American Journal of Political Science 44 (4): 750-767). Sijtsma K. and Molenaar (I.W. (1987) Reliability of test scores in nonparametric item response theory. Psychometrika. 52, 79-97 Rasch G. (1960) probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute of Educational Research. Niemoller B. and van Schuur W.H. (1983) Stochastic Models for unidimensional scaling: Mokken and Rasch. In D. Mckay, N. Schofield and P. Whiteley (Eds.) Data Analysis and the Social Sciences. London: Francis Pinter. Mokken R.J. and Lewis C. (1982) A nonparametric approach to the analysis of dichotomous item responses. Applied Psychological Measurement. 6, 417-430. Meijer R.R., Sijtsma K. and Molenaar I.W. (1996) Reliability estimation for single dichotomous items based on Mokken's IRT model. Applied Psychological Measurement. 20, 213-233. De Jong A., and Molenaar I.W. (1987) An application of Mokken's model for stochastic, cumulative scaling in psychiatric research. Journal of Psychiatry Research. 12, 137-149. Fischer G. H. (1974) The linear logistic test model as an instrument in educational research. Acta Psychologica. 37, 359-374 Fischer G. H. (1989) An IRT-based model for dichotomous longitudinal data. Psychometrika. 54, 599624. Kelderman (1984) Loglinear Rasch Model tests. Psychometrika. 49, 223-245. Mokken, R.J. (1997). Nonparametric models for dichotomous responses. In: Van der Linden, W.J. & Hambleton, R.K. Handbook of modern item response theory. New York: SpringerVerlag. Sijtsma, K., P. Debets, and I.W. Molenaar (1990). Mokken scale analysis for polytomous items: theory, a computer program, and an empirical application. Quality and Quantity, 24, 173188. Saris W. E.,Gallhofer I. N. [eds], Sociometric Research, Vol. I: Data Collection and Scaling (London: Macmillan, 1988) Van Schuur W.H, Structure in Political Beliefs. A new unfolding model with application to European party activists (Amsterdam: CT Press, 1984); Van Schuur W.H., ‘From Mokken to MUDFOLD and back’, in M. Fennema, C. van der Eijk and H. Schijf eds, In search of structure. Essays in social science and methodology (Amsterdam: Het Spinhuis, 1993), pp. 45-62; Clogg CC. Latent class models. In Arminger G, Clogg CC, Sobel ME (eds), Handbook of statistical modeling for the social and behavioral sciences (Ch. 6; pp. 311-359). New York: Plenum, 1995. Clogg CC. Unrestricted and restricted maximum likelihood latent structure analysis: a manual for users. Working paper 1977-09, Pennsylvania State University, Population Issues Research Center, 1977. Garrett ES, Zeger SL. Latent class model diagnosis. Biometrics, in press. Goodman LA. Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika, 1974, 61, 215-231. Haberman SJ. Qualitative data analysis: Vol. 2. New developments. New York: Academic Press, 1979. Hagenaars JA. Categorical longitudinal data. Newbury Park, California: Sage, 1990. Heinen T. Latent class and discrete latent trait models: Similarities and differences. Thousand Oaks, California: Sage, 1996. Langeheine R, Pannekoek J, van de Pol, F. Bootstrapping goodness-of-fit measures in categorical data analysis. Sociological Methods and Research, 1996, 24, 492-516. Lazarsfeld PF, Henry NW. Latent structure analysis. Boston: Houghton Mifflin, 1968. Lindsay B, Clogg CC, Grego J. Semiparametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis. Journal of the American Statistical Association, 1991, 86, 96-107. Uebersax JS. Statistical modeling of expert ratings on medical treatment appropriateness. Journal of the American Statistical Association, 1993a, 88, 421427. Uebersax JS. LLCA: Located latent class analysis. Computer program documentation, 1993b. Uebersax JS. Analysis of student problem behaviors with latent trait, latent class, and related probit mixture models. In Rost J, Langeheine R (eds), Applications of latent trait and latent class models in the social sciences (pp. 188-195). New York: Waxmann, 1997; van de Pol F, Langeheine R, de Jong W. PANMARK user manual, version 3. Netherlands Central Bureau of Statistics, Voorburg, The Netherlands, 1998. van der Heijden P, 't Hart H, Dessens J. A parametric bootstrap procedure to perform statistical tests in a LCA of anti-social behaviour. In Rost J, Langeheine R (eds), Applications of latent trait and latent class models in the social sciences (pp. 196-208). New York: Waxmann, 1997. Vermunt JK. LEM: A general program for the analysis of categorical data. Tilburg University, Department of Methodology, 1997. Vermunt JK, Magidson J. Latent GOLD User's Guide. Belmont, Mass.: Statistical Innovations, Inc., 2000. von Davier M. Bootstrapping goodness-of-fit statistics for sparse categorical data: results of a Monte Carlo study. Methods of Psychological Research, 1997, 2(2). ( http://www.pabst-publishers.de/mpr/issue3/art5/article.html ) Holland Rosenbaum (1986) Junker (1993). Mokken (1971), Rosenbaum (1987a, 1987b), Sijtsma Meijer (1992), Goetghebeur E, Liinev J, Boelaert M, Van der Stuyft P., Diagnostic test analyses in search of their gold standard: latent class analyses with random effects. Jacoby William G., "The Structure of Ideological Thinking in the American Electorate", American Journal of Political Science, Vol. 39, No. 2, May 1995, Pp. 314-35 New Horizons in Testing: Latent Trait Test Theory and Computerized Adaptive Testing Edited by David J. Weiss. 345 pages. 1983. Item Response Theory Books Applications of Item Response Theory to Practical Testing Problems Frederick M. Lord. 274 pages. 1980. Applying The Rasch Model Trevor G. Bond and Christine M. Fox 255 pages. 2001 Fundamentals of Item Response Theory Ronald K. Hambleton, H. Swaminathan, and H. Jane Rogers. 184 pages. 1991. Handbook of Modern Item Response Theory Edited by Wim J. van der Linden and Ronald K. Hambleton. 510 pages. 1997. Item Response Theory for Psychologists Susan E. Embretson and Steven P. Reise. 376 pages. 2000. Item Response Theory: Parameter Estimation Techniques Frank Baker. 440 pages. 1992. Item Response Theory: Principles and Applications Ronald K. Hambleton and Hariharan Swaminathan. 332 pages. 1984. Latent Variable Models and Factor Analysis, Second Edition David Bartholomew and Martin Knott. 214 pages. 1999. Probabilistic Models for Some Intelligence and Attainment Tests Georg Rasch, with foreword and afterword by Benjamin D. Wright. 199 pages. 1992. Rasch Models for Measurement David Andrich. 96 pages. 1988. Rasch Models: Foundations, Recent Developments, and Applications Edited by Gerhard H. Fischer and Ivo W. Molenaar. 436 pages. 1995. Measurement Instruments Books Questionnaires and Inventories: Surveying Opinions and Assessing Personality Lewis R. Aiken. 319 pages. 1997. Rating Scales and Checklists: Evaluating Behavior, Personality, and Attitudes Lewis R. Aiken. 312 pages. 1996. Setting Performance Standards: Concepts, Methods and Perspectives Edited by Gregory J. Cizek. 520 pages. 2001. Tests & Examinations: Measuring Abilities and Performance Lewis R. Aiken. 293 pages. 1998. Foundations of Measurement: Geometrical, Threshold, and Probabilistic Representations (Volume 2) Patrick Suppes, David Krantz, R. Duncan Luce, and Amos Tversky. 493 pages. 1989. Foundations of Measurement: Representation, Axiomatization, and Invariance (Volume 3) R. Duncan Luce, David Krantz, Patrick Suppes, and Amos Tversky. 356 pages. 1990. Measurement, Judgement, and Decision Making Edited by Michael H. Birnbaum. 386 pages. 1997. Statistical Approach to Social Measurement David J. Bartholomew. 239 pages. 1996 Multivariate Analysis Books Brian S. Everitt and Applied Multivariate Data Analysis - Second Edition Graham Dunn Applied Multivariate Statistics for the Social Sciences (Third Edition) James P. Stevens. 672 pages. 1996. Applied Multivariate Statistics for the Social Sciences (Fourth Edition) James P. Stevens. 699 pages. 2001. Applying Generalized Linear Models James Lindsey. 280 pages. 1997. Factor Analysis Richard L. Gorsuch. 452 pages. 1983. A First Course in Factor Analysis Andrew L. Comrey, Howard B. Lee. 448 pages. 1992. The Geometry of Multivariate Statistics Thomas D. Wickens. 176 pages. 1994. Handbook of Applied Multivariate Statistics and Mathematical Modeling Edited by Howard E. A. Tinsley and Steven D. Brown. 760 pages. 2000. Latent Variable Models and Factor Analysis, Second Edition David Bartholomew and Martin Knott. 214 pages. 1999. Mathematical Tools for Applied Multivariate Analysis J. Douglas Carroll, Paul E. Green, with contributions by Anil Chaturvedi. 382 pages. 1997. Modern Multidimensional Scaling: Theory and Applications Ingwer Borg and Patrick Groenen. 471 pages. 1996. Multivariate Analysis Techniques in Social Science Research: From Problem to Analysis Jacques Tacq. 432 pages. 1997. Multivariate Statistical Methods: A First Course George A. Marcoulides, Scott L. Hershberger. 344 pages. 1997. Multivariate Taxometric Procedures: Distinguishing Types from Continua Niels G. Waller, Paul E. Meehl. 149 pages. 1997. Neural Networks: An Introductory Guide for Social Scientists G. David Garson. 208 pages. 1998. A Primer of Multivariate Statistics (Third Edition) Richard J. Harris 609 pages. 2001 Reading and Understanding Multivariate Statistics Edited by Lawrence G. Grimm, PhD and Paul R. Yarnold, PhD. 384 pages. 1995. Reading and Understanding More Multivariate Statistics Edited by Laurence G. Grimm, PhD and Paul R. Yarnold, PhD. 430 pages. 2000. Statistical Analysis of Longitudinal Categorical Data in the Social and Behavioral Sciences Alexander von Eye, Keith E. Niedermeier. 272 pages. 1999. Regression Analysis Books Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences, Second Edition Jacob Cohen and Patricia Cohen. 545 pages. 1983. Applied Regression Analysis, Linear Models, and Related Methods John Fox. 462 pages. 1997. Log-Linear Models and Logistic Regression, Second Edition Ronald Christensen. 483 pages. 1997. Regression Analysis: Statistical Modeling of a Response Variable Rudolf J. Freund and William J. Wilson. 688 pages. 1998. Regression Analysis for Social Sciences Alexander von Eye, Christof Schuster. 386 pages. 1998. Regression Models for Categorical and Limited Dependent Variables: Analysis and Interpretation J. Scott Long. 416 pages. 1997. Understanding Regression Analysis Michael Patrick Allen. 228 pages. 1997 Scale and Survey Analysis Books Handbook of Survey Research P. Rossi. 337 pages. 1985. How to Analyze Survey Data Arlene Fink. 112 pages. 1995. How to Measure Survey Reliability and Validity Mark S. Litwin. 96 pages. 1995. Improving Survey Questions: Design and Evaluation Floyd J. Fowler, Jr. 200 pages. 1995. An Introduction to Survey Research, Polling, and Data Analysis Herbert F. Weisberg, Jon A. Krosnick, Bruce D. Bowen. 404 pages. 1996. Modern Multidimensional Scaling: Theory and Applications Ingwer Borg and Patrick Groenen. 471 pages. 1996. Questionnaires and Inventories: Surveying Opinions and Assessing Personality Lewis R. Aiken. 319 pages. 1997. Questions and Answers in Attitude Surveys: Experiments on Question Form, Wording, and Context Howard Schuman and Stanley Presser. 392 pages. 1996. Summated Rating Scale Construction: An Introduction Paul E. Spector. 80 pages. 1992. Survey Research Methods Floyd J. Fowler, Jr. 168 pages. 1993. Unidimensional Scaling John P. McIver, Edward G. Carmines. 96 pages. 1981. Statistics and Research Design Books The Analysis of Change Edited by John Mordechai Gottman. 544 pages. 1995. Analyzing Within-Subjects Experiments John W. Cotton. 352 pages. 1997. Applied Categorical Data Analysis Chap T. Le. 287 pages. 1998. Categorical Variables in Developmental Research: Methods of Analysis Edited by Alexander von Eye and Clifford C. Clogg. 286 pages. 1996. Graphing Statistics & Data: Creating Better Charts Anders Wallgren, Britt Wallgren, Rolf Persson, Ulf Jorner, Jan-Aage Haaland. 112 pages. 1996. Longitudinal Data Analysis: Designs, Models and Methods Leo van der Kamp, Willem A. van der Kloot, Catrien C. J. H. Bijleveld, Rien van der Leeden, Ab Mooijaart, Eeke Van Der Burg. 448 pages. 1998. Measurement, Design, and Analysis: An Integrated Approach Elazer J. Pedhauzer and Liora Pedhauzer Schmelkin. 849 pages. 1991. Modeling Longitudinal and Multilevel Data: Practical Issues, Applied Approaches, and Specific Examples Edited by Todd D. Little, Kai U. Schnabel, and Jürgen Baumert. 308 pages. 2000. New Methods for the Analysis of Change Edited by Linda M. Collins, PhD and Aline G. Sayer, EdD. 442 pages. 2001. Nonparametric Statistical Methods Myles Hollander, Douglas A. Wolfe. 787 pages. 1999. Odds Ratios in the Analysis of Contingency Tables Tamás Rudas. 88 pages. 1997. Ordinal Methods for Behavioral Data Analysis Norman Cliff. 208 pages. 1996. Random Number Generation and Monte Carlo Methods James E. Gentle. 261 pages. 1998. Research Design and Statistical Analysis Jerome L. Myers and Arnold D. Well. 728 pages. 1995. Statistical Graphics for Univariate and Bivariate Data William G. Jacoby. 96 pages. 1997. Statistical Methods for Categorical Data Analysis Daniel A. Powers and Yu Xie. 305 pages. 1999. Statistical Methods in Longitudinal Research: Principles and Structuring Change (Volume 1) Edited by Alexander von Eye. 288 pages. 1990. Statistical Methods in Longitudinal Research: Time Series and Categorical Longitudinal Data (Volume 2) Edited by Alexander von Eye. 352 pages. 1990. Test Development Books Constructing Test Items: Multiple-Choice, Constructed-Response, Performance and Other Formats Steven J. Osterlind. 352 pages. 1997. Developing and Validating Multiple-Choice Test Items Thomas M. Haladyna. 228 pages. 1994. Differential Item Functioning Edited by Paul W. Holland and Howard Wainer. 456 pages. 1993. Edited by Anne Boomsma, Marijtje A.J. van Duijn, Tom A.A. Snijders. 438 pages. 2001 Methods for Identifying Biased Test Items Gregory Camilli and Lorrie A. Shepard. 174 pages. 1994. Modern Theories of Measurement: Problems and Issues Edited by Dany Laveault, Bruno D. Zumbo, Marc E. Gessaroli, and Marvin W. Boss. 408 pages. 1994. Principles of Test Theories Hoi K. Suen. 236 pages. 1990. Psychological Testing Anne Anastasi and Susana Urbina. 832 pages. 1997. Reliability and Validity Assessment Edward G. Carmines and Richard A. Zeller. 70 pages. 1979. Reliability for the Social Sciences: Theory and Applications Ross E. Traub. 174 pages. 1994. A Technology for Test-Item Writing G. Roid and T. Haladyna. 247 pages. 1981. Essays on Item Response Theory Test Equating: Methods and Practices Michael J. Kolen and Robert L. Brennan. 333 pages. 1995. Test Scoring Edited by David Thissen and Howard Wainer. 422 pages. 2001. Test Theory: A Unified Treatment Roderick P. McDonald. 504 pages. 1999. Test Theory for a New Generation of Tests Norman Frederiksen, Robert J. Mislevy, and Isaac I. Bejar. 416 pages. 1993. Test Validity Edited by Howard Wainer and Henry I. Braun. 272 pages. 1988. Tests and Assessment W. Bruce Walsh and Nancy E. Betz. 474 pages. 1995. Holland Rosenbaum, 1986 Mokken Lewis, 1982 Meredith, 1965 Junker (2000) Meijer, 1996; Ramsay, 1991, 1995, 1996) Meijer, 1996; Cliff Donoghue, 1992; Drasgow, Levine, Tsien, Williams Mead, 1995; Samejima, 1997) Molenaar, 1991; 1997 (e.g. Ellis Van den Wollenberg, 1993; Hemker, Sijtsma, Molenaar, 1995; Mokken, 1971; Molenaar, 1997) Holland Rosenbaum, 1986; Junker, 1993; Grayson, 1988; Huynh, 1994 A Survey of Theory and Methods of Invariant Item Ordering - Klaas Sijtsma (1996) Sijtsma Meijer (1992) Junker (1993) Latent and manifest monotonicity in item response models - Brian Junker (1996) Exploring monotonicity in polytomous item response data - Brian Junker Lord and Novick (1968) Lazarsfeld and Henry (1968) Heinen (1996) Hulin, Drasgow and Parsons (1983) Vermunt, 1988 Uebersax, 2000 [Unidimensional versions can be estimated with ), both free programs.] LEM (Vermunt, 1988), or LTM (Uebersax, 2000 Clogg, C. C. (1995).Latent Class Models.In G. Arminger, C. C. Clogg, & M. E. Sobel (Eds.) Handbook of statistical modeling for the social and behavioral sciences.New York:Plenum. Clogg, C. C. (1981).Latent class models for measuring.In R. Langeheine & J. Rost (Eds.) Latent trait and latent class models.New York:Plenum. Hagenaars, J. A. (1993). Loglinear models with latent variables.Sage Publications. Heinen, T. (1996).Latent class and discrete latent trait models:Similarities and differences.Thousand Oaks:Sage Publications. Langeheine, R. & Rost, J. (Eds.) (1988).Latent trait and latent class models.New York:Plenum. Lazarsfeld, P. F. & Henry, N. W. (1968).Latent structure analysis.Boston:Houghton Mifflin. McCutcheon, A. L. (1987).Latent class analysis.Sage University Paper series on Quantitative Applications in the Social Sciences, series no. 07-064.Beverly Hills:Sage Publications. Rost, J., & Langeheine, R. (Eds.) (1997).Applications of latent trait and latent class models in the social sciences.Muenster, Germany:Waxmann. Rost J, Langeheine R, eds. Applications of Latent Trait and Latent Class Models in the Social Sciences. New York, NY: Waxmann; 1997:188-195. Qu Y, Tan M, Kutner MH. Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. Biometrics. 1996;52:797-810. Clogg, C. C. (1995). Latent class models. In G. Arminger, C. C. Clogg, & M. E. Sobel (Eds.), Handbook of statistical modeling for the social and behavioral sciences (Ch. 6; pp. 311-359). New York: Plenum. Goodman, L. A. (1974), "Exploratory Latent Structure Analysis Using Both Identifiable and Unidentifiable Models," Biometrika, 61, 215-231. Haberman, S. J., Qualitative Data Analysis (Vols. 1 & 2), New York, Academic Press, 1979. Hagenaars, J. A. (1993). Loglinear models with latent variables. Sage Publications. Hagenaars J, McCutcheon A (Eds) (due Feb. 2001). Applied Latent Class Analysis. Cambridge University Press. See online description Lazarsfeld, P. F., and Henry, N. W. (1968), Latent Structure Analysis, Boston: Houghton Mifflin. Langeheine, R. & Rost, J. (Eds.) (1988). Latent trait and latent class models. New York: Plenum. Hagenaars J.A. (1988). LCAG -- Loglinear modeling with latent variables: A modified LISREL approach. In W. E. Saris & I. N. Gallhofer (Eds.), Sociometric research: Volume 2. Data analysis (pp. 111-130). London, England: Macmillan. Pol, F. van de, R. Langeheine, W. de Jong, PANMARK User Manual, Netherlands Central Bureau of Statistics, Voorburg, The Netherlands, 1989. Grego, J. M. (1993). PRASCH: A Fortran program for latent class polytomous response Rasch models. Applied Psychological Measurement, 17, 238. Rost, J., & von Davier, M. (1992). MIRA: A PC program for the mixed Rasch model. Kiel, Germany: IPN--Institute for Science Education. Uebersax, J. S. (1993b). LLCA: Located latent class analysis. Program and user's manual. StatLib statistics archive (http://www.stats.cmu.edu). Espeland, M. A., & Handelman, S. L. (1989). Using latent class models to characterize and assess relative error in discrete measurements. Biometrics, 45, 587-99. Hagenaars, J. A. (1988). Latent structure models with direct effects between indicators: Local dependence models. Sociological Methods and Research, 16, 379-405. Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika 49, 359-381. Qu Y., Tan M., & Kutner M. H. (1996). Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. Biometrics, 52, 797-810. Rost, J. (1990). Rasch models in latent classes: An integration of two approaches to item analysis. Applied Psychological Measurement, 14, 271-282. Rost, J. (1991). A logistic mixture distribution model for polychotomous item responses. British Journal of Mathematical and Statistical Psychology, 44, 75-92. Uebersax, J. S. (1988). Validity inferences from interobserver agreement. Psychological Bulletin, 104, 405-416. Uebersax JS. Probit latent class analysis: conditional independence and conditional dependence models. Appl Psychol Measmt, in press. Uebersax, J. S., & Grove, W. M. (1993). A latent trait finite mixture model for the analysis of rating agreement. Biometrics, 49, 823-835. Clogg, C. C. (1988), "Latent Class Models for Measuring," in Latent Trait and Latent Class Models, eds. R. Langeheine and J. Rost, New York: Plenum, pp. 173-205. Dayton CM. Latent Class Scaling Analysis. Quantitative Applications in the Social Sciences, Vol. 126. Sage Publications, May 1999. Dayton, C. M., and Macready, G. B. (1980), "A Scaling Model With Response Errors and Intrinsically Unscalable Respondents," Psychometrika, 45, 343-356. Goodman, L. A. (1975), "A New Model for Scaling Response Patterns: An Application of the Quasi-Independence Concept," Journal of the American Statistical Association, 70, 755-768. Heinen, T. (1996). Latent class and discrete latent trait models: Similarities and differences. Thousand Oaks, California: Sage. Lindsay, B., Clogg, C. C., & Grego, J. (1991). Semiparametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis. Journal of the American Statistical Association, 86, 96-107. Uebersax, J. S. (1993). Statistical modeling of expert ratings on medical treatment appropriateness. Journal of the American Statistical Association, 88, 421-427. Clogg, C. C. (1979). Some latent structure models for the analysis of Likert-type data. Social Science Research, 8, 287-301. Croon, M. Latent class analysis with ordered latent classes. The British Journal of Mathematical and Statistical Psychology, 1990, 43, 171-192 Croon, M. A. Investigating Mokken scalability of dichotomous items by means of ordinal latent class analysis.British Journal of Mathematical & Statistical Psychology, 1991, 44, 315-331 Dayton, C. M., and Macready, G. B. (1988), "Concomitant-Variable Latent Class Models," Journal of the American Statistical Association, 83, 173-178. Dayton CM, Macready GB. Use of Categorical and Continuous Covariates in Latent Class Analysis. In: Advances in Latent Class Modeling, McCutcheon A, Hagenaars J (eds.), Cambridge University Press, in press. Formann, A. K. (1978). The latent class analysis of polychotomous data. Biometrical Journal, 20, 755-771. Formann, A. K. (1985), "Constrained Latent Class Models: Theory and Applications," British Journal of Mathematical and Statistical Psychology, 38, 87-111. Formann, A. K. (1992). Linear logistic latent class analysis for polytomous data. Journal of the American Statistical Association, 87, 476-486. Formann, A. K. (1992). Linear logistic latent class analysis for polytomous data. J. Amer. Statist. Assoc., 87, 476-486. Rost, J., "A Latent Class Model for Rating Data," Psychometrika, Vol. 50, No. 1, 37-49, 1985. Rost, J. (1988). Rating scale analysis with latent class models. Psychometrika, 53, 327-348. Uebersax, J. S. (1993). Statistical modeling of expert ratings on medical treatment appropriateness. Journal of the American Statistical Association, 88, 421-427. Anderson, T. W. (1959). Some scaling methods and estimation procedures in the latent class model. In Probability and Statistics, U. Grenander, ed. New York: Wiley, pp. 9-38. Gibson, W. A. (1959). Three multivariate models: Factor analysis, latent structure analysis, and latent profile analysis. Psychometrika, 24, 229-252. Bartholomew, David J. 1987. Latent Variable Models and Factor Analysis. New York: Oxford University Press. Day, N. E. Estimating the components of a mixture of normal distributions. Biometrika, 1969, 56, 463-474. Everitt, B. S. (1988). A finite mixture model for the clustering of mixed-model data. Statistics and Probability Letters, 6, 305-309. Everitt, B. S., & Merette, C. (1990). The clustering of mixed-mode data: A comparison of possible approaches. Journal of Applied Statistics, 17, 283-297. Uebersax, J. S. (1996). On the dimensionality of a latent class analysis solution. (Unpublished paper; based on 'Dimension reduction and latent class analysis,' paper presented at the annual meeting of the Classification Society of North America, Pittsburgh, June 1993). Wolfe, J. H. Pattern clustering by multivariate mixture analysis. Multivariate Behavioral Research, 1970, 5, 329-350. Day, 1969 Titterington, Smith & Makov, 1985 Wolfe, 1970 McCutcheon AC. Latent class analysis. Beverly Hills: Sage Publications, 1987. Clogg, C. C. (1995). Latent class models. In G. Arminger, C. C. Clogg, & M. E. Sobel (Eds.), Handbook of statistical modeling for the social and behavioral sciences (Ch. 6; pp. 311-359). New York: Plenum. Rost J, Langeheine R. A guide through latent structure models for categorical data. In J. Rost & R. Langeheine (Eds.), Applications of latent trait and latent class models in the social sciences. New York: Waxmann, 1997. Heinen, T. (1996). Latent class and discrete latent trait models: Similarities and differences. Thousand Oaks, California: Sage. Lindsay, B., Clogg, C. C., & Grego, J. (1991). Semiparametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis. Journal of the American Statistical Association, 86, 96-107. J. A. Hagenaars and A. L. McCutcheon (eds.), Applied latent class analysis.Cambridge: Cambridge University Press, 2000 Magidson Jay, Vermunt Jeroen, Latent class factor and cluster models, biplots and related graphical displays."Forthcoming (2001) in Sociological Methodology Volume 31. Boston: Blackwell Publishers. Magidson Jay, Vermunt Jeroen, Latent Class Models."Forthcoming (2002) in the Market Research Book, to be published by the Direct Marketing Association. Magidson Jay, Vermunt Jeroen, "Graphical displays for latent class cluster and latent class factor models." Proceedings COMPSTAT 2000 conference. Utrecht: The Netherlands Magidson Jay, Multivariate Statistical Models for Categorical Data," chapter 3 in Bagozzi, Richard, Advanced Methods of Marketing Research, Blackwell, 1994. Magidson Jay, "The CHAID Aproach to Segmentation Modeling: CHi-squared Automatic Interaction Detection," chapter 4 in Bagozzi, Richard, Advanced Methods of Marketing Research, Blackwell, 1994. Magidson Jay, "The Use of the New Ordinal Algorithm in CHAID to Target Profitable Segments."Journal of Database Marketing, London: Henry Stewart Publication, July 1993. Magidson Jay, "Improved Statistical Techniques for Response Modeling: Progression Beyond Regression," cover article in Journal of Direct Marketing, Vol. 2, No. 4, 1988. Magidson Jay, "Some Common Pitfalls in the Causal Analysis of Categorical Data."Journal of Marketing Research, November 1982. Lazarsfeld PF, Henry NW. Latent structure analysis. Boston: Houghton Mifflin, 1968. Lord FM, Novick MR. Statistical theories of mental test scores. Reading, Massachusetts: Addison-Wesley, 1968. Heinen T. Latent class and discrete latent trait models: Similarities and differences. Thousand Oaks, California: Sage, 1996. Dayton CM. Latent class scaling analysis. (Quantitative Applications in the Social Sciences, Vol. 126.) Newbury Park, California: Sage Publications, May 1999. Safrit MJ, Cohen AS, Costa MG. Item response theory and the measurement of motor behavior. Research Quarterly For Exercise and Sport, 1989, 60, 325-335. Bock RD, Aitkin M. Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika, 1981, 46, 443-459. Hulin CL, Drasgow F, Parsons CK. Item response theory. Homewood, Illinois: Dow Jones-Irwin, 1983. Hambleton RK, Swaminathan H, Rogers HJ. Fundamentals of item response theory. Newbury Park: Sage, 1991 Dayton, C. Mitchell (1998). Latent class scaling analysis. Quantitative Applications in the Social Sciences Series No. 126. Thousand Oaks, CA: Sage Publications. McCutcheon, A. L. (1987). Latent class analysis. Quantitative Applications in the Social Sciences Series No. 64. Thousand Oaks, CA: Sage Publications. McCutcheon A, Hagenaars J., eds. (1999). Advances in Latent Class Modeling. Cambridge, UK and NY: Cambridge University Press. Ellis, J.L. y Wollenberg, A.L. van den (1993). Local Homogeneity in latent trait models. A characterization of the homogeneous monotone IRT model.Psychometrika, 58, 417-429. Gifi, A. (1990). Nonlinear Multivariate Analysis. New York: Wiley Grayson, D.A. (1988). Two-group classification in latent trait theory: Scores with monotone likelihood ratio. Psychometrika, 53, 383-392. Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale, New Jersey: Erlbaum. Meijer, R.R., Sijtsma, K. y Smid, N.G. (1990). Theorical and empirical comparison of the Mokken and the Rasch approach to IRT. Applied Psychological Measurement, 14, 283-298. Mokken, R.J. (1971). A Theory and procedure of scale analysis. The Hague: Mouton. Mokken, R.J. (1997). Nonparametrics models for dichotomous responses. En W.J. van der Linden y R.K. Hambleton (Eds.). Handbook of Modern Item Response Theory. New York: Springer. Mokken, R.J. y Lewis, C. (1982). A nonparametric approach to the analysis of dichotomous item responses. Applied Psychological Measurement, 6, 417-430. Mokken, R.J., Lewis, C. y Sijtsma, K.(1986). Rejoinder to " the Mokken scale: A critical discussion". Applied Psychological Measurement, 10, 279-285. Rivas, T. (1998). Mokken scale analysis: An application to ítems of Numerical Inductive Reasoning. 11th European Meeting of the Psychometric Society, Lueneburg, Alemania. Rosenbaum, P.R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory. Psychometrika, 49, 425-435. Rosenbaum, P.R. (1987). Comparing item characteristic curves. Psychometrika, 52, 217-233. Sijtsma, K. (1998). Methodology Review. Non parametric IRT approaches to the analysis of dichotomous item scores. Applied Psychological Measurement, 22, 331. Sijtsma, K. y Meijer, R.R. (1992). A method for investigating the intersection of item response functions in Mokken's noparametric IRT model. Applied Psychological Measurement, 16, 149-157. Sijtsma, K. y Molenaar, I.W. (1987). Reliability of test scores in nonparametric item response theory. Psychometrika, 52, 79-97. Torgerson, W.S. (1958). Theory and methods of Scaling. New York: Wiley. Sijtsma K. Molenaar I.W., Introduction to Mokken's nonparametric Item Response Theory, (Measurement Methods for the Social Sciences Series), Sage Publications, 2002 Molenaar, W. (1998). Data, model, conclusion, doing it again. Psychometrika, 63, 315-340. Hoijtink, H.J.A. & Molenaar, W. (1997). A multidimensional item response model: Constrained latent class analysis using the gibbs sampler and posterior predictive checks. Psychometrika, 62, 171-189. Molenaar, W. (1997). Nonparametric models for polytomous responses. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of Modern Item Response Theory (pp. 369-380). New York: Springer. Molenaar, W. & Hoijtink, H.J.A. (1996). Person fit and the Rasch model, with an application to knowledge of logical quantors. Applied Measurement in Education, 9 , 27-45. Molenaar, W. & Lewis, C. (1996). Bayes-Statistik. In: E. Erdfelder, R. Mausfeld, T. Meiser, & G. Rudinger (Eds.), Handbuch Quantitative Methoden (pp. 145156). Weinheim: Psychologie Verlags Union. Hemker, B.T., Sijtsma, K., Molenaar, W., & Junker, B.W., (1996). Polytomous IRT models and monotone likelihood ratio in the total score. Psychometrika, 61, 679-693. Fischer, G.H. & Molenaar, I.W. (Eds.) (1995). Rasch models: Foundations, recent developments and applications. New York: Springer Boomsma, A., Van Duijn, M.A.J., & Snijders, T.A.B. (Eds). (2001). Essays on item response theory. New York: Springer Verlag Hoijtink, H.J.A. & Molenaar, W. (1997). A multidimensional item response model: Constrained latent class analysis using the gibbs sampler and posterior predictive checks. Psychometrika, 62, 171-189. Molenaar, W. (1997). Nonparametric models for polytomous responses. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of modern item response theory (pp. 369-380). New York: Springer. Hemker, B.T., Sijtsma, K., Molenaar, W., & Junker, B.W., (1996). Polytomous IRT models and monotone likelihood ratio in the total score. Psychometrika, 61, 679-693. Hoijtink, H.J.A. & Boomsma, A. (1996). Statistical inference based on latent ability estimates. Psychometrika, 61, 100-200. Molenaar, W. & Hoijtink, H.J.A. (1996). Person fit and the Rasch model, with an application to knowledge of logical quantors. Applied Measurement in Education, 9 , 27-45. Fischer, G.H. & Molenaar, I.W. (Eds.) (1995). Rasch models: Foundations, recent developments and applications. New York: Springer. Post, W.J. (1992). Nonparametric unfolding models, a latent trait approach. (M&T Series 21.) Leiden: DSWO Press. Hoijtink, H.J.A. (1991). PARELLA, measurement of latent traits by proximity items. (M&T Series 20.) Leiden: DSWO Press. Birnbaum Verhelst Masters Samejima Van Schuur Hoijtink 1. De Ayala, RJ. Item parameter recovery for the nominal response model. APPLIED PSYCHOLOGICAL MEASUREMENT, 1999 MAR, V23 N1:319. 2. Eid, M; Hoffmann, L. Measuring variability and change with an item response model for polytomous variables. JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS, 1998 FALL, V23 N3:193-215. 3. Wang, TY; Zeng, LJ. Item parameter estimation for a continuous response model using an EM algorithm. APPLIED PSYCHOLOGICAL MEASUREMENT, 1998 DEC, V22 N4:333344. 4. Kim, SH; Cohen, AS. Detection of differential item functioning under the graded response model with the likelihood ratio test. APPLIED PSYCHOLOGICAL MEASUREMENT, 1998 DEC, V22 N4:345355. 5. Butter, R; DeBoeck, P; Verhelst, N. An item response model with internal restrictions on item difficulty. PSYCHOMETRIKA, 1998 MAR, V63 N1:47-63. 6. Schnipke, DL; Scrams, DJ. Modeling item response times with a two-state mixture model: A new method of measuring speededness. JOURNAL OF EDUCATIONAL MEASUREMENT, 1997 FALL, V34 N3:213-232. 7. Hoijtink, H; Molenaar, IW. A multidimensional item response model: Constrained latent class analysis using the Gibbs sampler and posterior predictive checks. PSYCHOMETRIKA, 1997 JUN, V62 N2:171-189. 8. Roberts, JS; Laughlin, JE. A unidimensional item response model for unfolding responses from a graded disagree-agree response scale. APPLIED PSYCHOLOGICAL MEASUREMENT, 1996 SEP, V20 N3:231255. 9. Ferrando, PJ. Calibration of invariant item parameters in a continuous item response model using the extended lisrel measurement submodel. MULTIVARIATE BEHAVIORAL RESEARCH, 1996, V31 N4:419-439. 10. ROBLES J. PRS - POLYTOMOUS RESPONSE SIMULATOR - POLYTOMOUS ITEM GENERATION ACCORDING TO THE COMMON FACTOR MODEL. APPLIED PSYCHOLOGICAL MEASUREMENT, 1996 JUN, V20 N2:140140. Pub type:Software Review. 11. REISER M. ANALYSIS OF RESIDUALS FOR THE MULTINOMIAL ITEM RESPONSE MODEL. PSYCHOMETRIKA, 1996 SEP, V61 N3:509-528. 12. KIRISCI L; MOSS HB; TARTER RE. PSYCHOMETRIC EVALUATION OF THE SITUATIONAL CONFIDENCE QUESTIONNAIRE IN ADOLESCENTS - FITTING A GRADED ITEM RESPONSE MODEL. ADDICTIVE BEHAVIORS, 1996 MAY-JUN, V21 N3:303-317. 13. LANE S; STONE CA; ANKENMANN RD; LIU M. EXAMINATION OF THE ASSUMPTIONS AND PROPERTIES OF THE GRADED ITEM RESPONSE MODEL - AN EXAMPLE USING A MATHEMATICS PERFORMANCE ASSESSMENT. APPLIED MEASUREMENT IN EDUCATION, 1995, V8 N4:313-340. 14. KIRISCI L; TARTER RE; HSU TC. FITTING A TWO-PARAMETER LOGISTIC ITEM RESPONSE MODEL TO CLARIFY THE PSYCHOMETRIC PROPERTIES OF THE DRUG USE SCREENING INVENTORY FOR ADOLESCENT ALCOHOL AND DRUG ABUSERS. ALCOHOLISM-CLINICAL AND EXPERIMENTAL RESEARCH, 1994 DEC, V18 N6:1335-1341. 15. COHEN AS; KIM SH; BAKER FB. DETECTION OF DIFFERENTIAL ITEM FUNCTIONING IN THE GRADED RESPONSE MODEL. APPLIED PSYCHOLOGICAL MEASUREMENT, 1993 DEC, V17 N4:335350. 16. HOIJTINK H; MOLENAAR I. AN ITEM RESPONSE MODEL WITH SINGLE PEAKED ITEM CHARACTERISTIC CURVES THE PARELLA MODEL. QUALITY & QUANTITY, 1994 FEB, V28 N1:99-116. 17. BATLEY RM; BOSS MW. THE EFFECTS ON PARAMETER ESTIMATION OF CORRELATED DIMENSIONS AND A DISTRIBUTION-RESTRICTED TRAIT IN A MULTIDIMENSIONAL ITEM RESPONSE MODEL. APPLIED PSYCHOLOGICAL MEASUREMENT, 1993 JUN, V17 N2:131141. 18. BERGER MPF. SEQUENTIAL SAMPLING DESIGNS FOR THE 2-PARAMETER ITEM RESPONSE THEORY MODEL. PSYCHOMETRIKA, 1992 DEC, V57 N4:521-538. 19. CAMILLI G. A CONCEPTUAL ANALYSIS OF DIFFERENTIAL ITEM FUNCTIONING IN TERMS OF A MULTIDIMENSIONAL ITEM RESPONSE MODEL. APPLIED PSYCHOLOGICAL MEASUREMENT, 1992 JUN, V16 N2:129147. 20. SIJTSMA K; MEIJER RR. A METHOD FOR INVESTIGATING THE INTERSECTION OF ITEM RESPONSE FUNCTIONS IN MOKKEN NONPARAMETRIC IRT MODEL. APPLIED PSYCHOLOGICAL MEASUREMENT, 1992 JUN, V16 N2:149157. 21. HOLDEN RR; KRONER DG; FEKKEN GC; POPHAM SM. A MODEL OF PERSONALITY TEST ITEM RESPONSE DISSIMULATION. JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 1992 AUG, V63 N2:272-279. 22. ABRAHAMOWICZ M; RAMSAY JO. MULTICATEGORICAL SPLINE MODEL FOR ITEM RESPONSE THEORY. PSYCHOMETRIKA, 1992 MAR, V57 N1:5-27. 23. NUGENT WR; HANKINS JA. ITEM-RESPONSE THEORY - THE 2-PARAMETER MODEL, RASCH MODEL, AND INVARIANCE CRITERIA. SOCIAL SERVICE REVIEW, 1991 JUN, V65 N2:322-328. Pub type:Discussion. 24. LINDSAY B; CLOGG CC; GREGO J. SEMIPARAMETRIC ESTIMATION IN THE RASCH MODEL AND RELATED EXPONENTIAL RESPONSE MODELS, INCLUDING A SIMPLE LATENT CLASS MODEL FOR ITEM ANALYSIS. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1991 MAR, V86 N413:96-107. 25. MURAKI E. FITTING A POLYTOMOUS ITEM RESPONSE MODEL TO LIKERT-TYPE DATA. APPLIED PSYCHOLOGICAL MEASUREMENT, 1990 MAR, V14 N1:5971. 26. WILSON M. EMPIRICAL EXAMINATION OF A LEARNING HIERARCHY USING AN ITEM RESPONSE THEORY MODEL. JOURNAL OF EXPERIMENTAL EDUCATION, 1989 SUMMER, V57 N4:357-371. 27. REISER M. AN APPLICATION OF THE ITEM-RESPONSE MODEL TO PSYCHIATRIC EPIDEMIOLOGY. SOCIOLOGICAL METHODS & RESEARCH, 1989 AUG, V18 N1:66-103. 28. TENVERGERT E; KINGMA J; TAERUM T. PSYCHOLOGY OF COMPUTER USE .8. UTILIZING A NONPARAMETRIC ITEM RESPONSE MODEL TO DEVELOP UNIDIMENSIONAL SCALES - MOKSCAL. PERCEPTUAL AND MOTOR SKILLS, 1989 JUN, V68 N3:987-1000. Ayabe Harold, Delong David, Gee Travis, Hox Joop, Junker Brian, Dennis Roberts, Stenbeck Magnus, Terry Robert, Hambleton, Swaminathan and Rogers (1985), Andrich D. (1988). Rasch models for measurement. Newbury Park, CA: Sage. Bock RD, Lieberman M. (1970). Fitting a response curve model for dichotomously scored items. Psychometrika, 35, 179-198. Hambleton RK, Swaminathan H. (1985). Item response theory. Boston: Kluwer-Nijohoff (pp. 144-147, 7.8 Approximate Estimation Procedures) Hambleton RK, Swaminathan H, Rogers HJ. (1991). Fundamentals of Item Response Theory Newbury Park: Sage. Kelderman, H. (1984). Loglinear Rasch model tests. Psychometrika 49, 223-245. Lindsay B, Clogg CC, Grego J. (1991). Semiparametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis. JASA, 86, 96-107. Lord FM, Novick MR. (1968). Statistical theories of mental test scores. Reading, Massachusetts: Addison-Wesley. Rasch G. (1960; reprinted 1980). Probabilistic models for some intelligence and attainment tests. Chicago: University of Chicago Press. Safrit MJ, Cohen AS, Costa MG. (1989). Item response theory and the measurement of motor behavior. Research Quarterly For Exercise and Sport, 60, 325-335. Wright BD, Stone MH. (1979). Best test design: Rasch measurement. Chicago: MESA Press. Adema, J.J., & van der Linden, W.J. (1989). Algorithms for computerized test construction of parallel tests using classical item parameters. Journal of Educational Statistics, 15, 129-145. Aitchison, J., & Silvey, S.D. (1958). Maximum likelihood estimation of parameters subject to restraints. Annals of Mathematical Statistics, 29, 813-828. American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1985). Standards for educational and psychological testing. Washington, DC: American Psychological Association. Andersen, E.B. (1970). Asymptotic properties of conditional maximum likelihood estimation. Journal of the Royal Statistical Society, Series B, 32, 283-301. Andersen, E.B. (1973a). A goodness of fit test for the Rasch model. Psychometrika, 38, 123-140. Andersen, E.B. (1973b). Conditional inference and models for measuring. (Unpublished Ph.D. Thesis). Copenhagen: Mentalhygiejnisk Forlag. Andersen, E.B. (1973c). Conditional inference for multiple-choice questionnaires. British Journal of Mathematical and Statistical Psychology, 26, 31-44. Andersen, E.B., & Madsen, M. (1977). Estimating the parameters of the latent population distribution. Psychometrika, 42, 357-374. Andersen, E.B. (1977). Sufficient statistics and latent trait models. Psychometrika, 42, 69-81. Andersen, E.B. (1980). Discrete statistical models with social science applications. Amsterdam: North Holland. Andersen, E.B. (1985). Estimating latent correlations between repeated testings. Psychometrika, 46, 443-459. Andrich, D. (1978a). A rating formulation for ordered response categories. Psychometrika, 43, 561-573. Andrich, D. (1978b). Scaling attitude items constructed and scored in the Likert tradition. Educational and Psychological Measurement, 38, 665-680. Angoff, W.H. (1971). Scales, norms, and equivalent scores. In: R.L. Thorndike (red.). Educational measurement (2nd ed., pp. 508-600). Washington, DC: American Council on Education. Armstrong, R.D., Jones, D.H., & Wu, I. (1992). An automated test development of parallel tests from a seed test. Psychometrika, 57, 271-288. Bartko, J.J. (1966). The intraclass correlation coefficient as a measure of reliability. Psychological Reports, 19, 3-11. Bartko, J.J., & Carpenter, W.T. (1976). On the methods and theory of reliability. The Journal of Nervous and Mental Disease, 163, 307-317. Bejar, I.I. (1983). Subject matter experts’ assessment of item statistics. Applied Psychological Measurement, 7, 303-310. Bentler, P. M. (1985). Theory and implementation of EQS: A structural equations program. Los Angeles: BMDP Statistical Software. Berger, J.O. (1980). Statistical decision theory: Foundations, concepts and methods. New York: Springer. Berk, R.A. (1986). A consumer’s guide to setting performance standards on criterionreferenced tests. Review of Educational Research, 56, 137-172. Beuk, C.H. (1984). A method for reaching a compromise between absolute and relative standards in examinations. Journal of Educational Measurement, 21, 147-152. Bezembinder, Thom. G. G. (1970). Van rangorde naar continuum. Deventer: Van Loghum Slaterus. Birnbaum, A. (1968). Some latent trait models. In: F.M. Lord, & M.R. Novick. Statistical theories of mental test scores (pp. 397-424). Reading: Addison-Wesley. Bishop, Y.M.M., Fienberg, S.E., & Holland, P.W. (1975). Discrete multivariate analysis: Theory and practice. Cambridge: The MIT Press. Bock, R.D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 29-51. Bock, R.D. (1976). Basic issues in the measurement of change. In: D.N.M. de Gruijter, & L.J.Th. van der Kamp (red.). Advances in psychological and educational measurement (pp. 75-96). London: Wiley. Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of an EM-algorithm. Psychometrika, 46, 443-459. Bock, R.D., Gibbons, R.D., & Muraki, E. (1988). Full-information factor analysis. Psychological Measurement, 13, 261-280. Boekkooi-Timminga, E. (1990). The construction of parallel tests from IRT-based item banks. Journal of Educational Statistics, 15, 129-145. Bol, E., & Verhelst, N.D. (1985). Inhoudelijke en statistische analyse van een leertoets. Tijdschrift voor Onderwijsresearch, 10, 49-68. Bollen, K.A. (1989). Structural equations with latent variables. New York: Wiley. Bosch, L. van den, Gillijns, P., Krom, R., & Moelands, F. (1991). Handleiding schaal vorderingen in spellingvaardigheid 1. Arnhem: Cito. Bradley, T.B. (1983). Remediation of cognitive deficits: A critical appraisal of the Feuerstein model. Journal of Mental Deficiency Research, 27, 79-92. 512 Braun, W.I., & Holland, P.W. (1982). Observed-score test equating: A mathematical analysis of some ETS equating procedures. In: P.W. Holland, & D.B. Rubin (red.). Test equating (pp. 9-49). New York: Academic Press. Brennan, R.L. (1992). Elements of generalizability theory. Iowa City: ACT. Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3, 296-322. Bügel, K. (1991). Sexeverschillen in onderwijsprestaties in Nederland: Een overzicht van de literatuur en enkele nieuwe gegevens. Pedagogische Studiën, 68, 350-370. Bügel, K. (1993). Tekstbegrip moderne vreemde talen: De invloed van sekse en tekstonderwerp op de scores van centrale examens. Tijdschrift voor Onderwijswetenschappen, 23, 162-176. Bügel, K., & Glas, C.A.W. (1991). Item specifieke verschillen in prestaties tussen jongens en meisjes bij tekstbegrip examens moderne vreemde talen. Tijdschrift voor Onderwijsresearch, 16, 337-351. Campbell, D.T., & Fiske, D.W. (1959). Convergent and discriminant validation by the mulititrait-multimethod matrix. Psychological Bulletin, 56, 81-105. Campbell, D.T., & Stanley, J.C. (1966). Experimental and quasi-experimental designs for research. Chicago: Rand McNally. Coombs, C.H. (1964). A theory of data. New York: Wiley. Cardinet, J., Tourneur, Y., & Allal, L. (1981). Extension of generalizability theory and its applications in educational measurement. Journal of Educational Measurement, 18, 183-204; 19, 331-332. Cicchetti, D.V. (1972). A new measure of agreement between rank ordered variables. In Proceedings of the 80th Annual Convention of the American Psychological Association 7, 17-18. Cicchetti, D.V. (1976). Assessing inter-rater reliability for rating scales: Resolving some basic issues. British Journal of Psychiatry, 129, 452-456. Cochran, W. G. (1977). Sampling techniques. New York: Wiley. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46. Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provisions for scales disagreement of partial credit. Psychological Bulletin, 70, 213-220. Cornfield, J., & J.W. Tukey (1956). Average values of mean squares in factorials. Annals of Mathematical Statistics, 27, 907-949. Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. New York: Holt, Rinehart and Winston. 513 Cronbach, L.J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297-334. Cronbach, L.J. (1971). Test validation. In: R.L. Thorndike (red.). Educational Measurement (2nd ed., pp. 443-507). Washington, DC: American Council on Education. Cronbach, L.J., & Meehl, P.E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281-302. Cronbach, L.J., & Furby, L. (1970). How we should measure "change" - or should we? Psychological Bulletin, 74, 68-80. Cronbach, L.J., Gleser, G.C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York: Wiley. Dirickx, Y.M.I., Baas, S.M., & Dorhout, B. (1987). Operationele research. Schoonhoven: Academic Service. Divgi, D.R. (1981). Two direct procedures for scaling and equating tests with item response theory. Paper presented at the annual meeting of the National Council on Measurement in Education. Dixon, W.J. (red.) (1992). BMDP statistical software manual: Vol. 1 and 2. Berkeley: University of California Press. Dousma, T., & Horsten, A. (1989). Tentamineren. Groningen: Wolters-Noordhoff. Drenth, P.J.D., & Sijtsma, K. (1990). Testtheorie: Inleiding in de theorie van de psycholo- gische test en zijn toepassingen. Houten: Bohn Stafleu Van Loghum. Dunn, G. (1989). Design and analysis of reliability studies: The statistical evaluation of measurement errors. New York: Oxford University Press. Ebel, R.L. (1967). The relation of item discrimination to test reliability. Journal of Educational Measurement, 4, 125-128. Ebel, R.L. (1972). Essentials of educational measurement. Englewood Cliffs: PrenticeHall. Ebel, R.L. (1983). The practical validation of tests of ability. Educational Measurement: Issues and Practice, 2, 7-10. Ebel, R.L., & Frisbie, D.A. (1986). Essentials of educational measurement. Englewood Cliffs: Prentice Hall. Eggen, T.J.H.M. (1990). Innovative procedures in the calibration of measurement scales. In: W.H. Schreiber, & K. Ingenkamp (red.). International developments in large scale assessment (pp.199-212). Windsor, Berkshire: NFER-NELSON. 514 Eggen, T.J.H.M., & Verhelst, N.D. (1992). Item calibration in incomplete testing designs. (Measurement and Research Department Reports 92-3). Arnhem: Cito. Elliott, C.D., Murray, D.J., & Saunders, R. (1977). Goodness of fit to the Rasch model as a criterion of test unidimensionality. Manchester: University of Manchester. Evers, A., Vliet-Mulder, J.C. van, & Laak, J. ter. (1992). Documentatie van tests en testresearch in Nederland. Amsterdam: Nederlands Instituut van Psychologen. Fagot, R.F. (1991). Reliability of ratings for multiple judges: Intraclass correlation and metric scales. Applied Psychological Measurement, 15, 1-11. Fagot, R.F. (1993). A generalized family of coefficients of relational agreement for numerical scales. Psychometrika, 58, 357-370. Feldt, L.S. (1965). The approximate sampling distribution of Kuder-Richardson reliability coefficient twenty. Psychometrika, 30, 357-370. Feldt, L.S. (1993). The relationship between the distribution of item difficulties and test reliability. Applied Measurement in Education 6, 37-49. Feldt, L.S., Steffen, M., & Gupta, N.C. (1985). A comparison of five methods for estimating the standard error of measurement at specific score levels. Applied Psychological Measurement, 9, 351-361. Feldt, L.S, & Brennan, R.L. (1989). Reliability. In: R.L. Linn (red.). Educational Measurement (3rd ed., pp. 105-146). Washington, DC: American Council on Education. Ferguson, G.A., & Takane, Y. (1989). Statistical analysis in psychology and education. New York: McGraw-Hill. Feuerstein, R. (1980). Instrumental enrichment: An intervention program for cognitive modifiability. Baltimore: University Park Press. Fischer, G.H. (1972). A step towards dynamic test-theory. (Research Bulletin Nr. 10/72). Universität Wien: Psychologisches Institut. Fischer, G.H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37, 359-373. Fischer, G.H. (1974). Einführung in die theorie psychologischer tests. Bern: Huber. Fischer, G.H. (1981). On the existence and uniqueness of maximum likelihood estimates in the Rasch model. Psychometrika, 46, 59-77. Fischer, G.H. (1983). Logistic latent trait models with linear constraints. Psychometrika, 48, 3-26. Fischer, G.H. (in voorbereiding). Derivations of the Rasch model. In: G.H. Fischer, & I.W. Molenaar (red.). Rasch models: Their foundations, recent developments and applica515 tions. Fischer, G.H., & Scheiblechner, H. (1970). Algorithmen und programme für das probabi- listische testmodell von Rasch. Psychologische Beiträge, 12, 23-51. Flanagan, J.C. (1951). Units, scores and norms. In: E.F. Lindquist (red.). Educational measurement (pp. 695-763). Washington, DC: American Council on Education. Fleiss, J.L. (1986). The design and analysis of clinical experiments. New York: Wiley. Fleiss, J.L., Cohen, J., & Everitt, B.S. (1969) Large sample standard errors of kappa and weighted kappa. Psychological Bulletin, 72, 5, 323-327. Fleiss, J.L., & Shrout, P.E. (1978). Approximate interval estimation for a certain intraclass correlation coefficient. Psychometrika, 43, 259-262. Follman, D. (1988). Consistent estimation in the Rasch model based on nonparametric margins. Psychometrika, 53, 553-562. Freeman, M.F., & Tukey, J.W. (1950). Transformations related to the angular and square root. The Annals of Mathematical Statistics, 21, 607-611. Frisbie, D.A. (1988). Reliability of scores from teacher-made tests. Educational Measure- ment: Issues and practice, 7, 53-63. Glas, C.A.W. (1981). Het Raschmodel bij data in een onvolledig design. (PSM-Progress reports, 81-1). Utrecht: Vakgroep PSM van de subfaculteit Psychologie. Glas, C.A.W. (1989). Contributions to estimating and testing Rasch models. Arnhem: Cito. Glas, C.A.W. (1992). A Rasch model with a multivariate distribution of ability. In: M. Wilson (red.). Objective measurement: Theory into practice: Vol. 1 (pp. 236-258). Norwood: Ablex. Glas, C.A.W., & Verhelst, N.D. (1989). Extensions of the partial credit model. Psychometrika, 54, 635-659. Glas, C.A.W., & Verhelst, N.D. (in voorbereiding). Testing the Rasch model. In: G.H.Fischer, & I.W.Molenaar (red.). Rasch models: Their foundations, recent developments and applications. Green, S.B., & Lissitz, R.W. (1977). Limitations of coefficient alpha as an index of test unidimensionality. Educational and Psychological Measurement, 37, 827-838. Groot, A.D. de (1966). Vijven en zessen. Groningen: Wolters. Groot, A.D. de, & Naerssen, R.F. (1973). Studietoetsen, construeren, afnemen, analyseren: Deel I en II. Den Haag: Mouton. Gruijter, D.N.M. de (1985). Compromise models for establishing examination standards. Journal of Educational Measurement, 22, 263-269. Guilford, J.P., & Fruchter, B. (1978). Fundamental statistics in psychology and education. Tokyo: McGraw-Hill. 516 Gulliksen, H. (1950). Theory of mental tests. New York: Wiley. Gustafsson, J.E. (1979). PML: A computer program for conditional estimation and testing in the Rasch model for dichotomous items. (Reports from the Institute of Education, nr. 63). Göteborg: University of Göteborg. Guttman, L. A. (1950). The Basis of Scalogram Analysis. In: S.A. Stouffer, L.A. Gutmann, E.A. Suchman, P.F. Lazarsfeld, S.A. Star, & J.A. Clausen (red.). Measurement and prediction: Studies in social psychology in World War II: Vol. 4. Princeton: Princeton University Press. Guttman, L. A. (1954). A new approach to factor analysis: The radex. In: P.F. Lazersfeld (red.). Mathematical thinking in the social sciences (pp. 258-348). New York: Colombia University Press. Haggard, E.A. (1958). Intraclass correlation and the analysis of variance. New York: The Dryden Press. Hambleton, R.K., & Novick, M.R. (1973). Toward an integration of theory and method for criterion-referenced tests. Journal of Educational Measurement, 10, 159-170. Hambleton, R.K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer Academic Publishers. Hambleton, R.K., & Rogers, H.J. (1989). Detecting potentially biased test items: Compa- rison of IRT area and Mantel-Haenszel methods. Applied Psychological Measurement, 2, 313-334. Harris, D.H., & Crouse, J.D. (1992). A study of criteria used in equating. Paper presented at the annual meeting of the National Council on Measurement in Education. Heinen, T. (1993). Discrete latent variable models. Proefschrift, Katholieke Universiteit Brabant. Henrysson, S. (1963). Correction of item-total correlations in item analysis. Psychometrika, 28, 211-218. Hofstee, W.K.B. (1977). Cesuurprobleem opgelost. Onderzoek van Onderwijs, 6/2, 6-7. Hofstee, W.K.B. (1981). Psychologische uitspraken over personen. Deventer: Van Loghum Slaterus. Hofstee, W.K.B. (1983). The case for compromise in educational selection and grading. In Anderson, S.B., & Helmick, J.S. (red.). On educational testing. San Francisco: Jossey-Bass. Hoijtink, H., & Boomsma, A. (1991). Statistical inference with latent ability estimates. (Prepublication Department of Statistics and Measurement Theory). Groningen: University of Groningen. Hoijtink, H. (red.). (1993). Kwantitatieve Methoden nr. 42. 517 Holland, P.W., & Rubin, D.B. (1982). Test equating. New York: Academic Press. Holland, P.W., & Thayer, D.T. (1988). Differential item functioning and the MantelHaenszel procedure. In: H. Wainer, & H.I. Braun (red.). Test validity (pp.129145). Hillsdale: Lawrence Erlbaum. Hommel, G. (1983). Tests of the overall hypothesis for arbitrary dependence structures. Biometrical Journal, 25, 423-430. Houston, W.M., Raymond, M.R., & Svec, J.C. (1991). Adjustments for rater effects in performance assessment. Applied Psychological Measurement, 15, 409-421. Hulin, C.L., Drasgow, F., & Parsons, C.K. (1983). Item response theory: Applications to psychological measurement. Homewood: Dow-Jones Irwin. Iker, H.P., & Perry, N.C.A. (1960). A further note concerning the reliability of the point-biserial correlation. Educational and Psychological Measurement, 20, 505507. Imbos, Tj. (1989). Het gebruik van einddoel toetsen bij aanvang van de studie. Proefschrift, Rijksuniversiteit Limburg. Inspectierapport. (1992). Examens op punten getoetst: Onderzoek naar de ontwikkeling van de normen bij de centrale examens in het voortgezet onderwijs. James, L.R., Demaree, R.G., & Wolf, G. (1984). Estimating within-group interrater relia- bility with and without response bias. Journal of Applied Psychology, 69, 8598. Jannarone, R.J. (1986). Conjunctive item response theory kernels. Psychometrika, 51, 357-373. Jansen, G.G.H. (1979). Het meten van veranderingen in de klassieke testtheorie. (Bulletinreeks nr. 2). Arnhem: Cito. Jarjoura, D. (1983). Best linear prediction of composite universe scores. Psychometrika, 48, 525-539. Jazwinsky, A.H. (1970). Stochastic processes and filtering theory. New York: Academic Press. Johnson, H.M. (1935). Some neglected principles in aptitude testing. American Journal of Psychology, 47 159-165. Jonge, H. de (1963). Inleiding tot de medische statistiek: Deel I. Groningen: WoltersNoordhoff. Jöreskog, K.G. (1970). Estimation and testing of simplex models. The British Journal of Mathematical and Statistical Psychology, 23, 121-145. Jöreskog, K.G., & Sörbom, D. (1989). LISREL 7, user’s reference guide. Mooresville: Scientific Software. 518 Kamphuis, F.H., & Engelen, R.J.H. (in voorbereiding). Estimation and testing of structured latent ability covariance matrices in IRT models. Kane, M.T. (1992). An argument-based approach to validation. Psychological Bulletin, 112, 527-535. Kelderman, H. (1984). Loglinear Rasch model tests. Psychometrika, 49, 223-245. Kelderman, H. (1988). Loglinear multidimensional IRT model for polytomously scored items. (Research Report 88-17). Enschede: Universiteit Twente. Kelderman, H. (1989). Item bias detection using loglinear IRT. Psychometrika, 54, 681697. Kelderman, H., & Steen, R. (1988). LOGIMO I: Loglinear item response theory modeling. (Computer Program). Enschede: University of Twente, Department of Educational Technology. Kelderman, H., & Macready, G.B. (1990). The use of loglinear models for assessing differential item functioning across manifest and latent examinee groups. Journal of Educational Measurement, 27, 307-327. Kelley, T.L. (1947). Fundamentals of statistics. Cambridge: Harvard University Press. Kendall, M., & Stuart, A. (1973). The advanced theory of statistics: Vol. 2. Londen: Griffin. Kiefer, J., & Wolfowitz, J. (1956). Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Annals of Mathematical Statistics, 27, 887-903. Klauer, K.C. (1991). An exact and optimal standardized person test for assessing consistency with the Rasch model. Psychometrika, 56, 213-228. Kolen, M.J. (1988). Defining score scales in relation to measurement error. Journal of Educational Measurement, 25, 97-110. Koppen, M.G.M. (1987). On finding the bidimension of a relation. Journal of Mathematical Psychology, 31, 155-178. Knol, D.L. (1986). Een overzicht van meerdimensionale itemresponsmodellen. (Rapport R-86-5). Enschede: Univeriteit Twente, Faculteit TO, vakgroep OMD. Krippendorff, K. (1970). Estimating the reliability, systematic error and random error of interval data. Educational and Psychological Measurement, 30, 61-70. Krippendorff, K. (1980). Content analysis: An introduction to its methodology. Beverly Hills: Sage Publications. Kuder, G.F., & Richardson, M.W. (1937). The theory of the estimation of test reliability. Psychometrika, 2, 151-160. Lahey, M.A., Downey, R.G., & Saal, F.E. (1983). Intraclass correlations: There’s more than meets the eye. Psychological Bulletin, 93, 586-595. 519 Landis, J.R., & Koch, G.G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159-174. Laros, J.A., & Tellegen, P.J. (1991). Construction and validation of the SON-R 5½-17, the Snijders-Oomen non-verbal intelligence test. Groningen: Wolters-Noordhoff. Lazarsfeld, P.F. (1950). Logical and mathematical foundations of latent structure analysis. In: S.A. Stouffer. Studies in social psychology in World War II, IV. Princeton, NJ: Princeton University Press. LBR (1988). Psychologische tests en allochtonen. Symposiumverslag 1987, LBR-Reeks nr. 6. LBR (1990). Toepasbaarheid van psychologische tests bij allochtonen. Rapport van de testscreeningscommissie ingesteld door het LBR in overleg met het NIP, LBR-Reeks nr. 11. Leeuw, J. de, & Verhelst, N.D. (1986). Maximum likelihood estimation in generalized Rasch models. Journal of Educational Statistics, 11, 183-196. Leeuwe, J.F.J. van (1990). Probabilistic conjunctive models. Proefschrift. Nijmegen: NICI. Linden, W.J. van der (red.). (1982). Aspects of criterion-referenced measurement. Evalua- tion in Education: An International Review Series, 5. Linden, W.J. van der (1983). Van standaardtest naar itembank. Universiteit Twente (oratie). Linden, W.J. van der (1984). Some thoughts on the use of decision theory to set cutoff scores: Comment on De Gruijter and Hambleton. Applied Psychological Measurement, 8, 9-17. Linden, W.J. van der (1985). Decision theory in educational research and testing. In: T. Husén, & T.N. Postlethwaite (red.). International encyclopedia of education: Research and studies. Oxford: Pergamon Press. Linden, W.J. van der, & Boekkooi-Timminga, E. (1988). A zero-one programming approach to Gulliksen’s matched random subtests method. Applied Psychological Measurement, 12, 201-209. Linden, W.J. van der, & Boekkooi-Timminga, E. (1989). A maximin model for test design with practical constraints. Psychometrika, 54, 237-247. Lindsay, B., Clifford, C.C., & Grego, J. (1991). Semiparametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis. Journal of the American Statistical Association, 86, 96-107. Linn, R.L. (red.). (1989). Intelligence: Measurement, theory, and public policy. Chicago: University of Illinois Press. 520 Little, R.J.A., & Rubin, D.B. (1987). Statistical analysis with missing data. New York: Wiley. Livingston, S.A., & Zieky, M.J. (1982). Passing scores: A manual for setting standards of performance on educational and performance tests. Princeton, NJ: Educational Testing Service. Lord, F.M. (1950). Notes on comparable scales for test scores (Research Bulletin 50-48). Princeton, NJ: Educational Testing Service. Lord, F.M. (1952). The relation of the reliability of multiple-choice tests to the distribution of item difficulties. Psychometrika, 17, 181-194. Lord, F.M. (1953). On the statistical treatment of football numbers. The American Psycholo- gist, 8, 750-751. Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale: Lawrence Erlbaum. Lord, F.M. (1983a). Unbiased estimators of ability parameters, their variance and of their parallel-forms reliability. Psychometrika, 48, 233-245. Lord, F.M. (1983b). Estimating the imputed social cost of errors of measurement. (Report RR-83-33-ONR). Princeton, NJ: Educational Testing Service. Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley. Lord, F.M. & Wingerskey, M.S. (1983). Comparison of IRT true-score and equipercentile observed-score ’equatings’. Applied Psychological Measurement, 8, 453-461. MacCann, R.G. (1990). Derivations of observed score equating methods that cater to populations differing in ability. Journal of Educational Statistics, 15, 146-170. Maris, E. (1992). Psychometric models for psychological processes and structures. Proefschrift, Universiteit Leuven. Martin-Löf, P. (1973). Statistika Modeller: Anteckningar från seminarier Lasåret 19691970, utarbetade av Rolf Sunberg. Obetydligt ändrat nytryck, oktober 1973. Stockholm: Institutet för Försäkringsmatematik och Matematisk Statistik vid Stockholms Universitet. Martin-Löf, P. (1974). The notion of redundancy and its use as a quantitative measure if the discrepancy between a statistical hypothesis and a set of observational data. Scandinavian Journal of Statistics, 1, 3-18. Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149174. Masters, G.N., & Wright, B.D. (1984). The essential process in a family of measurement models. Psychometrika, 49, 529-544. 521 Maxwell, A.E., & Pilliner, A.E.G. (1968). Deriving coefficients of reliability and agreement. The British Journal of Mathematical and Statistical Psychology, 21, 105-116. McKinley, R.L., & Reckase, M.D. (1983). MAXLOG: A computer program for the estimation of the parameters of a multidimensional logistic model. Behavior Research Methods and Instrumentation, 15, 389-390. Meerling (1981). Methoden en technieken van psychologisch onderzoek: Deel 1. Meppel: Boom. Mellenbergh, G.J. (1977). The replicability of measures. Psychological Bulletin, 84, 378384. Mellenbergh, G.J. (1982). Contingency table models for assessing item bias. Journal of Educational Statistics, 7, 105-118. Mellenbergh, G.J. (1983). Conditional item bias methods. In: S.H. Irvine, & W.J. Berry (red.). Human assessment and cultural factors (pp. 293-302). New York: Plenum Press. Mellenbergh, G.J. (1985). Vraag-onzuiverheid: definitie, detectie en onderzoek. Nederlands Tijdschrift voor Psychologie, 40, 425-435. Messick, S. (1988). The once and future issues of validity: Assessing the meaning and consequences of measurement. In: H. Wainer, & H.I. Braun (red.). Test validity (pp.33-45). Hillsdale: Lawrence Erlbaum. Messick, S. (1989). Validity. In: R.L. Linn (red.). Educational Measurement (3rd ed., pp. 13-103). Washington, DC: American Council on Education. Millman, J., & Greene, J. (1989). The specification and development of tests of achievement and ability. In: R.L. Linn (red.). Educational Measurement (3rd ed., pp. 335-366). Washington, DC: American Council on Education. Mills, C.N., & Melican, G.J. (1987). A preliminary investigation of three compromise methods for establishing cut-off scores. (Report RR-87-14). Princeton, NJ: Educational Testing Service. Mislevy, R.J. (1984). Estimating latent distributions. Psychometrika, 49, 359-381. Mislevy, R.J. (1986). Bayes modal estimation in item response models. Psychometrika, 51, 177-195. Mislevy, R.J., & Bock, R.D. (1986). PC-BILOG: Maximum likelihood item analysis and test scoring with logistic models for binary items. Mooresville: Scientific Software. Mislevy, R.J., & Wu, P.K. (1988). Inferring examinee ability when some item responses are missing. (Research Report RR-88-48-ONR). Princeton, NJ: Educational Testing Service. Mislevy, R.J., & Sheenan, K.M. (1989). The role of collateral information about examinees in item parameter estimation. Psychometrika, 54, 661-680. 522 Moelands, A.H.J. (1988). Entreetoets: Basisvaardigheden taal, rekenen en informatieverwerking (Verantwoording). Arnhem: Cito. Mokken, R.J. (1971). A theory and procedure of scale analysis. Den Haag: Mouton. Molenaar, I.W. (1981). Programmabeschrijving van PML (versie 3.1) voor het Raschmodel. (Heymans Bulletins Psychologische Instituten R.U.Groningen, nr. HB-81-538-RP). Groningen: Rijksuniversiteit Groningen. Molenaar, I.W. (1983). Item steps. (Heymans Bulletins Psychologische Instituten R.U Groningen, nr. HB-83-630-EX). Groningen: Rijksuniversiteit Groningen. Molenaar I.W., & Hoijtink, H (1990). The many null-distributions of person fit indices. Psychometrika, 55, 75-106. Muskens, G.J. (1980). Frames of meaning - are they measurable? Proefschrift, Katholieke Universiteit Nijmegen. Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical and continuous latent variable indicators. Psychometrika, 49, 115-132. Muthén, B. (1989). LISCOMP: Analysis of linear structural equations with a comprehensive measurement model. Mooresville: Scientific Software. Nedelsky, L. (1954). Absolute grading standards for objective tests. Educational and Psychological Measurement, 14, 3-19. Nederlands Instituut van Psychologen. (1988). Richtlijnen voor ontwikkeling en gebruik van psychologische tests en studietoetsen. Amsterdam: Nederlands Instituut van Psychologen. Novick, M.R. (1966). The axioms and principal results of classical test theory. Journal of Mathematical Psychology, 3, 1-18. Oud, J.H.L., & Mommers (1988). Longitudinale computerondersteunende ondersteuning van lees- en spellingsmoeilijkheden: Een toepassing van het Kalmanfilter in de onderwijs- praktijk. Tijdschrift voor Onderwijsresearch, 13, 3150. Pennings, A.H. (1988). The development of strategies in embedded figure tasks. International Journal of Psychology, 23, 65-78. Pennings, A.H. (1991). Individual differences in the development of the restructuring ability in children. Proefschrift, Rijksuniversiteit Utrecht. Petersen, N.S., Kolen, M.J., & Hoover, H.D. (1989). Scaling, norming, and equating. In R.L. Linn (red.). Eductional Measurement (3rd ed., pp. 221-262). Washington, DC: American Council on Education. Popping, R. (1983). Overeenstemmingsmaten voor nominale data. Proefschrift, Rijksuniversi- teit Groningen. 523 Popping, R. (1989). AGREE: Computing agreement on nominal data, version 5. (User’s manual) Groningen: IEC ProGamma. Popping, R. (1992). Taxonomy on nominal scale agreement 1945 - 1990. Groningen: IEC ProGamma. Rao, C.R. (1948). Large sample tests of statistical hypothesis concerning several parameters with applications to problems of estimation. Proceedings of the Cambridge Philosophical Society, 44, 50-57. Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research. Rasch, G. (1961). On the general laws and the meaning of measurement in psychology. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, 321-333. Berkeley: University of California Press. Rasch, G. (1977). On specific objectivity: An attempt at formalizing the request for generality and validity of scientific statements. Berkeley: University of California Press. Read, T.R.C., & Cressie, N.A.C. (1988). Goodness-of-fit statistics for discrete multivariate data. New York: Springer. Reckase, M.D., & Mckinley, R.L. (1985). Some latent trait theory in a multidimensional latent space. In: D.I. Weiss (red.). Proceedings of the 1982 computerized adaptive testing conference (pp. 151-177). Minneapolis: University of Minnesota. Rigdon S.E., & Tsutakawa, R.K. (1983). Parameter estimation in latent trait models. Psychometrika, 48, 567-574. Rigdon S.E., & Tsutakawa, R.K. (1986). Estimation for the Rasch model when both ability and difficulty parameters are random. Journal of Educational Statistics, 12, 76-86. Roskam, E.E. (1982). Hypotheses non fingo, een methodologische gevalstudie over onderzoek van intelligentietests. Nederlands Tijdschrift voor de Psychologie, 37, 331-359. Rubin, D.B. (1976). Inference and missing data. Biometrika, 63, 581-592. Rubin, D.B. (1980). Using empirical Bayes techniques in law school validity studies. Journal of the American Statistical Association, 75, 801-816. Saal, F.E., Downey, R.G., & Lahey, M. (1980). Rating the ratings: Assessing the psycho- metric quality of rating data. Psychological Bulletin, 88, 413-428. Samejima, F. (1969). Estimation of latent ability using a pattern of graded scores. (Psycho- metric Monograph No. 17). Psychometric Society. 524 Samejima, F. (1972). A general model for free response data. (Psychometric Monograph No. 18). Psychometric Society. Samejima, F. (1973). Homogeneous case of the continuous response model. Psychometrika, 38, 203-219. Samejima, F. (1977). Weakly parallel tests in latent trait theory with some criticisms of classical test theory. Psychometrika, 42, 193-198. Sanders, P.F., Hendrix, A.C., & Luijten, A.J.M. (1984). De beoordeling van de samenvatting Nederlands. Tijdschrift voor Taalbeheersing, 6, 241-251. Sanders, P.F., Theunissen, T.J.J.M., & Baas, S.M. (1989). Minimizing the number of observations: A generalization of the Spearman-Brown formula. Psychometrika, 54, 587-598. Schouten, H.J.A. (1985). Statistical measurement of interobserver agreement: Analysis of agreement and disagreement between observers. Proefschrift, Rijksuniversiteit Utrecht.