Validity from the Perspective of Model-Based Reasoning Robert J. Mislevy Measurement, Statistics and Evaluation University of Maryland, College Park Presented at the conference “The Concept of Validity: Revisions, New Directions and Applications,” University of Maryland, College Park, MD October 9-10, 2008. Supported by a grant from the Spencer Foundation. October 10, 2008 Maryland Validity Conference Slide 1 Overview of the Talk Sources of unease Cognition in terms of patterns Model-based reasoning Measurement models as model-based reasoning Implications for validity Feeling better now October 10, 2008 Maryland Validity Conference Slide 2 Sources of Unease (1) Different models fit the same data Tatsuoka (1983) mixed number subtraction 4 5 7 4 1 7 1 32 October 10, 2008 4 13 1 53 3 2 2 1 4 10 Maryland Validity Conference 8 2 10 Slide 3 Sources of Unease (1) Cognitive diagnosis model for instruction Person of B 0/1 variables, say Container Student characterized metaphor by vector h, for which operations she had mastered Person D Task characterized by which ones the task needed Probability of correct response via latent class model 2PL IRT model for overall proficiency Student characterized by univariate, continuous q, for Measurement metaphor proficiency in the domain Person A Person B Person D Tasks modeled by difficulty & discrimination Probability via Item 1 Item 4 of correct Item 5 response Item 3 Item IRT 6 Itemmodel 2 October 10, 2008 Maryland Validity Conference Slide 4 Sources of Unease (2) Summary test scores, and factors based on them, have often been though of as “signs” indicating the presence of underlying, latent traits. … An alternative interpretation of test scores as samples of cognitive processes and contents … is equally justifiable and could be theoretically more useful. Snow & Lohman, 1989, p. 317 October 10, 2008 Maryland Validity Conference Slide 5 Sources of Unease (2) The evidence from cognitive psychology suggests that test performances are comprised of complex assemblies of component information-processing actions that are adapted to task requirements during performance. Snow & Lohman, 1989, p. 317 October 10, 2008 Maryland Validity Conference Slide 6 Sources of Unease (2) The implication is that sign-trait interpretations of test scores and their intercorrelations are superficial summaries at best. At worst, they have misled scientists, and the public, into thinking of fundamental, fixed entities, measured in amounts. Snow & Lohman, 1989, p. 317 October 10, 2008 Maryland Validity Conference Slide 7 Sources of Unease (2) Whatever their practical value as summaries, for selection, classification, certification, or program evaluation, the cognitive psychological view is that such interpretations no longer suffice as scientific explanations of aptitude and achievement constructs. Snow & Lohman, 1989, p. 317 October 10, 2008 Maryland Validity Conference Slide 8 Sources of Unease (3) What is the nature of parameters like q and h? Where are they? What is the interpretation of the probabilities that arise from IRT, latent class / cognitive diagnosis models, and the like? What does this mean about validity of the data / the models / the uses of them? October 10, 2008 Maryland Validity Conference Slide 9 Cognition in Terms of Patterns The sociocognitive paradigm Metaphors as foundation Formal model-based reasoning October 10, 2008 Maryland Validity Conference Slide 10 The sociocognitive paradigm Converging ideas from cog psych, neurology, anthropology, linguistics, science ed, etc. Knowledge as patterns, at many levels… Assembled to understand, to interact with, and to create particular situations in the world Developed, strengthened, modified by use Associations of all kinds, including applicability, affordances, procedures, strategies, affect October 10, 2008 Maryland Validity Conference Slide 11 Walter Kintsch’s CI Theory of Reading Comprehension Text Text base Context LTM Situation Model Context1 More focused research areas within cognitive psychology today differ as to their foci, methods, and levels of explanation. They include perception and attention, language and communication, development of expertise, situated and sociocultural psychology, and neurological bases of cognition. October 10, 2008 Kintsch is focusing here on “experiential” cognition – not conscious, occurring at the scale of milliseconds. We’ll talk about reflective cognition in a couple minutes. Maryland Validity Conference Slide 12 Walter Kintsch’s CI Theory of Reading Comprehension Text Text base Context LTM Situation Model Action Context1 More focused research areas within cognitive psychology today differ as to their foci, methods, and levels of explanation. They include perception and attention, language and communication, development of expertise, situated and sociocultural psychology, and neurological bases of cognition. October 10, 2008 Context2 Maryland Validity Conference Slide 13 Walter Kintsch’s CI Theory of Reading Comprehension Text Text base Context LTM Situation Model Action Context1 More focused research areas within cognitive psychology today differ as to their foci, methods, and levels of explanation. They include perception and attention, language and communication, development of expertise, situated and sociocultural psychology, and neurological bases of cognition. October 10, 2008 Context2 Maryland Validity Conference Slide 14 Walter Kintsch’s CI Theory of Reading Comprehension Text More focused research areas within cognitive psychology today differ as to their foci, methods, and levels of explanation. They include perception and attention, language and communication, development of expertise, situated and sociocultural psychology, and neurological bases of cognition. October 10, 2008 Text base Context LTM Situation Model Action Context2 Context3 Maryland Validity Conference Slide 15 Metaphors as foundation Lakoff & Johnson » Metaphors we live by (1980); Philosophy in the flesh (1999) Key idea: » Cognitive machinery builds from capabilities for interacting with the real physical and social world. » We extend and creatively recombine basic patterns and relationships to think about everything from … everyday things to extremely complicated and abstract social, conceptual, philosophical realms True of both experiential and reflective cognition. October 10, 2008 Maryland Validity Conference Slide 16 Metaphors as foundation Example: Containers Free Clip Art Provided by Artclips.com October 10, 2008 Maryland Validity Conference Slide 17 Metaphors as foundation Example: Containers Everyday experience Set theory » Very good, mostly. Knowledge as collection of discrete things inside our heads » Usually good and useful, in communication » Sometimes inapt, as sole basis of instructional practice and assessment design (the Jeopardy model of cognition—Rosie Perez in White men can’t jump) October 10, 2008 Maryland Validity Conference Slide 18 Metaphors as foundation Example: Cause & Effect October 10, 2008 Maryland Validity Conference Slide 19 Metaphors as foundation Example: Cause & Effect Newton’s laws; kinematics; quantitative models of force and motion, esp. F=MA October 10, 2008 Maryland Validity Conference Slide 20 Metaphors as foundation Example: Cause & Effect q xj IRT & SEM models; quantitative models for response probabilities, esp. Rasch’s P=qd. October 10, 2008 Maryland Validity Conference Slide 21 Metaphors as foundation Example: Cause & Effect Everyday experience F=MA » Very good, mostly. Teleological theories of history, a la Hegel » Not so good, mostly. October 10, 2008 Maryland Validity Conference Slide 22 Model-Based Reasoning Representational Form A y=ax+b Representational Form B (y-b)/a=x Mainly syntactic Mappings among representational systems Entities and relationships Real-World Situation October 10, 2008 Mainly semantic Reconceived Real-World Situation Maryland Validity Conference Slide 23 Properties of Models (1) Human way to think about complex unique situations Abstract structure of entities, relationships, processes What’s included, what’s omitted Levels of analysis and grainsize » Newtonian and quantum mechanics » Transmission genetics at level of species, individuals, cells, or molecules October 10, 2008 Maryland Validity Conference Slide 24 Properties of Models (2) Can apply different models to same situation » Can view selling car to brother-in-law in terms of economic transaction model vs family relationships model Models tuned to uses / problems / purposes » Mixed number subtraction October 10, 2008 Maryland Validity Conference Slide 25 Properties of Models (2) The modeling cycle: » » » » October 10, 2008 Revise Observe Evaluate Model Fit? Predict/Use Does it work? What’s left out? Adequacy of rationale? Maryland Validity Conference Slide 26 Models with probabilistic layers Probability from analogy with physical games of chance (Shafer) Probability connects to model representation » Key in model criticism Model posits space for patterns; parameter values characterize them; probability models can characterize … » Variation in patterns » Modeler’s uncertainty about patterns & parameters October 10, 2008 Maryland Validity Conference Slide 27 Psychometric / Measurement Models E.g., IRT, CTT, FA, SEM, CDM Model posits space for patterns, parameter values characterize them Semantic layer is cause & effect metaphor » Q: In what sense does q “cause” X? » A: The C&E metaphor grounds productive connection between observations and inferences Modeling patterns across people, not explaining item responses (Snow & Lohman) » Could model within-person processes at finer grainsize October 10, 2008 Maryland Validity Conference Slide 28 Some answers What is the nature of parameters like q and h? Where are they? » These are characterizations of patterns we observe in real-world situations (ones we in part construct for target uses) through the lens of a simplified model we are (provisionally) using to think about those situations and the use situations in which the patterns are apt to be relevant. » So they are in our heads, but they aren’t worth much unless they reflect patterns in examinees’ actions in the world. October 10, 2008 Maryland Validity Conference Slide 29 Some answers What is the interpretation of the probabilities that arise from IRT, latent class / cognitive diagnosis models, and the like? » These are characterizations of patterns we observe in situations and our degree of knowledge about them, again through the lens of a simplified model we are (provisionally) using to think about those situations. » In addition to guiding inference through the model, they provide tools for seeing where the model may be misleading, inadequate. October 10, 2008 Maryland Validity Conference Slide 30 Some answers What does this mean about validity of the data / the models / the uses of them? October 10, 2008 Maryland Validity Conference Slide 31 Validity Evidence Representational Form A y=ax+b Representational Form B (y-b)/a=x Theory and experience supporting the narrative/scientific frame Mappings among representational systems Theoretical Entities and relationships and empirical grounding of task-scoring procedures Theoretical and empirical Real-World Situation grounding of task design October 10, 2008 Empirical evaluation of predictions / outcomes Reconceived Real-World Situation Maryland Validity Conference Slide 32 Validity Implications, Sense 1 The currently dominant view: Validity is an integrated evaluative judgement of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment. (Messick, 1989) Focus on situated use of data from test Consistent with MBR perspective; i.e., reasoning through psychometric model in particular situations & inferences. October 10, 2008 Maryland Validity Conference Slide 33 Validity Implications, Sense 2 Alternative (e.g., Wiley, Borsboom, Lissitz): [A] test is valid for measuring an attribute if and only if (a) the attribute exists and (b) variations in the attribute causally produce variations in the outcomes of the measurement procedure. (Borsboom et al, 2004) MBR view can omit specific uses, but » must consider range of situations and uses that are apt to be thought about effectively via the model. » Broader range consistent with scientific program, in opposition to Snow & Lohman quote. » Is realist but strong correspondence to existence of traits qua traits in individuals is not required. October 10, 2008 Maryland Validity Conference Slide 34 I am Feeling Better Now Model-based reasoning provides a way of thinking about validity that … is consistent with the practical methods that have developed to assure quality of inferences from assessments is realist, in constructive-realism and L&J’s “embodied realism” sense is consistent with developments in cognitive psychology, including the nature of scientific reasoning, and the meaning of probability. October 10, 2008 Maryland Validity Conference Slide 35