Validity from the Perspective of Model-Based Reasoning Robert J. Mislevy

advertisement
Validity from the Perspective of
Model-Based Reasoning
Robert J. Mislevy
Measurement, Statistics and Evaluation
University of Maryland, College Park
Presented at the conference “The Concept of Validity: Revisions, New
Directions and Applications,” University of Maryland, College Park, MD
October 9-10, 2008.
Supported by a grant from the Spencer Foundation.
October 10, 2008
Maryland Validity Conference
Slide 1
Overview of the Talk

Sources of unease

Cognition in terms of patterns

Model-based reasoning

Measurement models as model-based
reasoning

Implications for validity

Feeling better now
October 10, 2008
Maryland Validity Conference
Slide 2
Sources of Unease (1)
Different models fit the same data
Tatsuoka (1983) mixed number subtraction
4
5
7
4
1 7
1
32
October 10, 2008

4 13  1 53 
3
2
2 
1
4 10
Maryland Validity Conference
8
 2 10

Slide 3
Sources of Unease (1)
Cognitive diagnosis model for instruction
Person of
B 0/1 variables, say
Container
Student characterized
metaphor by vector
h, for which operations she had mastered
Person D
 Task characterized by which ones the task needed
 Probability of correct response via latent class model
2PL IRT model for overall proficiency
 Student characterized by univariate, continuous q, for
Measurement
metaphor
proficiency in the
domain
Person A
Person B
Person D
 Tasks modeled by difficulty & discrimination
 Probability
via
Item 1 Item 4 of correct
Item 5 response
Item 3
Item IRT
6 Itemmodel
2
October 10, 2008
Maryland Validity Conference
Slide 4
Sources of Unease (2)
Summary test scores, and factors based
on them, have often been though of as
“signs” indicating the presence of
underlying, latent traits. …
An alternative interpretation of test
scores as samples of cognitive
processes and contents … is equally
justifiable and could be theoretically
more useful.
Snow & Lohman, 1989, p. 317
October 10, 2008
Maryland Validity Conference
Slide 5
Sources of Unease (2)
The evidence from cognitive psychology
suggests that test performances are
comprised of complex assemblies of
component information-processing
actions that are adapted to task
requirements during performance.
Snow & Lohman, 1989, p. 317
October 10, 2008
Maryland Validity Conference
Slide 6
Sources of Unease (2)
The implication is that sign-trait
interpretations of test scores and their
intercorrelations are superficial
summaries at best. At worst, they
have misled scientists, and the public,
into thinking of fundamental, fixed
entities, measured in amounts.
Snow & Lohman, 1989, p. 317
October 10, 2008
Maryland Validity Conference
Slide 7
Sources of Unease (2)
Whatever their practical value as
summaries, for selection,
classification, certification, or program
evaluation, the cognitive
psychological view is that such
interpretations no longer suffice as
scientific explanations of aptitude and
achievement constructs.
Snow & Lohman, 1989, p. 317
October 10, 2008
Maryland Validity Conference
Slide 8
Sources of Unease (3)

What is the nature of parameters like q and
h? Where are they?

What is the interpretation of the probabilities
that arise from IRT, latent class / cognitive
diagnosis models, and the like?

What does this mean about validity of the
data / the models / the uses of them?
October 10, 2008
Maryland Validity Conference
Slide 9
Cognition in Terms of Patterns

The sociocognitive paradigm

Metaphors as foundation

Formal model-based reasoning
October 10, 2008
Maryland Validity Conference
Slide 10
The sociocognitive paradigm


Converging ideas from cog psych, neurology,
anthropology, linguistics, science ed, etc.
Knowledge as patterns, at many levels…

Assembled to understand, to interact with, and to
create particular situations in the world

Developed, strengthened, modified by use

Associations of all kinds, including applicability,
affordances, procedures, strategies, affect
October 10, 2008
Maryland Validity Conference
Slide 11
Walter Kintsch’s CI Theory of
Reading Comprehension
Text
Text base
Context
LTM
Situation Model
Context1
More focused
research areas
within cognitive
psychology
today differ as
to their foci,
methods, and
levels of
explanation.
They include
perception and
attention,
language and
communication,
development of
expertise,
situated and
sociocultural
psychology, and
neurological
bases of
cognition.
October 10, 2008
Kintsch is focusing here on
“experiential” cognition – not
conscious, occurring at the scale
of milliseconds.
We’ll talk about reflective
cognition in a couple minutes.
Maryland Validity Conference
Slide 12
Walter Kintsch’s CI Theory of
Reading Comprehension
Text
Text base
Context
LTM
Situation Model
Action
Context1
More focused
research areas
within cognitive
psychology
today differ as
to their foci,
methods, and
levels of
explanation.
They include
perception and
attention,
language and
communication,
development of
expertise,
situated and
sociocultural
psychology, and
neurological
bases of
cognition.
October 10, 2008
Context2
Maryland Validity Conference
Slide 13
Walter Kintsch’s CI Theory of
Reading Comprehension
Text
Text base
Context
LTM
Situation Model
Action
Context1
More focused
research areas
within cognitive
psychology
today differ as
to their foci,
methods, and
levels of
explanation.
They include
perception and
attention,
language and
communication,
development of
expertise,
situated and
sociocultural
psychology, and
neurological
bases of
cognition.
October 10, 2008
Context2
Maryland Validity Conference
Slide 14
Walter Kintsch’s CI Theory of
Reading Comprehension
Text
More focused
research areas
within cognitive
psychology
today differ as
to their foci,
methods, and
levels of
explanation.
They include
perception and
attention,
language and
communication,
development of
expertise,
situated and
sociocultural
psychology, and
neurological
bases of
cognition.
October 10, 2008
Text base
Context
LTM
Situation Model
Action
Context2
Context3
Maryland Validity Conference
Slide 15
Metaphors as foundation
Lakoff & Johnson
» Metaphors we live by (1980); Philosophy in the flesh (1999)
Key idea:
» Cognitive machinery builds from capabilities for interacting
with the real physical and social world.
» We extend and creatively recombine basic patterns and
relationships to think about everything from …
everyday things
to
extremely complicated and abstract social, conceptual,
philosophical realms
True of both experiential and reflective cognition.
October 10, 2008
Maryland Validity Conference
Slide 16
Metaphors as foundation
Example: Containers
Free Clip Art Provided by Artclips.com
October 10, 2008
Maryland Validity Conference
Slide 17
Metaphors as foundation
Example: Containers

Everyday experience  Set theory
» Very good, mostly.

Knowledge as collection of discrete things
inside our heads
» Usually good and useful, in communication
» Sometimes inapt, as sole basis of instructional
practice and assessment design (the Jeopardy
model of cognition—Rosie Perez in White men
can’t jump)
October 10, 2008
Maryland Validity Conference
Slide 18
Metaphors as foundation
Example: Cause & Effect
October 10, 2008
Maryland Validity Conference
Slide 19
Metaphors as foundation
Example: Cause & Effect
Newton’s laws; kinematics; quantitative models of
force and motion, esp. F=MA
October 10, 2008
Maryland Validity Conference
Slide 20
Metaphors as foundation
Example: Cause & Effect
q
xj
IRT & SEM models; quantitative models for
response probabilities, esp. Rasch’s P=qd.
October 10, 2008
Maryland Validity Conference
Slide 21
Metaphors as foundation
Example: Cause & Effect

Everyday experience  F=MA
» Very good, mostly.

Teleological theories of history,
a la Hegel
» Not so good, mostly.
October 10, 2008
Maryland Validity Conference
Slide 22
Model-Based Reasoning
Representational
Form A
y=ax+b
Representational
Form B
(y-b)/a=x
Mainly syntactic
Mappings among
representational
systems
Entities and
relationships
Real-World Situation
October 10, 2008
Mainly semantic
Reconceived Real-World Situation
Maryland Validity Conference
Slide 23
Properties of Models (1)




Human way to think about complex unique
situations
Abstract structure of entities, relationships,
processes
What’s included, what’s omitted
Levels of analysis and grainsize
» Newtonian and quantum mechanics
» Transmission genetics at level of species,
individuals, cells, or molecules
October 10, 2008
Maryland Validity Conference
Slide 24
Properties of Models (2)

Can apply different models to same situation
» Can view selling car to brother-in-law in terms of
economic transaction model vs family
relationships model

Models tuned to uses / problems / purposes
» Mixed number subtraction
October 10, 2008
Maryland Validity Conference
Slide 25
Properties of Models (2)
The modeling cycle:
»
»
»
»
October 10, 2008
Revise
Observe
Evaluate
Model
Fit?
Predict/Use
Does it work?
What’s left out?
Adequacy of rationale?
Maryland Validity Conference
Slide 26
Models with probabilistic layers

Probability from analogy with physical games
of chance (Shafer)

Probability connects to model representation
» Key in model criticism

Model posits space for patterns; parameter
values characterize them; probability models
can characterize …
» Variation in patterns
» Modeler’s uncertainty about patterns &
parameters
October 10, 2008
Maryland Validity Conference
Slide 27
Psychometric / Measurement Models



E.g., IRT, CTT, FA, SEM, CDM
Model posits space for patterns, parameter
values characterize them
Semantic layer is cause & effect metaphor
» Q: In what sense does q “cause” X?
» A: The C&E metaphor grounds productive
connection between observations and inferences

Modeling patterns across people, not
explaining item responses (Snow & Lohman)
» Could model within-person processes at finer
grainsize
October 10, 2008
Maryland Validity Conference
Slide 28
Some answers

What is the nature of parameters like q and
h? Where are they?
» These are characterizations of patterns we
observe in real-world situations (ones we in part
construct for target uses) through the lens of a
simplified model we are (provisionally) using to
think about those situations and the use situations
in which the patterns are apt to be relevant.
» So they are in our heads, but they aren’t worth
much unless they reflect patterns in examinees’
actions in the world.
October 10, 2008
Maryland Validity Conference
Slide 29
Some answers

What is the interpretation of the probabilities
that arise from IRT, latent class / cognitive
diagnosis models, and the like?
» These are characterizations of patterns we
observe in situations and our degree of knowledge
about them, again through the lens of a simplified
model we are (provisionally) using to think about
those situations.
» In addition to guiding inference through the model,
they provide tools for seeing where the model may
be misleading, inadequate.
October 10, 2008
Maryland Validity Conference
Slide 30
Some answers

What does this mean about validity of
the data / the models / the uses of
them?
October 10, 2008
Maryland Validity Conference
Slide 31
Validity Evidence
Representational
Form A
y=ax+b
Representational
Form B
(y-b)/a=x
Theory and experience
supporting the
narrative/scientific frame
Mappings among
representational
systems
Theoretical
Entities and
relationships
and empirical
grounding of
task-scoring
procedures
Theoretical and empirical
Real-World Situation
grounding of task
design
October 10, 2008
Empirical evaluation of
predictions / outcomes
Reconceived Real-World Situation
Maryland Validity Conference
Slide 32
Validity Implications, Sense 1

The currently dominant view:
Validity is an integrated evaluative judgement of the
degree to which empirical evidence and theoretical
rationales support the adequacy and appropriateness
of inferences and actions based on test scores or
other modes of assessment. (Messick, 1989)


Focus on situated use of data from test
Consistent with MBR perspective; i.e.,
reasoning through psychometric model in
particular situations & inferences.
October 10, 2008
Maryland Validity Conference
Slide 33
Validity Implications, Sense 2

Alternative (e.g., Wiley, Borsboom, Lissitz):
[A] test is valid for measuring an attribute if and only if
(a) the attribute exists and (b) variations in the attribute
causally produce variations in the outcomes of the
measurement procedure. (Borsboom et al, 2004)

MBR view can omit specific uses, but
» must consider range of situations and uses that are
apt to be thought about effectively via the model.
» Broader range consistent with scientific program, in
opposition to Snow & Lohman quote.
» Is realist but strong correspondence to existence of
traits qua traits in individuals is not required.
October 10, 2008
Maryland Validity Conference
Slide 34
I am Feeling Better Now
Model-based reasoning provides a way of
thinking about validity that …

is consistent with the practical methods that
have developed to assure quality of inferences
from assessments

is realist, in constructive-realism and L&J’s
“embodied realism” sense

is consistent with developments in cognitive
psychology, including the nature of scientific
reasoning, and the meaning of probability.
October 10, 2008
Maryland Validity Conference
Slide 35
Download