If Language is a Complex Adaptive System, What is Language Assessment? Robert J. Mislevy Chengbin Yin University of Maryland Center for Applied Linguistics Presented at “Language as a Complex Adaptive System”, an invited conference celebrating the 60th Anniversary of Language Learning, at the University of Michigan, Ann Arbor, MI, November 7-9, 2008. The first author was supported by a grant from the Spencer Foundation. (1) Key Ideas Assessment as evidentiary argument, not simply as measurement. Arguments constructed around … » View of the nature of proficiency. » Situations and ways people acquire it and use it. Relevant work taking place in language testing from an interactionist perspective. Reconceiving measurement models (2) The Assessment Argument (Messick, 1994) What complex of knowledge, skills, or other attributes should be assessed? What behaviors or performances should reveal those constructs? What tasks or situations should elicit those behaviors? We’ll look at a more technical representation in a little while. (3) Snow & Lohman, 1989 Summary test scores, and factors based on them, have often been though of as “signs” indicating the presence of underlying, latent traits. [q] … An alternative interpretation of test scores as samples of cognitive processes and contents … is equally justifiable and could be theoretically more useful. (4) LaCAS and Assessment Arguments Interactionalist perspective in language testing: » Communicative competence » Contextual features of tasks » Language tests for specific purposes (5) An Interactionalist Perspective (Young, 2000, 2008) … language used in specific discursive practices rather than … language ability independent of context. Focus on the co-construction of discursive practices by all participants ... A set of general interactional resources that participants draw upon in specific ways in order to co-construct a discursive practice. (6) An Interactionalist Perspective (Young, 2000, 2008) Relationship between participants’ employment of interactional resources and the context in which they are employed. Varying with the practice and the participants… (7) Challenges for Assessment (Chalhoub-Deville, 2003) Amending the construct of individual ability to accommodate [how] language use in a communicative event reflects dynamic discourse, which is co-constructed among participants; and … reconciling [the notion that language ability is local] with the need for assessments to yield scores to generalize across contextual boundaries. (8) Sociocognitive Foundations Themes from, e.g., cognitive psychology, literacy, neuroscience, anthropology: » Connectionist metaphor, associative memory Situated cognition & information processing » Construction-Integration (CI) theory of comprehension (Kintsch and others) Individual Sociocultural perspectives » A cognitive theory of cultural meaning (Strauss & Quinn, 1997) (9) A Cognitive Theory of Cultural Meaning “Interactional Resources” Extrapersonal: » Cultural models: What ‘being sick’ means, restaurant script, Newton’s laws, complaints » Linguistic: Grammar, conventions, constructions Intrapersonal: » Patterns from experience at many levels » Schemas / frames / understandings / assumptions The user’s knowledge of the language Interplay Situated understandings rules is interlocked with his knowledge » Access to, and ways of interacting with, shared of when, where, and with whom to use structures in order to accomplish goals them. (R. Ellis, 1985) (10) Inside A not observable A B observable Inside B not observable (11) and internal and external aspects of context … Inside A A Inside B B Context A la Kintsch: Propositional content of text / speech… (12) Inside A A Inside B B Context E.g., tasks in Occupational English Test (OET; McNamara, 1996) call upon The C in CI theory, patternsConstruction: re language, but also genre, •If a pattern hasn’t been developed in past Activation of bothmedical relevant and irrelevant knowledge, use of information experience, it can’t be activated (although it may bits of cultural models, experiences, in clinical settings. e.g., get constructed in the interaction). •Restaurant script, Human motivation •A relevant Guided in part by linguistic models, e.g. pattern from LTM may be activated some contexts •Conventions, constructions,inrhetorical frames but not others (e.g., physics models; question formation (Tarone)). Content of utterance History with interlocutor Conversation thus far (13) Inside A A Inside B B Context The I in CI theory, Integration: •Situation model: synthesis of coherent / reinforced activated cultural / linguistic / situational patterns •Situation model is basis of understanding (14) Inside A A Inside B B Context Situation model is also the basis of planning and action. (15) Inside A A Inside B B Context Context Context Context Previous situation models are input to subsequent situation models. (16) Inside A A Inside B B Context Context Context Context Ideally, activation of relevant and compatible cultural & linguistic models… (17) Inside A A Inside B B Context Context Context Context to lead to (sufficiently) shared understanding; i.e., co-constructed Kramsch’s meaning. "shared internal context " (18) External / pubic aspects of context, e.g., Can distinguish external and internal •Setting aspects of context (e.g., Douglas, 2000) Re assessment, •Physical attributes Target use (TLU) features Inside A A language Inside B B Task features Comments about context… Context Context Context Context (19) Aspects of cultural/linguistic/interaction context as interpreted by an external observer. Used to determine what actions signal recognition, action As such, assessment, plays role in Inside A A incomprehension, Inside Bthrough B targeted cultural /linguistic •Evaluation, hence models. •Observable variables Context Context Context Context (20) What can we say about individuals? Use of resources in appropriate contexts in appropriate ways; i.e., Attunement to cultural/linguistic patterns: Recognize markers of externally-viewed patterns? Construct internal meanings in their light? Act in ways appropriate to targeted cultural/linguistic models? What is the range and circumstances of activation? (variation of performance across contexts) (21) The Assessment Argument (Messick, 1994) What complex of knowledge, skills, or other attributes should be assessed? What behaviors or performances should reveal those constructs? What tasks or situations should elicit those behaviors? (22) Toulmin’s Argument Structure Claim unless Alternative Warrant explanation since so Backing Data (23) Claim about student Backing concerning assessment situation unless on account of Warrant concerning assessment Alternative explanations since so Data concerning student performance Warrant concerning evaluation since Data concerning task situation Warrant concerning task design since Student acting in assessment situation Other information concerning student vis a vis assessment situation (24) Backing concerning assessment situation on account of Warrant concerning assessment In interactive task, •performance flows in time, Claim about student •performance changes situation, •may or may not be series of task and observable variables unless Alternative explanations since so Data concerning student performance Warrant concerning evaluation since Data concerning task situation Warrant concerning task design since Student acting in assessment situation Other information concerning student vis a vis assessment situation (25) Backing concerning assessment situation on account of Warrant concerning assessment Concerns features of (possibly evolving) context as seen from the Claim about student view of the assessor – in particular, those seen as relevant to targets of inference. unless Alternative explanations since so Data concerning student performance Warrant concerning evaluation since Data concerning task situation Warrant concerning task design since Student acting in assessment situation Other information concerning student vis a vis assessment situation (26) Claim about student Backing concerning assessment situation unless on account of Warrant concerning assessment Alternative explanations Evaluation of performance concerns context features Data concerning Data concerning indirectly: clues that suggest student task situation performance attunement to features of cultural / linguistic models of interest. (did examinee recognize, comprehend, act Student acting in assessment situation accordingly?) since so Warrant concerning evaluation since Warrant concerning task design since Other information concerning student vis a vis assessment situation (27) Backing concerning assessment situation on account of Warrant concerning assessment “Hidden” aspects of context—not in test theory model but essential to argument: What attunements to cultural / Claim about student linguistic models can be presumed among examinees, condition Fundamental to situated to meaning inference re targeted l/c models? of student variables in measurement models; Both critical and implicit. unless Alternative explanations since so Data concerning student performance Warrant concerning evaluation since Data concerning task situation Warrant concerning task design since Student acting in assessment situation Other information concerning student vis a vis assessment situation (28) Claim about student Assessment context Backing concerning always has its own assessment situation features that activate Warrant some cultural / linguistic concerning assessment models and suppress Data concerning others in different ways student performance for different examinees. (i.e., “method effects”) on account of since so Warrant concerning evaluation since unless Alternative explanations Important for … Data concerning •Alternative explanations task situation •Variable performance Warrant concerning task design since Student acting in assessment situation Other information concerning student vis a vis assessment situation (29) Claim about student Backing concerning assessment situation unless on account of Warrant concerning assessment Alternative explanations since so Data concerning student performance Warrant concerning evaluation Design Argument since Data concerning task situation Warrant concerning task design since Student acting in assessment situation Other information concerning student vis a vis assessment situation (30) Use Argument Claim about student in use situation (Bachman) unless Warrant concerning use situation Alternative explanations since on account of Backing concerning use situation Other information concerning student vis a vis use situation Data concerning use situation Claim about student Backing concerning assessment situation unless on account of Warrant concerning assessment Alternative explanations since so Data concerning student performance Warrant concerning evaluation Design Argument since Data concerning task situation Warrant concerning task design since Student acting in assessment situation Other information concerning student vis a vis assessment situation (31) Use Argument Claim about student in use situation unless Warrant concerning use situation Alternative explanations since on account of Backing concerning use situation Other information concerning student vis a vis use situation Data concerning use situation Claim about student Backing concerning assessment situation unless on account of Warrant concerning assessment Alternative explanations since so Data concerning student performance Warrant concerning evaluation Design Argument since Data concerning task situation Warrant concerning task design since Student acting in assessment situation Other information concerning student vis a vis assessment situation (32) Use Argument Claim about student in use situation unless Warrant concerning use situation Alternative explanations since on account of Backing concerning use situation Other information concerning student vis a vis use situation Data concerning use situation Claim about student Backing concerning assessment situation unless on account of Warrant concerning assessment Alternative explanations since so Data concerning student performance Warrant concerning evaluation Design Argument since Data concerning task situation Warrant concerning task design since Student acting in assessment situation Other information concerning student vis a vis assessment situation (33) Use Argument Claim about student in use situation unless Warrant concerning use situation Alternative explanations since on account of Backing concerning use situation Other information concerning student vis a vis use situation Data concerning use situation Claim about student Backing concerning assessment situation unless on account of Warrant concerning assessment Alternative explanations since so Data concerning student performance Warrant concerning evaluation Design Argument since Data concerning task situation Warrant concerning task design since Student acting in assessment situation Other information concerning student vis a vis assessment situation (34) Use Argument Claim about student in use situation unless Warrant concerning use situation Alternative explanations since on account of This is the essence of warrant for claim in use argument. Claim about student Backing concerning use situation Other information concerning student vis a vis use situation Data concerning use situation Backing concerning assessment situation unless on account of Warrant concerning assessment Shared backing for test and use arguments Data concerning grounds warrant for student performance presumed appropriate activation in TLU context. Alternative explanations since so Warrant concerning evaluation Design Argument since Data concerning task situation Warrant concerning task design since Student acting in assessment situation What features do tasks and TLUs share? •Implicit in trait arguments •Explicit in interactionalist arguments Other information concerning student vis a vis assessment situation (35) Use Argument Claim about student in use situation unless Warrant concerning use situation Alternative explanations since on account of Backing concerning use situation Data concerning use situation Other information concerning student vis a vis use situation Knowing about target Questions of validity / examinees and TLUs is Backing concerning generalizability: assessment situation key to strong inferences •TLU features that call for other Warrant (Douglas, 1998) cultural / concerning linguistic models that assessment weren’t in task and may or may Dataresources. concerning Data concerning not be in examinee’s What features do tasks and student task situation performance •Target models not activated in TLUs not have in common? LTM in TLU context. Claim about student unless on account of Alternative explanations since so Warrant concerning evaluation Design Argument since Warrant concerning task design since Student acting in assessment situation Other information concerning student vis a vis assessment situation (36) Implications for measurement models Basic form: Prob X ij q i , j Probability of aspects of performance Xij given parameters for person i and situation j (all could be vector-valued) • Way too simple • No explicit connection with CI comprehension These are indeed properties of the model, interaction processes,meaning etc. conventional of the • Apparent separation of person andand situation measurement model parameters. characteristics (37) An Interactionalist Perspective: Instantiation in a Context Xs result from particular persons calling upon resources in particular contexts (or not, or how) Mechanically qs simply accumulate info across situations Our choosing situations and what to observe drives the situated meaning of qs. Situated meaning of qs are tendencies toward these actions in these situations that call for certain interactional resources, via l/c models. (38) How to model inconsistent performance? Traditional: Model as “noise” / unreliability Promising direction: Model individual’s degree or pattern in variation in terms of context features If “motivated”: Model in terms of qs » Divide & Conquer: Multiple unidimensional tests (OET) » Exploratory multidimensional models » Controlled : Structured multidimensional models » Critical importance of what else you know (39) Conclusion How much can testing gain from modern cognitive psychology? So long as testing is viewed as something that takes place in a few hours, out of the context of instruction, and for the purpose of predicting a vaguely stated criterion, then the gains to be made are minimal. Buzz Hunt (1986) (40) Conclusion I have argued that we need to capitalize on [method effects] by designing tests for specific populations -- tests that contain instructions, content , genre, and language directed toward that population. The goal is to produce tests … that would provide information interpretable as evidence of communicative competence in context. Douglas (1998) (41) Conclusion Interactionalist view of test theory… for arguments in interactionalist view of language to assemble, analyze, & interpret assessments in light of context and purpose. Methods and exemplars needed, but pressing need is interpretive frame … » To connect view of language proficiency with the machinery of test theory, » Toward modeling students’ (inter)actions in purposeful variations in task contexts. (42) Thank you! (43)