Accuracy of Expert Self-Report 1 Running Head: ACCURACY OF EXPERT SELF-REPORT The Role of Automaticity in Experimental Design and Analysis: A Study of NoviceExpert Differences in the Accuracy of Self-Report David F. Feldon Rossier School of Education University of Southern California Dissertation Proposal Accuracy of Expert Self-Report 2 The Role of Automaticity in Experimental Design and Analysis: A Study of NoviceExpert Differences in the Accuracy of Self-Report “There are a number of assumptions that a skilled researcher uses when doing research. Often, they can't even articulate what they are, but they practice them. The [expert researcher] model requires a long process of acculturation, an indepth knowledge of the discipline, awareness of important scholars working in particular areas, participation in a system of informal scholarly communication, and a view of research as a non-sequential, nonlinear process with a large degree of ambiguity and serendipity. The expert researcher is relatively independent, and has developed his or her own personal [research] strategies” (Leckie, 1996, p. 202). This view of experimental research reflects several critical concerns in the preparation of Ph.D. students. One of the most challenging aspects of graduate education in the social sciences is the teaching of research skills (Labaree, 2003; Schoenfeld, 1999). While there are many instructional texts on the process of experimental research (e.g. Gall, Borg, & Gall, 1996; McBurney, 1998; Pedhazur & Schmelkin, 1991) and an emphasis on personal advisement and cognitive apprenticeship in the advanced stages of graduate study (Golde & Dore, 2001), there have been increasing levels of concern about the quality of research skills that students develop through their doctoral programs (Adams & White, 1994; Holbrook, 2002). Accuracy of Expert Self-Report 3 The development of social science research skills at the graduate level often begins with course-based instruction, which has been found to yield highly variable levels of skill mastery and self-efficacy (Onwuegbuzie, Slate, Paterson, Watson, & Schwartz, 2000). The content of these courses is usually presented through assigned readings and lectures by an instructor. In each case, the strategies for thinking about the research process are ultimately dependent on the reflections of a researcher describing his own practice—from the instructor directly, through the assigned readings, or a combination thereof. Likewise, in the case of cognitive apprenticeships, the mentor’s role is to model and explain his or her own approach to problems considered together with the student. In their studies of cognitive apprenticeships, Radziszewska and Rogoff (1988; 1991) report that successful learning by the student is dependent on accurate, comprehensible explanations of strategies by the mentor and the opportunity to participate in decisions during authentic tasks. As the decision-making and strategic components of the research process are entirely cognitive, direct observation by the student is severely limited. Thus, at every stage of training in experimental research, the student is dependent on the selfreport of (ideally) an expert in the field. However, the accuracy of self-report under many conditions is considered highly suspect (e.g. Minsky, 1981; Nisbett & Wilson, 1977; Schneider & Shiffrin, 1977; Wilson & Nisbett, 1978). Consequently, a deeper, more objective understanding of cognitive research skills must be developed to assist students in their transition to skilled researchers. Ultimately, to better scaffold the skill acquisition of developing researchers, an accurate understanding and representation of expert strategies must emerge— Accuracy of Expert Self-Report 4 specifically, the conceptualization of problems for inquiry, the formulation of experimental designs, and the analytical skills utilized in the interpretation of results (Hmelo-Silver, Nagarajan, & Day, 2002; McGuire, 1997). Purpose of the Study The purpose of this study is threefold: 1. To accurately identify cognitive skills and strategies that research experts use in experimental research and differentiate them from novice techniques. 2. To evaluate experts’ abilities to accurately report their own problem-solving processes. 3. To demonstrate that automaticity is a fundamental characteristic of expert performance in this domain. Review of the Literature In order to meaningfully explore the problems posed above, it is first necessary to ground several implicit assumptions in the foundation of prior empirical research. Specifically, these include the assertions that (a) scientific research skills are acquirable and represent predominantly crystallized intelligence, (b) expertise in scientific research is distinguishable from lower levels of skill, both in an individual’s ability to successfully solve a problem and in a qualitative distinction between expert and novice strategies, and (c) skills that are practiced extensively automate, such that the mental effort necessary to perform a procedure is minimal. Accuracy of Expert Self-Report 5 The support for each assumption will emerge from reviews of the empirical evidence in the following categories: scientific problem solving, expertise, automaticity of procedural knowledge, and accuracy in self-report. Because the arguments draw upon findings from a variety of overlapping research agendas with different theoretical frameworks, multiple terms will sometimes be used to describe a single construct and the use of identical terms will sometimes be differentiated to clarify important distinctions that exist between researchers. Most notably, the term “domain-general skills” has been used to describe both heuristic procedures applicable in multiple domains that have been acquired through experience (see Anderson, 1987) and the application of fluid intelligence to novel situations (see Perkins & Grotzer, 1997). To avoid confusion, domain-general skills will be considered heuristic procedures that can be used independent of knowledge of a particular subject to solve a problem. In contrast, domain-specific skills indicate those that actively utilize acquired knowledge of a particular domain to generate solutions specific to a problem. Scientific Problem Solving In the study of problem solving, the search for solutions is often referred to as the navigation of a problem space, in which the initial state and the goal state are the starting and ending points, and the space is composed of all possible routes to move from the former to the latter (Newell & Simon, 1972). However, because scientific problem solving requires attention to both a hypothesis or research question and an experimental design that controls sources of variance, Klahr and Dunbar (1988) argue that there are in fact two adjacent, mutually influential problem spaces which must be searched1. After 1 Thagard (1998) argued for the existence of a third problem search space for selection of instruments used within experiments, given the key role that advances in instrumentation played in the understanding of Accuracy of Expert Self-Report 6 the selection of a particular experimental design and the generation of data, the scientist must evaluate the progress that has been made in the hypothesis space towards the goal of null hypothesis rejection. Thus, each successful navigation of the experiment space results in incremental progress within the hypothesis space. Given the vast number of possible steps within each problem space and the exponential increase in the size of the search space for each additional step, Klahr and Simon (2001) note that “much of the training in scientists is aimed at increasing the degree of well-definedness of problems in their domain” (p. 76). In their study, for example, subjects were provided with programmable “rocket ships” that included a programming a button that produced an unknown function. Their task was to determine the function of this button through experimentation. As the participants engaged in the activity, they simultaneously worked to design experiments that would isolate its role as well as generate hypotheses that could be used to describe the rule-governed function and be tested for confirmation. In the study of high-level scientific problem solving, several significant qualitative differences have been observed in performance between experts and novices that have been consistent across studies and specific domains. Specifically, novices in a given task will rely on heuristic strategies, general skills that can be adaptively applied across domains, such as means-ends analyses and backward reasoning. In contrast, experts utilize domain-specific skills grounded in knowledge of concepts and procedures acquired through practice within a domain (Anderson, 1993; Newell & Simon, 1972). bacteria’s role in ulcer formation. However, it can be argued that instrument selection is encompassed in the attainment of sub-goals within the experimental design problem space. Further, Schunn and Klahr (1996) present criteria for instances when it might be appropriate to go beyond a two-space model of scientific problem solving: (a) additional spaces should involve search of different goals and entities; (b) the spaces should differ empirically from one another; and (c) spaces should be representable in a computational model that considers each space as distinct. In the current study, not only do none of these criteria apply, but the ability to modify instrumentation lies beyond the capabilities of its apparatus. Accuracy of Expert Self-Report 7 Even when an expert encounters novel problems within his domain of expertise that limit the extent to which strong methods can be used, the heuristic approaches that are employed are fundamentally more adaptive than novices’ weak methods, as they utilize an elaborate knowledge base (Schraagen, 1990). For example, McGuire (1997) provides an extensive list of heuristics that are specific to the navigation of the hypothesis problem space but require training in experimental design prior to successfully use. In contrast, true domain-general skills are applied when a structured knowledge base is not available. Means-ends analysis involves a general attempt to reduce observed differences between the problem state and the goal state. This strategy manifests itself as either an attempt to metaphorically move in as straight a line as possible toward the goal state (i.e., hill climbing; Lovett & Anderson, 1996) or the identification and pursuit of nested sub-goals that must be met prior to the attainment of the end goal. For example, studies of novice performance on the Tower of Hanoi task have reported that subjects will not generate a rule-based strategy for solving the problem efficiently. Instead, they work backwards from the goal state, identifying the move necessary to achieve the main goal, then identifying the move necessary to allow the first (i.e. identify a sub-goal), and so on. Neves (1977; as cited in Anderson, 1993, p. 37) provides a clear illustration through the verbal reasoning of one subject: “The 4 has to go to the 3, but the 3 is in the way. So you have to move the 3 to the 2 post. The 1 is in the way there, so you move the 1 to the 3.” More recent work by Phillips, Wynn, McPherson, and Gilhooly (2001) further indicates that even when novice subjects are instructed to preplan a strategy to solve the problem (in this case, a slight variation of the Tower of Hanoi, dubbed the Tower of London) before attempting it, there were no significant differences between Accuracy of Expert Self-Report 8 their speed and accuracy of performance and those of subjects not provided with the opportunity to develop a plan before beginning. Further, when asked to report their intermediary sub-goals, they were able to accurately report only up to two moves ahead in a manner similar to the Neves (1977) subject. From this, the authors conclude that the problem-solving method was indicative of a means-ends approach. Similarly, when novices attempt problem solving tasks within a scientific domain, they also exhibit difference reduction and subgoaling behavior. In their seminal study of physics problem solving, Larkin, McDermott, Simon, and Simon (1980a, 1980b) described novice physics students reasoning backwards from the required solution by determining which equation would yield an appropriate answer and then attempting to utilize the givens provided, whereas physics experts initiated the problem solving process by formulating a conception of the situation on the basis of physics principles and available specifics, generating the solution by manipulating the mental model to yield the appropriate answer. The distinction between novice and expert approaches to the conceptualization of physics problems was also observed in problem sorting tasks, in which expert participants consistently categorized problems according to underlying principles represented in the prompts in contrast to novices, who paid greater heed to surface features such as apparatus and available facts (Chi, Feltovich, & Glaser, 1981). More recently, an extensive study of individual differences in physics problem solving replicated and extended these findings, identifying the prominence of principle identification as a factor in the strategy selection of experts (Dhillon, 1998). Expert utilization of theoretical conceptualizations had the benefit of activating pre-existing Accuracy of Expert Self-Report 9 mental models that represented both the relevant pieces of information that were presented and the abstract relationships between elements in the mental model (Glaser & Chi, 1988; Larkin, 1985). This abstract representation also scaffolded the search for missing information that the model would otherwise incorporate. In contrast, novices who lacked an adaptive model relied on surface-level details and iterative hypothesis testing that generated a mental model that was more firmly bound to the concrete representation of the situation as it was presented (Lamberti & Newsome, 1989). As these differences are reliable and do not emerge until after high level attainment within the problem solving domain, it follows that they represent a distinct set of skills that are not intuitive or available to the novice problem solver. Singley and Anderson (1989) demonstrated that not only do experts characteristically use strong (i.e. domain-specific) methods in their problem solving, but less experienced problem solvers can also learn to use them successfully. Further, in another of Anderson’s studies, it was determined that differences in scientific performance were not significantly attributable to individual differences in fluid intelligence (Schunn & Anderson, 1998). This finding replicated the evidence regarding IQ in other skilled domains that after five years of professional experience, intelligence and performance are not reliably correlated (Ceci & Liker, 1986; Doll & Mayr, 1987; Ericsson & Lehmann, 1996; Hulin, Henry, & Noon, 1990; Masunaga & Horn, 2001). Chen and Klahr (1999) also describe the high level of impact that training in the strategic control of experimental variables had on science students’ ability to generate and execute valid scientific inquiries. However, they describe the control of variables strategy (CVS) as a domain-general skill, because it can apply with equal success to Accuracy of Expert Self-Report 10 specific studies in physics or any other science content domain. It is suggested here, however, that experimental design can be recognized as an independent domain that is specific to endeavors within the experiment problem space. While domain general skills will certainly impact performance in this area to some extent as they would in any situation where relevant factors are not known, designing informative experiments “requires domain-general knowledge about one's own information-processing limitations, as well as domain-specific knowledge about the pragmatic constraints of the particular discovery context” (Klahr, Fay, & Dunbar, 1993, p. 114). For example, Schunn and Anderson (1999) differentiate between domain experts and “task” experts for their study of scientific reasoning in a memory experiment design task. Domain experts were university faculty in psychology whose research agendas were directly related to the problem presented, whereas task experts were also psychology research faculty but specialized in topics unrelated to memory. Although there were some significant differences in experimentation strategies between the two groups of experts (to be expected in light of the inherent confluence of hypothesis and experiment problem spaces), the performance of both were consistently superior to that of undergraduates who had completed a course in experimental design in psychology. Given these findings and the relationship between expertise and practice discussed in the section following, it is reasonable to conclude that skills and strategies specific to scientific problem solving are fundamentally acquirable. As such, differences in performance are expected to be more directly linked to deliberate practice of skill acquisition than to measures of fluid intelligence. Expertise Accuracy of Expert Self-Report 11 The past several decades have yielded a burgeoning body of work on the subject of expertise (Patel, Kaufman, & Magder, 1996). While there is more than a century of research on skill acquisition (see Proctor & Dutta, 1995 for an extensive review), relatively recent emphasis has emerged on the characteristics of experts that are common across domains. Such work has predominantly emphasized two general issues in high level performance: the identification of cognitive processes generalizable to all expert performances and the factors contributing to the acquisition of expert-level skill. This review will examine the development and current state of expertise research and make several arguments regarding the cogency of its theoretical construct and future directions for research. Despite several ongoing debates in the theory of expertise, a number of reliable representative characteristics have emerged. Glaser (1988) elucidated seven oft-cited attributes that characterize the performance of most experts. These observations, which are drawn from generalizations of a number of studies in the late 1970s and 1980s, have helped to shape the development of the field, despite a lack of definitive analysis regarding the extent to which each may be necessary or sufficient for the expertise construct: 1. Experts excel mainly in their own domains. 2. Experts perceive large meaningful patterns in their domain. 3. Experts are fast; they are faster than novices at performing the skills of their domain, and they quickly solve problems with little error. 4. Experts have superior short-term and long-term term memory. Accuracy of Expert Self-Report 12 5. Experts see and represent a problem in their domain at a deeper (more principled) level than novices; novices tend to represent a problem at a superficial level. 6. Experts spend a great deal of time analyzing a problem qualitatively. 7. Experts have strong self-monitoring skills. In more recent research, Ericsson (1996; Ericsson & Lehmann, 1996) has advanced two additional characteristics of expertise. First, he has noted that expertise in a given domain typically requires a minimum of ten years of deliberate practice to develop. Extending the original findings of Simon and Chase (1973) suggesting that a decade was the minimum amount of experience necessary to gain chess mastery, this idea has been further elaborated and supported by the findings of several important investigations in various domains (Charness, Krampe, & Mayr, 1996; Simonton, 1999). Deliberate practice is considered to be highly effortful, intended to improve performance, not inherently motivating, and not intended to attain any goal beyond continued skill development (Ericsson & Charness, 1994; Starkes, Deakin, Allard, Hodges, & Hayes, 1996). Second, he has described an expert process as one exhibiting a “maximal adaptation to task constraints.” Such constraints include the physical limitations of the human body and the demands of the laws of physics, as well as the functional rules that are associated with the task (e.g., the rules of chess, established flight paths, etc.), and the limitations of short-term memory and other cognitive functions (Casner, 1994; Vicente, 2000)2. It is the asymptotic approach to these constraints, Ericsson argues, that allows It is notable that task constraints do not include an individual’s intelligence. Data from a number of studies has indicated that expert performance is not significantly related to measures of general or fluid ability (Ceci & Liker, 1986; Doll & Mayr, 1987; Ericsson & Lehmann, 1996; Hulin, Henry, & Noon, 1990; Masunaga & Horn, 2001). 2 Accuracy of Expert Self-Report 13 experts to succeed where others fail. Due to their extensive practice and skill refinement, they have to a great extent shaped the development of their physiological (e.g. density of blood vessels in top athletes; Ericsson, Krampe, & Tesch-Romer, 1993) and cognitive (e.g. working memory limitations; Ericsson & Kintsch, 1995) mechanisms required for performance in the domain of expertise. The Role of Knowledge in Expertise. Historically, several broad aspects of expert performance have been examined, with each emphasizing a distinct aspect of cognition. One major approach focuses on experts’ extensive knowledge of domain-relevant information and the ability to recall it in appropriate situations. Chase and Simon’s (1973) classic work in the memory performance of chess masters suggested that quantity of knowledge was considered the foundational component of expertise. Their findings indicated that experts had vastly superior memory for the locations of realistically-placed chess pieces in briefly presented stimuli relative to novices, but equivalent recall ability for randomly-placed pieces and chess-unrelated stimuli under equivalent conditions. From this, they concluded that expert performance on selected tasks depended on those tasks falling within their domain of mastery and being representative of the tasks performed during normal participation in the activity. Further, the increased speed and capacity that they seemed to demonstrate was attributed to the recognition of previous situations encountered within the domain that were equivalent to the tasks presented. This suggested that expertise was in large part a benefit of extensive experience within a domain from which subjects could recall previously successful solutions and deploy them quickly and consistently. Such findings have been consistently replicated in a wide array Accuracy of Expert Self-Report 14 of domains, including tennis (Beilock, Wierenga, & Carr, 2002) and botany (Alberdi, Sleeman, & Korpi, 2000). Later work has also analyzed the organization of expert knowledge structures and differentiated them from novice representations on the basis of levels of detail, differentiation, and level of principled abstraction. For example, Chi, Feltovich, and Glaser (1981) examined expert and novice performance in physics problem-sorting tasks and observed that the categories identified by experts were based on fundamental principles on which the problem solutions relied. In contrast, novices conceptualized the problems based on their surface-level details, such as the presence of pulleys or inclined planes. Similarly, Adelson (1981) found that novice programmers categorized lines of code according to syntax, whereas experts utilized functional or semantic aspects. High recall performance in experts has also been linked to principled conceptual organization. In additional chess studies, it has been found that by providing additional conceptually descriptive information about the location of chess pieces in a game before or after the visual presentation of the board, generates even higher levels of expert recall than in a visual presentation-only condition, suggesting that memory performance is linked to more abstract cognitive representations (Cooke, Atlas, Lane, & Berger, 1993). The level of conceptual abstraction in expert reasoning has also been explained as a “comfortable, efficient compromise…that is optimal” for expert-level performance in a specific domain (Zeitz, 1997, p. 44). This “compromise” represents a suitable chunking size and schematic framework to facilitate the establishment of appropriate links between the concrete elements of a particular problem and the more general concepts and principles that the expert has acquired through experience in the domain. This Accuracy of Expert Self-Report 15 framework facilitates a number of knowledge-related traits associated with expertise, specifically an expert’s ability to recognize sophisticated patterns and enhanced performance for recall of salient details in given situations. The Role of Working Memory in Expertise. A second account focuses primarily on the superior working memory performance of experts when working in their domain. Extensive evidence indicates that experts are able to process much more information in working memory than is possible under normal circumstances (cf. Baddeley, 1986). Evolving from the initial theory of chunking provided by Chase and Simon (1973), in which experts were believed to represent large, familiar perceptual patterns held in long term memory as single chunks that could be elaborated rapidly in short term memory, several newer theories have developed that are better able to account for findings suggesting that experts’ extraordinary recall of information from domain-specific problems is not impaired by disruptive short term memory tasks, despite the theory’s expectation that chunks are held in short term memory (Gobet, 1998; Vicente & Wang, 1998). Long term working memory theory (LTWM; Ericsson & Kintsch, 1995), template theory (Gobet & Simon, 1996), and the constraint attunement hypothesis (CAH; Vicente & Wang, 1998) have suggested that as a result of continued practice within a domain, schematic structures within long term memory can be used to not only facilitate access to existing declarative knowledge as discussed previously, but also to functionally augment the limited capacity of short term memory when considering domain-relevant problems. LTWM suggests that experts develop domain-specific representation mechanisms in long term memory that reflect the structure of the primary domain tasks themselves, Accuracy of Expert Self-Report 16 allowing for the rapid encoding and retrieval of stimuli from relevant tasks. Such a model can account not only for experts’ exceptional recall abilities of domain-relevant situations in general, but also for expanded working memory capacity during expert performance (Ericsson & Kintsch, 1995). Gobet (1998) extrapolates two possible manifestations of the LTWM theory in an attempt to account for a number of empirical findings. The first representation, referred to as the “square version” (p. 125), suggests that the LTWM structure for chess experts manifests itself directly in the form of a 64square schematic chess board. In this conception, encoding is therefore contingent on appropriately compatible stimuli for the format. The second possible representation, dubbed the “hierarchy interpretation” (p. 125), constructs a different conceptualization of the original theory to allow for encoding that is not contingent on format and establishes that “in preference to storing pieces in squares, experts store schemas and patterns in the various levels of the retrieval structure” (p. 125). Contrasting with LTWM theory, template theory (Gobet & Simon, 1996) does not completely reject the chunking component originally established by Chase and Simon (1973). Instead, in cases of extensive practice, associated templates augment a single large chunk with slots that could represent related but variable items, retrievable through a short term memory trace mechanism. The creation of these slots occurs when a minimum number of semantically related elements occur in similar relationships below the node representing the chunk in short term memory. Thus, slots could be occupied by common component categories that vary depending on the particular situation, such as strategy features or, in the case of chess, players associated with the particular approach. Accuracy of Expert Self-Report 17 The constraint attunement hypothesis critiques LTWM theory by arguing that it accounts primarily for data that was generated by experts in domains for which memorization is an intrinsic element. In an attempt to provide a more generalizable theory, Vicente and Wang (1998) suggest that the appropriate structures in long term memory to facilitate enhanced working memory performance are representations of the task constraints that govern performance within the domain. An abstraction of the task in this format, they argue, allows the goal structure to serve as the encoding representation for rapid retrieval. In essence, the hierarchy interpretation of LTWM theory elaborated by Gobet (1998) provides a comparable role, except that the hierarchical representation is structured according to the goals of the task, rather than the structural features. This allows the authors to predict the magnitude of expertise-enhanced memory performance on the basis of the number of available constraints, in that the higher the number of authentic task constraints, the more optimally a constraint-attuned framework in long term memory can be utilized to expand working memory capacity. The Role of Strategy in Expertise. The third framework for expertise grounds performance in qualitative differences in problem solving strategies between experts and novices. Consistent early findings in the study of physics problem-solving skills indicate that experts call on domain knowledge to approach problems through forward reasoning processes in which they are represented conceptually and approached strategically on the basis of the given factors (Chi, et al., 1981; Chi, Glaser, & Rees, 1982; Larkin, McDermott, Simon, & Simon, 1980a, 1980b). Such “strong methods” involve developing a highly principled representation of the problem that through manipulation yields an appropriate solution (Singley & Anderson, 1989). Novices on the other hand, Accuracy of Expert Self-Report 18 utilize “weak method” heuristics that begin with identification of the goal state and reason backwards to identify relevant given information and approaches that will generate the necessary outcome (Lovett & Anderson, 1996). Further, the development of expertise entails a progression from general, “weak-method” heuristics to feedbackrefined procedures that have integrated domain-specific knowledge (Anderson, 1987). Such differences between expert and novice performance are robust, even when novices are instructed to develop a strategy before attempting a solution. Phillips, Wynn, McPherson, and Gilhooly (2001) found that despite extensive preplanning, novices exhibited no significant differences in their speed or accuracy of performance when compared with those not provided with the opportunity to develop a plan before beginning. Further, when asked to report their intermediary sub-goals, they were able to accurately report only up to two moves ahead, thereby demonstrating a means-ends approach rather than a forward strategy. Similarly, Larkin, et al. (1980a, 1980b) described novice physics students reasoning backwards from the required solution by determining which equation would yield an appropriate answer and then attempting to utilize the givens provided, whereas physics experts initiated the problem solving process by formulating a conception of the situation on the basis of physics principles and available specifics, generating the solution by manipulating the mental model to yield the appropriate answer. More recently, an extensive study of individual differences in physics problem solving replicated and extended these findings, identifying the prominence of principle identification as a factor in the strategy selection of experts (Dhillon, 1998). Expert utilization of theoretical conceptualizations had the benefit of activating pre-existing Accuracy of Expert Self-Report 19 mental models that represented both the relevant pieces of information that were presented and the abstract relationships between elements in the mental model (Glaser & Chi, 1988; Larkin, 1985). This abstract representation also supported the search for missing information that the model would otherwise incorporate. In contrast, novices who lacked an adaptive model relied on surface-level details and iterative hypothesis testing that generated a mental model that was more firmly bound to the concrete representation of the situation as it was presented (Lamberti & Newsome, 1989). Even when it has been determined that novices perceive the deeper principles underlying a problem, their solutions rely nearly exclusively on surface features (Sloutsky & Yarlas, 2000; Yarlas & Sloutsky, 2000). Investigations of expert problem-solving processes in scientific experimentation and design have also provided clear illustrations of this phenomenon (Hmelo-Silver, Nagarajan, & Day, 2002). In Hmelo-Silver, et al.’s (2002) study, experts and novices in the domain of clinical trial design used a simulation to demonstrate a hypothetical drug’s suitability for medical use. Throughout a number of repeated trials, those subjects with extensive experience in the domain consistently used analogies to past experiences in their verbal protocols to reason abstractly about the process and outcomes. Additionally, they were highly reflective about the effectiveness of particular strategies in relation to their progress toward the goals of the task. In contrast, novices rarely used analogies and did not typically have the cognitive resources available for reflection while engaged in the task. The Relevance of Automaticity. The three frameworks discussed above each yield a common result: Ultimately, each aspect of expert performance improves the cognitive Accuracy of Expert Self-Report 20 efficiency of the problem solving process. This phenomenon not only emerges as a result of acquired expertise, but also further improves performance by freeing up cognitive resources to accommodate atypical features or other added cognitive demands that may arise within a task (Bereiter & Scardamalia, 1993; Sternberg & Horvath, 1998). In investigations of skill acquisition, it has been found that individuals with a high level of practice in a procedure can perform it at increasingly high speeds and with minimal mental effort (Anderson, 1982; Logan, 1988). Thus, highly principled representations of domain-specific problems can be used in fast, effortless performance by a subject with a large and well-structured knowledge base and at extensive practice of component skills. However, the procedure itself becomes more ingrained and extremely difficult to change to the extent that both goals and processes can manifest without conscious activation (Bargh & Ferguson, 2000). As such, once a skill has been automated, it no longer operates in such a way that it is available to conscious monitoring, and it tends to run to completion without interruption, further limiting the ability to modify performance (Wheatley & Wegner, 2001). In contrast, adaptive experts are highly successful even under novel conditions. Bereiter and Scardamalia (1993) observed that often when experts have automated procedures within their domain, their skills are highly adaptable to complex, illstructured, and novel situations, because minimal space in working memory is occupied by the process, thereby allowing mental effort to be reinvested attending to relevant new details. In one example, Gott, Hall, Pokorny, Dibble, and Glaser (1993) reported that highly successful air force technicians were able to adapt knowledge to novel situations despite high levels of consistent, effortless performance. This description is reminiscent Accuracy of Expert Self-Report 21 of the differences described by supervisors between the experts and super-experts in the Koubek and Salvendy (1991) study. Although their analysis of the data suggested that there was no difference in the levels of automaticity between the two groups, it is possible that Bereiter and Scardamalia’s (1993) arguments could have been supported if different forms of data, more sensitive to fluctuations in the level of cognitive load and representative of cognitive processes in greater detail had been collected. Automaticity and Procedural Knowledge Acquisition Human attention is limited by the finite capacity of working memory to retain and manipulate information (Baddeley, 1986). When mental operations such as perception and reasoning occur, they occupy some portion of available capacity and limit the attention that can be dedicated to other concurrent operations. Referred to as cognitive load, the burden placed on working memory has been found to play a major role in both learning and the governance of behavior (Goldinger, Kleider, Azuma, & Beike, 2003; Sweller, 1988; Sweller, Chandler, Tierney, & Cooper, 1990). Given the extreme limitations on the amount of information that can be consciously and simultaneously processed (i.e., as few as four chunks; Cowan, 2000) as well as the persistently high levels of information and sensory input available in most natural settings, it is necessary that many cognitive functions also take place outside of conscious awareness and control. Wegner (2002) suggests that as much as 95% of the common actions that we experience to be under conscious control are in fact automated. Further, he argues, the cognitive mechanisms that generate these false impressions of intention are themselves generated by nonconscious processes. Thus, people tend to Accuracy of Expert Self-Report 22 claim full knowledge and control of their actions to the point of creating false memories that provide plausible explanations for their actions. During the last century, both behaviorists and cognitive scientists have argued that many mental and behavioral processes take place without any conscious deliberation (Bargh, 2000). From these investigations, a dual-process model of cognition requiring the parallel execution of controlled and automatic processes has emerged in which conscious functions are relatively slow, effortful, and controllable, whereas automatic processes are rapid and effortless (Bargh, 1999a; Devine & Monteith, 1999). Automated procedures can occur without intention, tend to run to completion once activated, utilize few, if any attentional resources, and are not available to conscious monitoring (Wegner & Wheatley, 2001). It is important to note, however, that the dual-process description can prove misleading, as many procedures rely on the integration of both conscious and nonconscious thought during performance (Bargh & Chartrand, 1999; Hermans, Crombez, & Eelen, 2000). Specific sub-components can be automated or conscious, but their integration into the larger production yields a mixed composition. In the seminal work of Shiffrin and Schneider (1977), the acquisition of automaticity is said to be achieved through the consistent, repeated mapping of stimuli to responses. Most commonly associated with skill acquisition, automated procedures can be consciously initiated for the satisfaction of a specific goal. Thus, primary attention is paid to the level of ballisticity of the procedure (Logan & Cowan, 1984), which is the ability of the production to execute to its conclusion without conscious monitoring, and the ability to maintain performance levels in dual task paradigms (Brown & Bennett, 2002). Acquired through extensive practice, these goal-dependent procedures become Accuracy of Expert Self-Report 23 fluid and require less concentration to perform over time (Anderson, 1995; Fitts & Posner, 1967; Logan, 1988a). Additionally, these procedures evolve to greater levels of efficiency as automaticity develops by eliminating the need for conscious intermediate decision points (Blessing & Anderson, 1996). Evidence suggests that habitual approaches to problems are goal-activated, such that the solution search is significantly limited by the activation of established patterns of behavior (Aarts & Dijksterhuis, 2000). Dubois and Shalin (2000) further report that goal choice, conditions/constraints, method choice, method execution, goal standards, and pattern recognition are each elements of procedural knowledge that can become automated. Accuracy of Self-Report The challenge in collecting valid verbal self-report data lies in the structure of the human memory system itself. In the traditional cognitive model, the short term (working) memory acts as a gateway through which all information must pass as it is encoded and incorporated into schemas in long term memory or retrieved for manipulation or use in the production of behavior. According to Baddeley (1986, p. 34), it is “a system for the temporary holding and manipulation of information during the performance of a range of cognitive tasks.” For new information to be stored retrievably in long term memory, a trace or pathway must be created to allow the needed information to be activated at an appropriate time. Such links are more easily and successfully generated when the new information maps well onto an existing schema. Thus, schemas that have been utilized and refined adaptively through experience with particular concepts and events serve as stable mental models for the more efficient encoding and Accuracy of Expert Self-Report 24 evaluation of specific types of events (Anzai & Yokoyama, 1984; Bainbridge, 1981; Clement, 1988; Larkin, 1983). While these refined mental models are highly adaptive for problem solving, they may interfere with the accurate recall of problem solving situations after the fact. Because mental models are utilized in the search of a problem space (Larkin, et al., 1980a), details that are not directly mappable to the representation can fail to be encoded into long term memory. As a result, a retrospective account of the event may fall victim to the errors of generalizability and rationalization that Nisbett and Wilson (1977) describe in their critique of self-report accuracy. “Such reports may well be based more on people’s a priori causal theories about stimulus effects than on direct examination of their cognitive processes, and will be inaccurate whenever these theories are inaccurate” (Wilson & Nisbett, 1978, p. 130). Moray and Reeves (1987) provide direct empirical evidence for the potentially maladaptive nature of mental models. They presented subjects with a set of eight bar graphs that changed lengths over time. The lengths of some pairs of graphs were designed to covary to provide observable subsystems from which it was expected that a mental model with separate identifiable components would be derived. Participants in the study were given the task of preventing the graphs from exceeding specified parameters by changing the location and color of the bars within each graph. Once the subject had successfully learned to manage the graphs in accordance with the defined relationships, three “faults” were introduced to the system that prevented certain relationships from functioning as they had. The authors hypothesized that once the original model had been developed, the subjects would not recognize the appearance of Accuracy of Expert Self-Report 25 the faults. As expected, the fact that the relationships among and across subsystem components had changed took significantly longer to recognize than the time it took subjects to discover all of the rules for the original system, thereby demonstrating the durability of mental models once established. More recently, Logan, Taylor, and Etherton (1996) reported that “the representations expressed during automatic performance do not represent all stimulus attributes uniformly” (p. 636). Instead, only those elements that were attended to during the task production are specifically encoded in episodic memory. When subjects were asked to recall the font color in which target words were presented, they were unable to do so, despite above-chance performance on a recognition task as evidenced by instances of successful encoding of superficial item features without successful retrieval of those features during recall. Thus, they concluded that retrieval of an episode may stimulate differential partial representations of the specific instance. The challenge to validity of self-report posed by mental models mirrors the difficulties inherent in capturing self-reported procedural knowledge. Cooke (1992) notes that after knowledge of a process becomes proceduralized, subjects may have difficulty decomposing the mental representation into a declarative form. Further, subjects with extensive practice solving problems in a domain will have automated significant portions of their procedures, suggesting that the representations—presumably those of greatest interest within a domain of expertise—will be even harder to articulate. Williams (2000, p. 165) explains that “production units are not interpreted but are fired off automatically in sequences, which produce skilled performance. They are automatic to the extent that experts at a specific skill may not be able to recall why they perform the Accuracy of Expert Self-Report 26 skill as they do.” This phenomenon is particularly true of strong methods that rely on domain knowledge. In a study of radar system troubleshooters, Schaafstal & Schraagen (2000, p. 59) note that “only a small correlation was found between the knowledge test [of radar systems] and fault-finding performance (Pearson r = .27). This confirms the…gap between theory and practice.” This view has also been supported in the study of metacognitive monitoring and strategy selection. Reder and Schunn (1996; Schunn, Reder, Nhouyvanisvong, Richards, & Stroffolino, 1997) have demonstrated that subjects’ metacognitive selection of strategies during problem solving occur implicitly and are better predicted by previous exposure to similar problems regardless of whether or not a solution was obtained than by active strategy selection, despite subjects’ lack of awareness that (a) learning occurred during the initial exposure to previous problems or (b) new strategy development did not occur. Thus, information regarding critical elements of any problem solving procedure can fail to be directly accessible for verbal report. Despite these impediments, two general methods of verbal knowledge elicitation have been utilized in research of expertise and problem-solving: protocol analysis (Ericsson & Simon, 1993) and cognitive task analysis (Schraagen, Chipman, & Shute, 2000). Both approaches have been characterized as valid within specific constraints (discussed below). However, these considerations have not been consistently applied in the study of experts’ cognitive processes (Bainbridge, 1999). During protocol analysis, also referred to as the “think aloud” technique, the capacity limit of short term memory can prevent sufficient attention from being given to both the task at hand and the translation of mental symbols to verbal form. While this Accuracy of Expert Self-Report 27 does not pose a particular problem for tasks that are verbal in nature or do not require full attentional capacity to achieve, those elements which are not easily articulable (like images) require extra attentional resources to translate (Chipman, Schraagen, & Shalin, 2000). If those resources are not available, subjects will fall silent during the periods of high load. Likewise, processes which have become mostly or entirely automated will not be articulated (Ericsson & Simon, 1993). This phenomenon is particularly problematic for cognitive research, as automaticity often manifests at moments of particular interest for those studying the cognitive processes of a task, because they represent the points of most refined skill in the procedure. Further, subjects who are required to think aloud during insight problem solving tasks reveal performance deficits in the time necessary for task completion and frequency of correct solutions (Schooler, Ohlsson, & Brooks, 1993). Similarly, in Chung, de Vries, Cheak, Stevens, and Bewley (2002), subjects required to think aloud while engaging in scientific problem solving tasks required significantly higher number of attempts to successfully solve problems using a computer interface. This finding of verbal overshadowing has been replicated across a number of experimental and authentic tasks (Meissner & Memon, 2002). “A common observation is that verbal rehearsal or mediation declines with practice on perceptual-motor tasks (e.g. Fitts, 1964), which indicates that at least the form and possibly the amount of information held in working memory changes” (Carlson, Khoo, Yaure, & Schneider, 1990, p. 195).Thus, while appropriate for gathering process data in certain types of tasks performed by individuals who have not yet developed automaticity in their skills, Accuracy of Expert Self-Report 28 capturing an accurate representation of the highly specialized performances of experts in complex cognitive tasks remains problematic. In contrast, cognitive task analysis techniques represent knowledge of events and strategies that have been retained by the subject until after the event in question. While these techniques span a wide range of tools for knowledge elicitation, the data generated does not represent the contents of working memory in situ. These approaches have, however, been found to yield highly accurate information about the processes executed in a wide variety of authentic tasks (Schraagen et al., 2000; Velmahos, Toutouzas, Sillin, Chan, Clark, Theodorou, Maupin, Murray, Sullivan, Demetriades, & DeMeester, 2002). Although capturing procedural knowledge is considered to by more challenging than declarative knowledge within a domain (Hoffman, 1992), a skill-based cognitive task analysis framework has been established by Seamster, Redding, and Kaempf (2000) that focuses specifically on the elicitation of five cognitive skill types: (a) strategies, (b) decision-making skills, (c) representational skills, (d) procedural skills, and (e) automated skills. While several specific techniques have been developed within this framework, one of the most successful has been the Critical Decision Method, in which each of the skill types is elicited through a variety of probes in semi-structured interviews. Hoffman, Crandall, and Shadbolt (1998) reviewed reliability studies of the method and reported that there was high test-retest reliability over time (3 days, 3 months, and 5 months after the incident reported) and intercoder reliability of .89. With regard to validity of content, they argue that the memory prompting cues that are incorporated into the method can overcome the memory errors that are discussed above. Further, studies conducted by Crandall and his colleagues (e.g. Crandall & Calderwood, 1989; Crandall & Gamblian, Accuracy of Expert Self-Report 29 1991) have demonstrated that the behaviors captured with cognitive task analysis have differed significantly from the theoretical knowledge generally portrayed in textbooks and more aligned with the experiences of experts in the field. Summary As indicated in the literature above, the assumptions on which this study rests are justified by extant research. Specifically, (a) scientific research skills are acquirable and represent predominantly crystallized intelligence, (b) expertise in scientific research is distinguishable from lower levels of skill, both in an individual’s ability to successfully solve a problem and in a qualitative distinction between expert and novice strategies, and (c) skills that are practiced extensively automate, such that the mental effort necessary to perform a procedure is minimal. Further, although there are significant challenges to the reportability of automated procedural knowledge, there are certain circumstances under which some level of knowledge can be elicited. Research Questions The research questions for this study are: 1. What are the cognitive skills and strategies that research experts use in experimental research and how do they differ from novice techniques? 2. Do experts automate problem-solving procedures in the domain of research design? 3. Does the degree of automaticity in task performance differ between experts and novices? 4. To what extent can experts and novices accurately report their own problemsolving processes? Accuracy of Expert Self-Report 30 5. What is the relationship between the accuracy of self-report and the degree of automaticity during problem-solving? Methodology Subjects Participants in this study will be recruited from a major university in southern California. Six expert subjects will be identified and recruited on the basis of the following criteria: 1. Subjects will have attained tenure in the field of psychology or educational psychology at a Tier I research university and have conducted research for at least 10 years. These elements are associated with the attainment of expertise as recognized by a professional peerage and a typically necessary duration (Ericsson & Charness, 1994). 2. Each subject will have published at least 20 peer-reviewed empirical studies within their domain of expertise utilizing factorial designs equivalent to those available in the simulation. 3. Subjects will not have major lines of research in the area of memory to prevent biasing of the experimental design and analysis tasks based on recall. Six novice subjects will also be identified and recruited for the study. Each will have completed at least one course in psychological methodology and research design Design This study will utilize a Single-Subject Multivariate Repeated Measures (SSMRM) design (Nesselroade & Featherman, 1991; Wood & Brown, 1994). The goal Accuracy of Expert Self-Report 31 of this approach is to capture intraindividual concurrent changes in multiple variables over time. In this case, EEG levels across multiple bands, scientific problem-solving behaviors, and self-report accuracy will be recorded through time-locked records of each. This approach is necessary, because the aggregation of data that occurs in crosssectional designs would prevent the pairing of specific procedural elements with decreased levels of cognitive load associated with automaticity and particular segments of self-report accuracy scores, as these variables are impacted by both intraindividual change and interindividual differences (Jones & Nesselroade, 1990). Further, the ability to differentiate among larger patterns of behavior over time within individuals permits a deeper examination of the role that automaticity plays in expert and novice problemsolving strategies within the domain being examined (e.g. Chen & Siegler, 2000). Apparatus Simulation. Subjects will use a computer simulation, the Simulated Psychology Lab (Schunn & Anderson, 1999) to design and interpret the results of a series of factorialdesign experiments with the goal of determining which, if either, of two competing theories account for the memory spacing effect described in the introduction to the program. The interface allows subjects to select values for six independent variables (see Table 1), of which up to four can be manipulated in a particular experiment, involving the learning and recall of word lists by hypothetical subjects. When the variable settings for an experiment are determined by the user (though they remain editable until the command to run the experiment is given), the user is then required to predict the mean percentages of correct responses that the hypothetical subjects will produce in each Accuracy of Expert Self-Report 32 condition.3 Once the predictions are entered and the command is given to run the experiment, the computer generates data sets that are compatible with real-world results. After each iteration of the design-hypothesize-and-execute cycle, the results generated by the computer are available to the user in order to modify hypotheses and inform experimental designs in subsequent iterations. To facilitate this process, the results of all previous experimental designs and results are available to the user through the interface. As each action is taken, the simulation captures and records all user actions, including the viewing of data, with a time stamp to provide an accurate procedural history for analysis. Although the ecological validity of in laboratory experiments in general, and simulation apparati in particular, have been called into question for the study of cognitive scientific processes (e.g. Giere, 1993; Klahr & Simon, 1999), simulations that maintain a high level of fidelity to the complexity of experimental tasks and a small observational “grain size” can be considered to capture to a great extent the actual problem-solving processes employed by scientists during the course of their work (Klahr, 2002). Of primary concern is the representativeness of the task, analogous to the methodological considerations of Ericsson and Smith (1991) in the laboratory study of expert performance. With regard to the task in the simulation apparatus used in this study, Schunn and Anderson (1999, p. 346) note that “although the mapping of the two theories for the spacing effect onto these six variables is not simple, this relationship between 3 Schunn and Anderson (1999, p. 347-348) note that “although this prediction task is more stringent than the prediction task psychologists typically give themselves (i.e., directional predictions at best, and rarely for all dimensions and interactions), we used this particular form of a prediction task because 1) assessing directional predictions proved a difficult task to automate; 2) numerical predictions could be made without explicit thought about the influence of each variable and possible interactions, and thus we thought it was less intrusive; 3) it provided further data about the participants’ theories and beliefs about each of the variables; and 4) it provided some cost to large experimental designs (i.e., many more predictions to make) to simulate the increasing real-world cost of larger experimental designs.” Accuracy of Expert Self-Report 33 theory and operational variable is typical of most psychological theories and experiments.” Further, the Simulated Psychology Lab is specifically cited in other literature as an apparatus that “model[s] essential aspects of specific scientific discoveries” (Klahr & Simon, 2001, p. 75). Phase of Experiment Learning Learning Learning Recall Recall Recall Manipulable Variable Repetitions—the number of times that the list of words was studied. Spacing—the amount of time spent between repetitions Learning context—whether subjects were in the same context for each repetition or changed contexts for each repetition Test—memory performance Delay—the amount of time from the last learning repetition until the recall test was given Recall context—whether subjects were in the same context for each repetition or changed contexts for each repetition Possible Values 2, 3, 4, 5 1 minute to 20 days Mood, Location (yes/no) Free recall, Recognition, or Stem completion 1 minute to 20 days Mood, Location (yes/no) Electroencephalogram monitor. Subjects’ cognitive load during experimental tasks will be measured using an electroencephalogram (EEG) that dynamically records and analyzes changes in event-related potential. At minimum, alpha and theta frequency bands, demonstrated to be indicative of cognitive load (Brookings, Wilson, & Swain, 1996; Fournier, Wilson, & Swain, 1999) and a muscle-activity detection probe (e.g., eye blinks) for detecting artifacts in the collected data will be available. EEG is currently the Accuracy of Expert Self-Report 34 physiological instrument of choice for the measure of cognitive workload, due to its ability to provide relatively unobtrusive continuous monitoring of brain function (Gevins & Smith, 2003). Specifically, this approach has been found to reliably measure cognitive load during task analyses of naturalistic human-computer interactions (Raskin, 2000) and, when analyzed in multivariate combinations, can accurately indicate changes in cognitive tasks (Wilson & Fisher, 1995). Critical Decision Method protocol. Cognitive task analyses conducted with subjects during the study will follow the Critical Decision Method protocol (Klein & Calderwood, 1996; Klein, Calderwood, & MacGregor, 1989). The protocol is a semistructured interview that emphasizes the elicitation of procedural and conceptual knowledge from subjects through the recall of particular incidents using specific probes (Table 2; adapted from Klein & Calderwood, 1996, p. 31) following the subject’s report of a general timeline highlighting decision points occurring during the focal time. Probe Type Cues Knowledge Analogues Goals Options Basis Experience Aiding Situation Assessment Probe Content What were you seeing, thinking…? What information did you use in making this decision, and how was it obtained? Were you reminded of any previous experience? What were your specific goals at this time? What other courses of action were considered or were available to you? How was this option selected/other options rejected? What rule was being followed? What specific training or experience was necessary or helpful in making this decision? If the decision was not the best, what training, knowledge, or information could have helped? Imagine that you were asked to describe Accuracy of Expert Self-Report 35 Hypotheticals the situation to a partner who would take over the process at this point, how would you summarize the situation? If a key feature of the situation had been different, what difference would it have made in your decision? Procedure The procedure consists of two phases—preliminary data collection and primary task data collection. During the preliminary data collection phase, subjects’ relevant traits pertaining to ability level and instrument calibration will occur. Subsequently, each subject will begin the simulation task during which process, procedural knowledge, and cognitive load will be assessed. Preliminary Data Collection. Two forms of preliminary data will be collected prior to the beginning of the experimental task. Subjects’ levels of fluid intelligence (Gf) will be evaluated using the Raven Progressive Matrices test (Raven, 1969; Stankov & Crawford, 1993; Stankov & Raykov, 1993; Stankov & Raykov, 1995) to provide comparable baseline data on subjects’ reasoning abilities in novel situations. Horn and Masunaga (2001) differentiate between fluid reasoning ability and the reasoning abilities of experts within their domain of expertise, despite apparent similarities. They argue that “reasoning depends on perceiving relationships in complex sets of stimuli and drawing inferences from these perceptions in order to estimate relationships under conditions of lawful change. In this sense, it [expert reasoning] is similar to the reasoning that characterizes Gf; indeed, the reasoning of expertise and the reasoning of Gf are probably along a continuum of similarity” (p. 294). This similarity is limited, however, by the Accuracy of Expert Self-Report 36 inductive nature of fluid reasoning in contrast to the deductive, forward reasoning evidenced in expert performance. Subjects’ acquired scientific reasoning abilities will also be assessed using Lawson’s Test of Scientific Reasoning (Lawson, 1978; 2000). The test of 24 multiplechoice items assesses subjects’ abilities to separate variables and use proportional logic as well as combinational reasoning and correlations. Although the measure may evidence ceiling effects for expert subjects, it has been found to adequately represent the abilities of college students for assessment of scientific reasoning skills (Lawson, Clark, CramerMeldrum, Falconer, Kwon, & Sequist, 2000). EEG baseline readings will also be obtained for subjects’ high, medium, and low cognitive load levels prior to beginning the simulation task. Because EEG readings detect simple visual perception (Kosslyn, Thompson, Kim, & Alpert, 1995) and the orientation of graphic patterns (Nagai, Kazai, & Yagi, 2001), baseline readings will take place with the subjects sitting in a chair, facing the computer monitor while it displays the interface used during the simulation. Subjects will then perform a series of “n-back” tasks (Gevins & Cutillo, 1993), in which they are asked to identify sequentially presented stimuli as matching or not matching previously presented stimuli. In the low load condition, subjects need identify a match between a stimulus and the stimulus presented immediately beforehand (0-back task). In the higher load conditions, subjects need to match stimuli that are temporally separated by one (1-back), two (2-back), or three (3back) trials. As n increases, higher levels of cognitive load are imposed by the task requirement of retaining stimuli in working memory across continued presentations while simultaneously retaining new stimuli for subsequent match decisions. Accuracy of Expert Self-Report 37 Primary task procedure. Subjects will be informed that they are participating in a study to analyze their mental effort and strategies during a research task that will consist of multiple iterations of experimental design and data analysis. They will be instructed not to concern themselves with the amount of time necessary to complete the task. Instead, they will be advised to remain focused on the task at hand and not perform extraneous actions within the interface. Following these brief instructions, they will be presented with the instructions and task description via the computer monitor. These instructions include an orientation to the Simulated Psychology Lab interface (Schunn & Anderson, 1999). Once the task begins, the subject’s cognitive load levels will be recorded via the EEG while the subject’s actions are recorded by the software. After each experiment is designed and executed, subjects will be interviewed using the Critical Decision Method cognitive task analysis to understand the cognitive processes underlying the keystroke-level behavior data captured by the computer during simulation use. This process will repeat until the subject solves the problem presented by the simulation or reaches a maximum of forty minutes of time engaged in the task. Analysis The first phase of analysis will entail the time synchronization of the EEG data, the computer interaction record captured by the Simulated Psychology Lab, and the events reported in the cognitive task analysis. The EEG and simulation data are both time-stamped and should not pose great difficulty in matching, however the timeline generated during each CTA interview will need to be compared with the keystroke-level simulation data and time-locked. An adaptation of Ericsson and Simon’s (1993) protocol analysis coding will be utilized, wherein the transcribed data will be segmented to Accuracy of Expert Self-Report 38 encompass the processes that occur between one identified decision point and the next. The steps described in each segment will be matched to the computer-recorded actions within each iteration of the design-hypothesize-and-execute cycle. One point of departure from the Ericsson and Simon (1993) technique will be to maintain the contextualization of each segment in accordance with the recommendation of Yang (in press), who argues that in ill-structured, complex, and knowledge-rich tasks, discrete cognitive steps are inherently situated in the broader context of ongoing high level cognitive and metacognitive processes of reasoning and interpretation. “Given the systemic, interrelated, multidimensional nature of the learners’ cognitive processes in this…task, the complexity of the functional model and taxonomy [can cause subjects] to become immersed in the…interwoven processes of searching, interpreting, defining, reasoning, and structuring,” (Yang, in press, p. 8) resulting in an oversimplification of subsequent analyses that loses relevant meaning within the process. The second phase of analysis will consist of the coding of self-report data for degree of accuracy in relation to the matched actions in the simulation. While some categories, such as “accurate,” “error of omission,” and “error of commission” may be formulated a priori on the basis of the errors in self-report described in previous studies (e.g., Nisbett & Wilson, 1977; Wilson & Nisbett, 1978), additional coding categories may emerge from the data upon examination after it has been collected (Dey, 1993). Once coded, these categorical variables can be represented numerically for use in statistical analyses. In the final phase of analysis, P-technique factor analysis will be used to identify how the observed variables in the study (accuracy, action, and cognitive load) change Accuracy of Expert Self-Report 39 together over time within individual subjects (Jones & Nesselroade, 1990). In accordance with the cautions of several studies (Gorman & Allison, 1997; Wood & Brown, 1994), the data will also be checked for autocorrelative effects which can generate standard errors that are too large and standardized factor loadings are too small. Secondary analysis of temporal overlap for automaticity and inaccuracy will be conducted by computing the Yule’s Q statistic using sampled permutations to adjust for small sample size (Bakeman, McArthur, & Quera, 1996). Follow-up assessments to assess between-subjects similarities and differences will be conducted via the congruence assessment technique (Harman, 1967, cited in Lavelli, Pantoja, Hsu, Messinger, & Fogel, in press). Hypotheses 1. Experts will evidence highly developed procedures that yield solutions which are more systematic, more effective, and qualitatively distinct from novice performances (Hmelo-Silver, et al., 2002; Schunn & Anderson, 1999; Schraagen, 1993; Voss, Tyler, & Yengo, 1983). 2. Differences in novice-expert performance will not be significantly related to individual differences in fluid intelligence (cf. Schunn & Anderson, 1998). 3. Experts will be largely inaccurate in reporting their problem-solving procedures as compared to the computer-based record of their actions, while novice subjects’ reports will have significantly higher accuracy. 4. Expert self-report accuracy with regard to a specific element of the problem solving process will be directly related to assessments of intrinsic and observed cognitive load (Sweller, 1994) collected at the point in the process reported. Accuracy of Expert Self-Report 40 References Aarts, H., & Dijksterhuis, A. (2000). Habits as knowledge structures: Automaticity in goal-directed behavior. Journal of Personality and Social Psychology, 78(1), 5363. Adams, G. B., & White, J.D. (1994). Dissertation research in public administration and cognate fields: An assessment of methods and quality. Public Administration Review, 54(6), 565-576. Adelson, B. (1981). Problem solving and the development of abstract categories in programming languages. Memory and Cognition, 9, 422-433. Alberdi, E., Sleeman, D. H., & Korpi, M. (2000). Accommodating surprise in taxonomic tasks: The role of expertise. Cognitive Science, 24(1), 53-91. Anderson, J. R. (1982). Acquisition of cognitive skill. Psychological Review, 89(4), 369-406. Anderson, J. R. (1987). Skill acquisition: Compilation of weak-method problem situations. Psychological Review, 94(2), 192-210. Anderson, J. R. (1993). Problem solving and learning. American Psychologist, 48(1), 35-44. Accuracy of Expert Self-Report 41 Anzai, Y., & Yokoyama, T. (1984). Internal models in physics problem solving. Cognition and Instruction, 1, 397-450. Baddeley, A. (1986). Working memory. Oxford, England: Clarendon Press. Bainbridge, L. (1981). Mathematical equations or processing routines ? In J. Rasmussen and W.B. Rouse (Eds.), Human Detection and Diagnosis of System Failures. NATO Conference Series III : Human Factors, Vol. 15. New York: Plenum Press. Bainbridge, L. (1999). Verbal reports as evidence of the process operator’s knowledge. International Journal of Human-Computer Studies, 51, 213-238. Bakeman, R., McArthur, D., & Quera, V. (1996). Detecting group differences in sequential association using sampled permutations: Log odds, kappa, and phi compared. Behavior Research Methods, Instruments and Computers, 28(3), 446457. Bargh, J. A. (1990). Auto-motives: Preconscious determinants of social interaction. In R. M. Sorrentino & E. T. Higgins (Eds.), Handbook of Motivation and Cognition (pp. 93-130). New York: Guilford Press. Bargh, J.A. (1999). The unbearable automaticity of being. American Psychologist, 54 Accuracy of Expert Self-Report 42 (7), 462-479. Bargh, J.A. & Ferguson, M. J. (2000). Beyond behaviorism: On the automaticity of higher mental processes. Psychological Bulletin, 126(6), 925-945. Baumeister, R. F. (1984). Choking under pressure: Self-consciousness and paradoxical effects of incentives on skillful performance. Journal of Personality and Social Psychology, 46, 610-620. Beilock, S. L., Wierenga, S. A., & Carr, T. H. (2002). Expertise, attention, and memory in sensorimotor skill execution: Impact of novel task constraints on dual-task performance and episodic memory. The Quarterly Journal of Experimental Psychology, 55A(4), 1211–1240. Bereiter, C., & Scardamalia, M. (1993). Surpassing ourselves: An inquiry into the nature and implications of expertise. Chicago, IL: Open Court. Besnard, D. (2000). Expert error. The case of trouble-shooting in electronics. Proceedings of the 19th International Conference SafeComp2000 (pp. 74-85). Rotterdam, Netherlands. Besnard, D., & Bastien-Toniazzo, M. (1999). Expert error in trouble-shooting: An Accuracy of Expert Self-Report 43 exploratory study in electronics. International Journal of Human-Computer Studies, 50, 391-405. Besnard, D., & Cacitti, L. (2001). Troubleshooting in mechanics: A heuristic matching process. Cognition, Technology & Work, 3, 150-160. Blessing, S. B., & Anderson, J. R. (1996). How people learn to skip steps. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22(3), 576-598. Bransford, J.D., Brown, A.L. & Cocking, R.R. (1999). How people learn: Brain, mind, experience, and school. Washington, DC: National Academy Press. Brookings, J. B., Wilson, G. F., & Swain, C. R. (1996). Psychophysiological responses to changes in workload during simulated air traffic control. Biological Psychology, 42, 361-377. Brown, S. W., & Bennett, E. D. (2002). The role of practice and automaticity in temporal and nontemporal dual-task performance. Psychological Research, 66, 80-89. Bruenken, R., Plass, J. L., & Leutner, D. (2003). Direct measurement of cognitive load in multimedia learning. Educational Psychologist, 38(1), 53-61. Accuracy of Expert Self-Report 44 Camp, G., Paas, F., Rikers, R., & van Merrienboer, J. (2001). Dynamic problem selection in air traffic control training: a comparison between performance, mental effort and mental efficiency. Computers in Human Behavior, 17, 575-595. Carlson, R. A., Khoo, B. H., Yaure, R. G., & Schneider, W. (1990). Acquisition of a problem-solving skill: Levels of organization and use of working memory. Journal of Experimental Psychology: General, 119(2), 193-214. Casner, S. M. (1994). Understanding the determinants of problem-solving behavior in a complex environment. Human Factors, 36(4), 580-596. Ceci, S. J., & Liker, J. K. (1986). A day at the races: A study of IQ, expertise, and cognitive complexity. Journal of Experimental Psychology, 115, 255-266. Chandler, P. & Sweller, J. (1991). Cognitive load theory and the format of instruction. Cognition and Instruction, 8, 293-332. Charness, N., Krampe, R., & Mayr, U. (1996). The role of practice and coaching in entrepreneurial skill domains: An international comparison of life-span chess skill acquisition. In K. A. Ericsson (Ed.), The road to excellence: The acquisition of expert performance in the arts and sciences, sports, and games (pp. 51-80). Mahwah, NJ: Lawrence Erlbaum Associates. Accuracy of Expert Self-Report 45 Chase, W. G. & Simon, H. A. (1973). Perception in chess. Cognitive Psychology, 4, 5581. Chen, Z., & Siegler, R. S. (2000). Across the great divide: Bridging the gap between understanding of toddlers’ and older children’s thinking. Monographs of the Society for Research in Child Development, Serial No. 261, 65(2). Chi, M. T., Feltovich, P. J. & Glaser, R. (1981). Categorization and representation of physics problems by experts and novices. Cognitive Science, 5, 121-152. Chi, M. T. H., Glaser, R., & Rees, E. (1982). Expertise in problem solving. In R. J. Sternberg (Ed.), Advances in psychology of human intelligence (Vol. 1, pp. 7-75). Hillsdale, NJ: Erlbaum. Chipman, S. F., Schraagen, J. M., & Shalin, V. L. (2000). Introduction to cognitive task analysis. In J. M. Schraagen, S. F. Chipman, & V. L. Shalin (Eds.), Cognitive Task Analysis (pp. 3-23). Mahwah, NJ: Lawrence Erlbaum Associates. Chung, G. K. W. K., de Vries, F. L., Cheak, A. M., Stevens, R. H., & Bewley, W. L. (2002). Cognitive process validation of an online problem solving assessment. Computers in Human Behavior, 18, 669-684. Clement, J. (1988). Observed methods for generating analogies in scientific problem Accuracy of Expert Self-Report 46 solving. Cognitive Science, 12(4):563-586. Cook, T. D., & Campbell, D. T. (1976). The design and conduct of quasi-experimental and true experiments in field settings. In M. D. Dunnette (Ed.), Handbook of Industrial and Organizational Psychology (pp. 223-326). Rand McNally Publishing Company. Cooke, N. J. (1992). Modeling human expertise in expert systems. In R. R. Hoffman (Ed.), The psychology of expertise: Cognitive research and empirical AI (pp. 2960). Mahwah, NJ: Lawrence Erlbaum Associates. Cowan, N. (2000). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24, 87-185. Crandall, B., & Gamblian, V. (1991). Guide to early sepsis assessment in the NICU. Fairborn, OH: Klein Associates, Inc. Crandall, B., & Calderwood, R. (1989). Clinical assessment skills of neo-natal intensive care nurses (Report Contract 1-R43-NR01911-01, National Center for Nursing, National Institutes of Health, Bethesda, MD). Fairborn, OH: Klein Associates, Inc. Dey, I. (1993). Qualitative data analysis: A user-friendly guide for social scientists. Accuracy of Expert Self-Report 47 New York: Routledge. Doane, S. M., Pellegrino, J. W., & Klatzky, R. L. (1990). Expertise in a computer operating system: Conceptualization and performance. Human-Computer Interaction, 5, 267-304. Doll, J., & Mayr, U. (1987). Intelligenz und schachleistung—eine untersuchung an schachexperten. [Intelligence and achievement in chess—a study of chess masters.] Psychologische Beiträge, 29, 270-289. Dhillon, A. S. (1998). Individual differences within problem-solving strategies used in physics. Science Education, 82, 379-405. Dubois, D., & Shalin, V. L. (2000). Describing job expertise using cognitively oriented task analyses (COTA). In J. M. Schraagen, S. F. Chipman, & V. L. Shalin (Eds.), Cognitive Task Analysis (pp. 41-55). Mahwah, NJ: Lawrence Erlbaum Associates. Ericsson, K. A. (Ed.). (1996). The road to excellence: The acquisition of expert performance in the arts and sciences, sports and games. Mahwah, New Jersey: Lawrence Erlbaum Associates. Ericsson, K. A. (2000). How experts attain and maintain superior performance: Accuracy of Expert Self-Report 48 Implications for the enhancement of skilled performance in older individuals. Journal of Aging & Physical Activity, 8(4), 366-372. Ericsson, K. A., & Charness, N. (1994). Expert performance: Its structure and acquisition. American Psychologist, 49(8), 725-747. Ericsson, K. A., Krampe, R. T., & Tesch-Romer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100, 363-406. Ericsson, K. A., & Lehmann, A. C. (1996). Expert and exceptional performance: Maximal adaptation to task constraints. Annual Review of Psychology, 47, 273305. Ericsson, K. A., Patel, V., & Kintsch, W. (2000). How experts’ adaptations to representative task demands account for the expertise effect in memory recall: Comment on Vicente and Wang (1998). Psychological Review, 107(3), 578-592. Ericsson, K. A., & Smith, J. (1991). Towards a general theory of expertise: Prospects and limits. New York: Cambridge University Press. Fitts, P. M., & Posner, M. I. (1967). Human Performance. Belmont, CA: Brooks-Cole. Accuracy of Expert Self-Report 49 Fournier, L. R., Wilson, G. F., & Swain, C. R. (1999). Electrophysiological, behavioral, and subjective indexes of workload when performing multiple tasks: manipulations of task difficulty and training. International Journal of Psychophysiology, 31, 129-145. Frensch, P. A., & Sternberg, R. J. (1989). Expertise and intelligent issues: When is it worse to know better? In R. J. Sternberg (Ed.), Advances in the psychology of human intelligence (Vol. 5, pp. 157-188). Hillsdale, NJ: Lawrence Erlbaum Associates. Gall, M. D., Borg, W. R., & Gall, J. P. (1996). Educational Research: An introduction (6th edition). White Plains, NY: Longman Publishers, USA Gevins, A., & Smith, M. E. (2003). Neurophysiological measures of cognitive workload during human-computer interaction. Theoretical Issues in Ergonomic Science, 4(1-2), 113-131. Giere, R. N. (1993). Cognitive Models of Science. Psycoloquy, 4(56), Scientific Cognition (1). Accessed online at http://psycprints.ecs.soton.ac.uk/archive/ 00000350/. Gobet, F. (1998). Expert memory: A comparison of four theories. Cognition, 66, 115152. Accuracy of Expert Self-Report 50 Gobet, F., & Simon, H. A. (1996). Templates in chess memory: A mechanism for recalling several boards. Cognitive Psychology, 31, 1-40. Golde, C. M. and Dore, T. M. (2001). At cross purposes: What the experiences of doctoral students reveal about doctoral education. Philadelphia, PA, A report for The Pew Charitable Trusts. Gordon, S. E. (1992). Implications of cognitive theory for knowledge acquisition. In R. R. Hoffman (Ed.), The psychology of expertise: Cognitive research and empirical AI (pp. 99-120). Mahwah, NJ: Lawrence Erlbaum Associates. Gorman, B. S., & Allison, D. B. (1997). Statistical alternatives for single-case designs. In R. D. Franklin, D. B. Allison, & B. S. Gorman (Eds.), Design and Analysis of Single-Case Research (pp. 159-214). Mahwah, NJ: Lawrence Erlbaum Associates. Gott, S. P., Hall, E. P., Pokorny, R. A., Dibble, E., & Glaser, R. (1993). A naturalistic study of transfer: Adaptive expertise in technical domains. In D. K. Detterman & R. J. Sternberg (Eds.), Transfer on trial: Intelligence, cognition, and instruction (pp. 258-288). Norwood, NJ: Ablex. Hankins, T. C., & Wilson, G. F. (1998). A comparison of heart rate, eye activity, EEG Accuracy of Expert Self-Report 51 and subjective measures of pilot mental workload during flight. Aviation, Space, and Environmental Medicine, 69(4), 360-367. Harman, H. H. (1967). Modern factor analysis. Chicago: University of Chicago Press. Hermans, D., Crombez, G., & Eelen, P. (2000). Automatic attitude activation and efficiency: The fourth horseman of automaticity. Psychologica Belgica, 40(1), 322. Hatano, G. (1982). Cognitive consequences of practice in culture specific procedural skills. Quarterly Newsletter of the Laboratory of Comparative Human Cognition, 4, 15-18. Hatano, G. & Inagaki, K. (1986). Two courses of expertise. In H. Stevenson, H. Asuma & K. Hakauta (Eds.). Child Development and Education in Japan (pp. 262-272). San Francisco, CA: Freeman. Hatano, G. & Inagaki (2000). Practice makes a difference: Design principles for adaptive expertise. Presented at the Annual Meeting of the American Education Research Association. New Orleans, Louisiana: April, 2000. Hmelo-Silver, C. E., Nagarajan, A., & Day, R. S. (2002). ‘‘It’s harder than we thought it Accuracy of Expert Self-Report 52 would be”: A comparative case study of expert-novice experimentation strategies. Science Education, 86, 219-243. Holbrook, A. (2002). Examining the quality of doctoral research. A Symposium presented at the American Education Research Association Conference, New Orleans, LA, April 1-5, 2002. Holyoak, K. J. (1991). Symbolic connectionism: Toward third generation theories of expertise. In K. A. Ericsson & J. Smith (Eds.) Toward a general theory of expertise: Prospects and limits (pp. 301-335). New York: Cambridge University Press. Hong, J. C., & Liu, M. C. (2003). A study on thinking strategy between experts and novices of computer games. Computers in Human Behavior, 19, 245-258. Hooker, K., Nesselroade, D. W., Nesselroade, J. R., & Lerner, R. M. (1987). The structure of intraindividual temperament in the context of mother-child dyads: Ptechnique factor analyses of short-term change. Developmental Psychology, 23(3), 332-346. Hulin, C. L., Henry, R. A., & Noon, S. L. (1990). Adding a dimension: Time as a factor in the generalizability of predictive relationships. Psychological Bulletin, 107, 328-340. Accuracy of Expert Self-Report 53 Hyoenae, J., Tommola, J., & Alaja, A. (1995). Pupil dilation as a measure of processing load in simultaneous interpretation and other language tasks. Quarterly Journal of Experimental Psychology: Human Experimental Psychology, 48A(3), 598-612. Jonassen, D. H. (2000). Toward a meta-theory of problem solving. Educational Technology: Research and Development, 48(4), 63-85. Jones, C. J., & Nesselroade, J. R. (1990). Multivariate, replicated, single-subject, repeated measures designs and P-technique factor analysis: A review of intraindividual change studies. Experimental Aging Research, 16, 171-183. Kahane, H. (1973). Logic and philosophy: A modern introduction (2nd Ed.). Belmont, CA: Wadsworth. Klahr, D. (2000). Exploring science: The cognition and development of discovery processes. Cambridge, MA: The MIT Press. Klahr, D., & Dunbar, K. (1988). Dual space search during scientific reasoning. Cognitive Science , 12(1), 1-55. Klahr, D., Fay, A. L., & Dunbar, K. (1993). Heuristics for scientific experimentation: A developmental study. Cognitive Psychology, 24(1), 111-146. Accuracy of Expert Self-Report 54 Klahr, D., & Simon, H. A. (1999). Studies of scientific discovery: Complementary approaches and convergent findings. Psychological Bulletin, 125(5), 524-543. Klahr, D., & Simon, H. A. (2001). What have psychologists (and others) discovered about the process of scientific discovery? Current Directions in Psychological Science, 10(3), 75-79. Klein, G. A., & Calderwood, R. (1996). Investigations of naturalistic decision making and the recognition-primed decision model (Research Note 96-43). Yellow Springs, OH: Klein Associates, Inc. Prepared under contract MDA903-85-C-0327 for U. S. Army Research Institute for the Behavioral and Social Sciences, Alexandria, VA. Klein, G. A., Calderwood, R., & MacGregor, D. (1989). Critical decision method for eliciting knowledge. IEEE Transactions on Systems, Man, and Cybernetics, 19, 462-472. Kosslyn, S. M., Thompson, W. L., Kim, I. J., & Alpert, N. M. (1995). Topographical representations of mental images in primary visual cortex. Nature, 378, 496-498. Koubek, R. J., & Salvendy, G. (1991). Cognitive performance of super-experts on computer program modification tasks. Ergonomics, 34, 1095-1112. Accuracy of Expert Self-Report 55 Labaree, D. F. (2003). The peculiar problems of preparing educational researchers. Educational Researcher, 32(4), 13-22. Lamberti, D.M., & Newsome, S.L. (1989). Presenting abstract versus concrete information in expert systems: What is the impact on user performance. International Journal of Man-Machine Studies, 31, 27-45. Larkin, J.H. (1983). The role of problem representation in physics. In D. Gentner & A.L. Stevens (Eds.). Mental models (pp. 75-98). Hillsdale, NJ: Lawrence Erlbaum Associates. Larkin, J.H. (1985). Understanding, problem representation, and skill in physics. In S.F. Chipman, J.W. Segal, & R. Glaser (Eds.), Thinking and learning skills (Vol. 2): Research and open questions (pp. 141-160). Hillsdale, NJ: Erlbaum. Larkin, J., McDermott, J., Simon, D.P. and Simon, H.A. (1980a). Expert and novice performance in solving physics problems. Science, 208, 1335-1342. Larkin, J. H., McDermott, J., Simon, D. P., & Simon, H. A. (1980b). Models of competence in solving physics problems. Cognitive Science, 4(4), 317-345. Lavelli, M., Pantoja, A. P. F., Hsu, H., Messinger, D., & Fogel, A. (in press). Using Accuracy of Expert Self-Report 56 microgenetic designs to study change processes. In D. G. Teti (Ed.), Handbook of Research Methods in Developmental Psychology. Oxford, UK: Blackwell Publishers. Lawson, A. E. (1978). Development and validation of the classroom test of formal reasoning. Journal of Research in Science Teaching, 15(1), 11-24. Lawson, A. E. (2000). Classroom test of scientific reasoning: Multiple choice version (Revised Edition). Tempe, AZ: Arizona State University. Lawson, A.E., Clark, B., Cramer-Meldrum, E., Falconer, K.A., Kwon, Y.J., & Sequist, J.M. (2000). The development of reasoning skills in college biology: Do two levels of general hypothesis-testing skills exist? Journal of Research in Science Teaching, 37(1), 81-101. Leckie, G. J. (1996). Desperately seeking citations: Uncovering faculty assumptions about the undergraduate research process. The Journal of Academic Librarianship, 22, 201-208. Logan, G. (1988a). Toward an instance theory of automatization. Psychological Review, 95, 583-598. Logan, G. D. (1988b). Automaticity, resources, and memory: Theoretical controversies Accuracy of Expert Self-Report 57 and practical implications. Human Factors, 30(5), 583-598. Logan, G. D., & Cowan, W. (1984). On the ability to inhibit thought, and action: A theory of an act of control. Psychological Review, 91, 295-327. Logan, G. D., Taylor, S. E., & Etherton, J. L. (1996). Attention in the acquisition and expression of automaticity. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22(3), 620-638. Lovett, M. C., & Anderson, J. R. (1996). History of success and current context in problem solving: Combined influences on operator selection. Cognitive Psychology, 31, 168-217. Masunaga, H., & Horn, J. (2001). Expertise and age-related changes in components of intelligence. Psychology and Aging, 16(2), 293-311. McBurney, D. H. (1998). Research Methods. (4th edition). Pacific Grove, CA: Brooks/ Cole. McGuire, W. J. (1997). Creative hypothesis generating in psychology: Some useful heuristics. Annual Review of Psychology, 48, 1-30. Meissner, C. A., & Memon, A. (2002). Verbal overshadowing: A special issue Accuracy of Expert Self-Report 58 exploring theoretical and applied issues. Applied Cognitive Psychology, 16, 869872. Moray, N., & Reeves, T. (1987). Hunting the homomorph: A theory of mental models and a method by which they may be identified. Proceedings of the International Conference on Systems, Man, and Cybernetics (pp. 594-597). Nagai, M., Kazai, K., & Yagi, A. (2001). Lambda response by orientation of striped patterns. Perceptual and Motor Skills, 93, 672-676. Nesselroade, J.R., & Featherman, D.L. (1991). Intraindividual variability in older adults’ depression scores: Some implications for development theory and longitudinal research. In D. Magnusson, L. Bergman, G. Rudinger. & Y.B. Torestad (Eds.). Problems and methods in longitudinal research: Stability and change (pp. 7 - 66). Cambridge: Cambridge University Press. Neves, D. (1977). An experimental analysis of strategies of the Tower of Hanoi (C.I.P. Working Paper No. 362). Unpublished manuscript, Carnegie Mellon University. Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall. Nisbett, R. E., & Wilson, T. D. (1977). Telling more than we can know: Verbal reports Accuracy of Expert Self-Report 59 on mental processes. Psychological Review, 84, 231-259. Onwuegbuzie, A. J., Slate, J. R., Paterson, F. R. A., Watson, M. H., & Schwartz, R. A. (2000). Factors associated with achievement in educational research courses. Research in the Schools, 7(1), 53-65. Patel, V. L., Kaufman, D. R., & Magder, S. A. (1996). The acquisition of medical expertise in complex dynamic environments. In K. A. Ericsson (Ed.), The road to excellence: The acquisition of excellence in arts and sciences, sports and games (pp. 127-165). Mahwah, NJ: Lawrence Erlbaum Associates. Pedhazur, E. J., & Schmelkin, L. P. (1991). Measurement, design, and analysis: An integrated approach. Hillsdale, NJ: Lawrence Erlbaum Associates. Perkins, D. N., & Grotzer, T. A. (1997). Teaching intelligence. American Psychologist, 52(10), 1125-1133. Phillips, L. H., Wynn, V. E., McPherson, S., & Gilhooly, K. J. (2001). Mental planning and the Tower of London task. The Quarterly Journal of Experimental Psychology, 54A(2), 579-597. Proctor, R. W., & Dutta, A. (1995). Skill acquisition and human performance. Thousand Oaks, CA: Sage Publications. Accuracy of Expert Self-Report 60 Radziszewska, B., & Rogoff, B. (1988). Influence of adult and peer collaborators on children’s planning skills. Developmental Psychology, 24(6), 840-848. Radziszewska, B., & Rogoff, B. (1991). Children’s guided participation in planning imaginary errands with skilled adult or peer partners. Developmental Psychology, 27(3), 381-389. Raskin, J. (2000). Humane interface: New directions for designing interactive systems. Boston, MA: Addison-Wesley. Raven, J.C. (1969). Standard progressive matrices. London: Lewis. Reder, L. M., 7 Schunn, C. D. (1996). Metacognition does not imply awareness: Strategy choice is governed by implicit learning and memory. In L. M. Reder (Ed.), Implicit memory and metacognition (pp. 45-77). Mahwah, NJ: Lawrence Erlbaum Associates. Reingold, E. M., Charness, N., Schultetus, R. S., & Stampe, D. M. (2001). Perceptual automaticity in expert chess players: Parallel encoding of chess relations. Psychonomic Bulletin & Review, 8(3), 504-510. Rowe, R. M., & McKenna, F. P. (2001). Skilled anticipation in real-world tasks: Accuracy of Expert Self-Report 61 Measurement of attentional demands in the domain of tennis. Journal of Experimental Psychology: Applied, 7(1), 60-67. Schaafstal, A., & Schraagen, J. M. C. (2000). Training of troubleshooting: A structured, task analytical approach. In J. M. C. Schraagen, S. F. Chipman, & V. L. Shalin (Eds.), Cognitive Task Analysis (pp. 57-70). Mahwah, NJ: Lawrence Erlbaum Associates. Schneider, W., & Fisk, A. D. (1982). Concurrent automatic and controlled visuals search: Can processing occur without resource cost? Journal of Experimental Psychology: Learning, Memory, and Cognition, 8, 261-278. Schneider, W., & Shiffrin, R.M. (1977). Controlled and automatic human information processing: I. Detection, search, and attention. Psychological Review, 84(1), 1-66. Schoenfeld, A. H. (1999). The core, the canon, and the development of research skills: Issues in the preparation of education researchers. In E. C. Lagemann & L. S. Shulman (Eds.), Issues in Education Research: Problems and Possibilities (pp. 166-202). San Francisco, CA: Jossey-Bass. Schooler, J. W., Ohlsson, S., & Brooks, K. (1993). Thoughts beyond words: When language overshadows insight. Journal of Experimental Psychology: General, 122(2), 166-183. Accuracy of Expert Self-Report 62 Schunn, C. D., & Anderson, J. R. (1998). Scientific discovery. In J. R. Anderson & C. Lebiere (Eds.) The Atomic Components of Thought (pp. 385-427). Mahwah, NJ: Lawrence Erlbaum Associates. Schunn, C. D., & Anderson, J. R. (1999). The generality/specificity of expertise in scientific reasoning. Cognitive Science, 23(3), 337-370. Schunn, C. D., & Klahr, D. (1996). The problem of problem spaces: When and how to go beyond a 2-space model of scientific discovery. In G. W. Cottrell (Ed.), Proceedings of the 18th Annual Conference of the Cognitive Science Society (pp. 25-26). Hillsdale, NJ: Erlbaum. Schunn, C. D., Reder, L. M., Nhouyvanisvong, A., Richards, D. R., & Stroffolino, P. J. (1997). To calculate or not to calculate: A source activation confusion model of problem-familiarity’s role in strategy selection. Journal of Experimental Psychology: Learning, Memory, & Cognition, 23(1), 3-29. Schraagen, J. M. C. (1990). How experts solve a novel problem within their domain of expertise (IZF 1990 B-14). Soesterberg, The Netherlands: Netherlands Organization for Applied Scientific Research. Prepared under HDO assignment B89-35 for TNO Institute for Perception, TNO Division of National Defense Research, Soesterberg, The Netherlands. Accuracy of Expert Self-Report 63 Schraagen, J. M. C. (1993). How experts solve a novel problem in experimental design. Cognitive Science, 17(2), 285–309. Schraagen, J. M. C., Chipman, S. F., & Shute, V. J. (2000). State-of-the-art review of cognitive task analysis techniques. In J. M. C. Schraagen, S. F. Chipman, & V. L. Shalin (Eds.), Cognitive Task Analysis (pp. 467-487). Mahwah, NJ: Lawrence Erlbaum Associates. Seamster, T. L., Redding, R. E., & Kaempf, G. L. (2000). A skill-based cognitive task analysis framework. In J. M. C. Schraagen, S. F. Chipman, & V. L. Shalin (Eds.), Cognitive Task Analysis (pp. 135-146). Mahwah, NJ: Lawrence Erlbaum Associates. Shafto, P., & Coley, J. D. (2003). Development of categorization and reasoning in the natural world: Novices to experts, naïve similarity to ecological knowledge. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29(4), 641-649. Shalin, V. L., Geddes, N. D., Bertram, D., Szczepkowski, M. A., & DuBois, D. (1997). Expertise in dynamic, physical task domains. In P. J. Feltovich, K. M. Ford, & R. R. Hoffman (Eds.), Expertise in Context (pp. 195-217). Menlo Park, CA: American Association for Artificial Intelligence Press. Accuracy of Expert Self-Report 64 Shiffrin, R. M. (1988). Attention. In R. C. Atkinson, R. J. Herrnstein, G. Lindzey, & R. D. Luce, (Eds.), Stevens' Handbook of Experimental Psychology (2nd Ed.) (pp. 739-811). New York: Wiley. Shiffrin, R. M. & Dumais, S. T. (1981). The development of automatism. In J. R. Anderson (Ed.), Cognitive Skills and Their Acquisition (pp. 111-140). Hillsdale, NJ: Erlbaum. Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information processing: II. Perceptual learning, automatic attending, and a general theory. Psychological Review, 84, 127-190. Simonton, D. K. (1999). Talent and its development: An emergenic and epigenetic mode. Psychological Review, 106, 435-457. Singley, M. K., & Anderson, J. R. (1989). Transfer of cognitive skill. Cambridge, MA: Harvard University Press. Sloutsky, V. M., & Yarlas, A. (2000). Problem representation in experts and novices: Part 2. Underlying processing mechanisms. Proceedings of the XXII Annual Conference of the Cognitive Science Society (pp. 475-480). Mahwah, NJ: Erlbaum. Accuracy of Expert Self-Report 65 Stankov, L., & Crawford, J. D. (1993). Ingredients of complexity in fluid intelligence. Learning and Individual Differences, 5, 73-111. Stankov, L., & Raykov, T. (1993). On task complexity and “simplex” correlation matrices. Australian Journal of Psychology, 45, 125-145. Stankov, L., & Raykov, T. (1995). Modeling complexity and difficulty in measures of fluid intelligence. Structural Equation Modeling, 2, 335-366. Starkes, J. L., Deakin, J. M., Allard, F., Hodges, N. J., & Hayes, A. (1996). Deliberate practice in sports: What is it anyway? In K. A. Ericsson (Ed.), The road to excellence: The acquisition of expert performance in the arts and sciences, sports, and games (pp. 81-106). Mahwah, NJ: Lawrence Erlbaum Associates. Sternberg, R. J. (1997). Cognitive conceptions of expertise. In P. J. Feltovich, K. M. Ford, & R. R. Hoffman (Eds.), Expertise in Context (pp. 149-162). Menlo Park, CA: American Association for Artificial Intelligence Press. Sternberg, R. J., Gigorenko, E. L., & Ferrari, M. (2002). Fostering intellectual excellence through developing expertise. In. M. Ferrari (Ed.), The Pursuit of Excellence Through Education (pp. 57-83). Mahwah, NJ: Lawrence Erlbaum Associates. Accuracy of Expert Self-Report 66 Sternberg, R. J., & Horvath, J. A. (1998). Cognitive conceptions of expertise and their relations to giftedness. In R. C. Friedman & K. B. Rogers (Eds.), Talent in Context (pp. 177-191). Washington, DC: American Psychological Association. Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12, 257-285. Sweller, J. (1989). Cognitive technology: Some procedures for facilitating learning and problem solving in mathematics and science. Journal of Cognitive Psychology, 81(4), 457-466. Sweller, J. (1994). Cognitive load theory, learning difficulty, and instructional design. Learning & Instruction, 4(4), 295-312. Sweller, J., Chandler, P., Tierney, P., & Cooper, M. (1990). Cognitive load as a factor in the structuring of technical material. Journal of Experimental Psychology: General, 119(2), 176-192. Thagard, P. (1998). Ulcers and bacteria: I. Discovery and acceptance. Studies in the History and Philosophy of Biology and Biomedical Sciences, 9, 107-136. Torff, B. (2003). Developmental changes in teachers' use of higher order thinking and content knowledge. Journal of Educational Psychology, 95(3), 563-569. Accuracy of Expert Self-Report 67 VanLehn, K. (1996). Cognitive skill acquisition. Annual Review of Psychology, 47, 513-539. VanLehn, K., Siler, S., Murray, C., Yamauchi, T., & Baggett, W. B. (2003). Why do only some events cause learning during human tutoring? Cognition & Instruction, 21(3), 209-249. Velmahos, G., Toutouzas, K., Sillin, L., Chan, L., Clark, R. E. Theodorou, D., Maupin, F., Murray, J., Sullivan, M, Demetriades, D., & DeMeester, T. M. (2002). Cognitive task analysis for teaching technical skills in an animate surgical skills laboratory: A randomized controlled trial with pertinent clinical outcomes. Presented at the 2002 Annual meeting of the Association for Surgical Education, April 4-6, Baltimore, MD. Vicente, K. J. (2000). Revisiting the constraint attunement hypothesis: Reply to Ericsson, Patel, and Kintsch (2000) and Simon and Gobet (2000). Psychological Review, 107(3), 601-608. Voss, J. F., Tyler, S. W., & Yengo, L. A. (1983). Individual differences in the solving of social science problems. In R. F. Dillon & R. R. Schmeck (Eds.), Individual Differences in Cognition (Vol. 1, pp. 205-232). New York: Academic. Accuracy of Expert Self-Report 68 Wegner, D. M. (2002). The Illusion of Conscious Will. Cambridge, MA: MIT Press. Wheatley, T., & Wegner, D. M. (2001). Automaticity of action, Psychology of. In N. J. Smelser & P. B. Baltes (Eds.), International Encyclopedia of the Social and Behavioral Sciences, (pp. 991-993). Oxford, IK: Elsevier Science Limited. Williams, K. E. (2000). An automated aid for modeling human-computer interaction. In J. M. C. Schraagen, S. F. Chipman, & V. L. Shalin (Eds.), Cognitive Task Analysis (pp. 165-180). Mahwah, NJ: Lawrence Erlbaum Associates. Wilson, G. F., & Fisher, F. (1995). Cognitive task classification based upon topographic EEG data. Biological Psychology, 40, 239-250. Wilson, T. D., & Nisbett, R. E. (1978). The accuracy of verbal reports about the effects of stimuli on evaluations and behavior. Social Psychology, 41(2), 118-131. Wood, P., & Brown, D. (1994). The study of intraindividual differences by means of dynamic factor models: Rationale, implementation, and interpretation. Psychological Bulletin, 116(1), 166-186. Yang, S. C. (in press). Reconceptualizing think-aloud methodology: Refining the encoding and categorizing techniques via contextualized perspectives. Computers in Human Behavior, 1-21. Accuracy of Expert Self-Report 69 Yarlas, A., & Sloutsky, V. M. (2000). Problem representation in experts and novices: Part 1. Differences in the content of representation. Proceedings of the XXII Annual Conference of the Cognitive Science Society (pp. 1006-1011). Mahwah, NJ: Erlbaum. Zeitz, C. M. (1997). Some concrete advantages of abstraction: How experts’ representations facilitate reasoning. In P. J. Feltovich, K. M. Ford, & R. R. Hoffman (Eds.), Expertise in Context (pp. 43-65). Menlo Park, CA: American Association for Artificial Intelligence.