Accuracy of Expert Self-Report 1

advertisement
Accuracy of Expert Self-Report 1
Running Head: ACCURACY OF EXPERT SELF-REPORT
The Role of Automaticity in Experimental Design and Analysis: A Study of NoviceExpert Differences in the Accuracy of Self-Report
David F. Feldon
Rossier School of Education
University of Southern California
Dissertation Proposal
Accuracy of Expert Self-Report 2
The Role of Automaticity in Experimental Design and Analysis: A Study of NoviceExpert Differences in the Accuracy of Self-Report
“There are a number of assumptions that a skilled researcher uses when doing
research. Often, they can't even articulate what they are, but they practice them.
The [expert researcher] model requires a long process of acculturation, an indepth knowledge of the discipline, awareness of important scholars working in
particular areas, participation in a system of informal scholarly communication,
and a view of research as a non-sequential, nonlinear process with a large degree
of ambiguity and serendipity. The expert researcher is relatively independent, and
has developed his or her own personal [research] strategies” (Leckie, 1996, p.
202).
This view of experimental research reflects several critical concerns in the
preparation of Ph.D. students. One of the most challenging aspects of graduate education
in the social sciences is the teaching of research skills (Labaree, 2003; Schoenfeld, 1999).
While there are many instructional texts on the process of experimental research (e.g.
Gall, Borg, & Gall, 1996; McBurney, 1998; Pedhazur & Schmelkin, 1991) and an
emphasis on personal advisement and cognitive apprenticeship in the advanced stages of
graduate study (Golde & Dore, 2001), there have been increasing levels of concern about
the quality of research skills that students develop through their doctoral programs
(Adams & White, 1994; Holbrook, 2002).
Accuracy of Expert Self-Report 3
The development of social science research skills at the graduate level often
begins with course-based instruction, which has been found to yield highly variable
levels of skill mastery and self-efficacy (Onwuegbuzie, Slate, Paterson, Watson, &
Schwartz, 2000). The content of these courses is usually presented through assigned
readings and lectures by an instructor. In each case, the strategies for thinking about the
research process are ultimately dependent on the reflections of a researcher describing his
own practice—from the instructor directly, through the assigned readings, or a
combination thereof.
Likewise, in the case of cognitive apprenticeships, the mentor’s role is to model
and explain his or her own approach to problems considered together with the student. In
their studies of cognitive apprenticeships, Radziszewska and Rogoff (1988; 1991) report
that successful learning by the student is dependent on accurate, comprehensible
explanations of strategies by the mentor and the opportunity to participate in decisions
during authentic tasks. As the decision-making and strategic components of the research
process are entirely cognitive, direct observation by the student is severely limited. Thus,
at every stage of training in experimental research, the student is dependent on the selfreport of (ideally) an expert in the field.
However, the accuracy of self-report under many conditions is considered highly
suspect (e.g. Minsky, 1981; Nisbett & Wilson, 1977; Schneider & Shiffrin, 1977; Wilson
& Nisbett, 1978). Consequently, a deeper, more objective understanding of cognitive
research skills must be developed to assist students in their transition to skilled
researchers. Ultimately, to better scaffold the skill acquisition of developing researchers,
an accurate understanding and representation of expert strategies must emerge—
Accuracy of Expert Self-Report 4
specifically, the conceptualization of problems for inquiry, the formulation of
experimental designs, and the analytical skills utilized in the interpretation of results
(Hmelo-Silver, Nagarajan, & Day, 2002; McGuire, 1997).
Purpose of the Study
The purpose of this study is threefold:
1. To accurately identify cognitive skills and strategies that research experts use in
experimental research and differentiate them from novice techniques.
2. To evaluate experts’ abilities to accurately report their own problem-solving
processes.
3. To demonstrate that automaticity is a fundamental characteristic of expert
performance in this domain.
Review of the Literature
In order to meaningfully explore the problems posed above, it is first necessary to
ground several implicit assumptions in the foundation of prior empirical research.
Specifically, these include the assertions that (a) scientific research skills are acquirable
and represent predominantly crystallized intelligence, (b) expertise in scientific research
is distinguishable from lower levels of skill, both in an individual’s ability to successfully
solve a problem and in a qualitative distinction between expert and novice strategies, and
(c) skills that are practiced extensively automate, such that the mental effort necessary to
perform a procedure is minimal.
Accuracy of Expert Self-Report 5
The support for each assumption will emerge from reviews of the empirical
evidence in the following categories: scientific problem solving, expertise, automaticity
of procedural knowledge, and accuracy in self-report. Because the arguments draw upon
findings from a variety of overlapping research agendas with different theoretical
frameworks, multiple terms will sometimes be used to describe a single construct and the
use of identical terms will sometimes be differentiated to clarify important distinctions
that exist between researchers. Most notably, the term “domain-general skills” has been
used to describe both heuristic procedures applicable in multiple domains that have been
acquired through experience (see Anderson, 1987) and the application of fluid
intelligence to novel situations (see Perkins & Grotzer, 1997). To avoid confusion,
domain-general skills will be considered heuristic procedures that can be used
independent of knowledge of a particular subject to solve a problem. In contrast,
domain-specific skills indicate those that actively utilize acquired knowledge of a
particular domain to generate solutions specific to a problem.
Scientific Problem Solving
In the study of problem solving, the search for solutions is often referred to as the
navigation of a problem space, in which the initial state and the goal state are the starting
and ending points, and the space is composed of all possible routes to move from the
former to the latter (Newell & Simon, 1972). However, because scientific problem
solving requires attention to both a hypothesis or research question and an experimental
design that controls sources of variance, Klahr and Dunbar (1988) argue that there are in
fact two adjacent, mutually influential problem spaces which must be searched1. After
1
Thagard (1998) argued for the existence of a third problem search space for selection of instruments used
within experiments, given the key role that advances in instrumentation played in the understanding of
Accuracy of Expert Self-Report 6
the selection of a particular experimental design and the generation of data, the scientist
must evaluate the progress that has been made in the hypothesis space towards the goal of
null hypothesis rejection. Thus, each successful navigation of the experiment space
results in incremental progress within the hypothesis space. Given the vast number of
possible steps within each problem space and the exponential increase in the size of the
search space for each additional step, Klahr and Simon (2001) note that “much of the
training in scientists is aimed at increasing the degree of well-definedness of problems in
their domain” (p. 76). In their study, for example, subjects were provided with
programmable “rocket ships” that included a programming a button that produced an
unknown function. Their task was to determine the function of this button through
experimentation. As the participants engaged in the activity, they simultaneously worked
to design experiments that would isolate its role as well as generate hypotheses that could
be used to describe the rule-governed function and be tested for confirmation.
In the study of high-level scientific problem solving, several significant
qualitative differences have been observed in performance between experts and novices
that have been consistent across studies and specific domains. Specifically, novices in a
given task will rely on heuristic strategies, general skills that can be adaptively applied
across domains, such as means-ends analyses and backward reasoning. In contrast,
experts utilize domain-specific skills grounded in knowledge of concepts and procedures
acquired through practice within a domain (Anderson, 1993; Newell & Simon, 1972).
bacteria’s role in ulcer formation. However, it can be argued that instrument selection is encompassed in
the attainment of sub-goals within the experimental design problem space. Further, Schunn and Klahr
(1996) present criteria for instances when it might be appropriate to go beyond a two-space model of
scientific problem solving: (a) additional spaces should involve search of different goals and entities; (b)
the spaces should differ empirically from one another; and (c) spaces should be representable in a
computational model that considers each space as distinct. In the current study, not only do none of these
criteria apply, but the ability to modify instrumentation lies beyond the capabilities of its apparatus.
Accuracy of Expert Self-Report 7
Even when an expert encounters novel problems within his domain of expertise that limit
the extent to which strong methods can be used, the heuristic approaches that are
employed are fundamentally more adaptive than novices’ weak methods, as they utilize
an elaborate knowledge base (Schraagen, 1990). For example, McGuire (1997) provides
an extensive list of heuristics that are specific to the navigation of the hypothesis problem
space but require training in experimental design prior to successfully use.
In contrast, true domain-general skills are applied when a structured knowledge
base is not available. Means-ends analysis involves a general attempt to reduce observed
differences between the problem state and the goal state. This strategy manifests itself as
either an attempt to metaphorically move in as straight a line as possible toward the goal
state (i.e., hill climbing; Lovett & Anderson, 1996) or the identification and pursuit of
nested sub-goals that must be met prior to the attainment of the end goal. For example,
studies of novice performance on the Tower of Hanoi task have reported that subjects
will not generate a rule-based strategy for solving the problem efficiently. Instead, they
work backwards from the goal state, identifying the move necessary to achieve the main
goal, then identifying the move necessary to allow the first (i.e. identify a sub-goal), and
so on. Neves (1977; as cited in Anderson, 1993, p. 37) provides a clear illustration
through the verbal reasoning of one subject: “The 4 has to go to the 3, but the 3 is in the
way. So you have to move the 3 to the 2 post. The 1 is in the way there, so you move the
1 to the 3.” More recent work by Phillips, Wynn, McPherson, and Gilhooly (2001)
further indicates that even when novice subjects are instructed to preplan a strategy to
solve the problem (in this case, a slight variation of the Tower of Hanoi, dubbed the
Tower of London) before attempting it, there were no significant differences between
Accuracy of Expert Self-Report 8
their speed and accuracy of performance and those of subjects not provided with the
opportunity to develop a plan before beginning. Further, when asked to report their
intermediary sub-goals, they were able to accurately report only up to two moves ahead
in a manner similar to the Neves (1977) subject. From this, the authors conclude that the
problem-solving method was indicative of a means-ends approach.
Similarly, when novices attempt problem solving tasks within a scientific domain,
they also exhibit difference reduction and subgoaling behavior. In their seminal study of
physics problem solving, Larkin, McDermott, Simon, and Simon (1980a, 1980b)
described novice physics students reasoning backwards from the required solution by
determining which equation would yield an appropriate answer and then attempting to
utilize the givens provided, whereas physics experts initiated the problem solving process
by formulating a conception of the situation on the basis of physics principles and
available specifics, generating the solution by manipulating the mental model to yield the
appropriate answer.
The distinction between novice and expert approaches to the conceptualization of
physics problems was also observed in problem sorting tasks, in which expert
participants consistently categorized problems according to underlying principles
represented in the prompts in contrast to novices, who paid greater heed to surface
features such as apparatus and available facts (Chi, Feltovich, & Glaser, 1981). More
recently, an extensive study of individual differences in physics problem solving
replicated and extended these findings, identifying the prominence of principle
identification as a factor in the strategy selection of experts (Dhillon, 1998). Expert
utilization of theoretical conceptualizations had the benefit of activating pre-existing
Accuracy of Expert Self-Report 9
mental models that represented both the relevant pieces of information that were
presented and the abstract relationships between elements in the mental model (Glaser &
Chi, 1988; Larkin, 1985). This abstract representation also scaffolded the search for
missing information that the model would otherwise incorporate. In contrast, novices
who lacked an adaptive model relied on surface-level details and iterative hypothesis
testing that generated a mental model that was more firmly bound to the concrete
representation of the situation as it was presented (Lamberti & Newsome, 1989).
As these differences are reliable and do not emerge until after high level
attainment within the problem solving domain, it follows that they represent a distinct set
of skills that are not intuitive or available to the novice problem solver. Singley and
Anderson (1989) demonstrated that not only do experts characteristically use strong (i.e.
domain-specific) methods in their problem solving, but less experienced problem solvers
can also learn to use them successfully. Further, in another of Anderson’s studies, it was
determined that differences in scientific performance were not significantly attributable to
individual differences in fluid intelligence (Schunn & Anderson, 1998). This finding
replicated the evidence regarding IQ in other skilled domains that after five years of
professional experience, intelligence and performance are not reliably correlated (Ceci &
Liker, 1986; Doll & Mayr, 1987; Ericsson & Lehmann, 1996; Hulin, Henry, & Noon,
1990; Masunaga & Horn, 2001).
Chen and Klahr (1999) also describe the high level of impact that training in the
strategic control of experimental variables had on science students’ ability to generate
and execute valid scientific inquiries. However, they describe the control of variables
strategy (CVS) as a domain-general skill, because it can apply with equal success to
Accuracy of Expert Self-Report 10
specific studies in physics or any other science content domain. It is suggested here,
however, that experimental design can be recognized as an independent domain that is
specific to endeavors within the experiment problem space. While domain general skills
will certainly impact performance in this area to some extent as they would in any
situation where relevant factors are not known, designing informative experiments
“requires domain-general knowledge about one's own information-processing limitations,
as well as domain-specific knowledge about the pragmatic constraints of the particular
discovery context” (Klahr, Fay, & Dunbar, 1993, p. 114). For example, Schunn and
Anderson (1999) differentiate between domain experts and “task” experts for their study
of scientific reasoning in a memory experiment design task. Domain experts were
university faculty in psychology whose research agendas were directly related to the
problem presented, whereas task experts were also psychology research faculty but
specialized in topics unrelated to memory. Although there were some significant
differences in experimentation strategies between the two groups of experts (to be
expected in light of the inherent confluence of hypothesis and experiment problem
spaces), the performance of both were consistently superior to that of undergraduates
who had completed a course in experimental design in psychology. Given these findings
and the relationship between expertise and practice discussed in the section following, it
is reasonable to conclude that skills and strategies specific to scientific problem solving
are fundamentally acquirable. As such, differences in performance are expected to be
more directly linked to deliberate practice of skill acquisition than to measures of fluid
intelligence.
Expertise
Accuracy of Expert Self-Report 11
The past several decades have yielded a burgeoning body of work on the subject
of expertise (Patel, Kaufman, & Magder, 1996). While there is more than a century of
research on skill acquisition (see Proctor & Dutta, 1995 for an extensive review),
relatively recent emphasis has emerged on the characteristics of experts that are common
across domains. Such work has predominantly emphasized two general issues in high
level performance: the identification of cognitive processes generalizable to all expert
performances and the factors contributing to the acquisition of expert-level skill. This
review will examine the development and current state of expertise research and make
several arguments regarding the cogency of its theoretical construct and future directions
for research.
Despite several ongoing debates in the theory of expertise, a number of reliable
representative characteristics have emerged. Glaser (1988) elucidated seven oft-cited
attributes that characterize the performance of most experts. These observations, which
are drawn from generalizations of a number of studies in the late 1970s and 1980s, have
helped to shape the development of the field, despite a lack of definitive analysis
regarding the extent to which each may be necessary or sufficient for the expertise
construct:
1. Experts excel mainly in their own domains.
2. Experts perceive large meaningful patterns in their domain.
3. Experts are fast; they are faster than novices at performing the skills of their
domain, and they quickly solve problems with little error.
4. Experts have superior short-term and long-term term memory.
Accuracy of Expert Self-Report 12
5. Experts see and represent a problem in their domain at a deeper (more
principled) level than novices; novices tend to represent a problem at a
superficial level.
6. Experts spend a great deal of time analyzing a problem qualitatively.
7. Experts have strong self-monitoring skills.
In more recent research, Ericsson (1996; Ericsson & Lehmann, 1996) has
advanced two additional characteristics of expertise. First, he has noted that expertise in a
given domain typically requires a minimum of ten years of deliberate practice to develop.
Extending the original findings of Simon and Chase (1973) suggesting that a decade was
the minimum amount of experience necessary to gain chess mastery, this idea has been
further elaborated and supported by the findings of several important investigations in
various domains (Charness, Krampe, & Mayr, 1996; Simonton, 1999). Deliberate
practice is considered to be highly effortful, intended to improve performance, not
inherently motivating, and not intended to attain any goal beyond continued skill
development (Ericsson & Charness, 1994; Starkes, Deakin, Allard, Hodges, & Hayes,
1996). Second, he has described an expert process as one exhibiting a “maximal
adaptation to task constraints.” Such constraints include the physical limitations of the
human body and the demands of the laws of physics, as well as the functional rules that
are associated with the task (e.g., the rules of chess, established flight paths, etc.), and the
limitations of short-term memory and other cognitive functions (Casner, 1994; Vicente,
2000)2. It is the asymptotic approach to these constraints, Ericsson argues, that allows
It is notable that task constraints do not include an individual’s intelligence. Data from a number of
studies has indicated that expert performance is not significantly related to measures of general or fluid
ability (Ceci & Liker, 1986; Doll & Mayr, 1987; Ericsson & Lehmann, 1996; Hulin, Henry, & Noon, 1990;
Masunaga & Horn, 2001).
2
Accuracy of Expert Self-Report 13
experts to succeed where others fail. Due to their extensive practice and skill refinement,
they have to a great extent shaped the development of their physiological (e.g. density of
blood vessels in top athletes; Ericsson, Krampe, & Tesch-Romer, 1993) and cognitive
(e.g. working memory limitations; Ericsson & Kintsch, 1995) mechanisms required for
performance in the domain of expertise.
The Role of Knowledge in Expertise. Historically, several broad aspects of expert
performance have been examined, with each emphasizing a distinct aspect of cognition.
One major approach focuses on experts’ extensive knowledge of domain-relevant
information and the ability to recall it in appropriate situations. Chase and Simon’s
(1973) classic work in the memory performance of chess masters suggested that quantity
of knowledge was considered the foundational component of expertise. Their findings
indicated that experts had vastly superior memory for the locations of realistically-placed
chess pieces in briefly presented stimuli relative to novices, but equivalent recall ability
for randomly-placed pieces and chess-unrelated stimuli under equivalent conditions.
From this, they concluded that expert performance on selected tasks depended on those
tasks falling within their domain of mastery and being representative of the tasks
performed during normal participation in the activity. Further, the increased speed and
capacity that they seemed to demonstrate was attributed to the recognition of previous
situations encountered within the domain that were equivalent to the tasks presented.
This suggested that expertise was in large part a benefit of extensive experience within a
domain from which subjects could recall previously successful solutions and deploy them
quickly and consistently. Such findings have been consistently replicated in a wide array
Accuracy of Expert Self-Report 14
of domains, including tennis (Beilock, Wierenga, & Carr, 2002) and botany (Alberdi,
Sleeman, & Korpi, 2000).
Later work has also analyzed the organization of expert knowledge structures and
differentiated them from novice representations on the basis of levels of detail,
differentiation, and level of principled abstraction. For example, Chi, Feltovich, and
Glaser (1981) examined expert and novice performance in physics problem-sorting tasks
and observed that the categories identified by experts were based on fundamental
principles on which the problem solutions relied. In contrast, novices conceptualized the
problems based on their surface-level details, such as the presence of pulleys or inclined
planes. Similarly, Adelson (1981) found that novice programmers categorized lines of
code according to syntax, whereas experts utilized functional or semantic aspects.
High recall performance in experts has also been linked to principled conceptual
organization. In additional chess studies, it has been found that by providing additional
conceptually descriptive information about the location of chess pieces in a game before
or after the visual presentation of the board, generates even higher levels of expert recall
than in a visual presentation-only condition, suggesting that memory performance is
linked to more abstract cognitive representations (Cooke, Atlas, Lane, & Berger, 1993).
The level of conceptual abstraction in expert reasoning has also been explained as a
“comfortable, efficient compromise…that is optimal” for expert-level performance in a
specific domain (Zeitz, 1997, p. 44). This “compromise” represents a suitable chunking
size and schematic framework to facilitate the establishment of appropriate links between
the concrete elements of a particular problem and the more general concepts and
principles that the expert has acquired through experience in the domain. This
Accuracy of Expert Self-Report 15
framework facilitates a number of knowledge-related traits associated with expertise,
specifically an expert’s ability to recognize sophisticated patterns and enhanced
performance for recall of salient details in given situations.
The Role of Working Memory in Expertise. A second account focuses primarily
on the superior working memory performance of experts when working in their domain.
Extensive evidence indicates that experts are able to process much more information in
working memory than is possible under normal circumstances (cf. Baddeley, 1986).
Evolving from the initial theory of chunking provided by Chase and Simon (1973), in
which experts were believed to represent large, familiar perceptual patterns held in long
term memory as single chunks that could be elaborated rapidly in short term memory,
several newer theories have developed that are better able to account for findings
suggesting that experts’ extraordinary recall of information from domain-specific
problems is not impaired by disruptive short term memory tasks, despite the theory’s
expectation that chunks are held in short term memory (Gobet, 1998; Vicente & Wang,
1998). Long term working memory theory (LTWM; Ericsson & Kintsch, 1995),
template theory (Gobet & Simon, 1996), and the constraint attunement hypothesis (CAH;
Vicente & Wang, 1998) have suggested that as a result of continued practice within a
domain, schematic structures within long term memory can be used to not only facilitate
access to existing declarative knowledge as discussed previously, but also to functionally
augment the limited capacity of short term memory when considering domain-relevant
problems.
LTWM suggests that experts develop domain-specific representation mechanisms
in long term memory that reflect the structure of the primary domain tasks themselves,
Accuracy of Expert Self-Report 16
allowing for the rapid encoding and retrieval of stimuli from relevant tasks. Such a
model can account not only for experts’ exceptional recall abilities of domain-relevant
situations in general, but also for expanded working memory capacity during expert
performance (Ericsson & Kintsch, 1995). Gobet (1998) extrapolates two possible
manifestations of the LTWM theory in an attempt to account for a number of empirical
findings. The first representation, referred to as the “square version” (p. 125), suggests
that the LTWM structure for chess experts manifests itself directly in the form of a 64square schematic chess board. In this conception, encoding is therefore contingent on
appropriately compatible stimuli for the format. The second possible representation,
dubbed the “hierarchy interpretation” (p. 125), constructs a different conceptualization of
the original theory to allow for encoding that is not contingent on format and establishes
that “in preference to storing pieces in squares, experts store schemas and patterns in the
various levels of the retrieval structure” (p. 125).
Contrasting with LTWM theory, template theory (Gobet & Simon, 1996) does not
completely reject the chunking component originally established by Chase and Simon
(1973). Instead, in cases of extensive practice, associated templates augment a single
large chunk with slots that could represent related but variable items, retrievable through
a short term memory trace mechanism. The creation of these slots occurs when a
minimum number of semantically related elements occur in similar relationships below
the node representing the chunk in short term memory. Thus, slots could be occupied by
common component categories that vary depending on the particular situation, such as
strategy features or, in the case of chess, players associated with the particular approach.
Accuracy of Expert Self-Report 17
The constraint attunement hypothesis critiques LTWM theory by arguing that it
accounts primarily for data that was generated by experts in domains for which
memorization is an intrinsic element. In an attempt to provide a more generalizable
theory, Vicente and Wang (1998) suggest that the appropriate structures in long term
memory to facilitate enhanced working memory performance are representations of the
task constraints that govern performance within the domain. An abstraction of the task in
this format, they argue, allows the goal structure to serve as the encoding representation
for rapid retrieval. In essence, the hierarchy interpretation of LTWM theory elaborated
by Gobet (1998) provides a comparable role, except that the hierarchical representation is
structured according to the goals of the task, rather than the structural features. This
allows the authors to predict the magnitude of expertise-enhanced memory performance
on the basis of the number of available constraints, in that the higher the number of
authentic task constraints, the more optimally a constraint-attuned framework in long
term memory can be utilized to expand working memory capacity.
The Role of Strategy in Expertise. The third framework for expertise grounds
performance in qualitative differences in problem solving strategies between experts and
novices. Consistent early findings in the study of physics problem-solving skills indicate
that experts call on domain knowledge to approach problems through forward reasoning
processes in which they are represented conceptually and approached strategically on the
basis of the given factors (Chi, et al., 1981; Chi, Glaser, & Rees, 1982; Larkin,
McDermott, Simon, & Simon, 1980a, 1980b). Such “strong methods” involve
developing a highly principled representation of the problem that through manipulation
yields an appropriate solution (Singley & Anderson, 1989). Novices on the other hand,
Accuracy of Expert Self-Report 18
utilize “weak method” heuristics that begin with identification of the goal state and
reason backwards to identify relevant given information and approaches that will
generate the necessary outcome (Lovett & Anderson, 1996). Further, the development of
expertise entails a progression from general, “weak-method” heuristics to feedbackrefined procedures that have integrated domain-specific knowledge (Anderson, 1987).
Such differences between expert and novice performance are robust, even when
novices are instructed to develop a strategy before attempting a solution. Phillips, Wynn,
McPherson, and Gilhooly (2001) found that despite extensive preplanning, novices
exhibited no significant differences in their speed or accuracy of performance when
compared with those not provided with the opportunity to develop a plan before
beginning. Further, when asked to report their intermediary sub-goals, they were able to
accurately report only up to two moves ahead, thereby demonstrating a means-ends
approach rather than a forward strategy. Similarly, Larkin, et al. (1980a, 1980b)
described novice physics students reasoning backwards from the required solution by
determining which equation would yield an appropriate answer and then attempting to
utilize the givens provided, whereas physics experts initiated the problem solving process
by formulating a conception of the situation on the basis of physics principles and
available specifics, generating the solution by manipulating the mental model to yield the
appropriate answer.
More recently, an extensive study of individual differences in physics problem
solving replicated and extended these findings, identifying the prominence of principle
identification as a factor in the strategy selection of experts (Dhillon, 1998). Expert
utilization of theoretical conceptualizations had the benefit of activating pre-existing
Accuracy of Expert Self-Report 19
mental models that represented both the relevant pieces of information that were
presented and the abstract relationships between elements in the mental model (Glaser &
Chi, 1988; Larkin, 1985). This abstract representation also supported the search for
missing information that the model would otherwise incorporate. In contrast, novices
who lacked an adaptive model relied on surface-level details and iterative hypothesis
testing that generated a mental model that was more firmly bound to the concrete
representation of the situation as it was presented (Lamberti & Newsome, 1989). Even
when it has been determined that novices perceive the deeper principles underlying a
problem, their solutions rely nearly exclusively on surface features (Sloutsky & Yarlas,
2000; Yarlas & Sloutsky, 2000).
Investigations of expert problem-solving processes in scientific experimentation
and design have also provided clear illustrations of this phenomenon (Hmelo-Silver,
Nagarajan, & Day, 2002). In Hmelo-Silver, et al.’s (2002) study, experts and novices in
the domain of clinical trial design used a simulation to demonstrate a hypothetical drug’s
suitability for medical use. Throughout a number of repeated trials, those subjects with
extensive experience in the domain consistently used analogies to past experiences in
their verbal protocols to reason abstractly about the process and outcomes. Additionally,
they were highly reflective about the effectiveness of particular strategies in relation to
their progress toward the goals of the task. In contrast, novices rarely used analogies and
did not typically have the cognitive resources available for reflection while engaged in
the task.
The Relevance of Automaticity. The three frameworks discussed above each yield
a common result: Ultimately, each aspect of expert performance improves the cognitive
Accuracy of Expert Self-Report 20
efficiency of the problem solving process. This phenomenon not only emerges as a result
of acquired expertise, but also further improves performance by freeing up cognitive
resources to accommodate atypical features or other added cognitive demands that may
arise within a task (Bereiter & Scardamalia, 1993; Sternberg & Horvath, 1998).
In investigations of skill acquisition, it has been found that individuals with a high
level of practice in a procedure can perform it at increasingly high speeds and with
minimal mental effort (Anderson, 1982; Logan, 1988). Thus, highly principled
representations of domain-specific problems can be used in fast, effortless performance
by a subject with a large and well-structured knowledge base and at extensive practice of
component skills. However, the procedure itself becomes more ingrained and extremely
difficult to change to the extent that both goals and processes can manifest without
conscious activation (Bargh & Ferguson, 2000). As such, once a skill has been
automated, it no longer operates in such a way that it is available to conscious
monitoring, and it tends to run to completion without interruption, further limiting the
ability to modify performance (Wheatley & Wegner, 2001).
In contrast, adaptive experts are highly successful even under novel conditions.
Bereiter and Scardamalia (1993) observed that often when experts have automated
procedures within their domain, their skills are highly adaptable to complex, illstructured, and novel situations, because minimal space in working memory is occupied
by the process, thereby allowing mental effort to be reinvested attending to relevant new
details. In one example, Gott, Hall, Pokorny, Dibble, and Glaser (1993) reported that
highly successful air force technicians were able to adapt knowledge to novel situations
despite high levels of consistent, effortless performance. This description is reminiscent
Accuracy of Expert Self-Report 21
of the differences described by supervisors between the experts and super-experts in the
Koubek and Salvendy (1991) study. Although their analysis of the data suggested that
there was no difference in the levels of automaticity between the two groups, it is
possible that Bereiter and Scardamalia’s (1993) arguments could have been supported if
different forms of data, more sensitive to fluctuations in the level of cognitive load and
representative of cognitive processes in greater detail had been collected.
Automaticity and Procedural Knowledge Acquisition
Human attention is limited by the finite capacity of working memory to retain and
manipulate information (Baddeley, 1986). When mental operations such as perception
and reasoning occur, they occupy some portion of available capacity and limit the
attention that can be dedicated to other concurrent operations. Referred to as cognitive
load, the burden placed on working memory has been found to play a major role in both
learning and the governance of behavior (Goldinger, Kleider, Azuma, & Beike, 2003;
Sweller, 1988; Sweller, Chandler, Tierney, & Cooper, 1990).
Given the extreme limitations on the amount of information that can be
consciously and simultaneously processed (i.e., as few as four chunks; Cowan, 2000) as
well as the persistently high levels of information and sensory input available in most
natural settings, it is necessary that many cognitive functions also take place outside of
conscious awareness and control. Wegner (2002) suggests that as much as 95% of the
common actions that we experience to be under conscious control are in fact automated.
Further, he argues, the cognitive mechanisms that generate these false impressions of
intention are themselves generated by nonconscious processes. Thus, people tend to
Accuracy of Expert Self-Report 22
claim full knowledge and control of their actions to the point of creating false memories
that provide plausible explanations for their actions.
During the last century, both behaviorists and cognitive scientists have argued that
many mental and behavioral processes take place without any conscious deliberation
(Bargh, 2000). From these investigations, a dual-process model of cognition requiring the
parallel execution of controlled and automatic processes has emerged in which conscious
functions are relatively slow, effortful, and controllable, whereas automatic processes are
rapid and effortless (Bargh, 1999a; Devine & Monteith, 1999). Automated procedures
can occur without intention, tend to run to completion once activated, utilize few, if any
attentional resources, and are not available to conscious monitoring (Wegner &
Wheatley, 2001). It is important to note, however, that the dual-process description can
prove misleading, as many procedures rely on the integration of both conscious and
nonconscious thought during performance (Bargh & Chartrand, 1999; Hermans,
Crombez, & Eelen, 2000). Specific sub-components can be automated or conscious, but
their integration into the larger production yields a mixed composition.
In the seminal work of Shiffrin and Schneider (1977), the acquisition of automaticity is
said to be achieved through the consistent, repeated mapping of stimuli to responses.
Most commonly associated with skill acquisition, automated procedures can be
consciously initiated for the satisfaction of a specific goal. Thus, primary attention is
paid to the level of ballisticity of the procedure (Logan & Cowan, 1984), which is the
ability of the production to execute to its conclusion without conscious monitoring, and
the ability to maintain performance levels in dual task paradigms (Brown & Bennett,
2002). Acquired through extensive practice, these goal-dependent procedures become
Accuracy of Expert Self-Report 23
fluid and require less concentration to perform over time (Anderson, 1995; Fitts &
Posner, 1967; Logan, 1988a). Additionally, these procedures evolve to greater levels of
efficiency as automaticity develops by eliminating the need for conscious intermediate
decision points (Blessing & Anderson, 1996). Evidence suggests that habitual approaches
to problems are goal-activated, such that the solution search is significantly limited by the
activation of established patterns of behavior (Aarts & Dijksterhuis, 2000). Dubois and
Shalin (2000) further report that goal choice, conditions/constraints, method choice,
method execution, goal standards, and pattern recognition are each elements of
procedural knowledge that can become automated.
Accuracy of Self-Report
The challenge in collecting valid verbal self-report data lies in the structure of the
human memory system itself. In the traditional cognitive model, the short term
(working) memory acts as a gateway through which all information must pass as it is
encoded and incorporated into schemas in long term memory or retrieved for
manipulation or use in the production of behavior. According to Baddeley (1986, p. 34),
it is “a system for the temporary holding and manipulation of information during the
performance of a range of cognitive tasks.” For new information to be stored retrievably
in long term memory, a trace or pathway must be created to allow the needed information
to be activated at an appropriate time. Such links are more easily and successfully
generated when the new information maps well onto an existing schema. Thus, schemas
that have been utilized and refined adaptively through experience with particular
concepts and events serve as stable mental models for the more efficient encoding and
Accuracy of Expert Self-Report 24
evaluation of specific types of events (Anzai & Yokoyama, 1984; Bainbridge, 1981;
Clement, 1988; Larkin, 1983).
While these refined mental models are highly adaptive for problem solving, they
may interfere with the accurate recall of problem solving situations after the fact.
Because mental models are utilized in the search of a problem space (Larkin, et al.,
1980a), details that are not directly mappable to the representation can fail to be encoded
into long term memory. As a result, a retrospective account of the event may fall victim
to the errors of generalizability and rationalization that Nisbett and Wilson (1977)
describe in their critique of self-report accuracy. “Such reports may well be based more
on people’s a priori causal theories about stimulus effects than on direct examination of
their cognitive processes, and will be inaccurate whenever these theories are inaccurate”
(Wilson & Nisbett, 1978, p. 130).
Moray and Reeves (1987) provide direct empirical evidence for the potentially
maladaptive nature of mental models. They presented subjects with a set of eight bar
graphs that changed lengths over time. The lengths of some pairs of graphs were
designed to covary to provide observable subsystems from which it was expected that a
mental model with separate identifiable components would be derived. Participants in
the study were given the task of preventing the graphs from exceeding specified
parameters by changing the location and color of the bars within each graph. Once the
subject had successfully learned to manage the graphs in accordance with the defined
relationships, three “faults” were introduced to the system that prevented certain
relationships from functioning as they had. The authors hypothesized that once the
original model had been developed, the subjects would not recognize the appearance of
Accuracy of Expert Self-Report 25
the faults. As expected, the fact that the relationships among and across subsystem
components had changed took significantly longer to recognize than the time it took
subjects to discover all of the rules for the original system, thereby demonstrating the
durability of mental models once established.
More recently, Logan, Taylor, and Etherton (1996) reported that “the
representations expressed during automatic performance do not represent all stimulus
attributes uniformly” (p. 636). Instead, only those elements that were attended to during
the task production are specifically encoded in episodic memory. When subjects were
asked to recall the font color in which target words were presented, they were unable to
do so, despite above-chance performance on a recognition task as evidenced by instances
of successful encoding of superficial item features without successful retrieval of those
features during recall. Thus, they concluded that retrieval of an episode may stimulate
differential partial representations of the specific instance.
The challenge to validity of self-report posed by mental models mirrors the
difficulties inherent in capturing self-reported procedural knowledge. Cooke (1992)
notes that after knowledge of a process becomes proceduralized, subjects may have
difficulty decomposing the mental representation into a declarative form. Further,
subjects with extensive practice solving problems in a domain will have automated
significant portions of their procedures, suggesting that the representations—presumably
those of greatest interest within a domain of expertise—will be even harder to articulate.
Williams (2000, p. 165) explains that “production units are not interpreted but are fired
off automatically in sequences, which produce skilled performance. They are automatic
to the extent that experts at a specific skill may not be able to recall why they perform the
Accuracy of Expert Self-Report 26
skill as they do.” This phenomenon is particularly true of strong methods that rely on
domain knowledge. In a study of radar system troubleshooters, Schaafstal & Schraagen
(2000, p. 59) note that “only a small correlation was found between the knowledge test
[of radar systems] and fault-finding performance (Pearson r = .27). This confirms
the…gap between theory and practice.”
This view has also been supported in the study of metacognitive monitoring and
strategy selection. Reder and Schunn (1996; Schunn, Reder, Nhouyvanisvong, Richards,
& Stroffolino, 1997) have demonstrated that subjects’ metacognitive selection of
strategies during problem solving occur implicitly and are better predicted by previous
exposure to similar problems regardless of whether or not a solution was obtained than by
active strategy selection, despite subjects’ lack of awareness that (a) learning occurred
during the initial exposure to previous problems or (b) new strategy development did not
occur. Thus, information regarding critical elements of any problem solving procedure
can fail to be directly accessible for verbal report.
Despite these impediments, two general methods of verbal knowledge elicitation
have been utilized in research of expertise and problem-solving: protocol analysis
(Ericsson & Simon, 1993) and cognitive task analysis (Schraagen, Chipman, & Shute,
2000). Both approaches have been characterized as valid within specific constraints
(discussed below). However, these considerations have not been consistently applied in
the study of experts’ cognitive processes (Bainbridge, 1999).
During protocol analysis, also referred to as the “think aloud” technique, the
capacity limit of short term memory can prevent sufficient attention from being given to
both the task at hand and the translation of mental symbols to verbal form. While this
Accuracy of Expert Self-Report 27
does not pose a particular problem for tasks that are verbal in nature or do not require full
attentional capacity to achieve, those elements which are not easily articulable (like
images) require extra attentional resources to translate (Chipman, Schraagen, & Shalin,
2000). If those resources are not available, subjects will fall silent during the periods of
high load. Likewise, processes which have become mostly or entirely automated will not
be articulated (Ericsson & Simon, 1993).
This phenomenon is particularly problematic for cognitive research, as
automaticity often manifests at moments of particular interest for those studying the
cognitive processes of a task, because they represent the points of most refined skill in the
procedure. Further, subjects who are required to think aloud during insight problem
solving tasks reveal performance deficits in the time necessary for task completion and
frequency of correct solutions (Schooler, Ohlsson, & Brooks, 1993). Similarly, in
Chung, de Vries, Cheak, Stevens, and Bewley (2002), subjects required to think aloud
while engaging in scientific problem solving tasks required significantly higher number
of attempts to successfully solve problems using a computer interface.
This finding of verbal overshadowing has been replicated across a number of
experimental and authentic tasks (Meissner & Memon, 2002). “A common observation
is that verbal rehearsal or mediation declines with practice on perceptual-motor tasks (e.g.
Fitts, 1964), which indicates that at least the form and possibly the amount of information
held in working memory changes” (Carlson, Khoo, Yaure, & Schneider, 1990, p.
195).Thus, while appropriate for gathering process data in certain types of tasks
performed by individuals who have not yet developed automaticity in their skills,
Accuracy of Expert Self-Report 28
capturing an accurate representation of the highly specialized performances of experts in
complex cognitive tasks remains problematic.
In contrast, cognitive task analysis techniques represent knowledge of events and
strategies that have been retained by the subject until after the event in question. While
these techniques span a wide range of tools for knowledge elicitation, the data generated
does not represent the contents of working memory in situ. These approaches have,
however, been found to yield highly accurate information about the processes executed in
a wide variety of authentic tasks (Schraagen et al., 2000; Velmahos, Toutouzas, Sillin,
Chan, Clark, Theodorou, Maupin, Murray, Sullivan, Demetriades, & DeMeester, 2002).
Although capturing procedural knowledge is considered to by more challenging
than declarative knowledge within a domain (Hoffman, 1992), a skill-based cognitive
task analysis framework has been established by Seamster, Redding, and Kaempf (2000)
that focuses specifically on the elicitation of five cognitive skill types: (a) strategies, (b)
decision-making skills, (c) representational skills, (d) procedural skills, and (e) automated
skills. While several specific techniques have been developed within this framework, one
of the most successful has been the Critical Decision Method, in which each of the skill
types is elicited through a variety of probes in semi-structured interviews. Hoffman,
Crandall, and Shadbolt (1998) reviewed reliability studies of the method and reported
that there was high test-retest reliability over time (3 days, 3 months, and 5 months after
the incident reported) and intercoder reliability of .89. With regard to validity of content,
they argue that the memory prompting cues that are incorporated into the method can
overcome the memory errors that are discussed above. Further, studies conducted by
Crandall and his colleagues (e.g. Crandall & Calderwood, 1989; Crandall & Gamblian,
Accuracy of Expert Self-Report 29
1991) have demonstrated that the behaviors captured with cognitive task analysis have
differed significantly from the theoretical knowledge generally portrayed in textbooks
and more aligned with the experiences of experts in the field.
Summary
As indicated in the literature above, the assumptions on which this study rests are
justified by extant research. Specifically, (a) scientific research skills are acquirable and
represent predominantly crystallized intelligence, (b) expertise in scientific research is
distinguishable from lower levels of skill, both in an individual’s ability to successfully
solve a problem and in a qualitative distinction between expert and novice strategies, and
(c) skills that are practiced extensively automate, such that the mental effort necessary to
perform a procedure is minimal. Further, although there are significant challenges to the
reportability of automated procedural knowledge, there are certain circumstances under
which some level of knowledge can be elicited.
Research Questions
The research questions for this study are:
1. What are the cognitive skills and strategies that research experts use in
experimental research and how do they differ from novice techniques?
2. Do experts automate problem-solving procedures in the domain of research
design?
3. Does the degree of automaticity in task performance differ between experts
and novices?
4. To what extent can experts and novices accurately report their own problemsolving processes?
Accuracy of Expert Self-Report 30
5. What is the relationship between the accuracy of self-report and the degree of
automaticity during problem-solving?
Methodology
Subjects
Participants in this study will be recruited from a major university in southern
California. Six expert subjects will be identified and recruited on the basis of the
following criteria:
1. Subjects will have attained tenure in the field of psychology or educational
psychology at a Tier I research university and have conducted research for at least
10 years. These elements are associated with the attainment of expertise as
recognized by a professional peerage and a typically necessary duration (Ericsson
& Charness, 1994).
2. Each subject will have published at least 20 peer-reviewed empirical studies
within their domain of expertise utilizing factorial designs equivalent to those
available in the simulation.
3. Subjects will not have major lines of research in the area of memory to prevent
biasing of the experimental design and analysis tasks based on recall.
Six novice subjects will also be identified and recruited for the study. Each will have
completed at least one course in psychological methodology and research design
Design
This study will utilize a Single-Subject Multivariate Repeated Measures
(SSMRM) design (Nesselroade & Featherman, 1991; Wood & Brown, 1994). The goal
Accuracy of Expert Self-Report 31
of this approach is to capture intraindividual concurrent changes in multiple variables
over time. In this case, EEG levels across multiple bands, scientific problem-solving
behaviors, and self-report accuracy will be recorded through time-locked records of each.
This approach is necessary, because the aggregation of data that occurs in crosssectional designs would prevent the pairing of specific procedural elements with
decreased levels of cognitive load associated with automaticity and particular segments
of self-report accuracy scores, as these variables are impacted by both intraindividual
change and interindividual differences (Jones & Nesselroade, 1990). Further, the ability
to differentiate among larger patterns of behavior over time within individuals permits a
deeper examination of the role that automaticity plays in expert and novice problemsolving strategies within the domain being examined (e.g. Chen & Siegler, 2000).
Apparatus
Simulation. Subjects will use a computer simulation, the Simulated Psychology
Lab (Schunn & Anderson, 1999) to design and interpret the results of a series of factorialdesign experiments with the goal of determining which, if either, of two competing
theories account for the memory spacing effect described in the introduction to the
program. The interface allows subjects to select values for six independent variables (see
Table 1), of which up to four can be manipulated in a particular experiment, involving the
learning and recall of word lists by hypothetical subjects. When the variable settings for
an experiment are determined by the user (though they remain editable until the
command to run the experiment is given), the user is then required to predict the mean
percentages of correct responses that the hypothetical subjects will produce in each
Accuracy of Expert Self-Report 32
condition.3 Once the predictions are entered and the command is given to run the
experiment, the computer generates data sets that are compatible with real-world results.
After each iteration of the design-hypothesize-and-execute cycle, the results generated by
the computer are available to the user in order to modify hypotheses and inform
experimental designs in subsequent iterations. To facilitate this process, the results of all
previous experimental designs and results are available to the user through the interface.
As each action is taken, the simulation captures and records all user actions, including the
viewing of data, with a time stamp to provide an accurate procedural history for analysis.
Although the ecological validity of in laboratory experiments in general, and
simulation apparati in particular, have been called into question for the study of cognitive
scientific processes (e.g. Giere, 1993; Klahr & Simon, 1999), simulations that maintain a
high level of fidelity to the complexity of experimental tasks and a small observational
“grain size” can be considered to capture to a great extent the actual problem-solving
processes employed by scientists during the course of their work (Klahr, 2002). Of
primary concern is the representativeness of the task, analogous to the methodological
considerations of Ericsson and Smith (1991) in the laboratory study of expert
performance. With regard to the task in the simulation apparatus used in this study,
Schunn and Anderson (1999, p. 346) note that “although the mapping of the two theories
for the spacing effect onto these six variables is not simple, this relationship between
3
Schunn and Anderson (1999, p. 347-348) note that “although this prediction task is more stringent than
the prediction task psychologists typically give themselves (i.e., directional predictions at best, and rarely
for all dimensions and interactions), we used this particular form of a prediction task because 1) assessing
directional predictions proved a difficult task to automate; 2) numerical predictions could be made without
explicit thought about the influence of each variable and possible interactions, and thus we thought it was
less intrusive; 3) it provided further data about the participants’ theories and beliefs about each of the
variables; and 4) it provided some cost to large experimental designs (i.e., many more predictions to make)
to simulate the increasing real-world cost of larger experimental designs.”
Accuracy of Expert Self-Report 33
theory and operational variable is typical of most psychological theories and
experiments.” Further, the Simulated Psychology Lab is specifically cited in other
literature as an apparatus that “model[s] essential aspects of specific scientific
discoveries” (Klahr & Simon, 2001, p. 75).
Phase of Experiment
Learning
Learning
Learning
Recall
Recall
Recall
Manipulable Variable
Repetitions—the number of
times that the list of words
was studied.
Spacing—the amount of
time spent between
repetitions
Learning context—whether
subjects were in the same
context for each repetition
or changed contexts for
each repetition
Test—memory
performance
Delay—the amount of time
from the last learning
repetition until the recall
test was given
Recall context—whether
subjects were in the same
context for each repetition
or changed contexts for
each repetition
Possible Values
2, 3, 4, 5
1 minute to 20 days
Mood, Location (yes/no)
Free recall, Recognition, or
Stem completion
1 minute to 20 days
Mood, Location (yes/no)
Electroencephalogram monitor. Subjects’ cognitive load during experimental
tasks will be measured using an electroencephalogram (EEG) that dynamically records
and analyzes changes in event-related potential. At minimum, alpha and theta frequency
bands, demonstrated to be indicative of cognitive load (Brookings, Wilson, & Swain,
1996; Fournier, Wilson, & Swain, 1999) and a muscle-activity detection probe (e.g., eye
blinks) for detecting artifacts in the collected data will be available. EEG is currently the
Accuracy of Expert Self-Report 34
physiological instrument of choice for the measure of cognitive workload, due to its
ability to provide relatively unobtrusive continuous monitoring of brain function (Gevins
& Smith, 2003). Specifically, this approach has been found to reliably measure cognitive
load during task analyses of naturalistic human-computer interactions (Raskin, 2000) and,
when analyzed in multivariate combinations, can accurately indicate changes in cognitive
tasks (Wilson & Fisher, 1995).
Critical Decision Method protocol. Cognitive task analyses conducted with
subjects during the study will follow the Critical Decision Method protocol (Klein &
Calderwood, 1996; Klein, Calderwood, & MacGregor, 1989). The protocol is a semistructured interview that emphasizes the elicitation of procedural and conceptual
knowledge from subjects through the recall of particular incidents using specific probes
(Table 2; adapted from Klein & Calderwood, 1996, p. 31) following the subject’s report
of a general timeline highlighting decision points occurring during the focal time.
Probe Type
Cues
Knowledge
Analogues
Goals
Options
Basis
Experience
Aiding
Situation Assessment
Probe Content
What were you seeing, thinking…?
What information did you use in making
this decision, and how was it obtained?
Were you reminded of any previous
experience?
What were your specific goals at this time?
What other courses of action were
considered or were available to you?
How was this option selected/other options
rejected? What rule was being followed?
What specific training or experience was
necessary or helpful in making this
decision?
If the decision was not the best, what
training, knowledge, or information could
have helped?
Imagine that you were asked to describe
Accuracy of Expert Self-Report 35
Hypotheticals
the situation to a partner who would take
over the process at this point, how would
you summarize the situation?
If a key feature of the situation had been
different, what difference would it have
made in your decision?
Procedure
The procedure consists of two phases—preliminary data collection and primary task data
collection. During the preliminary data collection phase, subjects’ relevant traits
pertaining to ability level and instrument calibration will occur. Subsequently, each
subject will begin the simulation task during which process, procedural knowledge, and
cognitive load will be assessed.
Preliminary Data Collection. Two forms of preliminary data will be collected
prior to the beginning of the experimental task. Subjects’ levels of fluid intelligence (Gf)
will be evaluated using the Raven Progressive Matrices test (Raven, 1969; Stankov &
Crawford, 1993; Stankov & Raykov, 1993; Stankov & Raykov, 1995) to provide
comparable baseline data on subjects’ reasoning abilities in novel situations. Horn and
Masunaga (2001) differentiate between fluid reasoning ability and the reasoning abilities
of experts within their domain of expertise, despite apparent similarities. They argue that
“reasoning depends on perceiving relationships in complex sets of stimuli and drawing
inferences from these perceptions in order to estimate relationships under conditions of
lawful change. In this sense, it [expert reasoning] is similar to the reasoning that
characterizes Gf; indeed, the reasoning of expertise and the reasoning of Gf are probably
along a continuum of similarity” (p. 294). This similarity is limited, however, by the
Accuracy of Expert Self-Report 36
inductive nature of fluid reasoning in contrast to the deductive, forward reasoning
evidenced in expert performance.
Subjects’ acquired scientific reasoning abilities will also be assessed using
Lawson’s Test of Scientific Reasoning (Lawson, 1978; 2000). The test of 24 multiplechoice items assesses subjects’ abilities to separate variables and use proportional logic as
well as combinational reasoning and correlations. Although the measure may evidence
ceiling effects for expert subjects, it has been found to adequately represent the abilities
of college students for assessment of scientific reasoning skills (Lawson, Clark, CramerMeldrum, Falconer, Kwon, & Sequist, 2000).
EEG baseline readings will also be obtained for subjects’ high, medium, and low
cognitive load levels prior to beginning the simulation task. Because EEG readings
detect simple visual perception (Kosslyn, Thompson, Kim, & Alpert, 1995) and the
orientation of graphic patterns (Nagai, Kazai, & Yagi, 2001), baseline readings will take
place with the subjects sitting in a chair, facing the computer monitor while it displays the
interface used during the simulation. Subjects will then perform a series of “n-back”
tasks (Gevins & Cutillo, 1993), in which they are asked to identify sequentially presented
stimuli as matching or not matching previously presented stimuli. In the low load
condition, subjects need identify a match between a stimulus and the stimulus presented
immediately beforehand (0-back task). In the higher load conditions, subjects need to
match stimuli that are temporally separated by one (1-back), two (2-back), or three (3back) trials. As n increases, higher levels of cognitive load are imposed by the task
requirement of retaining stimuli in working memory across continued presentations while
simultaneously retaining new stimuli for subsequent match decisions.
Accuracy of Expert Self-Report 37
Primary task procedure. Subjects will be informed that they are participating in a
study to analyze their mental effort and strategies during a research task that will consist
of multiple iterations of experimental design and data analysis. They will be instructed
not to concern themselves with the amount of time necessary to complete the task.
Instead, they will be advised to remain focused on the task at hand and not perform
extraneous actions within the interface. Following these brief instructions, they will be
presented with the instructions and task description via the computer monitor. These
instructions include an orientation to the Simulated Psychology Lab interface (Schunn &
Anderson, 1999). Once the task begins, the subject’s cognitive load levels will be
recorded via the EEG while the subject’s actions are recorded by the software. After
each experiment is designed and executed, subjects will be interviewed using the Critical
Decision Method cognitive task analysis to understand the cognitive processes underlying
the keystroke-level behavior data captured by the computer during simulation use. This
process will repeat until the subject solves the problem presented by the simulation or
reaches a maximum of forty minutes of time engaged in the task.
Analysis
The first phase of analysis will entail the time synchronization of the EEG data,
the computer interaction record captured by the Simulated Psychology Lab, and the
events reported in the cognitive task analysis. The EEG and simulation data are both
time-stamped and should not pose great difficulty in matching, however the timeline
generated during each CTA interview will need to be compared with the keystroke-level
simulation data and time-locked. An adaptation of Ericsson and Simon’s (1993) protocol
analysis coding will be utilized, wherein the transcribed data will be segmented to
Accuracy of Expert Self-Report 38
encompass the processes that occur between one identified decision point and the next.
The steps described in each segment will be matched to the computer-recorded actions
within each iteration of the design-hypothesize-and-execute cycle. One point of
departure from the Ericsson and Simon (1993) technique will be to maintain the
contextualization of each segment in accordance with the recommendation of Yang (in
press), who argues that in ill-structured, complex, and knowledge-rich tasks, discrete
cognitive steps are inherently situated in the broader context of ongoing high level
cognitive and metacognitive processes of reasoning and interpretation. “Given the
systemic, interrelated, multidimensional nature of the learners’ cognitive processes in
this…task, the complexity of the functional model and taxonomy [can cause subjects] to
become immersed in the…interwoven processes of searching, interpreting, defining,
reasoning, and structuring,” (Yang, in press, p. 8) resulting in an oversimplification of
subsequent analyses that loses relevant meaning within the process.
The second phase of analysis will consist of the coding of self-report data for
degree of accuracy in relation to the matched actions in the simulation. While some
categories, such as “accurate,” “error of omission,” and “error of commission” may be
formulated a priori on the basis of the errors in self-report described in previous studies
(e.g., Nisbett & Wilson, 1977; Wilson & Nisbett, 1978), additional coding categories may
emerge from the data upon examination after it has been collected (Dey, 1993). Once
coded, these categorical variables can be represented numerically for use in statistical
analyses.
In the final phase of analysis, P-technique factor analysis will be used to identify
how the observed variables in the study (accuracy, action, and cognitive load) change
Accuracy of Expert Self-Report 39
together over time within individual subjects (Jones & Nesselroade, 1990). In
accordance with the cautions of several studies (Gorman & Allison, 1997; Wood &
Brown, 1994), the data will also be checked for autocorrelative effects which can
generate standard errors that are too large and standardized factor loadings are too small.
Secondary analysis of temporal overlap for automaticity and inaccuracy will be
conducted by computing the Yule’s Q statistic using sampled permutations to adjust for
small sample size (Bakeman, McArthur, & Quera, 1996). Follow-up assessments to
assess between-subjects similarities and differences will be conducted via the congruence
assessment technique (Harman, 1967, cited in Lavelli, Pantoja, Hsu, Messinger, & Fogel,
in press).
Hypotheses
1. Experts will evidence highly developed procedures that yield solutions which are
more systematic, more effective, and qualitatively distinct from novice
performances (Hmelo-Silver, et al., 2002; Schunn & Anderson, 1999; Schraagen,
1993; Voss, Tyler, & Yengo, 1983).
2. Differences in novice-expert performance will not be significantly related to
individual differences in fluid intelligence (cf. Schunn & Anderson, 1998).
3. Experts will be largely inaccurate in reporting their problem-solving procedures
as compared to the computer-based record of their actions, while novice subjects’
reports will have significantly higher accuracy.
4. Expert self-report accuracy with regard to a specific element of the problem
solving process will be directly related to assessments of intrinsic and observed
cognitive load (Sweller, 1994) collected at the point in the process reported.
Accuracy of Expert Self-Report 40
References
Aarts, H., & Dijksterhuis, A. (2000). Habits as knowledge structures: Automaticity in
goal-directed behavior. Journal of Personality and Social Psychology, 78(1), 5363.
Adams, G. B., & White, J.D. (1994). Dissertation research in public administration and
cognate fields: An assessment of methods and quality. Public Administration
Review, 54(6), 565-576.
Adelson, B. (1981). Problem solving and the development of abstract categories in
programming languages. Memory and Cognition, 9, 422-433.
Alberdi, E., Sleeman, D. H., & Korpi, M. (2000). Accommodating surprise in
taxonomic tasks: The role of expertise. Cognitive Science, 24(1), 53-91.
Anderson, J. R. (1982). Acquisition of cognitive skill. Psychological Review, 89(4),
369-406.
Anderson, J. R. (1987). Skill acquisition: Compilation of weak-method problem
situations. Psychological Review, 94(2), 192-210.
Anderson, J. R. (1993). Problem solving and learning. American Psychologist, 48(1),
35-44.
Accuracy of Expert Self-Report 41
Anzai, Y., & Yokoyama, T. (1984). Internal models in physics problem solving.
Cognition and Instruction, 1, 397-450.
Baddeley, A. (1986). Working memory. Oxford, England: Clarendon Press.
Bainbridge, L. (1981). Mathematical equations or processing routines ? In J. Rasmussen
and W.B. Rouse (Eds.), Human Detection and Diagnosis of System Failures.
NATO Conference Series III : Human Factors, Vol. 15. New York: Plenum Press.
Bainbridge, L. (1999). Verbal reports as evidence of the process operator’s knowledge.
International Journal of Human-Computer Studies, 51, 213-238.
Bakeman, R., McArthur, D., & Quera, V. (1996). Detecting group differences in
sequential association using sampled permutations: Log odds, kappa, and phi
compared. Behavior Research Methods, Instruments and Computers, 28(3), 446457.
Bargh, J. A. (1990). Auto-motives: Preconscious determinants of social interaction. In
R. M. Sorrentino & E. T. Higgins (Eds.), Handbook of Motivation and Cognition
(pp. 93-130). New York: Guilford Press.
Bargh, J.A. (1999). The unbearable automaticity of being. American Psychologist, 54
Accuracy of Expert Self-Report 42
(7), 462-479.
Bargh, J.A. & Ferguson, M. J. (2000). Beyond behaviorism: On the automaticity of
higher mental processes. Psychological Bulletin, 126(6), 925-945.
Baumeister, R. F. (1984). Choking under pressure: Self-consciousness and paradoxical
effects of incentives on skillful performance. Journal of Personality and Social
Psychology, 46, 610-620.
Beilock, S. L., Wierenga, S. A., & Carr, T. H. (2002). Expertise, attention, and memory
in sensorimotor skill execution: Impact of novel task constraints on dual-task
performance and episodic memory. The Quarterly Journal of Experimental
Psychology, 55A(4), 1211–1240.
Bereiter, C., & Scardamalia, M. (1993). Surpassing ourselves: An inquiry into the
nature and implications of expertise. Chicago, IL: Open Court.
Besnard, D. (2000). Expert error. The case of trouble-shooting in electronics.
Proceedings of the 19th International Conference SafeComp2000 (pp. 74-85).
Rotterdam, Netherlands.
Besnard, D., & Bastien-Toniazzo, M. (1999). Expert error in trouble-shooting: An
Accuracy of Expert Self-Report 43
exploratory study in electronics. International Journal of Human-Computer
Studies, 50, 391-405.
Besnard, D., & Cacitti, L. (2001). Troubleshooting in mechanics: A heuristic matching
process. Cognition, Technology & Work, 3, 150-160.
Blessing, S. B., & Anderson, J. R. (1996). How people learn to skip steps. Journal of
Experimental Psychology: Learning, Memory, and Cognition, 22(3), 576-598.
Bransford, J.D., Brown, A.L. & Cocking, R.R. (1999). How people learn: Brain, mind,
experience, and school. Washington, DC: National Academy Press.
Brookings, J. B., Wilson, G. F., & Swain, C. R. (1996). Psychophysiological responses
to changes in workload during simulated air traffic control. Biological
Psychology, 42, 361-377.
Brown, S. W., & Bennett, E. D. (2002). The role of practice and automaticity in
temporal and nontemporal dual-task performance. Psychological Research, 66,
80-89.
Bruenken, R., Plass, J. L., & Leutner, D. (2003). Direct measurement of cognitive load
in multimedia learning. Educational Psychologist, 38(1), 53-61.
Accuracy of Expert Self-Report 44
Camp, G., Paas, F., Rikers, R., & van Merrienboer, J. (2001). Dynamic problem
selection in air traffic control training: a comparison between performance,
mental effort and mental efficiency. Computers in Human Behavior, 17, 575-595.
Carlson, R. A., Khoo, B. H., Yaure, R. G., & Schneider, W. (1990). Acquisition of a
problem-solving skill: Levels of organization and use of working memory.
Journal of Experimental Psychology: General, 119(2), 193-214.
Casner, S. M. (1994). Understanding the determinants of problem-solving behavior in a
complex environment. Human Factors, 36(4), 580-596.
Ceci, S. J., & Liker, J. K. (1986). A day at the races: A study of IQ, expertise, and
cognitive complexity. Journal of Experimental Psychology, 115, 255-266.
Chandler, P. & Sweller, J. (1991). Cognitive load theory and the format of instruction.
Cognition and Instruction, 8, 293-332.
Charness, N., Krampe, R., & Mayr, U. (1996). The role of practice and coaching in
entrepreneurial skill domains: An international comparison of life-span chess skill
acquisition. In K. A. Ericsson (Ed.), The road to excellence: The acquisition of
expert performance in the arts and sciences, sports, and games (pp. 51-80).
Mahwah, NJ: Lawrence Erlbaum Associates.
Accuracy of Expert Self-Report 45
Chase, W. G. & Simon, H. A. (1973). Perception in chess. Cognitive Psychology, 4, 5581.
Chen, Z., & Siegler, R. S. (2000). Across the great divide: Bridging the gap between
understanding of toddlers’ and older children’s thinking. Monographs of the
Society for Research in Child Development, Serial No. 261, 65(2).
Chi, M. T., Feltovich, P. J. & Glaser, R. (1981). Categorization and representation of
physics problems by experts and novices. Cognitive Science, 5, 121-152.
Chi, M. T. H., Glaser, R., & Rees, E. (1982). Expertise in problem solving. In R. J.
Sternberg (Ed.), Advances in psychology of human intelligence (Vol. 1, pp. 7-75).
Hillsdale, NJ: Erlbaum.
Chipman, S. F., Schraagen, J. M., & Shalin, V. L. (2000). Introduction to cognitive task
analysis. In J. M. Schraagen, S. F. Chipman, & V. L. Shalin (Eds.), Cognitive
Task Analysis (pp. 3-23). Mahwah, NJ: Lawrence Erlbaum Associates.
Chung, G. K. W. K., de Vries, F. L., Cheak, A. M., Stevens, R. H., & Bewley, W. L.
(2002). Cognitive process validation of an online problem solving assessment.
Computers in Human Behavior, 18, 669-684.
Clement, J. (1988). Observed methods for generating analogies in scientific problem
Accuracy of Expert Self-Report 46
solving. Cognitive Science, 12(4):563-586.
Cook, T. D., & Campbell, D. T. (1976). The design and conduct of quasi-experimental
and true experiments in field settings. In M. D. Dunnette (Ed.), Handbook of
Industrial and Organizational Psychology (pp. 223-326). Rand McNally
Publishing Company.
Cooke, N. J. (1992). Modeling human expertise in expert systems. In R. R. Hoffman
(Ed.), The psychology of expertise: Cognitive research and empirical AI (pp. 2960). Mahwah, NJ: Lawrence Erlbaum Associates.
Cowan, N. (2000). The magical number 4 in short-term memory: A reconsideration of
mental storage capacity. Behavioral and Brain Sciences, 24, 87-185.
Crandall, B., & Gamblian, V. (1991). Guide to early sepsis assessment in the NICU.
Fairborn, OH: Klein Associates, Inc.
Crandall, B., & Calderwood, R. (1989). Clinical assessment skills of neo-natal intensive
care nurses (Report Contract 1-R43-NR01911-01, National Center for Nursing,
National Institutes of Health, Bethesda, MD). Fairborn, OH: Klein Associates,
Inc.
Dey, I. (1993). Qualitative data analysis: A user-friendly guide for social scientists.
Accuracy of Expert Self-Report 47
New York: Routledge.
Doane, S. M., Pellegrino, J. W., & Klatzky, R. L. (1990). Expertise in a computer
operating system: Conceptualization and performance. Human-Computer
Interaction, 5, 267-304.
Doll, J., & Mayr, U. (1987). Intelligenz und schachleistung—eine untersuchung an
schachexperten. [Intelligence and achievement in chess—a study of chess
masters.] Psychologische Beiträge, 29, 270-289.
Dhillon, A. S. (1998). Individual differences within problem-solving strategies used in
physics. Science Education, 82, 379-405.
Dubois, D., & Shalin, V. L. (2000). Describing job expertise using cognitively oriented
task analyses (COTA). In J. M. Schraagen, S. F. Chipman, & V. L. Shalin (Eds.),
Cognitive Task Analysis (pp. 41-55). Mahwah, NJ: Lawrence Erlbaum
Associates.
Ericsson, K. A. (Ed.). (1996). The road to excellence: The acquisition of expert
performance in the arts and sciences, sports and games. Mahwah, New Jersey:
Lawrence Erlbaum Associates.
Ericsson, K. A. (2000). How experts attain and maintain superior performance:
Accuracy of Expert Self-Report 48
Implications for the enhancement of skilled performance in older individuals.
Journal of Aging & Physical Activity, 8(4), 366-372.
Ericsson, K. A., & Charness, N. (1994). Expert performance: Its structure and
acquisition. American Psychologist, 49(8), 725-747.
Ericsson, K. A., Krampe, R. T., & Tesch-Romer, C. (1993). The role of deliberate
practice in the acquisition of expert performance. Psychological Review, 100,
363-406.
Ericsson, K. A., & Lehmann, A. C. (1996). Expert and exceptional performance:
Maximal adaptation to task constraints. Annual Review of Psychology, 47, 273305.
Ericsson, K. A., Patel, V., & Kintsch, W. (2000). How experts’ adaptations to
representative task demands account for the expertise effect in memory recall:
Comment on Vicente and Wang (1998). Psychological Review, 107(3), 578-592.
Ericsson, K. A., & Smith, J. (1991). Towards a general theory of expertise: Prospects
and limits. New York: Cambridge University Press.
Fitts, P. M., & Posner, M. I. (1967). Human Performance. Belmont, CA: Brooks-Cole.
Accuracy of Expert Self-Report 49
Fournier, L. R., Wilson, G. F., & Swain, C. R. (1999). Electrophysiological, behavioral,
and subjective indexes of workload when performing multiple tasks:
manipulations of task difficulty and training. International Journal of
Psychophysiology, 31, 129-145.
Frensch, P. A., & Sternberg, R. J. (1989). Expertise and intelligent issues: When is it
worse to know better? In R. J. Sternberg (Ed.), Advances in the psychology of
human intelligence (Vol. 5, pp. 157-188). Hillsdale, NJ: Lawrence Erlbaum
Associates.
Gall, M. D., Borg, W. R., & Gall, J. P. (1996). Educational Research: An introduction
(6th edition). White Plains, NY: Longman Publishers, USA
Gevins, A., & Smith, M. E. (2003). Neurophysiological measures of cognitive workload
during human-computer interaction. Theoretical Issues in Ergonomic Science,
4(1-2), 113-131.
Giere, R. N. (1993). Cognitive Models of Science. Psycoloquy, 4(56), Scientific
Cognition (1). Accessed online at http://psycprints.ecs.soton.ac.uk/archive/
00000350/.
Gobet, F. (1998). Expert memory: A comparison of four theories. Cognition, 66, 115152.
Accuracy of Expert Self-Report 50
Gobet, F., & Simon, H. A. (1996). Templates in chess memory: A mechanism for
recalling several boards. Cognitive Psychology, 31, 1-40.
Golde, C. M. and Dore, T. M. (2001). At cross purposes: What the experiences of
doctoral students reveal about doctoral education. Philadelphia, PA, A report for
The Pew Charitable Trusts.
Gordon, S. E. (1992). Implications of cognitive theory for knowledge acquisition. In
R. R. Hoffman (Ed.), The psychology of expertise: Cognitive research and
empirical AI (pp. 99-120). Mahwah, NJ: Lawrence Erlbaum Associates.
Gorman, B. S., & Allison, D. B. (1997). Statistical alternatives for single-case designs.
In R. D. Franklin, D. B. Allison, & B. S. Gorman (Eds.), Design and Analysis of
Single-Case Research (pp. 159-214). Mahwah, NJ: Lawrence Erlbaum
Associates.
Gott, S. P., Hall, E. P., Pokorny, R. A., Dibble, E., & Glaser, R. (1993). A naturalistic
study of transfer: Adaptive expertise in technical domains. In D. K. Detterman &
R. J. Sternberg (Eds.), Transfer on trial: Intelligence, cognition, and instruction
(pp. 258-288). Norwood, NJ: Ablex.
Hankins, T. C., & Wilson, G. F. (1998). A comparison of heart rate, eye activity, EEG
Accuracy of Expert Self-Report 51
and subjective measures of pilot mental workload during flight. Aviation, Space,
and Environmental Medicine, 69(4), 360-367.
Harman, H. H. (1967). Modern factor analysis. Chicago: University of Chicago Press.
Hermans, D., Crombez, G., & Eelen, P. (2000). Automatic attitude activation and
efficiency: The fourth horseman of automaticity. Psychologica Belgica, 40(1), 322.
Hatano, G. (1982). Cognitive consequences of practice in culture specific procedural
skills. Quarterly Newsletter of the Laboratory of Comparative Human Cognition,
4, 15-18.
Hatano, G. & Inagaki, K. (1986). Two courses of expertise. In H. Stevenson, H. Asuma
& K. Hakauta (Eds.). Child Development and Education in Japan (pp. 262-272).
San Francisco, CA: Freeman.
Hatano, G. & Inagaki (2000). Practice makes a difference: Design principles for
adaptive expertise. Presented at the Annual Meeting of the American Education
Research Association. New Orleans, Louisiana: April, 2000.
Hmelo-Silver, C. E., Nagarajan, A., & Day, R. S. (2002). ‘‘It’s harder than we thought it
Accuracy of Expert Self-Report 52
would be”: A comparative case study of expert-novice experimentation strategies.
Science Education, 86, 219-243.
Holbrook, A. (2002). Examining the quality of doctoral research. A Symposium
presented at the American Education Research Association Conference, New
Orleans, LA, April 1-5, 2002.
Holyoak, K. J. (1991). Symbolic connectionism: Toward third generation theories of
expertise. In K. A. Ericsson & J. Smith (Eds.) Toward a general theory of
expertise: Prospects and limits (pp. 301-335). New York: Cambridge University
Press.
Hong, J. C., & Liu, M. C. (2003). A study on thinking strategy between experts and
novices of computer games. Computers in Human Behavior, 19, 245-258.
Hooker, K., Nesselroade, D. W., Nesselroade, J. R., & Lerner, R. M. (1987). The
structure of intraindividual temperament in the context of mother-child dyads: Ptechnique factor analyses of short-term change. Developmental Psychology,
23(3), 332-346.
Hulin, C. L., Henry, R. A., & Noon, S. L. (1990). Adding a dimension: Time as a factor
in the generalizability of predictive relationships. Psychological Bulletin, 107,
328-340.
Accuracy of Expert Self-Report 53
Hyoenae, J., Tommola, J., & Alaja, A. (1995). Pupil dilation as a measure of processing
load in simultaneous interpretation and other language tasks. Quarterly Journal
of Experimental Psychology: Human Experimental Psychology, 48A(3), 598-612.
Jonassen, D. H. (2000). Toward a meta-theory of problem solving. Educational
Technology: Research and Development, 48(4), 63-85.
Jones, C. J., & Nesselroade, J. R. (1990). Multivariate, replicated, single-subject,
repeated measures designs and P-technique factor analysis: A review of
intraindividual change studies. Experimental Aging Research, 16, 171-183.
Kahane, H. (1973). Logic and philosophy: A modern introduction (2nd Ed.). Belmont,
CA: Wadsworth.
Klahr, D. (2000). Exploring science: The cognition and development of discovery
processes. Cambridge, MA: The MIT Press.
Klahr, D., & Dunbar, K. (1988). Dual space search during scientific reasoning. Cognitive
Science , 12(1), 1-55.
Klahr, D., Fay, A. L., & Dunbar, K. (1993). Heuristics for scientific experimentation: A
developmental study. Cognitive Psychology, 24(1), 111-146.
Accuracy of Expert Self-Report 54
Klahr, D., & Simon, H. A. (1999). Studies of scientific discovery: Complementary
approaches and convergent findings. Psychological Bulletin, 125(5), 524-543.
Klahr, D., & Simon, H. A. (2001). What have psychologists (and others) discovered
about the process of scientific discovery? Current Directions in Psychological
Science, 10(3), 75-79.
Klein, G. A., & Calderwood, R. (1996). Investigations of naturalistic decision making
and the recognition-primed decision model (Research Note 96-43). Yellow
Springs, OH: Klein Associates, Inc. Prepared under contract MDA903-85-C-0327
for U. S. Army Research Institute for the Behavioral and Social Sciences,
Alexandria, VA.
Klein, G. A., Calderwood, R., & MacGregor, D. (1989). Critical decision method for
eliciting knowledge. IEEE Transactions on Systems, Man, and Cybernetics, 19,
462-472.
Kosslyn, S. M., Thompson, W. L., Kim, I. J., & Alpert, N. M. (1995). Topographical
representations of mental images in primary visual cortex. Nature, 378, 496-498.
Koubek, R. J., & Salvendy, G. (1991). Cognitive performance of super-experts on
computer program modification tasks. Ergonomics, 34, 1095-1112.
Accuracy of Expert Self-Report 55
Labaree, D. F. (2003). The peculiar problems of preparing educational researchers.
Educational Researcher, 32(4), 13-22.
Lamberti, D.M., & Newsome, S.L. (1989). Presenting abstract versus concrete
information in expert systems: What is the impact on user performance.
International Journal of Man-Machine Studies, 31, 27-45.
Larkin, J.H. (1983). The role of problem representation in physics. In D. Gentner & A.L.
Stevens (Eds.). Mental models (pp. 75-98). Hillsdale, NJ: Lawrence Erlbaum
Associates.
Larkin, J.H. (1985). Understanding, problem representation, and skill in physics. In S.F.
Chipman, J.W. Segal, & R. Glaser (Eds.), Thinking and learning skills (Vol. 2):
Research and open questions (pp. 141-160). Hillsdale, NJ: Erlbaum.
Larkin, J., McDermott, J., Simon, D.P. and Simon, H.A. (1980a). Expert and novice
performance in solving physics problems. Science, 208, 1335-1342.
Larkin, J. H., McDermott, J., Simon, D. P., & Simon, H. A. (1980b). Models of
competence in solving physics problems. Cognitive Science, 4(4), 317-345.
Lavelli, M., Pantoja, A. P. F., Hsu, H., Messinger, D., & Fogel, A. (in press). Using
Accuracy of Expert Self-Report 56
microgenetic designs to study change processes. In D. G. Teti (Ed.), Handbook of
Research Methods in Developmental Psychology. Oxford, UK: Blackwell
Publishers.
Lawson, A. E. (1978). Development and validation of the classroom test of formal
reasoning. Journal of Research in Science Teaching, 15(1), 11-24.
Lawson, A. E. (2000). Classroom test of scientific reasoning: Multiple choice version
(Revised Edition). Tempe, AZ: Arizona State University.
Lawson, A.E., Clark, B., Cramer-Meldrum, E., Falconer, K.A., Kwon, Y.J., & Sequist,
J.M. (2000). The development of reasoning skills in college biology: Do two
levels of general hypothesis-testing skills exist? Journal of Research in Science
Teaching, 37(1), 81-101.
Leckie, G. J. (1996). Desperately seeking citations: Uncovering faculty assumptions
about the undergraduate research process. The Journal of Academic
Librarianship, 22, 201-208.
Logan, G. (1988a). Toward an instance theory of automatization. Psychological Review,
95, 583-598.
Logan, G. D. (1988b). Automaticity, resources, and memory: Theoretical controversies
Accuracy of Expert Self-Report 57
and practical implications. Human Factors, 30(5), 583-598.
Logan, G. D., & Cowan, W. (1984). On the ability to inhibit thought, and action: A
theory of an act of control. Psychological Review, 91, 295-327.
Logan, G. D., Taylor, S. E., & Etherton, J. L. (1996). Attention in the acquisition and
expression of automaticity. Journal of Experimental Psychology: Learning,
Memory, and Cognition, 22(3), 620-638.
Lovett, M. C., & Anderson, J. R. (1996). History of success and current context in
problem solving: Combined influences on operator selection. Cognitive
Psychology, 31, 168-217.
Masunaga, H., & Horn, J. (2001). Expertise and age-related changes in components of
intelligence. Psychology and Aging, 16(2), 293-311.
McBurney, D. H. (1998). Research Methods. (4th edition). Pacific Grove, CA: Brooks/
Cole.
McGuire, W. J. (1997). Creative hypothesis generating in psychology: Some useful
heuristics. Annual Review of Psychology, 48, 1-30.
Meissner, C. A., & Memon, A. (2002). Verbal overshadowing: A special issue
Accuracy of Expert Self-Report 58
exploring theoretical and applied issues. Applied Cognitive Psychology, 16, 869872.
Moray, N., & Reeves, T. (1987). Hunting the homomorph: A theory of mental models
and a method by which they may be identified. Proceedings of the International
Conference on Systems, Man, and Cybernetics (pp. 594-597).
Nagai, M., Kazai, K., & Yagi, A. (2001). Lambda response by orientation of striped
patterns. Perceptual and Motor Skills, 93, 672-676.
Nesselroade, J.R., & Featherman, D.L. (1991). Intraindividual variability in older
adults’ depression scores: Some implications for development theory and
longitudinal research. In D. Magnusson, L. Bergman, G. Rudinger. & Y.B.
Torestad (Eds.). Problems and methods in longitudinal research: Stability and
change (pp. 7 - 66). Cambridge: Cambridge University Press.
Neves, D. (1977). An experimental analysis of strategies of the Tower of Hanoi (C.I.P.
Working Paper No. 362). Unpublished manuscript, Carnegie Mellon University.
Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, NJ:
Prentice-Hall.
Nisbett, R. E., & Wilson, T. D. (1977). Telling more than we can know: Verbal reports
Accuracy of Expert Self-Report 59
on mental processes. Psychological Review, 84, 231-259.
Onwuegbuzie, A. J., Slate, J. R., Paterson, F. R. A., Watson, M. H., & Schwartz, R. A.
(2000). Factors associated with achievement in educational research courses.
Research in the Schools, 7(1), 53-65.
Patel, V. L., Kaufman, D. R., & Magder, S. A. (1996). The acquisition of medical
expertise in complex dynamic environments. In K. A. Ericsson (Ed.), The road to
excellence: The acquisition of excellence in arts and sciences, sports and games
(pp. 127-165). Mahwah, NJ: Lawrence Erlbaum Associates.
Pedhazur, E. J., & Schmelkin, L. P. (1991). Measurement, design, and analysis: An
integrated approach. Hillsdale, NJ: Lawrence Erlbaum Associates.
Perkins, D. N., & Grotzer, T. A. (1997). Teaching intelligence. American Psychologist,
52(10), 1125-1133.
Phillips, L. H., Wynn, V. E., McPherson, S., & Gilhooly, K. J. (2001). Mental planning
and the Tower of London task. The Quarterly Journal of Experimental
Psychology, 54A(2), 579-597.
Proctor, R. W., & Dutta, A. (1995). Skill acquisition and human performance.
Thousand Oaks, CA: Sage Publications.
Accuracy of Expert Self-Report 60
Radziszewska, B., & Rogoff, B. (1988). Influence of adult and peer collaborators on
children’s planning skills. Developmental Psychology, 24(6), 840-848.
Radziszewska, B., & Rogoff, B. (1991). Children’s guided participation in planning
imaginary errands with skilled adult or peer partners. Developmental Psychology,
27(3), 381-389.
Raskin, J. (2000). Humane interface: New directions for designing interactive systems.
Boston, MA: Addison-Wesley.
Raven, J.C. (1969). Standard progressive matrices. London: Lewis.
Reder, L. M., 7 Schunn, C. D. (1996). Metacognition does not imply awareness:
Strategy choice is governed by implicit learning and memory. In L. M. Reder
(Ed.), Implicit memory and metacognition (pp. 45-77). Mahwah, NJ: Lawrence
Erlbaum Associates.
Reingold, E. M., Charness, N., Schultetus, R. S., & Stampe, D. M. (2001). Perceptual
automaticity in expert chess players: Parallel encoding of chess relations.
Psychonomic Bulletin & Review, 8(3), 504-510.
Rowe, R. M., & McKenna, F. P. (2001). Skilled anticipation in real-world tasks:
Accuracy of Expert Self-Report 61
Measurement of attentional demands in the domain of tennis. Journal of
Experimental Psychology: Applied, 7(1), 60-67.
Schaafstal, A., & Schraagen, J. M. C. (2000). Training of troubleshooting: A structured,
task analytical approach. In J. M. C. Schraagen, S. F. Chipman, & V. L. Shalin
(Eds.), Cognitive Task Analysis (pp. 57-70). Mahwah, NJ: Lawrence Erlbaum
Associates.
Schneider, W., & Fisk, A. D. (1982). Concurrent automatic and controlled visuals
search: Can processing occur without resource cost? Journal of Experimental
Psychology: Learning, Memory, and Cognition, 8, 261-278.
Schneider, W., & Shiffrin, R.M. (1977). Controlled and automatic human information
processing: I. Detection, search, and attention. Psychological Review, 84(1), 1-66.
Schoenfeld, A. H. (1999). The core, the canon, and the development of research skills:
Issues in the preparation of education researchers. In E. C. Lagemann & L. S.
Shulman (Eds.), Issues in Education Research: Problems and Possibilities (pp.
166-202). San Francisco, CA: Jossey-Bass.
Schooler, J. W., Ohlsson, S., & Brooks, K. (1993). Thoughts beyond words: When
language overshadows insight. Journal of Experimental Psychology: General,
122(2), 166-183.
Accuracy of Expert Self-Report 62
Schunn, C. D., & Anderson, J. R. (1998). Scientific discovery. In J. R. Anderson & C.
Lebiere (Eds.) The Atomic Components of Thought (pp. 385-427). Mahwah, NJ:
Lawrence Erlbaum Associates.
Schunn, C. D., & Anderson, J. R. (1999). The generality/specificity of expertise in
scientific reasoning. Cognitive Science, 23(3), 337-370.
Schunn, C. D., & Klahr, D. (1996). The problem of problem spaces: When and how to
go beyond a 2-space model of scientific discovery. In G. W. Cottrell (Ed.),
Proceedings of the 18th Annual Conference of the Cognitive Science Society (pp.
25-26). Hillsdale, NJ: Erlbaum.
Schunn, C. D., Reder, L. M., Nhouyvanisvong, A., Richards, D. R., & Stroffolino, P. J.
(1997). To calculate or not to calculate: A source activation confusion model of
problem-familiarity’s role in strategy selection. Journal of Experimental
Psychology: Learning, Memory, & Cognition, 23(1), 3-29.
Schraagen, J. M. C. (1990). How experts solve a novel problem within their domain of
expertise (IZF 1990 B-14). Soesterberg, The Netherlands: Netherlands
Organization for Applied Scientific Research. Prepared under HDO assignment
B89-35 for TNO Institute for Perception, TNO Division of National Defense
Research, Soesterberg, The Netherlands.
Accuracy of Expert Self-Report 63
Schraagen, J. M. C. (1993). How experts solve a novel problem in experimental design.
Cognitive Science, 17(2), 285–309.
Schraagen, J. M. C., Chipman, S. F., & Shute, V. J. (2000). State-of-the-art review of
cognitive task analysis techniques. In J. M. C. Schraagen, S. F. Chipman, & V. L.
Shalin (Eds.), Cognitive Task Analysis (pp. 467-487). Mahwah, NJ: Lawrence
Erlbaum Associates.
Seamster, T. L., Redding, R. E., & Kaempf, G. L. (2000). A skill-based cognitive task
analysis framework. In J. M. C. Schraagen, S. F. Chipman, & V. L. Shalin (Eds.),
Cognitive Task Analysis (pp. 135-146). Mahwah, NJ: Lawrence Erlbaum
Associates.
Shafto, P., & Coley, J. D. (2003). Development of categorization and reasoning in the
natural world: Novices to experts, naïve similarity to ecological knowledge.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 29(4),
641-649.
Shalin, V. L., Geddes, N. D., Bertram, D., Szczepkowski, M. A., & DuBois, D. (1997).
Expertise in dynamic, physical task domains. In P. J. Feltovich, K. M. Ford, & R.
R. Hoffman (Eds.), Expertise in Context (pp. 195-217). Menlo Park, CA:
American Association for Artificial Intelligence Press.
Accuracy of Expert Self-Report 64
Shiffrin, R. M. (1988). Attention. In R. C. Atkinson, R. J. Herrnstein, G. Lindzey, & R.
D. Luce, (Eds.), Stevens' Handbook of Experimental Psychology (2nd Ed.) (pp.
739-811). New York: Wiley.
Shiffrin, R. M. & Dumais, S. T. (1981). The development of automatism. In J. R.
Anderson (Ed.), Cognitive Skills and Their Acquisition (pp. 111-140). Hillsdale,
NJ: Erlbaum.
Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information
processing: II. Perceptual learning, automatic attending, and a general theory.
Psychological Review, 84, 127-190.
Simonton, D. K. (1999). Talent and its development: An emergenic and epigenetic
mode. Psychological Review, 106, 435-457.
Singley, M. K., & Anderson, J. R. (1989). Transfer of cognitive skill. Cambridge, MA:
Harvard University Press.
Sloutsky, V. M., & Yarlas, A. (2000). Problem representation in experts and novices: Part
2. Underlying processing mechanisms. Proceedings of the XXII Annual
Conference of the Cognitive Science Society (pp. 475-480). Mahwah, NJ:
Erlbaum.
Accuracy of Expert Self-Report 65
Stankov, L., & Crawford, J. D. (1993). Ingredients of complexity in fluid intelligence.
Learning and Individual Differences, 5, 73-111.
Stankov, L., & Raykov, T. (1993). On task complexity and “simplex” correlation
matrices. Australian Journal of Psychology, 45, 125-145.
Stankov, L., & Raykov, T. (1995). Modeling complexity and difficulty in measures of
fluid intelligence. Structural Equation Modeling, 2, 335-366.
Starkes, J. L., Deakin, J. M., Allard, F., Hodges, N. J., & Hayes, A. (1996). Deliberate
practice in sports: What is it anyway? In K. A. Ericsson (Ed.), The road to
excellence: The acquisition of expert performance in the arts and sciences, sports,
and games (pp. 81-106). Mahwah, NJ: Lawrence Erlbaum Associates.
Sternberg, R. J. (1997). Cognitive conceptions of expertise. In P. J. Feltovich, K. M.
Ford, & R. R. Hoffman (Eds.), Expertise in Context (pp. 149-162). Menlo Park,
CA: American Association for Artificial Intelligence Press.
Sternberg, R. J., Gigorenko, E. L., & Ferrari, M. (2002). Fostering intellectual
excellence through developing expertise. In. M. Ferrari (Ed.), The Pursuit of
Excellence Through Education (pp. 57-83). Mahwah, NJ: Lawrence Erlbaum
Associates.
Accuracy of Expert Self-Report 66
Sternberg, R. J., & Horvath, J. A. (1998). Cognitive conceptions of expertise and their
relations to giftedness. In R. C. Friedman & K. B. Rogers (Eds.), Talent in
Context (pp. 177-191). Washington, DC: American Psychological Association.
Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive
Science, 12, 257-285.
Sweller, J. (1989). Cognitive technology: Some procedures for facilitating learning and
problem solving in mathematics and science. Journal of Cognitive Psychology,
81(4), 457-466.
Sweller, J. (1994). Cognitive load theory, learning difficulty, and instructional design.
Learning & Instruction, 4(4), 295-312.
Sweller, J., Chandler, P., Tierney, P., & Cooper, M. (1990). Cognitive load as a factor in
the structuring of technical material. Journal of Experimental Psychology:
General, 119(2), 176-192.
Thagard, P. (1998). Ulcers and bacteria: I. Discovery and acceptance. Studies in the
History and Philosophy of Biology and Biomedical Sciences, 9, 107-136.
Torff, B. (2003). Developmental changes in teachers' use of higher order thinking and
content knowledge. Journal of Educational Psychology, 95(3), 563-569.
Accuracy of Expert Self-Report 67
VanLehn, K. (1996). Cognitive skill acquisition. Annual Review of Psychology, 47,
513-539.
VanLehn, K., Siler, S., Murray, C., Yamauchi, T., & Baggett, W. B. (2003). Why do
only some events cause learning during human tutoring? Cognition &
Instruction, 21(3), 209-249.
Velmahos, G., Toutouzas, K., Sillin, L., Chan, L., Clark, R. E. Theodorou, D., Maupin,
F., Murray, J., Sullivan, M, Demetriades, D., & DeMeester, T. M. (2002).
Cognitive task analysis for teaching technical skills in an animate surgical skills
laboratory: A randomized controlled trial with pertinent clinical outcomes.
Presented at the 2002 Annual meeting of the Association for Surgical Education,
April 4-6, Baltimore, MD.
Vicente, K. J. (2000). Revisiting the constraint attunement hypothesis: Reply to
Ericsson, Patel, and Kintsch (2000) and Simon and Gobet (2000). Psychological
Review, 107(3), 601-608.
Voss, J. F., Tyler, S. W., & Yengo, L. A. (1983). Individual differences in the solving of
social science problems. In R. F. Dillon & R. R. Schmeck (Eds.), Individual
Differences in Cognition (Vol. 1, pp. 205-232). New York: Academic.
Accuracy of Expert Self-Report 68
Wegner, D. M. (2002). The Illusion of Conscious Will. Cambridge, MA: MIT Press.
Wheatley, T., & Wegner, D. M. (2001). Automaticity of action, Psychology of. In N. J.
Smelser & P. B. Baltes (Eds.), International Encyclopedia of the Social and
Behavioral Sciences, (pp. 991-993). Oxford, IK: Elsevier Science Limited.
Williams, K. E. (2000). An automated aid for modeling human-computer interaction. In
J. M. C. Schraagen, S. F. Chipman, & V. L. Shalin (Eds.), Cognitive Task
Analysis (pp. 165-180). Mahwah, NJ: Lawrence Erlbaum Associates.
Wilson, G. F., & Fisher, F. (1995). Cognitive task classification based upon topographic
EEG data. Biological Psychology, 40, 239-250.
Wilson, T. D., & Nisbett, R. E. (1978). The accuracy of verbal reports about the effects
of stimuli on evaluations and behavior. Social Psychology, 41(2), 118-131.
Wood, P., & Brown, D. (1994). The study of intraindividual differences by means of
dynamic factor models: Rationale, implementation, and interpretation.
Psychological Bulletin, 116(1), 166-186.
Yang, S. C. (in press). Reconceptualizing think-aloud methodology: Refining the
encoding and categorizing techniques via contextualized perspectives. Computers
in Human Behavior, 1-21.
Accuracy of Expert Self-Report 69
Yarlas, A., & Sloutsky, V. M. (2000). Problem representation in experts and novices: Part
1. Differences in the content of representation. Proceedings of the XXII Annual
Conference of the Cognitive Science Society (pp. 1006-1011). Mahwah, NJ:
Erlbaum.
Zeitz, C. M. (1997). Some concrete advantages of abstraction: How experts’
representations facilitate reasoning. In P. J. Feltovich, K. M. Ford, & R. R.
Hoffman (Eds.), Expertise in Context (pp. 43-65). Menlo Park, CA: American
Association for Artificial Intelligence.
Download