1 Protocols for Cognitive Task Analysis by Robert R. Hoffman Florida Institute for Human and Machine Cognition Beth Crandall & Gary Klein Klein Associates Division of ARA including- Goal Directed Task Analysis by Debra G. Jones and Mica R. Endsley SA Technologies This document is intended to empower people in the conduct of Cognitive Task Analysis and Cognitive Field Research. All rights reserved. Copyright 2008 by R. R. Hoffman, B. Crandall, G. Klein, D. G. Jones and M. R. Endsley. Individuals or organizations wishing to use or cite this material should request permission from: Robert R. Hoffman, Ph.D., Institute for Human & Machine Cognition 40 South Alcaniz St. Pensacola, FL 32502-6008 rhoffman@ihmc.us This document was prepared as a companion to Working Minds: A Practitioner’s Guide to Cognitive Task Analysis by B. Crandall, G. Klein, G., & R. R. Hoffman. Cambridge MA: MIT Press, 2005. The CTA protocols are best conducted through an understanding of procedural discussions in Working Minds. This document was prepared with the support of the Advanced Decision Architectures Collaborative Technology Alliance, sponsored by the US Army Research Laboratory under Cooperative Agreement DAAD19-01-2-0009. 2 1. Introduction 2. Bootstrapping Methods 2.1. Documentation Analysis 2.2. The Recent Case Walkthrough 2.3. The Knowledge Audit 2.4. Client Interviews 3. Proficiency Scaling 3.1. General Introduction 3.2. Career Interviews 3.3. Sociometry 3.3.1. General Protocol Notes 3.3.2. General Forms 3.3.3. Ratings Items 3.3.4. Sorting Items 3.3.5. Communication Patterns 3.3.5.1. Within-groups 3.3.5.2. Within-domains 3.4. Cognitive Styles Analysis 4. Workplace Observations and Interviews 4.1. General Introduction 4.2. Workspace Analysis 4.3. Activities Observations 4.4. Locations Analysis 4.5. Roles Analysis 4.6. Decision Requirements Analysis 4.7. Action Requirements Analysis 4.8. SOP Analysis 5. Modeling Practitioner Reasoning 5.1. Protocol Analysis 5.1.1. General Introduction 5.1.2. Materials 5.1.3. Coding Schemes 5.1.4. Effort 5.1.5. Coding Schemes Examples 5.1.6. Coding Verification 5.1.7. Procedural Variations and Short-cuts 5.1.8. A Final Word 5.2. Cognitive Modeling Procedure 5.2.1. Introduction and Background 5.2.2 Protocol Notes 5.2.2.1. Preparation 5.2.2.2. Round 1 5.2.2.3. Round 2 5.2.2.4. Round 3 5.2.2.5. Products 5.2.3. Templates 5.2.3.1. Round 1 5.2.3.2. Round 2 5.2.3.3. Round 3 6. The Critical Decision Method 6.1. General Description of the CDM Procedure 6.1.0. Preparation 6.1.1. Step One: Incident Selection 6.1.2 Step Two: Incident Recall 6.1.3. Step Three: Incident Retelling 6.1.4. Step Four: Timeline Verification and Decision Point Identification 6.1.5. Step Five: Deepening 6.1.6. Step Six: "What-if" Queries 6.2. Protocol Notes for the Researcher 3 6.3. Boiler Plate Data Collection Forms 6.3.0. Cover Form 6.3.1. Step One: Incident Selection 6.3.2. Step Two: Incident Recall 6.3.3. Step Three: Incident Retelling 6.3.4. Step Four: Timeline Verification and Decision Point Identification 6.3.5. Step Five: Deepening 6.3.6. Step Six: "What-if" Queries 6.3.7. Final Integration 6.3.8. Decision Requirements Table 7. Goal-Directed Task Analysis 7.1. Introduction 7.2. Elements of the GDTA 7.2.1. Goals 7.2.2 Decisions 7.2.3. Situation Awareness Requirements 7.3. Documenting the GDTA 7.3.1. Goal Hierarchy 7.3.2. Relational Hierarchy 7.3.3. Identifying Goals 7.3.4. Defining Decisions 7.3.5. Delineating SA requirements 7.4. GDTA Methodology 7.4.1. Step 1: Review Domain 7.4.2. Step 2: Initial interviews 7.4.3. Step 3: Develop the goal hierarchy 7.4.4. Step 4: Identify decisions and SA requirements 7.5.5. Step 5: Additional interviews / SME review of GDTA 7.5.6. Step 6: Revise the GDTA 7.5.7. Step 7: Repeat steps 5 & 6 7.5.9. Step 8: Validate the GDTA 7.5. Simplifying the Process 7.6. Conclusion 8. References 4 1. Introduction The purpose of this book to lay out guidance, including sample forms and instructions, for conducting a variety of Cognitive Task Analysis and Cognitive Field Research methods (including both knowledge elicitation and knowledge modeling procedures) that have a proven track record of utility in service of both expertise studies and the design of new technologies. Our presentation of guidance includes protocol notes. These are ideas, lessons learned, and cautionary tales for the researcher. We also present Templates—prepared forms for various tasks. We describe methods useful in boostrapping, which involves learning about the domain and the needs of the clients. We describe methods used in proficiency scaling and procedures for modeling knowledge and reasoning. We describe methods coming from the academic laboratory (e.g., Protocol Analysis), methods coming from applied research (e.g., Sociometrty) and methods coming from Cognitive Anthropology (e.g., Workpatterns Observations). Along the way we present lessons learned and rules of thumb that arose in the course of conducting of a number of projects, both small-scale and large-scale. The guidance offered here spans the entire gamut—going all the way from general advice about records-keeping, to detailed advice about audio recording knowledge elicitation interviews; going all the way from general issues involved in protocol analysis, to detailed advice about the problems and obstacles involved in managing large numbers of multi-media resources for knowledge models. Supporting material, including general guidance for the conduct of interviews and detailed guidance for the Critical Decision Method, can be found in: Crandall, B., Klein, G., & Hoffman R. R. (2006). Working Minds: A Practitioner’s Guide to Cognitive Task Analysis. Cambridge, MA: MIT Press. 5 2. CTA Boostrapping Methods Protocols and Templates In many projects that use Cognitive Work Analysis methodologies, one of the first steps is bootstrapping, in which the analysts familiarize themselves with the domain. 2.1. Documentation Analysis Bootstrapping typically involves an analysis of documents (manuals, texts, technical reports, etc.). The literature of research and applications of knowledge elicitation converges on the notion that documentation analysis can be a critically important method, one that should be utilized as a matter of necessity (Hoffman, 1987; Hoffman, Shadbolt, Burton & Klein, 1995). Documentation analysis is relied upon heavily in the Work Domain Analysis phase of Cognitive Work Analysis (see Hoffman & Lintern, 2005; Vicente, 1999). Documentation Analysis can be a time-consuming process, but can sometimes be indispensable in knowledge elicitation (Kolodner, 1983). In a study of aerial photo interpreters (Hoffman, 1987), interviews about the process of terrain analysis began only after an analysis of the readilyavailable basic knowledge of concepts and definitions. To take up the expert's time by asking questions such as "What is limestone?" would have made no sense. Although it is usually considered to be a part of bootstrapping, documentation analysis invariably occurs throughout the entire research programme. For example, in the weather forecasting case study (Hoffman, Coffey, & Ford, 2000), the bootstrapping process focused on an analysis of published literatures and technical reports that made reference to the cognition of forecasters. That process resulted in the creation of the project guidance document. However, documentation analyses of other types occurred throughout the remainder of the project—analysis of records of weather forecasting case studies, analysis of Standard Operating Procedures documents, analysis of the Local forecasting Handbooks, etc. Analysis of documents may range from underlining and note-taking to detailed propositional or content analysis according to functional categories Documentation Analysis can be conducted for a number of purposes or combinations of purposes. Rather than just having information flow from documents into the researcher's understanding, the analysis of the documents can involve specific procedures that generate records or analyses of the knowledge contained in the documents. Documentation Analysis can: Be suggestive of the reasoning of domain practitioners, and hence contribute to the forging of reasoning models, Be useful in the attempt to construct knowledge models, since the literature may include both useful categories for domain knowledge and important specific "tid-bits" of domain knowledge. In some domains, texts and manuals contain a great deal of specific information about domain knowledge. Be useful in the identification of leverage points—aspects of the work where even a modest infusion of or improvement in technology might result in a proportionately greater improvement in the work. 6 Often, "Standard Operating Procedures" documents are a source of information for bootstrapping. However, if one wants to use an analysis of SOP documents for other purposes, (e.g., as a way of verifying the Decision Requirements analysis), one may hold off on even looking at SOP documents during the bootstraping phase. Therefore, we include the protocol and template for SOP analysis later in this document, in the category of "Workplace Observations and Interviews. The uses and strengths of documentation analysis notwithstanding, the limitations must also be pointed out. Practitioners possess knowledge and strategies that do not appear in the documents and task descriptions and, in some cases at least, could not possibly be documented. Even in Work Domain Analysis, which is heavily oriented towards Documentation Analysis, interactions with experts are used to confirm and refine the Abstraction-Decomposition Analyses. In the weather forecasting project (Hoffman, Coffey, & Carnot, 2000) an expert told how she predicted the lifting of fog. She would look out toward the downtown and see how many floors above ground level she could count before the floors got lost in the fog deck. Her reasoning relied on a heuristic of the form, "If I cannot see the 10th floor by 10AM, pilots will not be able to take off until after lunchtime." Such a heuristic had great value, but is hardly the sort of thing that would be allowed in a formal standard operating procedure document. Many observations have been made of how engineers in process control bend rules and deviate from mandated procedures so that they can more effectively get their jobs done (see Koopman & Hoffman, 2003). We would hasten to generalize by saying that all practitioners who work in complex sociotechnical contexts possess knowledge and reasoning strategies that are not captured in existing procedures or documents, many of which represent (naughty) departures from what those experts are supposed to do or to believe (Johnston, 2003; McDonald, Corrigan, & Ward, 2002). 7 2.2. The Recent Case Walkthrough 2.2.1. Protocol Notes The Recent Case Walk-through is an abbreviated version of the Critical Decision Method, and is primarily aimed at: Bootstrapping the Analyst into the domain Establishing rapport with the Participant The method can also lead to the identification of potential leverage points, a tentative notion of practitioner styles, or tentative ideas about other aspects of cognitive work. 8 2.2.2. Template "The Recent Case Walkthrough" Interviewer Participant Start time Finish time Instructions Please walk me through your most recent problem. When was it? What were the goals you had to achieve? What did you do, step by step? Narrative (enlarge this cell as needed) Build a timeline TIME EVENT (Duplicate and expand these rows as needed) Revise the Timeline 9 Instructions: Now let's go over this timeline. I'd like you to suggest corrections and additions. Time Event (Duplicate and expand these rows as needed) Deepening Instructions: Now I'd like to go through this time line a step at a time and ask some questions. Information What information did you need or use to make this judgment? Where did you get this information? What did you do with this information? Mental Modeling As you went through the process of understanding this situation, did you build a conceptual of the problem scenario? Did you try to imagine the important causal events and principles? Did you make a spatial picture in your head? Can you draw me an example of what it looks like? Knowledge Base In what ways did you rely on your knowledge of this sort of case? How did this case relate to typical cases you have encountered? How did you use your knowledge of typical patterns? Guidance At what point did do you look at any sort of guidance? Why did you look at the guidance? How did you know if you could trust the guidance? How do you know which guidance to trust? Did you need to modify the guidance? When you modify the guidance, what information do you need or use? What do you do with that information? Legacy What forms do you have to complete or specific products you have to produce? 10 Procedures Deepening Form Timeline Entry Probe (Duplicate and expand these rows as needed) Response 11 2.3. The Knowledge Audit 2.3.1. Template "The Knowledge Audit" Interviewer Participant Start time Finish time Your work involves a need to analyze cases in great detail. But we know that experts are also able to maintain a sense of the "big picture." Can you give me an example of what is important about the big picture for (your, this) task? What are the major elements you have to know and keep track of? (Expand these rows, as needed) When you are conducting (this) task, are there ways of working smart, or accomplishing more with less--ways that you have found especially useful? What are some examples when you have improvised on this task, or noticed an opportunity to do something more quickly or better, and then followed up on it?" Have there been times when the data or the equipment pointed in one direction, but your own judgment tells you to do something else? Or when you had to rely on experience to avoid being led astray by equipment or the data? Can you think of a situation when you realized to that you had change what you were doing in order to get the job done? Can you recall an instance where you spotted a deviation from the norm or recognized that something 12 was amiss? Have you ever had experiences where part of a situation just "popped" out at you?--a time when you walked into a situation and knew exactly how things got there and where they were headed? Experts are able to detect cues and see meaningful patterns that novices cannot see. Can you tell me about a time when you predicted something that others did not, or where you noticed things that others did not catch? Were you right? Why (or why not)? Experts can predict the unusual. I know this happens all the time, but my guess is that the more experience you have, the more accurate your seemingly unusual judgments become. Can you remember a time when you were able to detect that something unusual happened? 13 2.4. Client Interviews 2.4.1. Protocol Notes The "clients" are the individuals or organizations that use the products that come from the domain practitioners (or their organizations). For example, new technologies might be designed to assist weather forecasters (who would be the "end-users" of the systems) whose job it is to produce forecasts to assist aviators. The aviators would be the "clients." One might make a great tool to help weather forecasters, but it if does not help the forecasters help their clients, it may end up an instance of how good intentions can lead to hobbled technologies. For many projects in which CTA plays a role, it is important to identify Client's needs (for information, planning, decision-making, etc.), and understand the differing needs and requirements of different Clients. Some “Tips” Keep an eye out for inadequacies in the standard forms, formats, and procedures used in the UserClient exchanges. Keep an eye out for adaptation (including local kluges and work-arounds). It can be useful to use Knowledge Audit probes in the Client Interviews. One need not use all of the probe questions that are included in the Template, below. 14 2.4.2. Template "Client Interviews" Analyst: Participant Date Start time Finish time Participant Age Participant Job designation (or duty assignment), and title (or rank) How many years have you been ___________________ ? What kinds of projects have you been involved in? (Expand the cells, as needed) Have you had any training in (provider's domain)? What is the range of experience you have had at (provider's domain or products)? What is your current (job/project)? Can you give me an example of a project where (provider information/services/products) had a 15 big impact? Can you give me an example of a project where the provider's (products/information/services) made a big difference in success or failure? What is it that you like about the provider's (products/information/services) that is provided to you? Can you give me an example of a project where you were highly confident in the provider's (products/services/information)? Can you give me an example of a project where you were highly skeptical of the provider's (products/services/information)? How do you use the provider's (products/services/information) during your projects? Have you ever been in a situation where you felt that you understood the (provider's domain) better than the provider? 16 Have you ever felt that you needed a kind of (product/service/information) that you are not given? In your opinion, what characterizes a good provider? What is the value of having providers who are highly skilled? Can you think of any things you could do to help the providers get better? 17 3. Proficiency Scaling Methods Protocols and Templates 3.1. General Introduction The psychological and sociological study of interpersonal judgments and attributions dates to the early days of both fields (e.g., Thorndike, 1918; see Mieg, 2001). In the past two decades, the areas of “Knowledge Elicitation” and “Expertise Studies” have been focused specifically on proficiency scaling, primarily for the purpose of identifying experts. The rich understanding of expert-level knowledge and reasoning provides a "gold standard" for domain knowledge and reasoning strategies. The rich understanding of journeymen, apprentices, and novices informs the researcher concerning training needs, the adaptation of software systems to individual user models, etc. In parallel, the areas of the Sociology of Scientific Knowledge and Cognitive Anthropology, have been focusing some effort on studies of work in the modern technical workplace and thus have brought to bear methods of ethnography. This has also entailed the study of proficient skill. This paradigm has been referred to as the study of "everyday cognition" and "cognition in the wild" (Hutchins, 1995a,b; Lave, 1988; Scribner, 1984; Suchman, 1993). Researchers have studied a great many domains of proficiency, including traditional farming methods used in Peru, photocopier repair, the activities of military command teams, the social interactions of collaborative teams of designers, financiers, climatologists, physicists, and engineers, and so on (Collins, 1985; Cross, Christians, and Dorst, 1996; Fleck and Williams, 1996; Greenbaum and Kyng, 1991; Hutchins, 1995; Lynch, 1991; Orr, 1985). Classic studies in this paradigm include: Hutchins' studies of navigation (by both individuals and teams, by both culturallytraditional methods and modern technology-based methods) (1990) on the interplay of navigation tools and collaborative activity in maritime navigation, Hutchins' (1995b) study of flight crew collaboration and team work, Orr's study of how photocopier repair technicians acquire knowledge and skill by sharing their "war stories" with one another. Lave's (1988) studies of traditional methods used in crafts such as tailoring, and the nature of skills used in everyday life (e.g., math skills used during supermarket shopping). Research in the area of the "sociology of scientific knowledge" (e.g., Collins, 1997; Fleck and Williams, 1996; Knorr-Cetina, 1981; Knorr-Cetina and Mulkay, 1983) also falls within this paradigm, and has involved detailed analyses of science as it is practiced, e.g., the history of the development of new lasers and of gravity wave detectors, the role of consensus in decisions about what makes for proper practice, breakdowns in the attempt to transmit skill from one laboratory to another, failures in science or in the attempt to apply new technology, etc. In psychology and cognitive science, concern with the question of how to define expertise (Hoffman, 1998) led to an awareness that determination of who is an expert in a given domain can require its own empirical effort. In a proficiency scaling procedure, the researcher determines a domain-and organizationally appropriate scale of proficiency levels. Proficiency scaling is the attempt to forge a domain- and organizationally-appropriate scale for distinguishing levels of proficiency. Some alternative methods are described in Table 3.1. 18 Table 3.1. Some methods that can contribute data for the creation of a proficiency scale. Method Yield In-depth career interviews about education, training, etc. Ideas about breadth and depth of experience; Estimate of hours of experience Professional standards or licensing Ideas about what it takes for individuals to reach the top of their field. Measures of performance at the familiar tasks Can be used for convergence on scales determined by other methods. Social Interaction Analysis Proficiency levels in some group of practitioners or within some community of practice (Meig, 2000; Stein, 1997) Example Weather forecasting in the armed services, for instance, involves duty assignments having regular hours and regular job or task assignments that can be tracked across entire careers. Amount of time spent at actual forecasting or forecasting-related tasks can be estimated with some confidence (Hoffman, 1991). The study of weather forecasters involved senior meteorologists US National Atmospheric and Oceanographic Administration and the National Weather Service (Hoffman, Coffey, and Ford, 2000). One participant was one of the forecasters for Space Shuttle launches; another was one of the designers of the first meteorological satellites. Weather forecasting is again a case in point since records can show for each forecaster the relation between their forecasts and the actual weather. In fact, this is routinely tracked in forecasting offices by the measurement of "forecast skill scores" (see Hoffman and Trafton, forthcoming). In a project on knowledge preservation for the electric power utilities (Hoffman and Hanes, 2003), experts at particular jobs (e.g., maintenance and repair of large turbines, monitoring and control of nuclear chemical reactions, etc.) were readily identified by plant managers, trainers, and engineers. The individuals identified as experts had been performing their jobs for years and were known among company personnel as "the" person in their specialization: "If there was that kind of problem I'd go to Ted. He's the turbine guy." If research or domain practice is predicated on any notion of expertise (e.g., a need to build a new decision support system that will help experts but can also be used to teach apprentices), then it is necessary to have some sort of empirical anchor on what it really means for a person to be an "expert," say, or an "apprentice." A proficiency scale should be based on more than one method, and ideally should be based on at least three. We refer to this as the “three legs of a tripod.” Social Interaction Analysis, the result of which is a sociogram, is perhaps the lesser known of the methods. A sociogram, that represents interaction patterns between people (e.g., frequent interactions), is used to study group clustering, communication patterns, and workflows and processes. For Social Interaction Analysis, multiple individuals within an organization or a community of practice are interviewed. Practitioners might be asked, for example, "If you have a problem of type x, who would you go to for advice?" or they might be asked to sort cards bearing the names of other domain practitioners into piles according to one or another skill dimension or knowledge category. 19 Hoffman, Ford and Coffey (2000) suggested that proficiency scaling for a given project should be based on at least three of the methods listed in Table A.1. It is important to create a scale that is both domain and organizationally-appropriate, and that considers the full range of proficiency. A rule of thumb in psychology is that it takes 10 years of intensive practice to become an expert (Ericsson, Charness, Feltovich, and Hoffman, 2005). Assuming six hours of concentrated practice per day for 40 weeks per year, this would mean 12,000 hours of practice at domain tasks. while convenient, such rules of thumb are just that, rules of thumb. in the project on weather forecasting (Hoffman, Coffey and Ford, 2000), the proficiency scale distinguished three levels: experts, journeymen, and apprentices, each of which was further distinguished by three levels of seniority, up to the senior experts who has as much as 50,000 hours of practice and experience at domain tasks. Following Hoffman (1988; Hoffman, Shadbolt, Burton, and Klein, 1995; Hoffman, 1998) we rely on a modern variation on the classification scheme used in the craft guilds (Renard, 1968). 20 Table 3.2. Basic proficiency categories (adapted from Hoffman, 1998). Naive Novice Initiate Apprentice Journeyman Expert Master One who is totally ignorant of a domain. Literally, someone who is new—a probationary member. There has been some (“minimal”) exposure to the domain. Literally, someone who has been through an initiation ceremony—a novice who has begun introductory instruction. Literally, one who is learning—a student undergoing a program of instruction beyond the introductory level. Traditionally, the apprentice is immersed in the domain by living with and assisting someone at a higher level. The length of an apprenticeship depends on the domain, ranging from about one to 12 years in the craft guilds. Literally, a person who can perform a day’s labor unsupervised, although working under orders. An experienced and reliable worker, or one who has achieved a level of competence. It is possible to remain at this level for life. The distinguished or brilliant journeyman, highly regarded by peers, whose judgments are uncommonly accurate and reliable, whose performance shows consummate skill and economy of effort, and who can deal effectively with certain types of rare or “tough” cases. Also, an expert is one who has special skills or knowledge derived from extensive experience with subdomains. Traditionally, a master is any journeyman or expert who is also qualified to teach those at a lower level. Traditionally, a master is one of an elite group of experts whose judgments set the regulations, standards, or ideals. Also, a master can be that expert who is regarded by the other experts as being “the” expert, or the “real” expert, especially with regard to sub-domain knowledge. This is useful first step, in that it adds some explanatory weight to the category labels, but needs refinement when actually applied in particular projects. In the weather forecasting project (Hoffman, Coffey and Ford, 2000) it was necessary to refer to Senior and Junior ranks within the expert and journeyman categories. (For details, see Hoffman, Trafton, and Roebber, 2009). The Table entries fall a step short of being operational definitions. They are suggestive, however, and this is where the specific methods of proficiency scaling come in: Career evaluations (hours of experience, breadth and depth of experience), social evaluations and attributions (sociograms), and performance evaluations. Mere years of experience does not guarantee expertise, however. Experience scaling can also involve an attempt to gauge the practitioners' breadth of experience—the different contexts or sub-domains they have worked in, the range of tasks they have conducted, etc. Deeper experience (i.e., more years at the job) affords more opportunities to acquire broader experience. Hence, depth and breadth of experience should be correlated. But they are not necessarily correlated, and so examination of both breadth and depth of experience is always wise. Thus, one comes to a resolution that age and proficiency are generally related, that is, partially correlated. This is captured in Table 3.3. 21 Table 3.3. Age and proficiency are, generally speaking, partially correlated. 10 Novice Intern and Apprentice Journeyman 20 30 LIFESPAN 40 Learning an commence at any age (e.g., the school child who is avid about dinosaurs; the adult professional who undergoes job retraining) Individuals less than18yrs of age are rarely considered Can commence at any age appropriate as Apprentices It is possible to achieve journeyman status (e.g., chess, computer hacking, sports) 50 60 Achievement of expertise in significant domains may not be possible Achievement of expertise in significant domains may not be possible Is typically achieved in mid- to late-20s, but development may go no further Expert Stage It is possible to achieve expertise (e.g., chess, computer hacking, sports) Master Stage Is rarely achieved early in a career Most typical for 35 yrs of age and beyond It is possible to achieve midcareer Most typical of seniors To summarize, a rich understanding of expert-level knowledge and reasoning provides a "gold standard" for domain knowledge and reasoning strategies. The rich understanding of journeymen, apprentices, and novices informs the researcher concerning training needs, the adaptation of software systems to individual user models, etc. A number of CTA methods can support the process of proficiency scaling. Proficiency scaling is the attempt to forge a domain- and organizationally-appropriate scale for distinguishing levels of proficiency. A proficiency scale can be based on: 1. Estimates of experience extent, breadth, and depth based on results from a Career Interview, 2. Evaluations of performance based on performance measures, 3. Ratings and skill category sortings from a sociogram. 22 If research is predicated on any notion of expertise (e.g., a need to build a new decision support system that will help experts but can also be used to teach apprentices), then it is necessary to have some sort of empirical anchor on what it really means for a person to be an "expert," say, or an "apprentice." Ideally, a proficiency scale will be based on more than one method. 23 3.2. Career Interviews 3.2.1. General Protocol Notes Important things the Interviewer needs to accomplish: Convey respect for the Participant for their job and the contribution their job represents. Convey respect for the Participant for their skill level and accomplishments. Convey a desire to help the Participant by learning about their goals and concerns. Mitigate any concerns that the Participant might have that the Interviewer is there to finger-point or disrupt the work, or broker people's agendas. Prove that the Interviewer has done some homework, and knows something about the Participant’s domain. Important things the Interviewer needs to learn: What it takes for a practitioner in the domain to achieve expertise. The range of kinds of experience (breadth and depth) that characterize the most highlyexperienced practitioners. Acquire a copy of the participant's vita or resume and append it to the interview data. Selection of Participants can depend on a host of factors, ranging from travel and availability issues to any need or requirement there might be to identify and interview experts. Whatever the criteria or constraints, it is generally useful to interview personnel who represent a range of job assignments, levels of experience, and levels of “local” knowledge. In most CTA procedures, it is necessary to have at least some Participants who conduct domain tasks on a daily basis. In most CTA procedures it is highly valuable to have Participants who can talk openly about their processes and procedures, seem to love their jobs, and have had sufficient experience to have achieved expertise at using the newest technologies that are in the operational context. Lesson Learned Practitioners who are verbally fluent and who seem to love their domain can sometimes be too talkative and can jump around story-telling and getting off on tangents, especially since in some cases they rarely have opportunities to talk about their domain and their experiences to interested listeners. They sometimes need to be kept on track by saying, e.g., "that's an interesting story—let's come back to that…" Some “Tips” 24 Until or unless each interviewer has achieves sufficient skill or has had sufficient practice with Participants in the domain at hand, interviews should be conducted by pairs of interviewers, who take separate notes. Expect that the interview guides will have to be revised, even several times, during the course of the data collection phase. Think twice about audiotape recording the interviews—it can take a great deal of time to transcribe and code the data. For interviews that are audio recorded, do not conduct the interview in a room with ANY background noise (e.g., air conditioners, equipment, etc.) as this will significantly degrade the quality of the recordings. Consider using a y-connector allowing input from two lapel microphones, one clipped to the collar of the Participant, one to the collar of the Interviewer. After the recorder is started, preface it with a clear statement of the day, date, time, Participant name, and Researcher name, project name, and interview type. Be sure to ask the Participant to speak clearly and loudly. It is often valuable, and sometimes necessary, to attempt to determine how much time a Participant has spent engaged in activities that arguably fall within the domain. One might focus on the amount of time the Participant has spent at some single activity. For instance, how much time do weather forecasters spend trying to interpret at infra-red satellite images? One might focus on something broader, such as how much time a forecaster has spent engage in all forecasting-related activities. The rule of thumb from expertise studies is that to qualify as an expert one must have had 10 years full time experience. Assuming 40 hours per week and 40 weeks per year, this would amount to 16,000 hours. Another rule of thumb, based on studies of chess, is 10,000 hours of task experience (Chase and Simon, 1973). These rules of thumb will in all likelihood not apply in many applied research contexts. For instance, in the weather forecasting case study (Hoffman, Coffey, and Ford, 2000), the most senior expert forecasters had logged in the range of 40,00050,000 hours of experience. The experience scale must always be appropriate for the domain, but also appropriate for and the particular organizational context. When retrospecting about their education, training, and career experiences, the reflective Participant will recall details, but there will be some gaps and uncertainties. One strategy that has proven useful is to apply the “Gilbreth Correction Factor,” which is simply the estimated total divided by two. In his classic "time and motion" studies of the details of various jobs, Gilbreth (1911, 1914) found that the amount of time people spend at tasks that are directly work-related is about half of the amount of time they spend "on the job." Gilbreth also found if the schedule of work and rest periods could be made, as we would say today, more human-centered, that work output could double. Gilbreth studied tasks that had a major manual component (e.g., the assembly of radios), the "factor of 2" kept cropping up again (e.g., estimated time savings if a bin holding a part could be moved closer to the worker). The GCF serves as a reasonable enough anchor, a conservative estimate of time-on-task, in comparison to the raw estimate from the Career Interview, which may be likely to be an over-estimate. Research in a number of domains has shown that even with the GCF applied, estimates of time spent at domain tasks by the most 25 senior practitioners often fall well above the 10,000 hour benchmark set for the achievement of expertise (see Hoffman and Lintern, 2005). Mere years of experience does not guarantee expertise, however. Experience scaling also involves an attempt to gauge the practitioners' breadth of experience—the different contexts or sub-domains they have worked in, the range of tasks they have conducted, etc. Deeper experience (i.e., more years at the job) affords more opportunities to acquire broader experience. Hence, depth and breadth of experience should be correlated. But they are not necessarily correlated, and so examination of both breadth and depth of experience is always wise. Thus, one comes to a resolution that age and proficiency are generally related. This is captured in Table 3.2. 26 Table 3.2. Age and proficiency are, generally speaking, partially correlated. 10 Novice Intern and Apprentice Journeyman 20 30 LIFESPAN 40 Learning an commence at any age (e.g., the school child who is avid about dinosaurs; the adult professional who undergoes job retraining) Individuals less than18yrs of age are rarely considered Can commence at any age appropriate as Apprentices It is possible to achieve journeyman status (e.g., chess, computer hacking, sports) 50 60 Achievement of expertise in significant domains may not be possible Achievement of expertise in significant domains may not be possible Is typically achieved in mid- to late-20s, but development may go no further Expert Stage It is possible to achieve expertise (e.g., chess, computer hacking, sports) Master Stage Is rarely achieved early in a career Most typical for 35 yrs of age and beyond It is possible to achieve midcareer Most typical of seniors 27 3.2.2. Template "Career Interviews" Analyst: Participant Date Start time Finish time Name Age Title or rank or job designation or duty assignment High-school and college education (Expand these cells, as needed) Early interest in this domain? Training/College/Graduate Studies Post-Schoolhouse Experience Attempt To Estimate Hours Spent At The Tasks 28 EXPERIENCE WHERE WHEN WHAT For each task/activity: Number of hours per day Number of days per week Number of weeks per year TOTAL Notes Determination: Qualification on the Determination: Notes on Salient experiences and skills TIME 29 3.3. Sociometry 3.3.1. General Protocol Notes In the field of psychology there is a long history of studies attempting to evaluate interpersonal judgments of abilities and competencies (e.g., Thorndike, 1920). In recent times this, has extended to research on the extent to which expert judgments of the competencies of other experts conform to evidence from experts’ retrospections about previously-encountered critical incidents, and the extent to which competency judgments, and competencies shown in incident records, can be used to predict practitioner performance (McClelland, 1998). In the field of advertising there has been a similar thread, focusing on perceptions of the expertise of individuals who endorse products, and the relations of those perceptions to judgments of product quality and intention to purchase (e.g., Ohanian, 1990). There is a similar thread in the field of sociology. Sociometry is a branch of sociology in which attempts are made to measure social structures and evaluate them in terms of the measures. Psychiatrist Jacob Moreno first put the idea of a “sociogram” forward in the 1930s (see Fox, 1987) as a proposal of a means for measuring and visualizing interpersonal relations within groups. In the context of cognitive task analysis and its applications, Sociometry involves having domain practitioners make evaluative ratings about other domain practitioners. (see Stein, 1992, 1997). The method is simple to use, especially in the work setting. In its simplest form (see Stein, 1992, Appendix 2), the sociogram is a one-page form on which the participant simply checks off the names of other individuals from who the participant obtains information or guidance. Stein (1992, Appendix 2) provides guidance on using techniques from social network analysis in the analysis of sociometric data (e.g., how to calculate who is “central” in a communication network). Results from a sociogram can include information that can be useful in proficiency scaling, leverage point identification, and even cognitive modeling. They can be useful in exploring hypotheses about expertise attributions and communication patterns within organizations. For instance, one study (Stein, 1992) in a particular organization found that the number of years individuals had been in an organization correlated with the frequency with which they were approached as an information resource. However, the individual who was most often approached for information was not the individual with the most years in the organization. A sociogram can be designed for use with practitioners who know one another and work together (e.g., as a team, within an organization, etc.) or it can be designed for use in the study of a “community of practice” in which each practitioner knows something about many others, including others with whom the practitioner has not worked. Most sociograms, like social network analyses, focus on a single relation: Whom do you trust? From whom to you get information? With whom do you meet? And so on. There is no principled reason why a sociogrammetric interview, in the context of cognitive task analysis, might ask about a variety of social factors that contribute to proficiency attributions. The listings 30 of possible sociogram probe questions we present here is intended to illustrate a range of possibilities. Questions in sociograms can ask for self-other comparisons in terms of knowledge and experience level, in terms of reasoning styles, in terms of ability, in terms of subspecialization, etc. On those same topics, questions in a sociogram can ask about patterns of consultation and advice-seeking. Questions in a sociogram can involve a numerical ratings task, or a task involving the sorting of cards with practitioner names written upon them. The template we present is intended as suggestive. A sociogram need not precisely follow the guidance that we present here. Questions can be combined in many ways, as appropriate to the goals of the research and other pertinent project constraints. Asking practitioners who work together in an organization to make comparative ratings of one another can be socially awkward and can have negative repercussions. The Participant needs to feel that they can present their most honest and candid judgments while at the same time preserving confidentiality and anonymity. A solution is to have the Researcher who facilitates the interview procedure remain “blind” to the name and identity of the co-worker(s) that the Participant is rating. Ideally, before the sociogram is conducted, the Researchers already have some idea of who the experts in an organization are. Ideally, the sociogram should involve all of the experts as well as journeyman and apprentices. If the number of participants to be involved in a sociogram is greater than about seven, the conduct of a full Sociogram can become tedious for the Participant. (A glance ahead will show that each of the sociograms involves having each Participant answer a number of questions about each of the other Participants). One solution to this problem is to have each Participant conduct the sociogram for only a subset of the other Participants, with the constraint that each participant is assessed by at least two others. Another solution is to divide the pool of Participants into two sets, conduct the sociograms within each of the two sets, and then randomly select n Participants from set 1 and n Participants from set 2, and conduct another sociogram in which they rate individuals whom they had not rated in the first sociogram. Ideally, the same proficiency scale assignments should arise out of the converging data sets. As for all of the Templates we present, we suggest that the first page of the sociogram data collection form include the following: 3.3.2. General Forms Templates 31 Interviewer Participant Date Start time Finish time For Within-Group Sociograms We would like to ask you some questions regarding: (Name or ID number) How long (months, years) have you known/worked with one another? 32 3.3.3. Ratings Scale Items We would now like you to answer a set of specific questions that use a numerical rating scale. Please utilize the full range of the scale. Do not hesitate to say things that reflect either positively or negatively on you or your colleagues. We are not here to point fingers, or challenge anyone’s skill or authority. All the data we collect will remain anonymous and confidential. That is, no one’s name will be associated with particular results. Please be open, honest, and candid. Compared to you, what is their level or extent of experience? Check one of the boxes below. They have had much more experience They have had more experience They have had somewhat more experience Our levels of experience are about the same They have had somewhat less They have had less experience They have had much less experience 7 6 5 4 3 2 1 Comments (Expand the cells, as needed) Compared to your own knowledge of your domain, what is their level of knowledge? Check one of the boxes below. They know much more 7 They know more 6 They know somewhat more 5 Our levels of knowledge are about the same They know somewhat less They know less 4 3 2 They know much less Comments Compared to your level of ability at conducting your domain tasks, what is their level of 1 33 ability? Check one of the boxes below. Their work is consistently better Their work is usually better Their work is often better Their work is about as good as mine Their work is often not as good Their work is usually not as good Their work is consistently not as good 7 6 5 4 3 2 1 Comments Is s/he good at particular tasks within your domain than others? What types? _____________________________________________________________________ _____________________________________________________________________ Compared to your level of ability at conducting those kinds of tasks, what is their ability level? Circle one of the numbers below Their work is consistently better Their work is usually better Their work is often better Their work is about as good as mine Their work is often not as good Their work is usually not as good Their work is consistently not as good 7 6 5 4 3 2 1 Comments Compared to your level of ability at rapidly integrating data and quickly sizing up a situation, what is their level of ability? Check one of the boxes below. Much greater skill 7 Greater skill 6 Moderately greater skill 5 About the same skill level as me 4 Somewhat lower skill 3 Lower skill 2 Much lower skill 1 Comments Compared to the general accuracy of your own products or results, what is the general accuracy of their products or results? Check one of the boxes below. 34 Much greater Greater Moderately greater About the same as mine Somewhat lower Lower Much lower 7 6 5 4 3 2 1 Comments Compared to your own degree of ability at communicating about your domain with others, what is their level of ability? Check one of the boxes below. Much greater skill Greater skill 7 Moderately greater skill 6 5 About the same skill level as me Somewhat lower skill level 4 Lower skill 3 2 Much lower skill 1 Comments Using traditional craft guild terminology, what would you say is their degree of expertise? Circle one of the numbers below. Master Sets the standards for performance in the profession. 1 Expert Highly competent, accurate, and skillful. Has specialized skills. 2 Journeyman Can work competently without supervision or guidance. Apprentice Needs guidance or supervision; shows a need for experience and on-the-job training, but also some need for additional formal training. 3 4 Initiate Still shows a need for needs formal training. 5 Comments Working Style What seems to be their “style" of reasoning or working? Can you place them somewhere on the continuum below? Use a check mark 35 Understand the domain events as a dynamic causal system; form conceptual models of what’s going on. They conduct their work on the basis of standard procedures, checklists, etc. rather than a deep understanding of the domain -------------------------------------------------------------------- Comments Reflective Practitioner or Journeyman? What seems to be their “style" of reasoning or working? Can you place them somewhere on the continuum below? Use a check mark. Are reflective, and question their own reasoning and assumptions, and are even critical of their own ideas. Do not seem to be deep a thinker. They do not think about their own reasoning or question their strategies or assumptions. -------------------------------------------------------------------- Comments Leader or Follower? What seems to be their “style" of reasoning or working? Can you place them somewhere on the continuum below? Use a check mark Are they one of the visionaries or innovators? They seem to refine and extend the pathfinding ideas of others. -------------------------------------------------------------------- Comments 36 3.3.4. Sorting Task Items Experimenter's Protocol Participant is invited to write the names of 12 domain practitioners on 3x5 cards. Names can be selected by instruction or by Participant choice. “People whose names come to mind, people you know or know of.” “People you have worked with.” “People who represent a range of experience in the field.” A card bearing the Participant’s own name can be added into the pile of cards. Using traditional craft guild terminology, what would you say is their degree of expertise? Sort into piles. Master Sets the standards for performance in the profession. Comments Expert Highly competent, accurate, and skillful. Has specialized skills. Journeyman Apprentice Can work competently without supervision or guidance. Needs guidance or supervision; shows a need for experience and on-the-job training, but also some need for additional formal training. Novice Still shows a need for formal training. 37 Place the cards into two piles, according to these two general styles. “Reflective Practitioner” They understand the domain at a deep conceptual level. They understand the important issues. “Proceduralist” They conduct their work on the basis of standard procedures and have a superficial rather than a deep understanding of the domain. They do not seem to care much about underlying or outstanding issues. Comments For each name, note any specialization or specialized areas of subdomain knowledge or skill. Comments Who are the real pathfinders or founders of the field? Sort into piles. Founders and pathfinders Those who refine and extend the seminal ideas of others Comments Who would you say are the real visionaries in the field? Sort into piles. Visionaries Comments Those who refine and extend the innovations and visions of others 38 Compared to you, what is their level of skill at conducting domain procedures? Sort into piles. Much higher skill Higher skill About the same as me Less skill Much less skill Comments Compared to you, what is their level or extent of experience? Sort into piles. Much more experience More experience About the same level of experience as me Less experience Much less experience Comments Compared to your own knowledge of your domain or field, what is their level of knowledge? Sort into piles. They know much more They know more Our levels of knowledge are about the same They know less They know much less Comments Compared to the general quality or accuracy of your own products or results, what is the general quality or accuracy of their products or results? Greater Somewhat greater About the same as mine Somewhat less Less Comments Compared to your own degree of ability at communicating about your field both to people 39 within the field and those outside of the field, what is their level of ability? Much greater Comments Greater skill About the same skill level as me Lower skill Much lower skill 40 3.3.5. Communications Patterns Items 3.3.5.1. Within-Groups, Teams, or Organizations We would like to ask you some questions regarding: (Name or ID number) How long (months, years) have you known/worked with one another? How often do you collaborate in the conduct of tasks? How often do you conduct separate tasks, but at the same time? How many times in the last month have you gone to them for guidance or an opinion concerning the interpretation of data? (If “rarely” or “never,” is this because of rank?) Does s/he ever come to YOU for guidance or an opinion? How many times in the last month have you gone to them for guidance or an opinion concerning the understanding of causes or events? How many times in the last month have you gone to them for guidance or an opinion concerning the equipment you use or the operations you perform using the technology? If “rarely” or “never,” is this because of rank or status? Does s/he ever come to YOU for guidance or an opinion? How many times in the last month have you gone to them for guidance or an opinion concerning organizational operations and procedures? (If “rarely” or “never,” is this because of rank?) Does s/he ever come to YOU for guidance or an opinion? 41 3.3.5.2. Within-Domains Who in the field of ______________________________________ has come to you for guidance or an opinion? On what topics or issues has your opinion been solicited? Do you know of_________________________________________? How do you know of _____________________________________? Do you personally know ___________________________________? If so, how long have you known _____________________________? Have you worked or collaborated in any way with ______________________________? Have you gone to them for guidance or an opinion? Does s/he ever come to you for guidance or an opinion? 42 3.4. Cognitive Styles Analysis 3.4.1. Introduction In any given domain, one should expect to find individual differences in skill level, but also such things as motivation and style. Different practitioners prefer work problems in different ways at different points, for any of a host of reasons. Proficiency and style can be related in many ways. Proficiency can be achieved via differing styles. Proficiency can be exercised via differing styles. On the other hand, proficiency takes time to achieve and thus the styles that characterize experts should be at least partly coincident with skill level. The analysis of cognitive styles, as a part of Proficiency Scaling, can be conducted with reliance on more than one perspective. Ideas concerning cognitive style come from: The proficiency scale of the craft guilds. Cognitive style can be evaluated from the perspective that style is partly correlated with skill level. Some designations and descriptions are presented in Table 3.3. Experiences of applied researchers who have conducted CTA procedures, and research in the paradigm of Naturalistic Decision Making (Reference). Some styles designations and descriptions are presented in Table 3.4 The ideas presented in these Tables are offered as suggestions. The categorizations are not intended to be inclusive or exhaustive—not all practitioners will fall neatly into one or another of the categories. Any analysis of reasoning styles will have to be crafted so as to be appropriate to the domain at hand. Any given project may find itself in need of these, some of these, or other appropriate designations. 43 Table 3.3. Likely reasoning style as a function of skill level. (For the definitions of the levels, see Table 3.1.) Low Skill Levels (Naïve, Initiate, Novice) Their performance standards focus on satisfying literal job requirements. They rely too much on the guidance (e.g., out puts of computer models), and take the guidance at They use a single strategy or a fixed set of procedures--they conduct tasks in a highly proceduralized manner using the specific methods taught in the schoolhouse. Their data-gathering efforts are limited but they tend to say that they look at everything. They do not attempt to link everything together in a meaningful way—they do not form visual, conceptual models or does not spend enough time forming and refining models. They are reactive, and "chase the observations." They are insensitive to contextual factors or local effects. They cannot do multi-tasking without suffering from overload, becomes less efficient and more prone to error. They are uncomfortable when there is any need to improvise. They are unaware of situations in which data can be misleading, and do not adjust their judgments appropriately. They are less adroit. Things that are difficult for them to do are easy for the expert. They are less aware of actual needs of the end-users. They are not strategic opportunists. Medium Skill Levels (Apprentice, Journeyman) Despite motivation, it is possible to remain at this level for life and not progress to the expert level. They begin by formulating the problem but then like individuals at the Low Skill Levels, those at the Medium Skill Levels tend to follow routine procedures—they tend to seek information in a proceduralized or routine way, following the same steps every time. Some time is spent forming a mental model. Their reasoning is mostly at the level of individual cues, some ability to recognize cue configurations within and across data types. They tend to place great reliance on the available guidance. They tend to treat guidance as a whole, and not look to sub-elements—they focus on determining whether the guidance is "right" or "wrong." Journeymen can issue a competent product without supervision but both Apprentices and Journeymen have difficulty with tough or unfamiliar scenarios. 44 High Skill Levels (Expert, Master) They have high personal standards for performance. Their process begins by achieving a clear understanding of the problem situation. They are always skeptical of guidance and know the biases and weakness of each of the various guidance products. They look at agreement of the various types of guidance--features or developments on which the guidances agree. They are flexible and inventive in the use of tools and procedures, i.e., the conduct action sequences based on a reasoned selection among action sequence alternatives. They are able to take context and local effects into account. They are able to adjust data in accordance with data age. They are able to take into account data source location. They do not make wholesale judgments that guidance is either "right" or "wrong." They reason ahead of the data. They can do multi-tasking and still conduct tasks efficiently without wasting time or resources. They are comfortable improvising. They can tell when equipment or data or data type is misleading; know when to be skeptical. They create and use strategies not taught in school. They form visual, conceptual models that depict causal forces. They can use their mental model to quickly provide information they are asked for. Things that are difficult for them to do may be either very difficult for the novice to do, or may be cases in which the difficulty or subtleties go totally unrecognized by the novice. They can issue high-quality products for tough as well as easy scenarios. They provide information of a type and format that is useful to the client. They can shift direction or attention to take advantage of opportunities. 45 Table 3.4. Some Cognitive Styles Designations (based on Pliske, Crandall, and Klein, 2000). SCIENTIST The "reflective practitioner" Thinks about their own strategies and reasoning, engages in "what if?" reasoning, questions their own assumptions, and is freely critical of their own ideas. AFFECT STYLE ACTIVITIES CORRELATED SKILL OR PROFICIENCY LEVEL They are often "lovers" of the domain. They like to experience domain events and watch patterns develop. They are motivated to improve their understanding of the domain. They are likely to act like a Mechanic when stressed or when problems are easy. They actively seek out a wide range of experience in the domain, including experience at a variety of scenarios. They show a high level of flexibility. They spend proportionately more time trying to understand problems, and building and refining a mental model. They are most likely to be able to engage in recognition-primed decisionmaking. They spend relatively little time generating products since this is done so efficiently. They possess a high level of pattern-recognition skill. They possess a high level of skill at mental simulation. They possess skill at using a wide variety of tools. They understand domain events as a dynamic system. Their reasoning is analytical and critical. They possess an extensive knowledge base of domain concepts, principles and reasoning rules. PROCEDURALIST Seem to lack the intrinsic motivation to improve their skills and expand their knowledge, so they do not move up to the next level. They do not think about their own reasoning or question their strategies or assumptions. However, in terms of performance and/or experience, they are journeyman on the proficiency scale. AFFECT STYLE ACTIVITIES CORRELATED SKILL OR PROFICIENCY LEVEL They sometimes are lovers of the domain. They like to experience domain events and see patterns develop. They are motivated to improve their understanding of the domain. They spend proportionately less time building a mental model. They engage in recognition-primed decision making less often. They are proficient with the tools they have been taught to use. They are less likely to understand domain events as a complex dynamic system. They see their job as having the goal of completing a fixed set of procedures, although these are often reliant on a knowledge base. They sometimes rely on a knowledge base of principles of rules, but this tends to be limited to types of events they have worked on in the past. Typically, they are younger and less experienced. MECHANIC They conduct their work on the basis of standard procedures and have a superficial rather than a deep 46 understanding of the domain. They do not seem to care much or wonder about underlying concepts or outstanding issues. AFFECT STYLE ACTIVITIES CORRELATED SKILL OR PROFICIENCY LEVEL They are not interested in knowing more than what it takes to do the job; not highly motivated to improve. They are likely to be unaware of factors that make problems difficult. They see their job as having the goal of completing a fixed set of procedures, and these are often not knowledge-based. They spend proportionately less time building a mental model. They rarely are able to engage in recognition-primed decision making. They sometimes have years of experience. They possess a limited ability to describe their reasoning. They are skilled at using tools with which they are familiar, but changes in the tools can be disruptive. DISENGAGED The most salient feature is emotional detachment and disengagement with the domain or organization. AFFECT STYLE ACTIVITIES CORRELATED SKILL OR PROFICIENCY LEVEL They do not like their job. They do not like to think about the domain. They are very likely to be unaware of factors that make problems difficult. They spend most of the time generating routine products or filling out routine forms. They spend almost no time building a mental model and proportionately much more time examining the guidance. They cannot engage in recognition-primed decision making. They possess a limited knowledge base of domain concepts, principles, and reasoning rules. Skill is limited to scenarios they have worked in the past. Their products are of minimally acceptable quality. In the Cognitive Styles analysis, the Researchers conduct a card-sorting task that categorizes the domain Practitioners according to perceived cognitive style. The procedure presupposes that all of the Researchers who will engage in the sorting task have had sufficient opportunity to interact with each of the Participants (e.g., opportunistic, unstructured interviews, initial workplace observations, etc.). The Researchers will have engaged in some interviews, perhaps unstructured and opportunistic ones, with the domain Practitioners. That is, the Researchers will have bootstrapped to the extent that they feel they have some idea of each Practitioner's style. The result of the procedure is a consensus sorting of Practitioners according to a set of cognitive styles. Individually, the styles should seem both appropriate and necessary. As a set, the styles should seem appropriate and complete, at least complete with regard to the set's functionality in helping the Researchers achieve the project purposes or goals. 3.4.2. Protocol For Cognitive Styles Analysis 47 1. The Analysis is conducted after the Researchers have become familiar with each of the domain practitioners who are serving as Participants in the project. That is, the Researchers have already had opportunities to discuss the domain with the Participants whose styles will be categorized. 2. The Researchers formulate a set of what seem to be reasonable and potentially useful styles categories, having short descriptive labels. These may be developed by the Researchers, or may be adapted from the listing provided in this doccument. 3. Each Practitioner's name is written on a file card. Working independently, all of the Researchers sort cards according to style. 4. The Researchers review and discuss each other's sorts. They converge on an agreed-upon set of categories and descriptive labels. 5. The Researchers re-sort. The Results are examined for consensus or rate of agreement. 6. The Researchers come to agreement that those Practitioners attributed the same style are similar with regard to the style definition. 7. The Researchers come to agreement that there are no Practitioners who seem to manifest more than one of the styles. That is, the final set of styles categories allows for a satisfactory partitioning of the set of Participants. 8. Steps 3-6 may be repeated. 48 3.4.3. Protocol for Emergent Cognitive Styles Analysis A variation on this task involves having the Researchers conduct the sorting task without worrying about creating any short descriptive labels for the perceived styles, and without worrying about devising a clear description of the styles. The procedure lets the categories "emerge" from the individual Researcher's perceptions and judgments without influence from a set of pre-determined styles descriptors. 1. Each Practitioner's name is written on a file card. Working independently, each of the Researchers deliberates concerning the style of each of the Practitioners, 2. The Researchers sorts cards according to those judgments. An individual Practitioner may be in a category alone, or may fall in a category with one or more others. 3. The Researchers review and discuss each other's sorts. They converge on an agreed-upon set of categories. These are expressed in terms of the theme or themes of each category. 4. Working independently, the Researchers re-sort. The results are examined for consensus or rate of agreement. 5. Only then are the categories are given descriptive labels. 6. The Researchers confirm that those Practitioners attributed the same style are similar with regard to the style definition. 7. The Researchers confirm that there are no Practitioners who seem to manifest more than one of the styles. That is, the final set of styles categories allows for a satisfactory partitioning of the set of Participants. 49 3.4.4. Protocol for Cognitive Styles-Proficiency/Skill Levels Comparison Another use of Cognitive Styles Analysis is to aid in proficiency scaling. In this use, Cognitive Styles are compared to proficiency levels, since there should be some relations. For example, it would be more likely to find that individuals who are potentially designated as expert in a sociogram or a career interview or a performance evaluation would be more likely to manifest the style of the "Reflective Practitioner" than that of the "Proceduralist." Proficiency data for comparison to those from a Cognitive Styles Analysis can themselves come from a sorting procedure. 1. Working individually, the Researchers sort cards containing the names of the practitioners according to expertise level (see Table 4.1) or along a continuum. 2. The Researchers compare each individual an individual in the next lowest level and an individual in the next highest level, to confirm the levels assignments. 3. The Researchers compare the results of the Levels sort and the Styles sort to determine whether there is any consensus on a possible relation of style to skill level. 3.4.5. Validation Researcher(s) who did not participate in the sorting task can be provided with the Styles categories' labels and definitions and can engage in the styles sorting (and the skill levels comparison). Their styles sorting should match the sorting arrived at in the final group consensus. Their skill levels sorting should affirm the consensus on the relation of styles to skill level. 50 4. Workplace Observations and Interviews 4.1. General Introduction Workspace analysis can focus on: The workspace in which groups work, The spaces in which individuals work, The activities in which workers engage, The roles or jobs, The requirements for decisions, The requirements for activities, The "standard operating procedures." These alternative ways of looking at workplaces will overlap. For example, at a particular work group, the activities might be identified with particular locations where particular individuals work. Roles might be identified with particular work locations. And so on. The listing above is not meant to be a complete, universal, or even consistent "slicing of the pie." Indeed, there is considerable overlap in the Templates presented below. Our purpose in providing these alternative ways of empirically investigating cognitive work is intended to suggest options. Where appropriate, ideas from the Templates might be used, or combined in ways other than those we present here. It must be kept in mind that no observation is unobtrusive. In arguing for the advantages of socalled unobtrusive observations, people sometimes assert that in the ideal case, the observer should be like a "fly on the wall." The analogy is telling, because generally speaking when someone is concentrating at work, a fly in the room, on the wall or otherwise, can be very distracting. The goal is not to capture behavior that has not been influenced in any way by the presence of the Researcher/observer, but to capture authentic behavior on the part of the Worker. The Researcher/observer is far more likely to observe authentic behavior if he or she has to some extent already been accepted into the culture of the work than if he or she tries to be a "fly on the fall." Acceptance means that the Workers regard the Researcher/observer as informed, sincere, and oriented toward helping, and the Workers know that the Researcher/observer respects them for their experience and skill. This understanding of observational methods means that the possibilities for observation include merging interviewing and observing. The analysis can be conducted with the assistance of an informant, who accompanies the Researcher/observer and answers questions. It should not be assumed that the Observations or Interviews should be limited to the categories set out here. Furthermore, these generic categories may need to be modified to better fit the given work domain. Nor should it be assumed that the Template probe questions presented here are necessarily appropriate to the domain, organization, or workplace. Finally, it should not be assumed that the probe questions presented in the Templates exhaust all of the questions that might be appropriate to the domain, organization, or workplace. 51 When probing concerning the deficiencies of a tool or workstation, the practitioner's initial response to the probe may be meditative silence. Only with some patience and re-probing will deficiencies be mentioned. Often, the practitioner will have discovered deficiencies but will not have explicitly thought about them. Re-probing can focus on make-work, work-arounds, action sequence alternatives, alternative displays, changes to the tool, etc. It is critical to ask about each piece of equipment and technology—whether it is legacy, how it is used, what is good about it, whether its use is mandated, etc. As can be seen in the Templates, below, the Templates combine various probes in differing ways. The two following tables provide an overview of the sorts of questions that are addressed, looking across the alternative procedures and their Templates. Possible Probes What sub-tasks or action sequences are involved in this activity? What information does the practitioner need? Where does the practitioner get this information? What does the practitioner do with this information? What forms have to be completed? How does the practitioner recover when glitches cause problems? How does the practitioner do work-arounds? Is each piece of technology a legacy system or a mandated legacy system? Phenomena to Look For Multi-tasking and multiple distractions? Do procedures require preparation of reports having categories and descriptions with little meaning? Are the categories and descriptions really relevant to job effectiveness? Is the environment one in which it is hard to develop skill? Do task demands match with the equipment? Are apprentices or trainees overwhelmed?—have to work through tons of data? What circumstances induce conservativism? Do workers test their own processes so as to learn the extent of their capabilities? Do workers have leeway to make risky decisions of other forms of opportunities to benefit from learning? Does any mentoring occur? What are the opportunities, or the obstacles/impediments? Is their routine work (such as reporting functions) that have to be performed and detract from learning or understanding? Do tasks or duties force the worker to adopt one or another style or strategy? The Roles Analysis and Locations analysis interviews are conducted for the purpose of specifying work patterns at the social level (information-sharing) and at the individual level (cognitive activities). To focus on Roles, the Participant is asked questions about each of the jobs 52 (duty assignments) that focus on the flow of information, such as "What information does this person need?" and, "What does s/he do with the information?" To focus on Locations, the Participant is asked questions about each of the tools (workstations), such as "What tasks are conducted here?," What is the action sequence involved?" and, "What makes the task difficult?" Workplace, Locations and Roles Analyses, Activities Observations, and SOP Analyses involve variations on similar probe questions, but the purposes of the analyses can differ, even though all can culminate in Decision Requirements Tables (DRTs) and/or Action Requirements Tables (ARTs). It is important that the Analyses be conducted in the work place, since artifacts in the workplace can serve as contextual cues to both the Participant and the Researcher. The Importance of Opportunism Once the Work Place Map has been prepared, every time the researchers visit the workplace they should bring with them a folder containing copies of the Work Place Map and also copies of the Template for the Activities Observations. At any moment there may be an opportunity to take notes about work patterns (e.g., information-sharing) or an observation that suggests leverage point. For example, during the weather forecasting case study (Hoffman, Coffey, and Ford, 2000), there were a number of occasions where pilots and forecasters made comments that were of use. On one occasion, a group of pilots was effectively stranded at the airfield due to inclement weather, and were gathered chatting in a hallway. This was taken as an opportunity to conduct an informal and unplanned interview about weather impact on air operations. 53 4.2. Workspace Analysis 4.2.1. Protocol Notes The main goal of this analysis is to create a detailed representation of the workspace, to inform subsequent analyses of work patterns and work activities. The analysis consists "simply" of creating a detailed map of the workplace. We use scare quotes because a detailed map, drawn to scale, takes considerable care and attention to detail. The effort includes: Photographing the workplace to any level of detail, including views of individual workspaces, photographs of physical spaces alone, photographs of individuals at work, and so on. The workplace should be photographed from a variety of perspectives. e.g., from the entranceways, from each desk or workstation looking toward the other desks, and so on. It is important to note all of the desks, individual worker's workplaces, the locations of cabinets, records, operating manuals, etc. Creating preliminary sketches map of the workplaces that are subsequently used to create a refined map of the workplace, noting any special features that are pertinent to the research goals (e.g., locations of information resources, obstacles to communication, etc.). One never knows beforehand what things in the work place may seem unimportant at first but that may, in a later analysis, turn out to be very important. A detailed map and set of photographs can be repeatedly referred to throughout the research program, and mined for details that are likely to be missed at the time that the photographs and map are made. A simple device such as a paper cutter or a jumbled pile of reference manuals could turn out in a later analysis to be an important clue to aspects of the work. Here are examples from the weather forecasting project (Hoffman, Coffey, and Ford, 2000): A particular workstation was adorned with numerous Post-It Notes , serving as an indicator of user-hostility. One of the Participant interviews revealed information about problems forecasters were having with a particular software system. Subsequently, the photographs were examined and revealed that the forecasters had applied a relatively large number of Post-It Notes at the workstation, confirming the interview statements. Interviews with Participating expert forecasters revealed that some of them had kept files of hardcopies of images and other data regarding tough cases of forecasting that they had encountered. Using the workplace map, the researchers were able to create a second map showing the locations of the information resources, which in turn could be used in the study of work patterns. Photographs of the workplace showing the Participants at work showed how forecasters provided information to pilots, and revealed obstacles to communication and ways in which the workspace layout was actually a hindrance. The workplace was subsequently reconfigured. 54 The workplace map from the weather project is presented in the Figure 4.1 below. We present this figure to illustrate the level of detail that can be involved. Figure 4.1. An example workplace map. Related to the importance of subtle indicators, and the importance of conducting a photographic survey, it may be advisable to take photographs of the workplace be taken throughout the course of the investigation, if not at planned periodic intervals. The complex sociotechnical workplace is always a moving target, and changes made in the workplace can be important clues to the nature of the work. For instance, in the course of conducting the weather forecasting case study, the main forecasting workstation and its software were changed out completely. Activities Observations, Roles analysis, Locations analysis, Decision Requirements analysis, Action Requirements analysis, and SOP analysis can all rely on the workplace map, and in many circumstances will rely heavily on it. See the comment above concerning opportunism. 55 4.3. Activities Observations 4.3.1. Protocol Notes This simple and deceptively straight-forward protocol is for observing people at work. What is not so simple or straight-forward is deciding on what to observe and how to record the observations. This necessarily involves some sort of reasoned, specific, and functional means or categorizing or labeling work activities. Like the Workspace Analysis, this procedure does not involve interviewing. This distinguishes Workspace Analysis and Activities Observations from the other Workplace Observations and Interviews procedures, which combine an element of observing with an element of interviewing. 4.3.2. Template "Activities Observations" Observer Informant(s) (if any) Date Start time: Finish time: Observation Record Time Actor Activities (Duplicate and expand the rows as needed.) 56 4.4. Locations Analysis This analysis looks at the work from the perspective of individual locations (desks, cubicles, etc.) within the total workplace. 4.4.1. Template "Work Locations Analysis" Researcher/Observer Participant/Informant Date Start time: Finish time: Instructions for the Participant We would like to go through the workplace one work location at a time, and ask you some questions about it. This may take some time and we need not rush through it in a single session. Some of our questions may require a bit of thought. Feel free to take your time, and of course ask any questions of us that come to mind. (Expand the cells and duplicate this table, as needed.) Work Space (desk, workstation) Location (see Workspace Map) Work Space Layout (Researcher sketches or photographs) Who works here? 57 Name Notes (Duplicate and expand the rows, as needed.) Tools and Technologies Tool/Technologies Description, Uses, Notes Role or activities that are enacted at the work space (Expand the cells and duplicate this table, as needed.) Activity What are the goals of this activity? What skills, knowledge, and experience are needed for successful accomplishment of this activity? What information is needed for successful accomplishment of this activity? What about this work space makes the goal easy to achieve? What about the work space makes the goal difficult to achieve? Kluges and work-arounds 58 59 4.5. Roles Analysis 4.5.1. Protocol Notes The analysis of Roles can be aided by keeping on hand the finished workplace map, but even more important is to conduct the analysis in the workplace rather than outside of it. For the Roles Analysis, a Participant/informant is asked about goals, tasks, needed knowledge, and technology. Finally, it should not be assumed that the probe questions presented in the Template exhaust all of the questions that might be appropriate to the domain, organization, or workplace. 60 4.5.2. Template "Roles (jobs, duty assignments)" Researcher Participant/ Informant (if any) Date Start time Finish time Instructions for the Participant We would like to go through the workplace one job assignment at a time and ask you some questions about it. This may take some time and we need not rush through it in a single session. Some of our questions may require a bit of thought. Feel free to take your time, and of course ask any questions of us that come to mind. Role (job, post, or duty assignment) Goals Tasks Needed skills, knowledge, and experience. What makes the job easy? What makes the job difficult? 61 4.6. Decision Requirements Analysis 4.6.1. Protocol Notes This procedure is an interview for workpatterns analysis, and is conducted using Template Decision Requirements Tables (DRT) to provide interview structure. For each DRT, the Participant is asked to confirm/flesh out the action sequence, confirm/flesh out the specification of support and tools, confirm/flesh out the notes on information needs, confirm/flesh out the notes on usability and usefulness, and to confirm/flesh out the notes on deficiencies. The Interviewer makes notes right on the draft forms. Following the Interview, the DRTs are re-drafted into their final form. A DRT is an identification and codification of the important judgments and decisions that are involved in performing a particular task. In addition, the table captures the dynamic flow of decision-making. In the analysis, one DRT is completed for each task or worker goal. The DRT specifies: The action sequence involved in the task, The equipment, tools, forms, etc. that are used to conduct the task, A specification of the information the person needs in order to conduct the task, Notes about what is good and useful about the support, Notes about ways in which the support makes the task unnecessarily difficult or awkward, or that might be improved. As a consequence of these depictions, the DRT is intended to suggest new approaches to display design, workspace design, and work patterns design. 62 4.6.2. Template "Decision Requirements Analysis" Interviewer Decision Designation or Identifying goal Participant Date Start time Finish time What is the decision? What are the goals, purposes, or functions? Why does this decision have to be made? How is the decision made? What information is needed? What are the informational cues? In what ways can the decision be difficult? How do you assess the situation or broader context for this decision? How much time or effort is involved in making this decision? What is the technology or aid, and how does it help? Are any work-arounds? Are there any local "kludges" to compensate for workplace or technology deficiencies? What kinds of errors can be made? What kinds of additional aids might be useful? 63 When this decision has to be made, are there any hypotheticals or "what ifs?" What are your options when making this decision? 64 4.7. Action Requirements Analysis 4.7.1. Protocol Notes This procedure is an interview for workpatterns analysis, and is conducted using Template Action Requirements Tables (ART) to provide interview structure. For each ART, the Participant is asked to confirm/flesh out the action sequence, confirm/flesh out the specification of support and tools, confirm/flesh out the notes on information needs, confirm/flesh out the notes on usability and usefulness, and to confirm/flesh out the notes on deficiencies. The Interviewer makes notes right on the Template forms. Following the Interview, the ARTs are re-drafted into their final form. An ART is an identification and codification of the important activities that are involved in performing a particular task. In addition, the table captures the dynamic flow of activity In the analysis, one ART is completed for each task or worker goal. The ART specifies: The action sequence involved in the task, The equipment, tools, forms, etc. that are used to conduct the task, A specification of the information the person needs in order to conduct the task, Notes about what is good and useful about the support, Notes about ways in which the support makes the task unnecessarily difficult or awkward, or that might be improved. As a consequence of these depictions, the ART is intended to suggest new approaches to display design, workspace design, and work patterns design. 65 4.7.2. Template "Action Requirements Analysis" Interviewer Task Designation or Identifying goal Participant Date Start time Finish time What is the action sequence? What cognitive activities are involved in this task/activity? In what ways can the activity be difficult? What about the support or information depiction makes the action sequence difficult? What are the informational cues? How are they depicted? What is the technology or aid, and how does it help? What is good or useful about it? Are any work-arounds? Are there any local "kludges" to compensate for workplace or technology deficiencies? What kinds of errors can be made? What kinds of additional aids might be useful? 4.8. SOP Analysis 66 4.8.1. Protocol Notes It is not unheard of for "Standard Operating Procedures" (SOP) documents to be a poor reflection of actual practice, for a variety of reasons. Workers may rely on short-cuts and workarounds that they have discovered or created. SOPs may specify (sub)procedures that are not regarded as necessary. SOPs may simply be ignored, and the reasons for this might be informative. An SOP might be outdated (given that task requirements evolve) or ill-specified. For these and other reasons, the analysis of SOP documents can be a window to the "true work." The analysis of SOP documents depends on the cooperation of a highly-experienced domain practitioner who is willing to talk openly and candidly. For each SOP, the Participant is asked the following questions: 1. Can you briefly describe that this procedure is really about? What are its goals and purposes; what is the basic action sequence? 2. Who does this procedure? 3. How often is the procedure conducted? 4. What is good or easy about the action sequence? 5. When you conduct the sequence, do you really do it in a way that departs from the specifications in the SOP? Are there short-cuts? Do they use a "cheater sheet?" 6. How often have you actually referred to the SOP document for guidance? When probing concerning the deficiencies of a tool or workstation, the practitioner's initial response to the probe may be meditative silence. Only with some patience and re-probing will deficiencies be mentioned. Often, the practitioner will have discovered deficiencies but will not have explicitly thought about them. Re-probing can focus on make-work, work-arounds, action sequence alternatives, alternative displays, changes to the tool, etc. It is critical to ask about each piece of equipment and technology—whether it is legacy, how it is used, what is good about it, whether its use is mandated, etc. In many projects that use Cognitive Work Analysis methodologies, one of the first steps is bootstrapping, in which the analysts familiarize themselves with the domain. Ordinarily, SOP documents would be a prime source of information for bootstrapping. However, if one wants to use the SOP Interview and analysis as a way of verifying the DRT analysis, one may hold off on even looking at SOP documents during the bootstrapping phase. In the weather forecasting project (Hoffman, Coffeyand Ford, 2000), it took about 5 minutes to flesh out the answers for each SOP. From the Interviewer's notes, formal SOP tables can be created. An example appears below in Table 4.2. 67 Table 4.2. An example completed SOP Table, for Terminal Air Forecast Verification Number and Name Description Who does this procedure? How often? Frequency of reference to the SOP Comments 2110 TAF Verification Takes about 2 minutes. You input your own TAF and verify the previous one. Forecaster Every 6 hours (four times per day) unless the field is closed. P has referred to this SOP a couple of times. Once you've done it a few times, you know it. Is done on time about 80 percent of the time. Sometimes you get busy, you get tired. You forget to check the home page and when you do you see that GOES isn't downloading or something and so you get involved in fixing that, etc. The OPs Assistant looks at the TAF Verifications to prepare the skill scores. Invariably, there will be some missing information in the SOP tables that are created after a single interview. Therefore, SOP Interviews generally have to be conducted in two waves, one to draft the SOP tables and the second to flesh them out and fine-tune them. Another common circumstance is when the domain includes workers with special areas of proficiency or responsibility. Groups of tasks (and the corresponding SOPs) may be largely unknown to any one individual. In such cases, SOP Interviews will need to be conducted with more than one individual. 68 4.8.2. Template Analysis of "Standard Operating Procedures" Documents Interviewer Participant Date Start time Finish time SOP Number and Name What are its goals and purposes; what is the basic action sequence? Can you briefly describe that this procedure is really about? What are the purposes and goals of the procedure? Who conducts this procedure? How often? What are the circumstances that trigger the procedure? Is the procedure optional, or mandatory, and why? Frequency of reference to the SOP. How often have you actually referred to the SOP document for guidance? What information is needed to conduct the task? What knowledge does a person have to possess in order to be able to conduct this procedure? 69 Can the procedure be conducted easily or well by individuals who have not practiced it? What is easy about the procedure? What is difficult about the procedure? When you conduct the sequence, do you really do it in a way that departs from the specifications in the SOP? Are there short-cuts? Do you use a "cheater sheet?" Comments 70 5. Modeling Practitioner Reasoning 5.1. Protocol Analysis 5.1.1. General Introduction In this document we have been using the word "protocol" to refer to descriptions of procedures and guidance to the researcher for the conduct of procedures. For the suite of methods that is referred to as "Protocol Analysis," we have a different use and intended meaning of "protocol." In this context, a protocol is a record of a process in which a domain practitioner has performed some sort of task. The record might include data concerning actions and action sequences, cognitions, and sometimes, also often includes data concerning communication and collaboration. This protocol must somehow be analyzed or "mined" for data that can be used in building or verifying a cognitive model (e.g., of reasoning, heuristics, strategies, cognitive styles, knowledge, etc.) or testing hypotheses about cognition. Exploration of the protocol for such information is protocol analysis. "Protocol Analysis" if often referred to as a research method, when in fact it is a data analysis method. The reason is that protocol analysis is usually employed in conjunction with a single knowledge elicitation task, the "Think-Aloud Problem Solving" (TAPS) task (see Hoffman, Militello, and Eccles, 2005). In the TAPS task, the participant is presented with some sort of puzzle or problem case and is asked to work on the problem while thinking out loud. In their verbalization the participant is asked to attempt to explicate the problem. An audio recording is made of the deliberations. That recording is transcribed and the transcription is then analyzed for content. It is this last activity that is protocol analysis. Thus, many references to Protocol Analysis as a task or as a CTA method are in fact references to TAPS +PA, that is, TAPS plus Protocol Analysis as the data analysis method. In this section of the document we some guidance concerning protocol analysis. 5.1.2. Materials To conduct a study in which expert's performance at their familiar tasks is examined, one must select particular problems or cases to present to the expert. Materials can come from a number of sources. Since experts often reason in terms of their memories of past experiences or cases (their "war stories") (Kolodner, 1991; Slade, 1991; Wood and Ford, 1993), experts can be asked to retrospect about cases that they themselves encountered in the past. Hypothetical test cases (e.g., Kidd and Cooper, 1985, Prerau, 1989) can be generated from archives data or can be generated by other experts. A set of test cases can be intended to sample the domain, or it can be intended to focus on prototypical cases, sometimes to sample along a range of difficulty. For example, Senjen (1988) used archived data to generate test cases of plans for sampling the insects in orchards. The test cases were presented to expert entomologists, who then conducted their familiar task--the generation of advice about the design of sampling plans. 71 Occasionally, experts come across a particularly difficult or challenging case. Reliance on tough cases for TAPS knowledge elicitation can be more revealing than observing experts solving common or routine problems (Klein and Hoffman, 1993). Mullin (1989) emphasized the need to select so-called well structured test case problems to reveal ordinary "top-down" reasoning, and using so-called ill structured or novel test case problems in order to reveal flexible or "bottom-up" reasoning. Hoffman (1987) tape recorded the deliberations of two expert aerial photo interpreters who had encountered a difficult case of radar image interpretation. The case evoked deliberate, pensive problem solving and quite a bit of "detective work." In this way, the transcripts were informative of the experts' refined or specialized reasoning. 5.1.3. Coding Schemes In the traditional analysis, each and every statement in the protocol is coded according to some sort of a priori scheme that reflects the goal of the research (i.e., the creation of models of reasoning). Hence, the coding categories include, for example, expressions of goals, observations, and hypotheses. A number of alternative coding schemes for protocol analysis have been discussed in the literature. (See for instance, Cross, Christians, and Dorst, 1996; Newell, 1997; Pressley and Afflerbach, 1995; Simon, 1979). Without exception, the coding scheme the researcher uses is a function of the task domain and the purposes of the analysis. If the study involves the reasoning involved in the control of an industrial process, categories would include, for instance, statements about processes (e.g., catalysis), statements about quantitative relations among process variables, and so on (see Weilinga and Breuker, 1985). If the study is about puzzlesolving (e.g., cryptarithmetic), categories might include, for example, statements about goals, about the states of operators, about elementary functions, and so on. In the study concerns expert decision making, categories might include: noticing informational cues or patterns, hypothesis formation, hypothesis testing, seeking information in service of hypothesis testing, sensemaking, reference to knowledge, procedural rules, inference, metacognition, and so on. Statements need not just be categorized with reference to a list of coding categories. the scheme might be more complex and might involve sub-categories, for instance, statements about operators might be broken down into statements about assigning values to variables, generating values of a variable, testing an equation for variable y value x, and so on to a fine level of analysis (see Newell, 1997). Also, working backwards from a detailed assignment of each and every statement in a protocol, one can cluster sequences of statements into functional categories (e.g., a sequence of utterances that all involved a forward search or a means-end analysis, etc. (see Hayes, 1989). 5.1.4. Effort In addition to these method-shaping questions is an important methodological consideration for protocol analysis: No matter what task is used to generate the protocol, transcription and analysis is very time and labor-intensive (see Burton, et al., 1987, 1988; Hoffman, 1987). It takes on the order of seven to ten hours for even an experienced transcriber to transcribe each hour of audio 72 taped TAPS, largely because even the best typist will have to pause and rewind very frequently, and cope with many transcription snags (e.g., how do I transcribe hesitations, "um's, "ah's," etc.?). Coding takes a considerable amount of time, and the validity check (multiple coders, comparison of the codings, resolution of disagreements, etc.) takes even more time. Both Burton, et al. and Hoffman compared TAPS+PA with other methods of eliciting practitioner knowledge (interviews, sorting tasks, etc.), and both found that TAPS+PA is indeed time consuming and effortful, and yields less information about domain concepts than did contrived techniques. The upshot is that in terms of its total yield of information, TAPS with Protocol Analysis is one of the least efficient method of eliciting domain knowledge from practitioners. On the other hand, if the analysis focuses just on identifying leverage points or culling information about practitioner reasoning, and not on the coding of each and every statement in the protocol, then a review of a protocol can be useful. 5.1.5. Coding Schemes Examples To illustrate protocol analysis coding schemes, we present four examples. The first three all come from the weather forecasting case study (Hoffman, Coffey, and Ford, 2000). Example 1: Abstraction-Decomposition The first coding scheme we use to illustrate protocol analysis shows how protocol analysis is shaped by project goals. The Abstraction-Decomposition Analysis scheme evolved out of research on nuclear safety conducted by engineer Jens Rasmussen at the RIS National Laboratory in Denmark (Rasmussen, Pjtersen, and Schmidt, 1990). That research was initially focused on engineering solutions to the avoidance of accidents, but the researchers discovered that human error remained a major problem. Hence, their investigation spilled over into the ergonomic domain and also the analysis of the broader organizational context. So what started as an engineering project expanded into a fuller Work Analysis. The researchers developed a scheme that can be used for coding interviews and TAPS sessions. The coding representation shows in a single stroke both the category of each proposition and how each proposition fits into the participant's strategic reasoning process and goal-orientation. This scheme is illustrated in Figure 6.1. The rows refer to the levels of abstraction for analyzing the aspect of the work domain that is under investigation. The columns refer to the decomposition of the important functional components. 73 Levels of Abstraction Levels of Decomposition Whole System Subsystem Component Goals Measures of the goals General functions and activities Specific functions and activities Workspace configuration Figure 5.1. The coding scheme of Abstraction-Decomposition Analysis. Table 5.1. shows how this scheme would apply to the domain of weather forecasting. In particular, it refers to a workstation system for the Forecast Duty Officer, called the Satellite, Alphanumerics, NEXRAD and Difax Display System (SAND). Table 5.1. An instantiation of the Abstraction-Decomposition Analysis. Forecasting Facility Goals Understand the weather, Carry out standing orders. Measures of the goals General functions and activities Duty section forecast verification statistics. Prepare forecast products and services to clients. Carry out all Standard Operating Procedures. Specific functions and activities Physical form The Operations floor layout. Forecast Duty Officer's Desk Forecast the weather FDO - SAND Workstation Supports access and analysis of weather data products from various sources. 24-hour manning by a senior expert Understanding and analysis of weather data. Forecast verification, System downtime, Ease of use. Supports access to satellite images, computer models, etc. Prepare Terminal Area forecasts, request NEXRAD products, etc. Workspace layout. Supports comparison of imagery to lightning network data to locate severe storms. CRT location and size, keyboard configuration, desk space, etc. 74 Once a template for use in a work domain has been diagrammed in such a manner, each of the statements in a specific interview or problem solving protocol can be assigned to the appropriate cell, resulting in a process tracing that codes the protocol in terms of the work domain. A hypothetical example appears in Figure 5.2. The numbered path depicts the sequence of utterances in an interview in which a forecaster was asked to describe the Standard Operating Procedure involved in the use of the SAND system, supported by probe questions about what makes the system useful and what makes it difficult. GOALS Forecasting Facility Forecast Duty Officer's Station "This is for everyone-Forecasters, Observers." "It is used to prepare packets for the student pilots." SAND 4 2 MEASURES "Sometimes the workstation running SAND hangs up." "SAND has a lot of potential if it works the say it is supposed to." 5 9 1 GENERAL FUNCTIONS "This SOP says what the products are and how to get them." 3 SPECIFIC FUNCTIONS 7 PHYSICAL FORM "You sometimes have to refer to the SOP folder when you need to look up something specific." "Charts get printed from here." "SAND is used every day." "SAND gives you access to NWS charts." 6 8 Figure 5.2. Protocol tracing using the Abstraction-Decomposition Analysis. (Reprinted from Hoffman and Lintern, 2005, with permission.) Example 2: Coding of Propositions for a Model of Knowledge In the analysis of an interview transcript, statements were highlighted that could be incorporated in Concept-Maps of domain knowledge. From the transcript: 75 'Cause usually in summer here we're far enough south that ah...high pressure will dominate, and you'll keep the fronts and that cold polar continental air north of us. Even if the front works its way down, usually by the time it gets here it’s a dry front and doesn't have a lot of impact... I also think of ah... I think of tornados too. Severe thunderstorms and tornadoes are all part of the summer regime. And again, tornadoes and severe thunderstorms here are not quite as severe as they are inland like we talked about last time, because they need to get far enough in to get the mixing, the shearing and the lift. And that takes a while to develop and unfold. You really don't see that kind of play until this maritime air starts to cross the land-sea interface. Starting at the beginning of this excerpt, one sees the following propositions: (In the Gulf Coast region) high pressure will dominate (in the summer) (High pressure) keeps fronts north of us (the Gulf Coast region) (High pressure) keeps cold polar continental air north of us (the Gulf Coast region). Example 3: Coding for Leverage Points In an analysis of SOP documents, the Participant went through each of the Standard Operating Procedures and was probed about each one, as illustrated in Table 5.2. The purpose of the coding was to identify leverage points, indicated by the boldings. 76 Table 5.2. Notes from an interview concerning a SOP. SIGMET stands for SIGnificant METeorologcical event. Task/Job: Observer updates SIGMETs Action sequence: Conducted every hour (valid for 2 hours) SIGMET is defined and identified Change is saved as a jpeg file Change is sent to the METOC home page and to thereby to the Wall of Thunder What supports the action sequence? What is the needed information? METOC PC How to find the SIGMETS (they come from Kansas City). Powerpoint with an old kludge. There Have to plot them - sometimes look up stations in the are a series of 4 maps station identifier book. Have to know more geography - SE Texas - East Coast, than many observers know. - FL, AL, MS, - VA through GA, - TX, OK, Arkansas. What is good or useful about the support and the depiction of needed information? The SIGMETS themselves are really good - give customers good information at a glance. The PowerPoint maps are designed for METOC - they have all geographical information and 3 letter identifiers (e.g., PNS for Pensacola) that is needed. Other forecasting offices have blank maps of the US. What about the support or information depiction makes the action sequence difficult? Too labor intensive - the whole system is archaic. There is a commercial Web site that has a Java program with map and red box for SIGMET. you can highlight the box and get text describing the SIGMET. This always seems to be updated. Limited capability to customize the shapes of the SIGMET areas. The map cannot move as you are in the act of drawing a SIGMET--you have to change functions and scroll the map with the mouse. The alphanumerics are hard to see even if you zoom. The map shown on the CRT is not large enough, details are hard to read. It is a sectored map--cuts off at Texas. You sometimes have to hunt for station identifiers--end up searching via "Yahoo." Some stations have several IDs. Map cannot scroll outside a limited area. Nothing in the work environment reminds the Observer to conduct the task. NOTE: The final display of SIGMETs does not support a zoom function. They aren't easy to see on Data Wall. The work is often done on the hardcopy map lying on the table--where you can see all the regions and station identifiers in a glance, and read all the details. After figuring it out on the hardcopy map the Observer inputs it into the computer. The entries in this table are the researcher's notes from the interview—including some paraphrases and synopses of what the participant said—and are not an utterance-for utterance protocol. This underscores the idea that protocol analysis, as a data analysis method, can be divorced from the TAPS task and can be used for a variety of purposes. 77 Example 4: Coding an Unstructured Interview to Identify Rules for an Expert System. A project for the U. S. Air Force Military Airlift Command was aimed at developing an expert system to assist in the creation of airlift plans (Hoffman, 1986). The software used in planning was complex, and only one individual had achieved expertise in its use. The researchers conducted an unstructured interview with that expert. An example excerpt is presented in the left-hand column in Table 5.3. The focus of the interview at this point was on the kinds and nature of the information provided in a file about airfields. The excerpt is cropped in the center column, with boldings in that column indicating the coded pieces. The right-hand column shows the encoded propositions in an intermediate representation, a step closer to implementability as concepts for a knowledge base and procedural rules for an inference engine. 78 Table 5.3. Coding of a transcript from an unstructured interview. The purpose was to identify concepts for a knowledge base and potential inference rules for an inference engine. Transcript Encoded Transcript Concepts and Rules I: What is the difference between MOG and airport capability? E: Ah… MOG maximum on the ground is parking spots… on the ramp. Airport capability is how many passengers and tons of cargo per day it can handle at the facilities. I: Throughput… ah… throughput as a function of… E: It all sorta goes together as throughput. If you've only got… if you can only have ah… if you've got only one parking ramp with the ability to handle 10,000 tons a day, then your… your throughput is gonna be limited by your parking ramp. Of the problem could be vice versa. I: Yeah?… E: Ah… MOG maximum on the ground is parking spots… on the ramp. Airport capability is how many passengers and tons of cargo per day it can handle at the facilities. Concept: Maximum number of aircraft allowed on the ground = MOG … your throughput is gonna be limited by your parking ramp. Of the problem could be vice versa. Rule: Parking ramp capacity can limit throughput.) Concept: Airport capability is the number of passengers and number of tons of cargo per day the airport can throughput. E: So it's a [unintelligible phrase in the audio recording] I: So what if you had only one loader, so that you could only unload one widebody airplane at a time? You wouldn't want to schedule five planes in the ground simultaneously. How would you restrict that? E: We know we're not gonna get all the error out of it. We're gonna try and minimize the error. And then we'll say that… ah… we'll say an arrival-departure interval of one hour, so that means that the probability of having two wide-bodies on the ground tryin' to get support from the loader is cut… E: We know we're not gonna get all the error out of it. We're gonna try and minimize the error. And then we'll say that… ah… we'll say an arrivaldeparture interval of one hour, so that means that the probability of having two wide-bodies on the ground tryin' to get support from the loader is cut… Rule: If you need to restrict the number of aircraft on the ground then manipulate the arrival-departure interval. Note in this example that there are many propositions in the Expert's statements (e.g., "The airport designation file only includes the field's common name, latitude, longitude, MOG, and capability) that were not coded because they were not useful in composing implementable 79 statements. In addition there were statements made by the Interviewer (e.g., "The airport capability can be restricted by the number of loaders.") and that also could have been coded, and even implemented as rules, but were not because they were only tacitly affirmed by the Expert. (This sort of slippage is one of the disadvantages to unstructured interviews). Like the first three examples, this fourth one also makes the point that the coding scheme depends heavily on the purposes of the analysis. 5.1.6. Coding Verification In some cases it is valuable to have more than one coder conduct the protocol coding task. In some cases it is necessary for demonstrating soundness of the research method and the conclusions drawn from the research. For research in which data from the TAPS task are used to make strong claims about reasoning processes, especially reasoning models that assert causeeffect relations among mental operations, the assessment of inter-coder reliability of protocol codings is regarded as a critical aspect of the research (see Ericsson and Simon, 1993). There are a number of ways of using multiple coders and conducting a validity check. In the simplest procedure, two researchers independently code the statements in the protocol and a percentage of agreement is calculated. Researchers typically look to find a high rate of 85 percent or greater agreement among multiple coders (see Hoffman, Crandall, and Shadbolt, 1998). Analysis of multiple codings can show that the coding scheme is well defined, consistent, and coherent. On the other hand, it is likely that there will be disagreements, even among coders who are practiced and are familiar with the task domain. Disagreements can be useful pointers to ways in which the coding scheme, and the functional categories on which it is based, might be in need of refinement. 5.1.7. Procedural Variations and Short-cuts. In another procedure: 1. Two or more researchers, working independently, go over the typed transcript and highlight every statement that can be taken as an instance of one of the coding categories. 2. Each researcher codes each highlighted statement in terms of the coding categories. 3. Each researcher codes the highlighted statements from OTHER researcher's highlightings. 4. Both the highlightings and the codings from the researchers are compared. A short-cut on this general approach to coding verification is to have multiple coders and a reliability check, and once an agreement rate of 85% or more is achieved, all remaining transcripts can be coded by individual researchers. Because the process of transcription alone, let alone coding in addition, is very time-consuming and effortful, researchers may want to consider another short-cut. In this procedural variation, there are two researchers, one serving as Interviewer to facilitate the interview (or other 80 empirical procedure) and the other serving as a Recorder. about what everyone says, especially things that he participant says that are salient in terms of the project goals. After the notes are transcribed, each researcher independently highlights every statement that can be taken as a reference to a coding category. For some research, the assessment (elaborate or otherwise) of inter-coder reliability may not be necessary. For instance, the identification of leverage points in the analysis of the Standard Operating Procedures in the weather forecasting case study (Hoffman, Coffey, and Ford, 2000) did not require an assessment of inter-coder reliability because the leverage points were explicitly elicited from the expert using interview probe questions. Furthermore, the coding scheme was simple—a statement either was or was not an expression of a possible leverage point. Likewise, in the protocol analysis of the Concept Mapping interview, the identification of statements that could be used in Concept Maps did not mandate verification that all possible statements that could be used in Concept Maps were in fact identified and used. Such an analysis would have actually detracted from the main the goal of the analysis, which was to identify propositions that could be used in Concept-Maps on particular topics, and that were not already in the Concept-Maps that had been created. 5.1.8. A Final Word In planning to conduct a protocol analysis procedure in the context of systems design, there are a number of questions the researcher might consider at the outset. These are presented in Table 5.4. 81 Table 5.4. Some questions for consideration when planning for a protocol analysis procedure. What are the purposes? If the purpose is to develop reasoning models, then the categorization scheme needs to include such (slippery) categories as "observation," "goal," and "hypothesis." If the purpose is to identify leverage points, the categorization scheme can be simple (i.e., highlighting statements that refer any sort of obstacle to problem solving). What is the level of analysis? Do I cut a coarse grain—"notice," "choose," "act"? Or do I cut a fine grain based on a functional analysis of the particular domain? (e.g., "look at data type x," "notice pattern q," "choose operation y," "perform procedure z," and so on). Do I need to code each and every statement? What constitutes a statement? Do I separate them by the hesitations on the recording? How do I cope with synonyms, anaphor, etc.? Statements are often obviously dependent on previous statements. Context dependence is always a feature of protocol analysis because context dependence is always a feature of discourse. How intensive must the validity check be? Indeed, must there be a validity check at all given the purposes of the research? Do I need to have independent coders code the protocol, and then compare the codings for inter-coder reliability? How many coders—two? Three? What rate of agreement is acceptable? How do I cope with the inevitable disagreements? Answers to these questions will determine what, precisely, is done during the protocol analysis. 82 5.2. The Cognitive Modeling Procedure 5.2.1. Introduction and Background The Documentation Analysis, Work Space Analysis, Participant Interviews, and other bootstrapping methods can allow the researcher to forge a preliminary model of domain practitioner reasoning or reasoning style, and even variations that capture the reasoning of less versus more proficient Practitioners, or variations representing the ways that reasoning strategies are shaped by problem types. As is traditional in cognitive science, such models take the form of information processing flow diagrams or decision trees, with boxes indicating mental capacities such as memory, boxes indicating mental operations such as inference, and arrows indicating transformations or information flows among the hypothetical components. The question always arises as to how to validate such models, that is, insure that the models are veridical or valid. The main approach taken in traditional experimental psychology is to conduct programmes of experimental research in which models are developed, refined, and then tested under controlled circumstances that allow convergence on the "truth" concerning the causal relations among mental processes. For the purposes of most applied research--which is invariably conducted under significant constraints of time, resources, and particular goals--it is out of the question to step out of the Cognitive Work Analysis and Cognitive Field Research venue to design and conduct series of formal experiments. Might there not be some other way to take a "fast track into the black box?" One way would be to simply present to the domain Practitioners the model that the researcher forged in the bootstrapping process, and elicit their commentary, i.e., "Does this model capture your reasoning?" There are two main difficulties with this approach: Knowledge elicitation using abstract or conceptual-level probes can be ineffective. Research using the CDM procedure, for example, has shown repeatedly that the more effective probes are those that are couched in terms of specific experiences (e.g., "How did you reason in this particular case?") rather than in terms of typical patterns (i.e., "Describe how your typical pattern of reasoning.). Any responses to the Base Model may be "just-so stories," biased by the Base Model itself. That is, the responses may reflect the Practitioner's attempt to please the interviewer. A second way to obtain validation would be via observations, to see if the patterns of behavior suggested by the model conform to actual task behavior. The main difficulty here is that cognitive activities (i.e., data inspection in service of mental model formation versus data inspection in service of hypothesis testing) may not be discernable on the basis of overt behaviors. 83 The issue of validating cognitive models is a significant one, and in theory requires formal controlled, and costly experimentation. 5.2.2 Protocol Notes The "Cognitive Modeling Procedure" (CMP) is one way of obtaining validation in direct, and more efficient manner. In this interview, one begins with the "Base Model of Expertise." (See Hoffman and Shadbolt, 1996). The Base Model is shown in the following Figure. Problem of the Day Course of action Action Queue Action Pla n Refinement Cycle Focus, leverage points, gaps, barriers Recognition Priming Data Examination Situa tion Awa re ness Cyc le Judgments, Predictions Mental simulation, trend analysis, courses of action Me ntal Model Refinement Cycle Hypotheses, Questions 84 5.2.2.1. Preparation Armed with this Base Model, the researcher takes what was learned from the bootstrapping activities (i.e., documentation analysis, unstructured interviews, etc.) and tweaks the Base Model to make it accord with the domain. In the case of the weather forecasting domain, the Base Model would be modified as in the following Figure. Forecasting Problem of the Day Action Queue * Routine forecasts * Need for special warnings? Action Pla n Refinement Cycle Focus, leverage points, opportunistic data search Recognition Priming Data Examination Satellite images Data fields Radar Computer model outputs Situa tion Awa re ness Cyc le Judgments, Predictions Forecast products Mental Model of atmospheric dynamics Me ntal Model Refinement Cycle Hypotheses, Questions 85 Next, the researcher takes the elements in the Base Model and recombines them so as to create two "bogus models." The bogus models will include boxes containing concept terms and action paths (arrows) labeled by actions. The following Table lists the sorts of terms and labels that reasoning models can include. This listing of terms was compiled from the models presented in a great number of recent studies of various domains of real-world decision-making (Hoffman and Shadbolt, 1995). Some Useful Concept-terms UNDERSTANDING OF THE CURRENT SITUATION MENTAL SIMULATION KNOWLEDGE OF CONCEPTS KNOWLEDGE OF PRINCIPLES KNOWLEDGE OF "RULES OF THUMB" MEMORY/PAST EXPERIENCES PATTERN CUE PREDICTION HYPOTHESIS GUIDANCE PLAN INPUT/OBSERVATIONS JUDGMENT GOAL/SUB-GOAL EXPECTATION PROTOTYPE/SCHEMA Some Useful Labels for Action Paths COMPARE AGREE? DISAGREE? TEST HYPOTHESIS REVISE/CHANGE SEARCH RECOGNIZE VERIFY/CONFIRM REFUTE RECOGNIZE PROBLEM MODIFY IMPLEMENT REMEMBER EVALUATE DIAGNOSE INFER DETECT IDENTIFY ASSESS CONFIDENCE QUALITY ALTERNATIVES/COURSES OF ACTION ANALOGIZE RECOMMEND DECIDE/CHOOSE/SELECT PRIORITIZE SCHEDULE The concept-terms used for boxes and for action paths are not limited, except in that as they need to make reference to cognitive activities and task-relevant behaviors, and in that they have to make functional sense in terms of the data that have already been collected up to the point when the Cognitive Modeling Procedure is conducted. To the greatest extent possible, the concept terms and action path labels should not rely on technical or psychological jargon. (See the Table above.) For instance, if one were to use the concept "mental model," the concept would have to be explained to the Participant. Instead, one might use lay terminology such as "Understanding of the Current Situation." 86 The two bogus models need to be reasonable. For example, one would probably not have a "disagree" link lead directly to a box labeled "output judgment." One or both of the bogus models should include some sort of feedback loop, to let the participant know that such loops are acceptable. This is especially important since it is a good bet that one or more such loops will be involved in proficient reasoning in any domain. If one were to present two artificial models that did not contain one or more feedback loops, this might give the Participant the impression that the Interviewer was expecting purely linear models. On the other hand, one should avoid having the feedback loop in the bogus models be an actual Duncker refinement cycle. One of the main things one seeks in the Cognitive Modeling Procedure is to have the Participant re-create this cycle and thereby validate the Base Model. One or both of the bogus models should include some notion that hearkens of hypothesis testing, and some notion that hearkens of situational awareness, again since it is a good bet that such notions will be involved in proficient reasoning in any domain On the other hand, the two bogus models should not conform precisely to those models that actually do, at this stage of inquiry, seem to describe Practitioner reasoning. Try to make the two bogus models fairly simple (see the Base Model that is presented here). The artificial models should have only about 5 boxes and 3 linking arrows. the bogus models should serve as an invitation for the Practitioner to adapt the terms to their domain and flesh the models out. Example "bogus models" are presented below in the Boiler Plate sub-section, and these can be used in composing the CMP. 5.2.2.2. Round 1 In Round 1 of the CMP, the Practitioner/Participant is presented with the two bogus models and is asked two probe questions: Which of these seems to best describe how you approach your domain task(s)? Are there any ways in which this (selected) model seems to be incorrect or in need of refinement? (Don't be surprised if the participant shrugs, or even chuckles.) The Participant is told that they can feel free to concoct a diagram of their own, and that there is no one correct way of diagramming their reasoning of strategies. Indeed, they should be told that they may need more than one diagram to capture their different strategies or the ways in which their reasoning might depend on the nature of the case or situation at hand. It is recommended that the participant be explicitly informed that they are allowed some quiet time alone, and perhaps even a span of a few days, to ponder the models and consider their answers to the probe questions. Otherwise, the pressure of having to create a more immediate answer might have a negative impact on the reflective thought that this task requires. 87 The caution here being that if the researcher leaves material with the Practitioner, there is a high likelihood that on the researcher's return, nothing will have been done. Domain Practitioners, even those who may be enthused about the project, will often, if not usually "drop the ball" if left to their own devices. It is highly recommended that if the material is to be left with the Practitioner, at the time when the material is first presented an appointment should be made for a specific date and time no more than a week hence when the interview and discussion will be conducted. The notion in this interview is to have the domain Practitioner correct the bogus model that was selected as being the one that came closest to matching their own reasoning. model. The Practitioner is expected to and add richness to the over-simplification of the bogus model. What should come out as a consequence should be in greater conformity with the Base Model, i.e., validation of the Base Model or a Base Model of a particular reasoning style. Although this interview method engages the Practitioner at an abstract or conceptual level of analyzing their own thinking, it does not seem to suffer from the abstractness. When a model does not fit, the Practitioner can see this, and can use the bogus models as support in clarifying their description of their own reasoning style. After the completion of Round 1, the Researcher prepares a polished version of the Participant's model. Inevitably, the polished models will differ from the models crafted by the participants in Round 1, in ways that can be subtle and possibly profound. This is because of the degrees of freedom one has in concocting these sorts of models, and also because of the deliberate attempt on the part of the researcher to "clean up" the Participant's descriptions and create an elegant, if not totally logical, depiction. There are many degrees of freedom involved in the construction of reasoning models, and there is no single correct way to create a model. Placement of concepts and identification of paths can vary considerably without fundamentally altering the important flows in reasoning sequences. Thus, it is prudent to present the polished model to the Participant and invite comment. Fine-tuning of the model may be called for. 5.2.2.3. Round 2 After the Round 1 data have been collected and the resulting models from a number of domain Practitioners have been placed into a polished graphic, it is possible to run Round 2. In Round 2, each Participant Expert is presented with copies of all of the models that had been derived in Round 1. Added to this set are some foils derived from the analysis of Participants who are not expert, and some (new) bogus models. The task presented to each Participant Expert is to: Judge which model goes with which of the Practitioners (including their own) or Rate the degree of expertise manifested in each of the models Judge which model(s) describe the reasoning of individuals who are not experts Judge which model(s) are bogus, and Rate their degree of confidence in each of their judgments. 88 It is best to limit the number of models to less than 10 (say, one bogus model, two apprentice models, and 4 expert models). For greater numbers of models, the task can become overwhelming. The Interviewer keeps track of task time and takes notes. Round 2 can be used to explore a number of hypotheses about the work domain. For instance, if the Practitioner-participants are all experts, one might expect that their models would tend to be similar, and hence in Round 2 they would show considerable confusion and even fail to recognize their own model. If this hypothesis is of interest, it is desirable for the researcher to wait some weeks, or even months, between the conclusion of Round 1 and the beginning of Round 2. The researcher may be interested in testing hypotheses about the workplace, such as, "Do these Practitioners discuss their reasoning amongst themselves?" "Do the all have the opportunity to learn of each others' reasoning styles and strategies by seeing each other in action?" "How closely do they communicate and cooperate in their organizational context?" Results from Round 2 can address such questions. 89 Hypotheses for Round 2 HYPOTHESIS ALTERNATIVE HYPOTHESIS Experts will recognize their own models, with confidence, if only because they will have already seen their polished model during the Round 1 discussion in which the Analyst's cleaned-up version is presented for confirmation. If there is a time delay between CMP Round 1 and Round 2, experts will confuse the models and will fail to recognize their own. As skill approaches expertise, one would expect the reasoning models to converge, setting the stage for error or confusion. Experts may falsely recognize models of other Experts, insofar as those other models conform to some degree to their own reasoning, or to the normative model of proficient reasoning. Experts will recognize the models of individuals who are not expert, with confidence, especially if the non-expert models tend to be simplifications. Expert models might be relatively simple as well as complex, depending on the level of detail the Expert went into during CMP Round 1. So, just because a model is simple, that does not mean that it is necessarily a bogus or an apprentice model. Indeed, a participant may regard even a bogus model as being an expert's model. Experts will correctly identify the bogus models, assuming that the bogus models include some sort of clear violation of domain practice or logic. If the bogus models are oversimplifications rather than clear violations of domain practice, then they may be confused with the expert models. Experts are likely to use a "divide-and-conquer" strategy in which they first attempt to identify the models of the senior experts, and their own model, and then partial out the remaining alternatives into the apprentice and bogus categories. Other strategies are possible. A less frequent one is to peruse all of the models carefully before making any decisions. Advanced apprentices and journeymen should be able to say which model goes with each expert, assuming that they have had the opportunity to work with each of the experts. Presumably, the apprentices and journeymen observe the experts' reasoning patterns and recognize stylistic differences and individual preferences. None of the participants will show confidence in assigning the models, since they may rarely, if ever, discuss or even have the opportunity to witness, one another's reasoning style or strategy. 90 5.2.2.4. Round 3 In Round 3, the Analyst observes the Participant when he or she is conducting an actual task in the operational context. The researcher notes as much as possible about everything the Practitioner does and says, seeking behavioral evidence that might confirm or disconfirm aspects of the Participant's model. For instance, if a participant asserts in Round 1 that they always begin their familiar task by looking at a particular data type, then there should be corresponding behavior that should be observable-- if the researcher is located in the workplace just at the moment the participant begins to conduct their familiar task. In addition to taking detailed notes on the Practitioner's behavior, the researcher can use the refined participant's model as a "checklist." It is possible during Round 3 to ask occasional probe questions (assuming that such intrusion in the operational setting is appropriate, acceptable, and agreed to in advance). For instance, suppose again that a Practitioner asserted in Round 1 that they begin their familiar task by inspecting a particular data type (that should be observable in their behavior) and also said in Round 1 that they then attempt to come to an understanding of the current situation. This is mental modeling, which in all likelihood will not be readily observable. The researcher might simply ask, "What are you thinking?" just at the point where the Practitioner has conducted the initial examination of the data for an appropriate span of time, or has turned away from the data. Round 3 has to be conducted opportunistically. For instance, in the weather forecasting project it was decided that the best time to make the Round 3 observations would be at the very beginning of the watch period of a forecaster who had been off of the watchbill for a span of days. In this case, the forecaster would almost certainly have to begin their watch by examining data to develop their initial mental model of the overall weather situation--something most of the forecasters asserted (in Round 1) that they did. If they were observed on any other day, they would have already had a mental model of the current weather situation when they walked in the door. Furthermore, the observations had to be conducted at a time when the local weather was not in a persistence regime--when the weather is dominated by an air mass or a stationary front, making the weather situation basically the same from day to day. Opportunism notwithstanding, the researcher needs to keep focused on the main goal of Round 3--observe whatever can be observed, and even ask probe questions, to see what elements of the Practitioner's model are mirrored in actual behavior. The Round 3 results can also be examined with regard to the Base Model of domain expertise that was developed in the Preparation phase--which elements were affirmed, which need refinement. 5.2.2.4. Round 3 Following Round 3, the data collected in all three Rounds can be pulled together into the final products: Refined Base Model of domain expertise Refined models of the reasoning of the individual participants 91 Conclusions regarding any hypotheses that were testable in Round 2. 92 5.2.3. Templates 5.2.3.1. Round 1 Example Bogus Models. EXAMINE COMPUTER GUIDANCE EXAMINE ERRORS MADE IN PAST SIMILAR CASES FORM A MENTAL MODEL DO "WHAT-IF" REASONING INSPECT OTHER DATA OUTPUT THE JUDGMENT DOUBLECHECK 93 INSPECT DATA (type 1) REJECT HYPOTHESES INSPECT DATA (type 2) OUTPUT JUDGMENT UNDERSTAND THE CURRENT SITUATION COMPARE WITH GUIDANCE 94 Analyst Participant Date Start Time Finish Time Instructions Based on the material we have collected so far in this project, we have attempted to create something like a logical flow diagram that attempts to capture the sorts of strategies that domain Practitioners use. On the next page you will see diagrams of two possible reasoning strategies. Another way of saying what this is that these are two alternative ways of describing what you do when you conduct your usual or typical task. We would like you to look at these and think about them. You may have some immediate reactions, but you may want to take this booklet with you and think about the strategies for a day or so. Which of these seems to best describe how you approach your domain task(s)? Are there any ways in which this (selected) model seems to be incorrect or in need of refinement? Please feel free to concoct diagrams of your own. You might use a strategy other than those shown here. You might use more than one strategy depending on the given task situation. Please keep in mind that there is no one correct way of diagramming your reasoning of strategies. Indeed, you may need more than one diagram to capture your different strategies, or those of other Practitioners. After you have thought about the two strategies that are depicted in the diagram, please complete pages 4 and 5. Insert your "Bogus Models" page here 95 Which of these diagrams seems to best describe how you approach your domain task(s)? In what ways does it seem to capture your strategies? Are there any ways in which this (selected) diagram seems to be incorrect or in need of refinement? What changes would you make? Feel free to make changes right on the diagram page itself. Feel free to concoct diagrams of your own, ones that better represent your strategy or strategies. DISCUSSION 5.2.3.2. Round 2 96 Analyst Participant Date Start Time Finish Time Instructions Based on the material we have collected so far we have been able to recreate graphical representations of the reasoning styles of each of our Participants. Those models appear on these pages that I will lay out on the table. For each of these models, we would like to guess "who is the owner." That is, for each model, can you determine the person whose reasoning the model seems to capture? Please write you judgment on each page. If you uncertain of the ownership of a model, you can give more than one guess. Also, try to give your reasons for assigning models to individuals. What in your experience working with each individual leads you to assign each particular model to each particular person? One of the models is yours. Can you tell which? If so, what clues you in? Not all of the models presented here go with the Experts. One or more may represent the reasoning of an individual who is less-than-expert. See if you can determine that. Not all of the models presented here are "real" models that would represent the reasoning of ANY domain Practitioner, whether expert or not. See if you can determine which of the models is "bogus." In attempting to determine the ownership of a model, you may be confident, or uncertain, for any of a variety of reasons. For instance, you may have the suspicion that a given model represents the reasoning of some particular person, but you may not have had much opportunity to actually witness that person's reasoning style. Hence, you may be uncertain. As another example, you may feel that more than one model might represent the reasoning of a particular person. Please share such reactions with the researcher. If you have any questions at any time, please ask. (reproduce the following set of cells, as needed) MODEL ID Whose Model is it? Rate your confidence 97 Check One: Name Circle one: ______________________________ A "bogus" model _______ Someone who is not an expert _______ COMMENTS 1 - Highly Confident 2 - Confident 3 - Somewhat confident 4 - Somewhat uncertain 5 - Uncertain 6 - Very Uncertain 98 5.2.3.3. Round 3 Analyst Participant Date Start Time Finish Time E =EVENT, P = PROBE R = REPLY, A = ANALYSIS TIME NOTES (reproduce this table, as needed) 99 6. Critical Decision Method 6.1. General Description of the CDM Procedure The CDM Procedure involves multi-pass retrospection. The expert is guided in the recall and elaboration of a previously-experienced case. The CDM leverages the fact that domain experts often retain detailed memories of previously-encountered cases, especially ones that were unusual, challenging, or in one way or another involved "critical decisions." The CDM works by deliberately avoiding generic questions of the kind, "Tell me everything you know about x," or, "Can you describe your typical procedure?" Such generic questions have not met with much success in the experience of many knowledge engineers. More fundamentally: 1. In complex sociotechnological domains of practice, there often is no general or typical procedure. 2. Scenarios may be "typical," or prototypical, but: There are likely to be many action sequence alternatives, even for routine tasks and routine situations. Procedures will vary as a function of style. Procedures will vary as a function of the equipment status. Procedures will vary as a function of practitioner skill level. Thus, any uniformities observed in procedures may reflect problem differences only, and are likely to simply reflect the procedures taught in the schoolhouse and are prescribed by the operational forms and reporting procedures. Therefore, the CDM probes focus on the recall of specific, concrete experiences from the forecaster's experience 6.1.0. Preparation Training of elicitors to properly conduct the CDM procedure has involved a number of exercises, including experience in the role of interviewee, experience in the conduct of (mock) CDM procedures, and experience at preparing an interview guide. As in the exercise of any knowledge elicitation method, effective use of the CDM presumes that the elicitor is a skillful facilitator at the level of the social dynamic. The knowledge elicitor must become familiar with the domain—through analysis of research and technical documents, preliminary interviews (i.e., conversations with domain experts, on-site observations, etc.). The goals for knowledge elicitation are specified and defined operationally. Experts, in sufficient number, must be available and cooperative. 6.1.1. Step One: Incident Selection The opening query poses a particular type of event or situation and asks for an example where the experts' decision making altered the outcome, where things would have turned out differently had s/he not been there to intervene, or where the experts' skills were particularly challenged. The incident must come from the person's own lived experience 100 as a decision maker. The goal in this Step is to help the participant identify cases that are non-routine, especially challenging, or difficult —cases where one might expect differences between the decisions and actions of an expert and those of someone with less experience, and cases where elements of expertise are likely to emerge. One should bypass incidents that are memorable for tangential reasons (e.g., someone died during the incident) and incidents that are memorable but did not involve the expert in a key decision-making role. 6.1.2. Step Two: Incident Recall The participant is asked to recount the episode in its entirety. The participant is asked to "walk through" the incident and to describe it from beginning to end. The elicitor asks few, if any, questions, and allows the participant to structure the account. 6.1.3. Step Three: Incident Retelling Once the expert has completed the initial recounting of the incident, the elicitor tells the story back, matching as closely as possible the expert's own phrasing and terminology, as well as incident content and sequence. The participant is asked to attend to the details and sequence. The participant will usually offer additional details and clarifications, and corrections. This sweep allows the elicitor and the participant to arrive at a common understanding of the incident. 6.1.4. Step Four: Timeline Verification and Decision Point Identification The expert goes back over an incident account a second time. The expert is asked for the approximate time of key events. A timeline is composed along a domain-meaningful temporal scale, based on the elicitor's judgment about the important events, the important decisions, and the important actions taken. The timeline is shared with and verified by the expert as it is being constructed. The elicitor's goal is to capture the salient events within the incident, ordered by time and expressed in terms of the points where important input information was received or acquired, points where decisions were made, and points where actions were taken. Input information could include objectively verifiable events (e.g., "The second fire truck arrived on the scene" ) and subjective observations (e.g., "I noticed that the color of the smoke had changed; it was a lot darker than it had been when I arrived on the scene" ). In some cases, the markers for decision points are obvious in the expert's statements (e.g., "I had to decide whether it was safe to send my firefighters inside the building" ). In other cases, the expert's statements suggest feasible alternative courses of action that were considered and discarded; or suggest that the expert was making judgments that would affect the course of decision making. The goal is to specify and verify decision points, points where there existed different possible ways to understanding a situation or different possible actions. 6.1.5. Step Five: Deepening The elicitor leads the participant back over the incident account a third time, employing probe questions that focus attention on particular aspects of each decision-making event within the incident. The step typically begins with questions about the informational cues that were involved in the initial assessment of the incident. The elicitor focuses the 101 participant's attention on the array of cues and information available within the situation, eliciting the meanings those cues hold and the expectations, goals, and actions they engender. The elicitor then works through each segment of the story, asking for additional detail and encouraging the expert to expand on the incident account. This step is often the most intensive, sometimes taking an hour or more. Solicited information depends on the purpose of the study, but might include presence or absence of salient cues and the nature of those cues, assessment of the situation and the basis of that assessment, expectations about how the situation might evolve, the goals that were considered, and the options that may have been evaluated. The array of topics and specific probe questions that might be used in this sweep of the CDM are listed with the Boiler Plate forms (below). No single CDM procedure would be likely to include all of the topics or probes presented here; elicitors select some subset of these in accord with goals of the study. 6.1.6. Step Six: "What if?" Queries The fourth sweep through the incident involves shifting the from the participant's actual experience of the event to a more analytical strategy. The elicitor poses various hypothetical changes to the incident account and asks the participant to speculate on what might have happened differently. In studies of expert decision making, for example, the query might be: "At this point in the incident, what if it had been a novice present, rather than someone with your level of proficiency? Would they have noticed Y? Would they have known to do X?" The elicitor can ask the expert to identify potential errors at each decision point and how and why errors might occur, in order to better understand the vulnerabilities and critical junctures within the incident. The purpose of the "what if" probes is to specify pertinent dimensions of variation for key features contained in the incident account. Klein, Calderwood, and MacGregor (1989) noted that the reasons for taking a particular action are frequently illuminated through understanding choices that were not made, or that were rejected. 102 6.2. Protocol Notes For the Researcher The goal is to develop a detailed account of a set of scenarios via the method of retrospection concerning "critical" decisions. The detailed account will include timelines and decision-point analyses. A time-line is a layout of the sequence of events involved in the analysis of a particular event or scenario. The fundamental dimension is the time course of the event, and laid onto this are milestones indicated by component events, situation assessments and decision pointsactions. Situation assessments are based on informational cues and component events. Decision points involve triggering cues and situation assessments, determination that information is needed, hypothetical reasoning, action options, action goals, and action rationale. Do not plan to obtain detailed CDM accounts representing all possible tasks, scenarios, mission types, etc. Since practitioner skill levels and styles are both highly variable, and are themselves dependent on context, the analysis must focus at the functional level. If the analysis goes into too detailed a level, the recommendations may not address the most significant system design problems. Seek information about cognitive skills and requirements across the many different aspects of the work as performed by different types of practitioners, rather than trying to specify everything involved in each of a set of various particular types of procedures. Do not focus on specifying the details of each and every task/scenario, but instead focus on the variability in the tasks that practitioners face and the cognitive demands that are shared across tasks/scenarios. 103 Conduct detailed CDM analyses of selected incidents only to insure that the general models of styles and cognitive activities are representative. Do not assume that HCC decision-aids will be developed that support activities for each of a set of particular scenarios or mission types. According to the literature on the CDM, sessions should last from between 1 - 2 hours. However, in some domains, the case stories that practitioners have to tell are rich indeed. The CDM sessions an last hours, and the process of transcription and analysis likewise can take hours. It is possible to split the CDM into two or three sessions. In a three-way split: Encourage the Participant to create sketches or diagrams to describe the situation or any of its aspects. The Participant might be able to provide graphic resources from archived data on the case. These can all be integrated into the final case study. The first session includes Step One (Incident Selection) and Two (Incident Recall). After these two steps, the elicitor transcribes the notes and prepares for the second session. The second session includes Steps Three (Retelling) and Four (Timelining). After these two steps, the elicitor transcribes the notes and prepares for the third session. The third session includes Steps Four (Timeline verification and Decision-Point analysis), Five (Deepening) and Six ("What-if" Queries). In a two-way split: The first session includes Steps One and Two (followed by a brief break during which the elicitor creates the Timeline) and then Step Three, The second session includes Steps Four, Five and Six. 104 6.3. Boiler Plate Forms 6.3.0. Cover Form Analyst Participant Time in Interview Date Date Start Start Finish Finish Date Start Finish Date Start Finish Time in Transcription and analysis Date Date Start Start Finish Finish Date Start Finish Date Start Finish Total Time 105 6.3.1. Step One: Incident Selection Start Time Break Finish Time Instructions In this interview we want to talk about the interesting and challenging experiences you have had. We will provide you with guidance in the form of specific questions to support you in laying out some of your past experiences. Can you think of a case you were involved in that was particularly difficult or challenging? -A time when your expertise was really put to the test? Can you remember a situation where you did not feel you could trust or believe certain data, such as a computer model or some other product--a situation where the guidance gave a different answer than the one you came up with? Can you remember a case when you went out on a limb?--Where the data might have suggested something other than what you thought or another practitioner might have done things differently, but you were confident that you were right? (expand the cells and copy this table as needed) 106 6.3.2. Step Two: Incident Recall Start Time Break Finish Time Instructions This (selected event) sounds interesting. Could you please to recount it again in a bit more detail, from beginning to end. (copy this table as needed) 6.3.3. Step Three: Incident Retelling Date 107 Transcription Retelling Start Time Finish Time Instructions Now what I'd like to do is read back your account. As I do so, please check to make sure I've got it all right, and feel free to jump in and make corrections or add in details that come to mind. Elicitor's embellishments/modifications (expand the cells and copy this table as needed) Participant's added details 108 6.3.4. Step Four: Timeline Verification and Decision Point Identification Analyst drafts the timeline Date Start Time Finish Time Verification and Decision Point Identification Date Start Time Finish Time Instructions Now I'd like to go through the event again and this time we will try to create a timeline of the important occurrences--when things happened, what you saw, the decisions or judgments you made, and the actions you took. OSA = Observation / Situation Assessment Event (expand the cells and copy this table as needed) D = Decision A = Action OSA, D, A Time 109 6.3.5. Step Five: Deepening Date Start Time Finish Time Instructions Now I want to go through the incident again, but this time we want to look at it in a little detail and I'll guide you with some questions. 110 PROBE TOPIC PROBES Cues & Knowledge What were you seeing? Analogues Were you reminded of any previous experience? Standard Scenarios Does this case fit a standard or typical scenario? Does it fit a scenario you were trained to deal with? Goals What were your specific goals and objectives at the time? Options What other courses of action were considered or were available? Basis of Choice How was this option selected/other options rejected? What rule was being followed? Mental Modeling Did you imagine the possible consequences of this action? Did you imagine the events that would unfold? Experience What specific training or experience was necessary or helpful in making this decision? What training, knowledge, or information might have helped? Decision-Making How much time pressure was involved in making this decision? How long did it take to actually make this decision? Aiding If the decision was not the best, what training, knowledge, or information could have helped? Situation Assessment If you were asked to describe the situation to a relief officer at this point, how would you summarize the situation? Errors What mistakes are likely at this point? Did you acknowledge if your situation assessment or option selection were incorrect? How might a novice have behaved differently? Hypotheticals If a key feature of the situation had been different, what difference would it have made in your decision? 111 Event PROBE (Copy this set of cells for each event in the timeline) OSA, D, A RESPONSE Time 112 6.3.6. Step Six: "What-if" Queries Date Start Time Finish Time Instructions Now I want to go through the incident one more time, but this time I want to analyze it a little and to do that I'll ask you some hypothetical questions. PROBES What might have happened differently at this point? What were the alternative decisions that could have been made here? What choices were not made or what alternatives were rejected? At this point in the incident, what if it had been a novice present, rather than someone with your level of proficiency? Would they have noticed Y? Would they have known to do X?" What sorts of error might have been made at this point? Why might errors have occurred here? 113 Event PROBE (copy this set of cells for each event in the timeline) OSA, D, A RESPONSE Time 114 6.3.7. Final Integration (Re-telling, Timeline, Deepening, and What-if Steps) In the final integration, the Researcher collapses all of the comments from the Timeline Verification, Deepening and What-if steps into the text for each entry in the verified timeline. The result is a detailed case study expressed as a timeline, with each entry specified as being either OSA, D, or A. Any sketches the Participant made during any of the steps, or any graphic resources on the case that have been procured can be inserted at the appropriate places. Event and Comments (expand the cells and copy this table as needed) OSA, D, A Time 115 6.3.8. Decision Requirements Table The Researcher examines each cell in the Final Integration and culls from it all statements that refer to each of the categories in the DRT. Cues and Variables Needed information Hypotheticals Options Goals Rationale Situation Assessment Time/effort 116 7. Goal Directed Task Analysis Debra G. Jones and Mica R. Endsley SA Technologies 7.1. Introduction In systems that involve a significant amount of cognitive work, conducted under dynamic and changing conditions, developing and maintaining situation awareness (SA) is a significant and critical task for decision makers, forming the basis for decision making and action. Developing systems to effectively support decision makers as they carry out their mission requires that these systems specifically support a wide range of SA requirements that are needed to support the many key decisions and goals of the decision maker. Good system design needs to support not just part of their decisions or tasks, but the full range of work that must be accomplished. To meet this need, a cognitive task analysis methodology for determining the SA requirements of decision makers has been developed (Endsley, 1993, 2000), which can be applied to a wide variety of domains. Key features of this method are that it (a) focuses not on just determining a list of data that is needed, but also identifies how that data is used and combined to form SA – specifically the comprehension and projection requirements associated with a job, and (b) provides a systematic approach for determining these requirements across the many varying goals of a job. This method is based on obtaining a detailed knowledge of the specific goals the decision maker must accomplish and the decisions that people make in meeting each of these goals. As goals are a central organizing feature for information seeking and interpretation in complex systems, they form a foundation for this type of cognitive task analysis. A detailed goal structure provides a systematic foundation for insuring that the full range of decisions and information requirements involved in a person’s job are considered in the analysis process. The goals, decisions and information requirements identified through this analysis provide the foundation for both the creation of new systems and the evaluation of current systems. The ability of a system to support decision makers in successfully accomplishing their mission depends on how well the system supports their goals and information requirements. This type of information typically can not be accessed easily via a traditional task analysis, as traditional task analyses focus more on the physical tasks and functions that must be accomplished and the processes by which these functions and tasks are completed. This focus is restrictive in that it often does not identify the real information the decision maker needs to accomplish goals, nor does it identify how the decision maker integrates information to gain an understanding of the situation. In addition, they tend to assume that work is sequential and linear, when in fact much of situation awareness involves a constant juggling back and forth between multiple goals and processing of information in a very non-linear fashion. Thus, although traditional task analyses can provide information that is vital to the design process, alone they are insufficient for fully identifying the informational requirements of decision makers interacting with complex, dynamic environments. A Goal Directed Task Analysis (GDTA), addresses these issues by identifying the goals a decision maker must achieve in order to accomplish a mission, the decisions that must be made in order to accomplish these goals, and the specific information that is needed to support these decisions (Endsley, Bolte, & Jones, 2003) (See Figure 1). 117 Goals Decisions Information Requirements Figure 1: Elements of the GDTA The GDTA documents what information decision makers need to perform their job and how they integrate or combine information to address a particular decision. The GDTA is not developed along task or procedural lines, is not sequenced according to a timeline, and does not reflect goal priority, as priorities change dynamically depending on the situation. Rather, the GDTA focuses on what information decisions makers would ideally like to know to meet each goal, even if that information is not available given current technology. The ideal information is the focus of the analysis, as focusing only on what current technology can provide would induce an artificial ceiling effect that would obscure much of the information the decision maker would ideally like to know. Further, the means an operator uses to acquire information are not the focus of this analysis as methods for acquiring information can vary from person to person, from system to system, from time to time, and with advances in technology. The GDTA delineates the specific SA requirements a decision maker needs and determines the nature and format of how this information must be integrated in order to achieve each goal relevant to successful task completion. The purpose behind the creation of a GDTA is to identify the information the decision maker really needs and uses. Once this information has been identified, systems can be evaluated to determine how well the current design meets these needs, and future designs can be created that take these needs into account from the beginning. 7.2. Elements of the GDTA The GDTA has three main components: Goals, Decisions, and SA Requirements. 7.2.1. Goals The goals a decision maker has with respect to a particular task, mission, or operation are one element of the GDTA. Goals are higher-order objectives essential to successful job performance. The highest level goal reflects the overall goal of the decision maker. For example, the overall goal for the Army Operations Officer is to “Plan and Execute the Mission”. Associated with this overall goal is a set of main goals that must be achieved in order to accomplish the overall goal. Although the number of main goals varies depending on the complexity and breadth of the domain under consideration, typically three to six main goals describe the overall goal. Finally, each main goal may have a varying number of subgoals associated with it. For more complex domains, subgoals may also have any number of associated sub-goals. The goals increase in specificity as they move down the hierarchy (Figure 2). 118 Overall Goal Main Goal Main Goal sub-goal sub-goal Main Goal Main Goal sub-goal sub-goal sub-goal sub-goal sub-goal Figure 2: Example levels of goals within the GDTA 7.2.2. Decisions The next element of the GDTA involves the decisions the decision maker must make in order to achieve a particular goal. Decisions are associated with a specific goal, although a similar decision may play into more than one goal. These decisions are essentially the questions the decision maker must answer in order to achieve a specified goal. These questions require the synthesis of information in order to understand the situation and how it will impact its associated goal. For example, for the Army brigade intelligence officer goal “Determine impact of environment on friendly forces” one of the questions that needs to be answered in order to achieve this goal is “what is the impact of weather on friendly forces?”. 7.2.3. Situation Awareness Requirements The final element of the GDTA involves the information needed to answer the questions that form the decisions. These information needs are the decision maker’s situation awareness requirements. Situation awareness (SA) can be defined as “the perception of the elements within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future” (Endsley, 1988, 1995). From this definition, 3 levels of situation awareness can be identified: Level 1 which involves the most basic data that is perceived, Level 2 which involves an integration of Level 1 data elements, and Level 3 which involves projecting how the integrated information will change over time. The SA requirements analysis identifies and documents relevant information at all three of these levels. 7.3. Documenting the GDTA Two main structures are utilized to document the GDTA: the goal hierarchy and the relational hierarchy. 7.3.1. Goal Hierarchy The primary goal hierarchy begins with the overall operator goal, from which the major goals are identified and the subgoals essential for successfully achieving the major goal defined. The structure and depth of the primary goal hierarchy will depend on the characteristics of the domain under consideration. The information needed develop this goal hierarchy as well as for subsequent components of the GDTA are generated through a variety of mechanisms (e.g., interviews, document reviews, etc) which will be discussed later. The goal hierarchy is made up of the overall goal, the main goals, and the subgoals associated with a particular task or domain. These goals and subgoals are numbered in outline style for organizational purposes; the numbers are not intended to denote priority or sequential processes. The main goals are numbered 1.0, 2.0, 3.0, etc., and the associated sub-goals are numbered 1.1, 1.2, etc. (Figure 3). Although the GDTA can be completed without this numbering system, using 119 this convention has been found to facilitate discussion and revision of the GDTA. An example of a GDTA for the Army S2 is shown in Figure 4. Overall Operator Goal 1.0 Major Goal 2.0 Major Goal 3.0 Major Goal 4.0 Major Goal 1.1 Sub-goal 2.1 Sub-goal 3.1 Sub-goal 4.1 Sub-goal 1.2 Sub-goal 2.2 Sub-goal 3.2 Sub-goal 4.2 Sub-goal 3.3 Sub-goal 4.3 Sub-goal 1.3 Sub-goal Figure 3: GDTA goal hierarchy Plan and Execute Mission 1.0 Develop plan to effectively execute mission 2.0 Provide ongoing operations 2.1 Optimize placement of units and assets 2.2 Provide force protection 3.0 Accomplish Mission with least casualties 3.1 Determine Adjustments to plan needed to meet mission goals 3.2 Synchronize & Integrate Combat Power in Field 2.3 Supervise battle preparation 2.4 Monitor battlefield / conduct recon Figure 4: Army Operations Officer Goal Hierarchy 7.3.2. Relational Hierarchy The relational hierarchy shows the relationship between the goals, the subgoals, the decisions related to each subgoal, and the SA requirements relevant for each decision (See Figure 5). Each of the elements in the relational hierarchy has a corresponding shape that, although not necessary for the completion of a GDTA, does facilitate information transfer and comprehension. The exact shape of the GDTA with respect to the number of goals and related sub-goals depends on the complexity of the domain and the goal being analyzed. The GDTA is a flexible tool. The structure can be varied however needed to most accurately represent the relationship between the 120 various goals, subgoals, decisions, and SA requirements essential to the domain under consideration. 1.0 Major Goal 1.1 Sub-goal 1.2 Sub-goal 1.3 Sub-goal Decision Decision Decision SA Requirements: Level 3 - Projection Level 2 - Comprehension Level 1 - Perception SA Requirements: Level 3 - Projection Level 2 - Comprehension Level 1 - Perception SA Requirements: Level 3 - Projection Level 2 - Comprehension Level 1 - Perception Figure 5: Relational Hierarchy 7.3.3. Identifying Goals Items that are appropriate for designation as goals are those items that require operator cognitive effort and that are essential to successful task completion. They are higher-level items as opposed to basic information requirements. They are a combination of sometimes competing subgoals and objectives which must be accomplished in order to reach the person’s overall goal. The goals themselves are not decisions that need to be made, although reaching them will generally require that a set of decisions and a corresponding set of SA requirements be known. At times, a subgoal will be essential to more than one goal. In these cases, the sub-goal should be described in depth at one place in the GDTA, and then in subsequent utilizations, it can be “called out”, referencing the earlier goal rather than repeating the goal in its entirety. Cross referencing these callouts helps maintain the clarity of the GDTA, improve cohesiveness, support consistency, and minimize redundancies. Utilizing callouts instead of repeating information is beneficial to the interview process as well as these subgoals can be discussed at one point rather than inadvertently taking up valuable interview time reviewing the same subgoals. Performing a task is not a goal, in part because tasks are technology dependent. A particular goal may be accomplished by means of different tasks depending on the systems involved. For example, navigation may be done very differently in an automated cockpit as compared to a nonautomated cockpit. Yet, the SA needs associated with the goal of navigation are essentially the same (e.g., location or deviation from desired course). Although tasks are not the same as goals, the implications of the task should be considered. A goal may be addressed by a task the subject matter expert mentions during the interview. By focusing on goals rather than tasks, future uses of the GDTA will not be constrained. In some future system, this goal may be achieved very differently than by the present task – for example, the information may be transferred through an electronic network rather than with a verbal report. Goal names should be descriptive enough to explain the nature of the subsequent branch (i.e., the related subgoals, decisions, and SA requirements) and broad enough to encompass all elements related to the goal being described. Further, the goals should be just that – goals, not tasks nor information needs. For example, physical tasks are not goals: rather they are things the operator 121 must physically accomplish, such as filling out a report or calling a coworker. Rather, goals require the expenditure of higher-order cognitive resources such as predicting an enemy course of action (COA) or determining the effects of an enemy’s COA on the outcome of the battle. 7.3.4. Defining Decisions The decisions that are needed to effectively meet each goal in the goal hierarchy are listed beneath the goals to which they correspond (see Figure 5). Decisions reflect the need to synthesize information in order to understand how that information is going to affect the system both now and in the future. Although decisions are presented in the form of questions, not all questions qualify as decisions. Questions that can be answered “yes/no’ are not typically considered appropriate decisions in the GDTA. For example, the question “Does the commander have all the information needed on enemy courses of action?” can be answer with yes/no and does not qualify as a decision. On the other hand “how will enemy courses of action affect friendly objectives?” requires more than a yes/no answer and is a decision that is pertinent to the person’s goals. Further, if a question’s only purpose is to discern a single piece of information, it is not a decision, rather it is an information requirement and belongs in the SA requirements portion of the hierarchy. When a single goal or subgoal has more than one relevant decision, these decisions can be listed separately or together within the goal hierarchy. These decisions (and their corresponding SA requirements) can be listed separately or bunched in the hierarchy, depending on personal preference and the size of the goal hierarchy. One advantage for listing them separately is that the SA requirements for each decisions can be easily discerned from each other. This aids in the process of insuring that all information needed for a decision is present. If after final analysis the SA requirements for several decisions show complete overlap, they can then be combined in the representation for conciseness. 7.3.5. Delineating SA requirements Decisions are posed in the form of questions, and the associated SA requirements provide the information needed to answer the questions. To determine the SA requirements, each decision should be analyzed individually to identify all the information the operator needs to make that decision. The information requirements should be listed without reference to technology or the manner in which the information is obtained. When delineating the SA requirements, be sure to fully identify the item for clarity – for example, instead of ‘assets’, identify ‘friendly assets’ or ‘enemy assets’. Although numerous resources can be utilized to develop an initial list of SA requirements (e.g., interview notes or job manuals), once a preliminary list is created, the list needs to be verified by domain experts. Often, experts will express many of their information needs at a data level (e.g., altitude, airspeed, pitch). Further probing may be required to find out why they need to know this information. This probing will prompt them to describe the higher-level SA requirements – how this information is used. For example, altitude may be used (along with other information) to assess deviations from assigned altitude (Level 2 SA) and deviations from terrain (Level 2 SA). When the expert provides higher order information requirements (e.g., “I need to know the enemy strengths”), further probing will be required to find all of the lower-level information that goes into that assessment (e.g., ask “What about the enemy do you need to know to determine their strengths?”). 122 The SA requirements are listed in the GDTA according to its level of SA by using an indented stacked format (see Figure 5). This format helps ensure the SA requirements at each level are considered and generally helps with the readability of the document. However, the three levels of SA are general descriptions that aid in thinking about SA. At times, definitively categorizing an SA requirement into a particular level will not be possible. For example in air traffic control the amount of separation between two aircraft is both a Level 2 SA requirement (distance now) and a Level 3 SA requirement (distance in the future along their trajectories). Further, not all decisions will include elements at all three levels – in some cases, a Level 2 requirement may not have a corresponding Level 3 item, particularly if the related decision is addressing current, rather than future, operations. At times, a series of SA requirements will be essential for numerous goals. In these cases, this frequently used subset of SA requirements can be turned into callout blocks with a descriptive title. For example, weather issues at all levels of SA may be of concern across many subgoals in the hierarchy. To decrease complexity and reduce redundancy, a weather callout block can be created. This block can be called within the SA requirements section of multiple subgoals as ‘Weather’ and fully defined at the end of the GDTA (See Figure 6). Consistency is important for the SA requirements, and SA requirement callout blocks can be useful to ensure that each time a frequently used item is referenced all of its relevant information requirements are included. 1.3.2 Determine Ability of Assets to Collect Needed Information Can I safely deploy my assets? Projected safety of deployment for assets Level of risk to the assets in the area Projected enemy COAs Projected ability of enemy to counterattack asset Terrain Weather Commo range Ability to move the assets stealthily Areas of cover and concealment Terrain Time of Year Security needed to protect asset Available force protection Time needed to collect the information Accessibility of the area Weather Icing Projected Weather Inversion Temperature Barometric pressure Precipitation Thunder storms Hurricanes Tornadoes Monsoons/ Flash flooding Tides Cloud Ceiling Lightening Winds Directions Magnitudes Surface winds Aloft winds Visibility Illumination Fog Day/Night Ambient Noise Moon Phases Sand storms Figure 6: Subset of Army Intelligence Officer GDTA showing Weather Call-out. 123 7.4. GDTA Methodology 7.4.1. Step 1: Review Domain Prior to beginning a goal-directed task analysis, a general understanding of the domain of interest should be developed. This understanding can be gained by reviewing relevant documents pertaining to the domain of interest (e.g., job descriptions, job taxonomies, manuals, performance standards). This review accomplishes several things. First, it provides an overview of the domain and the nature of the decision maker’s job. Second, it provides a background for understanding the lingo the interviewee is accustomed to speaking. Finally, the interview should go more smoothly (and the interviewer’s credibility enhanced) if the interviewer can “talk the talk.” The interviewer must be cautious, however, not to develop a preconceived notion of what the decision maker’s goals are likely to be and, as a result, to seek only confirming information in the interview. 7.4.2. Step 2: Initial interviews Interviews with subject matter experts (SMEs) are an indispensable source for information gathering for the GDTA. Whenever possible, each SME should be interviewed individually. When more than one person is interviewed at a time, the data collection may be negatively impacted if one participant is more vocal than the other or if the participants have dissenting opinions regarding relevant goals and decisions. Each interview begins with an introduction of the purpose and intent of the data collection effort and a quick review of the interviewee’s experience. Next, the SME is asked about the overall goals relevant for successful task completion. As the SME explains the overall goals, make note of topics to pursue during the interview. This process of noting potential questions as they come to mind should be followed throughout the entire interview. Relying on memory for questions may work for some, but for most the questions will be forgotten (but crop up again after the interview when the practitioner is trying to organize the GDTA). Obtaining time with domain experts is often difficult, so maximizing the time when it is available is essential. Several suggestions can be offered to this end. First, conduct the interview with two interviewers present to minimize downtime; one interviewer can continue with questions while the other interviewer quickly reviews and organizes notes or questions. The presence of two interviewers will also have a benefit beyond the interview; constructing the goal hierarchy is easier when two people with a common frame of reference (developed in part from the interview session) work on the task. Next, limit the number of interviews performed on any given day. Time is needed to organize information gleaned from each interview and to update the charts to insure that subsequent sessions are as productive as possible. Interviewing too many people without taking time to update notes and charts can negatively impact not only the interview sessions but also the resulting data quality. Finally, interviewer fatigue can be a considerable factor if too many interviews are scheduled without a break or if the interviews last too long. Two hours per session is generally the maximum recommended. 7.4.3. Step 3: Develop the goal hierarchy Once the initial interviews have been completed, the task becomes one of organizing all the disparate pieces of information collected during the interview into a working preliminary goal structure that will allow for adequate portrayal of the information requirements. Determining the overall goal is usually fairly straightforward. The art of the process comes in when delineating the goals essential for the success of the overall goal. Typically more questions are raised than answered during the first attempts to develop a goal hierarchy. Nonetheless, creating a preliminary goal structure, even if incomplete or sketchy, is essential and will aid in focusing future data collection efforts. 124 Although each practitioner will develop a unique style for developing the goal hierarchy, one approach is to begin by reorganizing the notes from the interviews into similar categories. This categorization can be done by beginning each new category on a separate page and adding statements from the notes to these categories as appropriate. Sorting information in this manner helps illuminate areas that constitute goals and may make it easier to create a preliminary goal hierarchy. Although defining an adequate goal hierarchy is the foundation of the GDTA, in the early stages this hierarchy will not be perfect, and an inordinate amount of time should not be spent trying to make it so. Further interviews with experts will most likely shed new light that will require that the goal hierarchy be revamped by adding, deleting, or rearranging goals. Nonetheless, developing even a preliminary goal structure at this point in the process allows for a baseline for future iterations, helps in the process of aggregating information, and helps direct information gathering efforts during the next round of interviews. 7.4.4. Step 4: Identify decisions and SA requirements Once a preliminary goal hierarchy has been developed, the associated decisions and SA requirements can be developed to the extent possible given the amount of data collected. Notes taken during the interview can be preliminarily linked with the associated decision and goals. Further, after completing the process of organizing the interview notes into the relational hierarchy, existing manuals and documentation can be referenced to help fill in the holes. Caution should be exercised however, since these sorts of documents tend to be procedural and task specific in nature. Information in the GDTA is concerned with goals and information requirements, not current methods and procedures for obtaining the information or performing the task. Although operators often don’t explicitly follow written procedures found in the related documentation, evaluating the use of this information in the hierarchy can help to insure the GDTA is complete and can spark good discussions of what information the decision maker actually needs to achieve a goal. 7.4.5. Step 5: Additional interviews / SME review of GDTA Once a draft GDTA is created, it can serve as a tool during future interview sessions. One way to begin an interview session is to show the primary goal hierarchy to the interviewee and talk about whether the goal hierarchy captures all the relevant goals. After this discussion, one section of the GDTA can be selected for further review, and each component of that section (I.e., goals, subgoals, decisions, and SA requirements) can be discussed at length. The draft can be used to probe the interviewee for completeness (e.g., “what else do you need to know to assess this?”) and to determine higher-level SA requirements (“how does this piece of data help you answer this question?”). Showing the entire GDTA to the participant is generally not a good idea, as the apparent complexity of the GDTA can be overwhelming to the participant and consequently counter-productive to data collection. If time permits after discussing one section of the GDTA, another subset can be selected for in-depth discussion. 7.4.6. Step 6: Revise the GDTA After each round of interviews is complete, the GDTA should be revised to reflect new information. New information gleaned from iterative interviews allows for reorganizing and condensing of the goals within the structure to create better logical consistency. For example, when several participants bring up a low-level goal as an important facet of the job, the goal may need to be moved to a higher place in the hierarchy. Furthermore, as more information is uncovered concerning the breadth of information encompassed by the various goals and subgoals, the goals and subgoals can often be refined to create a more descriptive and accurate category. Often goals will seem to fit more than one category. In these cases, the goals can be represented in detail in one place in the GDTA and referenced (i.e., “called out”) to other places within the hierarchy as needed. 125 As the GDTA is reviewed further, commonalities become apparent that can be used to assist in refining the charts. Consequently, after defining a preliminary set of goals, the goal structure should be reviewed to determine if goals that originally seemed distinct, albeit similar, can actually be combined into a single category. Combining similar goals will reduce repetition in the analysis. For example, goals involving planning and re-planning rarely need to be listed as separate major goals; often representing them as branches under the same major goal is sufficient to adequately delineate any distinct decisions or SA requirements. When combining goals that are similar, all of the information requirements associated with the goals should be retained. If the information requirements are very similar but not quite the same, separate subgoals may be in order, or perhaps future interviews will resolve the inconsistency. If several goals seem to go together, this clustering might be an indication that the goals should be kept together under the umbrella of a goal one level higher in the hierarchy. 7.4.7. Step 7: Repeat steps 5 & 6 Additional interviews with subject matter experts should be conducted and the GDTA revised until a comprehensive GDTA has been developed. After discussing a set of GDTA charts with one participant, the charts should be updated to reflect the changes before being shown to another participant. Showing a chart to another participant before it has been edited may not be the best use of time; interviewer notations and crossed out items on the charts can be confusing to the participant and thus counterproductive. 7.4.8. Step 8: Validate the GDTA To help insure that the final GDTA is as complete and accurate as possible, it should be validated by a larger group of subject matter experts. Printouts of the final GDTA can be distributed to experts with instructions on how to interpret it, and the SMEs asked to identify missing information or errors. Needed corrections can then be made. In some cases a particular expert will report that he or she does not consider certain information or do particular subsets of the GDTA. This feedback does not necessarily mean that these items should be eliminated, however. If other experts report using that data or performing those goals, then these components should remain part of the GDTA. Some people will have had slightly different experiences than others and simply may have never been in a position to execute all possible subgoals. As the GDTA will form the basis for future design, the full breadth of possible operations and information requirements should be considered. An additional way to validate the GDTA is through observation of actual or simulated operations by experienced personnel in the position. While it can be difficult to always know exactly what the operators are thinking (unless they are instructed to ‘think aloud,’ performing a verbal protocol), these observations can be used to check the completeness of the GDTA. It should be possible to trace observed actions, statements, and activities back to sections of the GDTA. If people are observed to be performing tasks that there is no apparent reason for in the GDTA, or looking for information that is not identified in the GDTA, follow up should be conducted to determine why. Any additions to the GDTA that are needed should be made based on these validation efforts. 7.5. Simplifying the Process Creating a GDTA can be a seemingly overwhelming task at times. Several suggestions can be offered to simplify this process. Organize the notes into the preliminary GDTA as soon as possible after the first interview while the conversation is still fresh in memory. Once a draft of the GDTA has been created, number the pages to facilitate discussion and minimize confusion. During a collaborative session, locating page 7 is easier than locating Subgoal number 4.2.3.1. 126 Using paper copies of the GDTA when reorganizing the GDTA is often easier than trying to edit it in electronic format, as the pages can be arrayed across the table, elements compared, and changes notated quickly. Once changes have been decided, notate all the changes on a clean copy for review before updating the electronic format. As soon as the reorganization process allows, make a new primary goal hierarchy to provide structure for other changes. While the GDTA is being developed, make notes of any questions that arise concerning where or how something fits. This list can be brought to the next interview session for clarification. Do not get too concerned about delineating the different SA levels; these are not concrete, black and white items, and it is conceivable that a particular item can change SA levels over time. Thinking about the different SA levels is mainly an aid to help in the consideration of how information is used. If the same thing is being considered at various places to achieve essentially the same goal, combine them. For example, the questions ‘How does the enemy COA affect ours?’ and ‘How does the enemy COA affect battle outcome?’ are essentially the same and can be combined. During the interview listen closely for predictions or projections the person may be making when reaching a decision. Try to distinguish between what they are assessing about the current situation (e.g., ‘where is everyone at?’ – the Level 1 SA requirement “location of troops”) and what they are projecting to make a decision, (e.g., “where will the enemy strike next?” – the Level 3 SA requirement “projected location of enemy attack”). Distinguishing between these types of statements will assist the practitioner in following up on higher order level SA requirements and documenting lower level SA requirements. 7.6. Conclusion Creating a comprehensive GDTA for a particular job can be difficult. It often takes many interviews with subject matter experts; it is not uncommon for it to take anywhere from 3 to 10 sessions, depending on the complexity of the domain. Even then, there can be a fair degree of subjectivity involved on the part of the analyst and the experts. These problems are common to all cognitive task analysis methods, and GDTA is no exception. Nonetheless, the GDTA methodology provides a sound approach to delineating the SA requirements decision makers have with respect to specific goals. Understanding these SA requirements will assist designers in evaluating and designing systems to ensure that the decision maker’s efforts to build and maintain a high level of situation awareness are supported to the maximum extent possible. The resulting analysis from this effort feeds directly into efforts to design systems to support situation awareness, in that the structure provides (a) a clear delineation of what the individual really needs to know, allowing designers to integrate data in meaningful ways on displays, (b) key information on what information needs to be grouped together to support decisions, (c) guidance as to how information and decisions relate to common goals, a critical organizing feature for system design, and (d) information on critical cues/situations that dictate necessary shifts in priority between situation classes and goals. This information is used directly in SA-Oriented Design (Endsley, Bolte, & Jones, 2003) to create systems that support SA, based on a set of clearly defined design principles. In addition, the SA requirements identified through the GDTA can be used to create objective metrics for evaluating the degree to which different design concepts and technologies are successful in supporting the SA of decision makers (Endsley, 2000). Overall, the GDTA provides a useful 127 tool for assessing and organizing the situation awareness requirements associated with cognitive work in a wide variety of domains and applications. Army Fire Support Officer GDTA Protocols for CTA p.128 Plan and Integrate Lethal and Non-lethal Fires into Scheme of Maneuver 1.0 Support Scheme of Maneuver 2.0 Select Best Asset for Target 3.0 Execute Fire Support Plan 1.1 Choose Targets 3.1 Find Desired Targets 1.2 Array& Allocate Fire Support Assets 3.2 Coordinate Fire Support Mission 3.2.1 Coordinate Changes to 3.2.2 Minimize Collateral D 3.3 Deliver Fires Protocols for CTA p.129 1.1 Choose targets Should identified target be engaged? Target prioritization o Target value Target relevance to enemy plans/intentions Projected enemy plans/intentions Resources needed for enemy plans/intentions HVTs o Target payoff Target relevance to friendly plans/intentions Projected ability of target to interfere with friendly plans Projected impact of target on friendlies Weapons type/range Weapons fires HPTs Friendly COAs/scheme of maneuver Target allowability o ROE o Target location in zone o Confidence in target ID o Confidence in target location Impact of allocating resources to target on ability to complete mission o Resources required to engage target o Time cost of engaging target Projected route of artillery units Location of artillery units Location of targets Projected route of targets Weapons/ammo required to engage target o Weapons/ammo required for plan Likelihood successful target engagement o Target location o Target level of protection o Target mobility (dwell time) Effects of prior fires on target Protocols for CTA p.130 1.2 Array Assets to Best Advantage How accurate is my representation of the battlefield? How does the enemy threaten the scheme of maneuver? Is plan supportable with artillery? Confidence in information o Verified information o Assumptions Projected friendly movement / actions (Bde, Adjacent Bde) o Current friendly maneuver/actions Friendly location/strength Routes of movement Observer location Projected friendly movement / actions (Artillery Battery) o Current maneuver/actions Location/strength Weapons Ammo type/quantity Radar types/location o Coverage of battlefield Projected enemy movement/action o Current enemy maneuver/action Enemy location/strength Enemy weapons Enemy weapon coverage area Enemy knowledge of friendlies (all) Assumptions about enemy Enemy doctrine Special areas of interest o Fire support coordination measures o Maneuver boundaries Projected Impact on Friendly COA and maneuver CO Intent Terrain Projected friendly (all) movement / action o Current friendly maneuver / action Location / strength Routes of movement Projected enemy movement / action o Current enemy maneuvers/action Enemy location / strength Enemy weapons Enemy weapon coverage area Enemy knowledge of Friendlies (all) Assumptions about enemy Enemy doctrine CO Intent Projected impact of weapon on enemy o Range and location o Ammo capabilities o Level of protection of enemy o Timing of plan (ability to meet with weapons) Projected availability of other friendly (non-organic) assets o Supporting fires (artillery, mortars, naval gunfire, air support) o Aircraft availability o Support artillery From higher From adjacent o Electronic Warfare How can Projected imp o Time const o Time avail o Time requi Projected abil Projected tim concealment Projected frie o Current ma Location Projected ene o Current en Enemy l Enemy w Enemy w Enemy k Terrain o Cover and Position Masking o Foliage eff o Munitions o Protection o Impact on o Impact on o Impact on o Weather o Ability to r o Line of sig o Ability to h Area of opera Protocols for CTA p.131 2.0 Select Best Asset for Target Will the asset achieve the desired effect? Projected ability to acquire target o Visibility of target Weather Terrain Day / night Projected effect of weapon on target o Objective (suppress / destroy / neutralize / shape) o Level of protection of target Target location Target size Target movement Target cover/ concealment Terrain Prepared positions Target armored protection Projected System effects o Weapon type o Weapon Coverage area (range) o Weapon limitations Environmental Acoustics o Weapon location o Weapon capabilities o Ammo available o Ammo required for target o Footprint o Dispersion pattern Projected action of enemy following weapons execution Is this the best utilization of resources? Priority and quantity of targets not covered by weapons o Projected effect of weapon on target Weapon availability (lethal/nonlethal) Weapon location Weapon type Weapon strength/status Weapon allocation Ammo availability Ammo level/type Ammo utilization rate Projected ammo resupply Availability non-organic resources o Target prioritization o Availability of more responsive system o Availability of lower level of support to achieve desired results Are the weap Projected impact on o Likelihood of co Target locatio Proximity of f Projected l Proximity of c Projected l Proximity of s target Ability to mo weapons Level of prot Cover of fr Area of ope Probability of Dud rate Targeting p o Probabil o Weapon o Ammo t Projected disp Type of am Surroundin o ROE o Projected impac mission outcome Target priorit Projected leth troops/civilian Availability o desired effect Protocols for CTA p.132 3.0 Execute Fire Support Mission 3.1 Find & Acquire Desired Targets What is the best position for observers? Likelihood for being able to detect enemy potential positions o Observer capabilities (range/viewability) o Ability to get equipment and people into position to sight and acquire target o Impact of terrain on viewability of targets o Impact of weather on viewability of targets o Risk level to observers Cover/concealment available to observers o Accessability of area (in/out) o Projected enemy positions/movements Projected enemy intent Likelihood of being able to communicate detections o Comm. Capabilities o Projected enemy interference capabilities Best placement of weapon system Need for personnel protection Projected ability to protect personnel o Personnel location/routes o Level of skill of personnel o Visibility of personnel/weapons Areas of cover/concealment Enemy locations Enemy capabilities 3.2 Coordinate Fire Support Mission 3.2.1 Coordinate Changes to Plan 3.2 Minimize Collateral Damage 3.3 Deliver Are delivery assets able to deliv Indirect fire Ability to get alloc position (range) Availability of cor type/quantity Communications w Close Air Support Allocated asset in Correct ammo typ Communication w Permissive weathe Protocols for CTA p.133 3.2 Coordinate Fire Support Mission 3.2.1 Coordinate changes to plan Is everyone aware of impending fires and residual effects? Coordinate changes to plan o Reallocate fire support assets o Revise fire support coordination measures o Refine target location Info communicated to troops o Comm to troops successful o Comm. Availability Calls for fire requests on target o Priority of unit requesting fire support 3.2.2 Minimize Collateral Damage Is the potential for collateral damage minimized? Friendly troops potentially effected by fires o Projected friendly troop movements o Projected friendly location o Projected duration of weapons effects o Projected area of weapons coverage Weapons/Ammo type Timing of fires Location of fires Projected impact on civilians o Proximity of civilians to target o Projected location of civilians o Ability to move friendlies /civilians to avoid weapons o Level of protection of friendlies/civilians Cover of friendlies/civilians Protocols for CTA p.134 8. References Burton, A. M., Shadbolt, N. R., Hedgecock, A. P., and Rugg, G. (1987). A formal evaluation of a knowledge elicitation techniques for expert systems: Domain 1. In D. S. Moralee (Ed.), Research and development in expert systems, Vol 4. (pp. 35-46). Cambridge: University Press Burton, A. M., Shadbolt, N. R., Rugg, G., and Hedgecock, A. P. (1988). A formal evaluation of knowledge elicitation techniques for expert systems: Domain 1. Proceedings, First European Workshop on Knowledge Acquisition for Knowledge-Based Systems (pp. D3.121). Reading, UK: Reading University. Chase, W. G., and Simon, H. A. (1973). Perception in chess. Cognitive Psychology, 5, 55-81. Endsley, M. R. (1988). Design and evaluation for situation awareness enhancement, Proceedings of the Human Factors Society 32nd Annual Meeting (pp. 97-101). Santa Monica: Human Factors Society. Endsley, M. R. (1993). A survey of situation awareness requirements in air-to-air combat fighters. International Journal of Aviation Psychology, 3, 157-168. Endsley, M. R. (1995). Toward a theory of situation awareness in dynamic systems. Human Factors, 37, 32-64. Endsley, M. R. (2000). Direct measurement of situation awareness: Validity and use of SAGAT. In M. R. Endsley and D. J. Garland (Eds.), Situation awareness analysis and measurement (pp. 147-174). Mahwah, NJ: LEA. Endsley, M. R., Bolte, B., and Jones, D. G. (2003). Designing for situation awareness: An approach to user-centered design. London: Taylor and Francis. Ericsson, K. A., and Simon, H. (1993). Protocol analysis: Verbal reports as data (2nd. Ed). Cambridge, MA: MIT Press. Fox, J. (Ed.) (1987). The essential Moreno. New York: Springer. Gilbreth, F. B. (1911). Motion study. New York: Van Nostrand. Gilbreth, F. B., and Gilbreth L. M. (1919). Fatigue study: The elimination of humanity's greatest unnecessary waste. New York: MacMillan. Hoffman, R. R. (1986). "Extracting experts' knowledge." Report, U.S. Air Force Summer Faculty Research Program. Hoffman, R. R. (1987). The problem of extracting the knowledge of experts from the perspective of experimental psychology. AI Magazine, 8, 53-67. Hoffman, R. R. (1998). How can expertise be defined? Implications of research from cognitive psychology. In R. Williams and J. Fleck (Eds.), Exploring expertise: Issues and perspectives. London: Macmillan. Hoffman, R. R., Coffey, J. W., and Ford, K. M. (2000). "A Case Study in the Research Paradigm of Human-Centered Computing: Local Expertise in Weather Forecasting." Report on the Contract, "Human-Centered System Prototype," National Technology Alliance. Hoffman, R. R., Crandall, B., and Shadbolt, N. (1998). A case study in cognitive task analysis methodology: The Critical Decision Method for the elicitation of expert knowledge. Human Factors, 40, 254-276. Hoffman, R. R., and Lintern, G. (2005). Eliciting knowledge from experts. In A. Ericsson, N. Charness, P. Feltovich, and R. Hoffman (Eds.), Cambridge handbook of expertise and expert performance. New York: Cambridge University Press. Hoffman, R. R., Militello, L., and Eccles, D. (2005). “Perspectives on Cognitive Task Analysis.” Report to the Advanced Decision Architectures Collaborative Alliance, Army Research Laboratory, Adelphi, MD. Protocols for CTA p.135 Hoffman, R. R., Shadbolt, N., Burton, A. M., and Klein, G. A. (1995). Eliciting knowledge from experts: A methodological analysis. Organizational Behavior and Human Decision Processes, 62, 129-158. Hoffman, R. R., Trafton, G., and Roebber, P. (2005), Minding the weather: How expert forecasters reason. Cambridge, MA: MIT Press. Johnston, N. (2003). The Paradox of Rules: Procedural Drift in Commercial Aviation. In R. Jensen, (Ed), Proceedings of the Twelfth International Symposium on Aviation Psych (pp. 630-635) Dayton, OH: Wright State University. Kidd, A. L., and Cooper, M. B. (1985). Man-machine interface issues in the construction and use of an expert system. International Journal of Man-Machine Studies, 22, 91-102. Klein, G., Calderwood, R., and MacGregor, D. (1989). Critical decision method of eliciting knowledge. IEEE Transactions on Systems, Man, and Cybernetics, 19, 462-472. Kolodner, J. L. (1991). Improving decision making through case-based decision aiding. The AI Magazine, 12, 52-68. Koopman, P., and Hoffman, R. R., (November/December 2003). Work-Arounds, Make-Work, and Kludges. IEEE: Intelligent Systems, pp. 70-75. McClelland, D. C. (1998). Identifying competencies with behavioral-event interviews. Psychological Science, 9, 331-339. McDonald, N., Corrigan, S. and Ward, M. (2002). Cultural and Organizational factors in system safety: Good people in bad systems. Proceedings of the 2002 International Conference on Human-Computer Interaction in Aeronautics (HCI-Aero 2002) (pp. 205-209). Menlo Park, CA: American Association for Artificial Intelligence Press. Ohanian, R. (1990). Construction and validation of a scale to measure celebrity endorsers’ perceived expertise, trustworthiness, and attractiveness. Journal of Advertising, 19, 3952. Prerau, D. (1989). Developing and managing expert systems: Proven techniques for business and industry. Reading, MA: Addison Wesley. Rasmussen, J., Pejtersen, A. M., and Schmidt, K. (1990). Taxonomy for cognitive work analysis. (Report No. RIS-M-2871). Roskilde, Denmark: Riso National Laboratory. Renard, G. (1968). Guilds in the Middle Ages. New York: A. M. Kelley. Senjen, R. (1988). Knowledge acquisition by experiment: Developing test cases for an expert system. AI Applications in Natural Resource Management, 2, 52-55. Slade, S. (1991). Case-based reasoning: A research paradigm. The AI Magazine, 42-55. Stein, E. (1992). A method to identify candidates for knowledge acquisition. Journal of Management and Information Systems, 9, 161-178. Stein, E. (1997). A look at expertise from a social perspective. In P. J. Feltovich, K. M. Ford, and R. R. Hoffman (Eds.). Expertise in context: Human and machine (pp. 181-194). Cambridge, MA: MIT Press/AAAI Books. Thorndike, E. L. (1920). The selection of military aviators. U.S. Air Service, 1 and 2, 14-17; 2832, 29-31. Vicente, K. J. (1999). Cognitive work analysis: Toward safe, productive, and healthy computerbased work. Mahwah, NJ: Erlbaum. Wood, L. E., and Ford, K. M. (1993). Structuring interviews with experts during knowledge elicitation. In K. M. Ford and J. M. Bradshaw (Eds.), Knowledge acquisition as modeling (pp. 71-90). New York: Wiley.