INTERACTIVE COGNITIVE SYSTEMS METHODS: DOMAIN AND TASK ANALYSIS Modified from Jim Warren slides Expert Systems Expert System (ES) – a system that is (in some sense) a synthetic expert Uses Generally for a very narrow ‘domain’ of knowledge Reasons in a way that emulates a human expert For safety and completeness – as a check, even though the user is an expert too To extend the user – because they have less than ideal expertise themselves To train or test the user – possibly working on canned data and/or with the system providing explanation of its recommendations Can be more acceptable in some domains (e.g. medicine) to call it a ‘decision support system’ ES is a type of DSS, but a DSS could be other than an ES And not every agent is an ES, but an ES always offers some agency Example MYCIN rule IF the stain of the organism is gram negative AND the morphology of the organism is rod AND the aerobicity of the organism is anaerobic THEN there is strongly suggestive evidence (0.8) that the class of the organism is Enterobacter iaceae. This rule has three predicates (yes/no, or Boolean, values that determine if it should fire) In this case each predicate involves the equality of a data field about a patient to a specific qualitative value (e.g., [stain of the organism] = ‘gram negative’) Note that human expertise is still needed – e.g., to decide that the morphology of the organism is ‘rod’ (nonetheless to understand its vocabulary!) Notice it produces a new fact (regarding [class of the organism]) Note this is ‘symbolic reasoning’ – working with concepts as compared to numbers (it’s not like y = x1 + 4.6 x2) Knowledge and Inference Rules Knowledge rules (declarative rules), state all the facts and relationships about a problem Inference rules (procedural rules), advise on how to solve a problem, given that certain facts are known Inference rules contain rules about rules (metarules) Knowledge rules are stored in the knowledge base Inference rules become part of the inference engine Example: IF more than one rule applies THEN fire the one with the highest priority value first Your turn! Inference Rules The knowledge base contains, amongst other facts: green(Kermit). The rule base contains, amongst other rules: IF green(x) THEN frog(x). IF frog(x) THEN hops(x). Query: Does Kermit hop? Deductive reasoning Step 1 Knowledge base is examined to see if 'hops(Kermit)' is a recorded fact. It’s not. Step 2 Rule base is examined to see if there’s a rule of the form IF A THEN hops(x); x= Kermit. There is, with A=frog(x); x= Kermit. But is the premise 'frog(Kermit)' actually true? Deductive reasoning Step 3 Knowledge base is examined to see if 'frog(Kermit)' is a recorded fact. As with 'hops(Kermit),' it’s not, so it’s again necessary to look instead for an appropriate rule. Step 4 Rule base is examined to see if there’s a rule of the form IF A THEN frog(x); x=Kermit. Again, there is a suitable rule, this time with A=green(x); x=Kermit. But now is 'green(Kermit)' true? Step 5 Knowledge base is yet again examined, this time to see if 'green(Kermit)' is a recorded fact, and yes -- this time the premise is directly known to be true. One can therefore finally conclude that the original assertion 'hops(Kermit)' was also true. Explanation Facility Ability to explain its recommendations has always been considered part of an ‘expert system’ Useful for Tutoring – the student can learn what rules applied to the case at hand Use in practice – the doctor (or other expert user) can confirm that the concur with the reasoning (and possibly then translate it for the patient as appropriate) Accuracy/safety/debugging – the user might see an error in the explanation Readability Because it’s something a human expert is also expected to do Unfortunately, an automatically generated explanation can be rather difficult to read Aided by use of good predicate names in the production rules Provenance Ideally, the system should allow links to supporting literature to support the rules themselves (well, in a serious evidence based domain like medicine it should) And it’s important that we know who was involved in the knowledge engineering, and how the system has been tested PREDICT – based on population data Knowledge Engineering (KE) KE – developing (and also potentially maintaining) a knowledge-based system especially an Expert System A key aspect is Knowledge Acquisition (KA) Getting the rules and heuristics (and their confidence levels) Best to have the participation of a live expert (or preferably a panel of them) Suboptimal to work only from a textbook or printed guideline, esp. if the knowledge engineer has only a technical (e.g. not clinical or whatever domain) background This can lead to a ‘bottleneck’ (experts tend to be in-demand and expensive!) Corresponds closely to the ‘Discovery’ phase in user interface / user experience design (ref Heim, The Resonant Interface) Discovery During the collection portion you will formally identify: The people who are involved with the work The things they use to do the work The processes that are involved in the work The information required to do the work The constraints imposed on the work The inputs required by the work The outputs created by the work Organizing the Discovery Process Filters Physical—We can describe the physical aspects of the activity. Cultural—We can look at the activity in terms of the relationships among the people involved. Where is it done? What objects are involved? Are some people in a position to orchestrate and evaluate the performance of other people? Functional—We can also look at these activities in terms of what actually happens. Do some people create things? Do other people document procedures and communications? Organizing the Discovery Process Filters Informational—We can look at these activities in terms of the information that is involved. What information is necessary to perform a task? How does the information flow from one person to another? How is the information generated? How is the information consumed? KA/Discovery techniques In addition to reviewing documents (e.g., guidelines)… Interviewing Just ask the expert how they solve problems in the domain; may first develop a number of reference cases to guide the conversation Protocol analysis Have the expert ‘think aloud’ as they solve a problem (or have two experts talk to each other as they solve the problem) Make a video (be sure audio is good!) Collection – Observation Direct—Ethnographic methods involve going to the work site and observing the people and the infrastructure that supports the work flow Indirect—You can use indirect methods of observation by setting up recording devices in the work place The use of indirect methods may require a significant degree of transparency KA synthesis and verification Once you’ve identified some initial concepts and rules, begin formalisation and review with experts Process diagrams – make a flowchart (or task analysis) of the process that’s followed for key decisions/actions Ontology formulation – collect all the concepts in the domain into a database (Protégé is a tool to support this) Often a concept will have more than one term (word or phrase used to refer to the concept) Create influence diagrams (show the relationship of factors to one another, to actions, and to outcomes) At that point you’ll have a good foundation for formalising the ES/agent’s decision knowledge Program flowcharts for process modeling Simple notation to show order of steps in a process Use diamond to indicate a decision and include two outbound branches Use rectangle with double sidebars to indicate details are defined elsewhere E.g., Compute dose Note: fitting clinical workflow is a CDSS (clinical decision support system) success factor! [Kawamoto, 2005] Interpretation - Task Analysis Task analysis is a way of documenting how people perform tasks A task analysis includes all aspects of the work flow It is used to explore the requirements of the proposed system and structure the results of the data collection phase Interpretation - Task Analysis Task decomposition A linear description of a process that captures the elements involved as well as the prevailing environmental factors. Hierarchical task analysis (HTA) HTA provides a top-down, structured approach to documenting processes. Task decomposition Include the following in a task decomposition The reasons for the actions The people who perform the actions The objects or information required to complete the actions Task decompositions should capture: The flow of information Use of artefacts Sequence of actions and dependencies Environmental conditions Cultural constraints Task decomposition elements Goal – top-level goal of the task being analysed Plans – the order and conditions for proceeding with the sub-tasks Information – all the information needed to undertake the task Objects – all the physical objects involved Methods – the various ways of doing the sub-tasks Textual HTA description (from Dix et al. Human-Computer Interaction, 3rd ed.) Hierarchy description ... 0. in order to clean the house 1. get the vacuum cleaner out 2. get the appropriate attachment 3. clean the rooms 3.1. clean the hall 3.2. clean the living rooms 3.3. clean the bedrooms 4. empty the dust bag 5. put vacuum cleaner and attachments away ... and plans Plan 0: do 1 - 2 - 3 - 5 in that order. when the dust bag gets full do 4 Plan 3: do any of 3.1, 3.2 or 3.3 in any order depending on which rooms need cleaning N.B. only the plans denote order Generating the hierarchy 1 get list of tasks 2 group tasks into higher level tasks 3 decompose lowest level tasks further Stopping rules How do we know when to stop? Is “empty the dust bag” simple enough? Purpose: expand only relevant tasks Motor actions: lowest sensible level Tasks as explanation imagine asking the user the question: what are you doing now? for the same action the answer may be: typing ctrl-B making a word bold emphasising a word editing a document writing a letter preparing a legal case Diagrammatic HTA Refining the description Given initial HTA (textual or diagram) How to check / improve it? Some heuristics: paired actions restructure balance generalise e.g., where is `turn on gas' e.g., generate task `make pot' e.g., is `pour tea' simpler than making pot? e.g., make one cup ….. or more Refined HTA for making tea Types of plan fixed sequence - 1.1 then 1.2 then 1.3 optional tasks - if the pot is full 2 wait for events - when kettle boils 1.4 cycles - do 5.1 5.2 while there are still empty cups time-sharing - do 1; at the same time ... discretionary - do any of 3.1, 3.2 or 3.3 in any order mixtures - most plans involve several of the above Entity-Relationship Techniques Focus on objects, actions and their relationships Similar to OO analysis, but … includes non-computer entities emphasises domain understanding not implementation Running example ‘Vera's Veggies’ – a market gardening firm owner/manager: Vera Bradshaw employees: Sam Gummage and Tony Peagreen various tools including a tractor `Fergie‘ two fields and a glasshouse new computer controlled irrigation system Objects Start with list of objects and classify them: Concrete objects: simple things: spade, plough, glasshouse Actors: human actors: Vera, Sam, Tony, the customers what about the irrigation controller? Composite objects: sets: the team = Vera, Sam, Tony tuples: tractor may be < Fergie, plough > Attributes To the objects add attributes: Object Pump3 simple – irrigation pump Attributes: status: on/off/faulty capacity: 100 litres/minute N.B. need not be computationally complete Actions List actions and associate with each: agent – who performs the actions patient – which is changed by the action instrument – used to perform action examples: Sam (agent) planted (action) the leeks (patient) Tony dug the field with the spade (instrument) Actions (ctd) implicit agents – read behind the words `the field was ploughed' – by whom? indirect agency – the real agent? `Vera programmed the controller to irrigate the field' messages – a special sort of action `Vera told Sam to ... ' rôles – an agent acts in several rôles Vera as worker or as manager example – objects and actions Object Sam human actor Actions: S1: drive tractor S2: dig the carrots Object Vera human actor Object glasshouse simple Attribute: humidity: 0-100% – the proprietor Actions: as worker V1: plant marrow seed V2: program irrigation controller Actions: as manager V3: tell Sam to dig the carrots Object the men composite Comprises: Sam, Tony Object Irrigation Controller non-human actor Actions: IC1: turn on Pump1 IC2: turn on Pump2 IC3: turn on Pump3 Object Marrow simple Actions: M1: germinate M2: grow Events … when something happens performance of action ‘Sam dug the carrots’ spontaneous events ‘the marrow seed germinated’ ‘the humidity drops below 25%’ timed events ‘at midnight the controller turns on’ Relationships object-object social - Sam is subordinate to Vera spatial - pump 3 is in the glasshouse action-object agent (listed with object) patient and instrument actions and events temporal and causal ‘Sam digs the carrots because Vera told him’ temporal relations use HTA or dialogue notations. show task sequence (normal HTA) show object lifecycle example – events and relations Events: Ev1: humidity drops below 25% Ev2: midnight Relations: object-object location ( Pump3, glasshouse ) location ( Pump1, Parker’s Patch ) Relations: action-object patient ( V3, Sam ) – Vera tells Sam to dig patient ( S2, the carrots ) – Sam digs the carrots ... instrument ( S2, spade ) – ... with the spade Relations: action-event before ( V1, M1) – the marrow must be sown before it can germinate triggers ( Ev1, IC3 ) – when humidity drops below 25%, the controller turns on pump 3 causes ( V2, IC1 ) – the controller turns on the pump because Vera programmed it Interpretation - Storyboarding Storyboarding involves using a series of pictures that describes a particular process or work flow Can be used to study existing work flows or generate requirements. Can facilitate the process of task decomposition Used to brainstorm alternative ways of completing tasks. Storyboard Example of a method for people who don’t own a cell phone handset to buy access for voice, SMS, etc. - http://www.dexigner.com/news/20788 Interpretation – Use Cases Use cases represent a formal, structured approach to interpreting work flows and processes Designed to describe a particular goal and explore the interaction between users and the actual system components. Jacobson et al. (1992) Incorporated into the Unified Modeling Language (UML) standard. Interpretation – Use Cases The two main components of use cases are the actors and the use cases that represent their goals and tasks. Actors: similar to stakeholders, but can also include other systems, networks, or software that interacts with the proposed system. Use Cases: Each actor has a unique use case, which involves a task or goal the actor is engaged in. Describe discrete goals that are accomplished in a short time period Describe the various ways the system will be used and cover all of the potential functionality being built into the design Interpretation – Use Cases Notice we use a stickman symbol for the equipment! Use case diagram of “schedule a meeting” process. Use cases The diagram provides an overview of the entities and their relationships through activities This can be used to develop and explore scenarios Basic path – the steps proceed without diversions from error conditions Alternative paths – branches related to premature termination, choosing a different method of accomplishing a task, etc. E.g., what if the equipment isn’t available? OWL Web Ontology Language builds on RDF It’s for machine, not human, interpretation It’s a knowledge representation method Defines classes, their properties and individuals (members of a class) Defines relationships between classes, cardinality, equality Allows restrictions, e.g., to define the data type of a property Eg., this ontology snippet defines a latitude property for an airport and restricts it to a floating point (‘double’ precision) number <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#latitude"/> <owl:allValuesFrom rdf:resource="http://www.w3.org/2001/XMLSchema#double"/> </owl:Restriction> </rdfs:subClassOf> Ontology An ontology – a specification of a conceptualization Term borrowed by AI from philosophy (where it is study of existence); often confused with epistemology (study of knowledge and knowing) Key is what an ontology is for An ontology is a description of concepts and relationships for a community Using an ontology is a commitment to use a vocabulary in a way consistent with that ontology’s specific theory From Tom Gruber: http://www-ksl.stanford.edu/kst/what-is-an-ontology.html Relationships in Ontologies Ontologies can be chiefly taxonomic hierarchies of classes Other standard relationships ‘Part-of’ – a wing is part-of a bird ‘Successor’ – 1988 was the successor year to 1987 Multiple parents Subsumption (is-a): a duck ‘is-a’ bird. ‘bird’ completely subsumes ‘duck’ (bird is a superclass of duck; duck is a subclass of bird) A wing may also be part-of an aeroplane Tuberculosis is-a lung disease and is-a infectious disease Domain-specific relationships In a clinical ontology, we may define a symptom as ‘pathonemonic’ with a disease Constraints and Axioms Constraints may apply to the ‘properties’ (attributes) of a class in an ontology Cardinality constraint (a patient must have exactly one [or at least not more than one] date of birth Value constraints (an SBP is non-negative) Class axioms Reasoners E.g., A class may be declared to be disjoint from another (no ‘is-a’ child in common) Software can check the assertions in an ontology and identify implied relationships and also ‘exceptions’ (constraint and axiom violations) Protégé is a popular free tool (out of Stanford) for managing ontologies stored in OWL Protégé example From Mabotuwana and Warren, AI in Medicine, 2009 Modelled heart disease risk management The drugs The problem / diagnoses The clinical terminology systems The patient instances We actually migrated electronic medical records into the ontology and created rules to identify poorly managed patients right in Protégé Can also integrate the OWL with Java using JENA In most cases, however, the ontology is just for documentation A simpler scheme for this is simply a data dictionary Influence diagrams Arrows indicate influence (generally positive correlation / increased probability) * * from http://www.cs.ru.nl/~peterl/aisb.pdf Decision Trees Essentially flowcharts A natural order of ‘micro decisions’ (Boolean – yes/no decisions) to reach a conclusion In simplest form all you need is A start (marked with an oval) A cascade of Boolean decisions (each with exactly two outbound branches) A set of decision nodes (marked with ovals) and representing all the ‘leaves’ of the decision tree (no outbound branches) Knowledge Engineering (KE) problems for flowchart Natural language may pack a lot in (if we try to do KE on a clinical practice guideline or other document written for humans) E.g., “any one of the following” Even harder if they say “two or more of the following” which implies they mean to compute some score and then ask if it’s >=2 Incompleteness Are there logically possible (or, worse, physically possible) cases that aren’t handled? ‘For example’ is a worry in guideline text, although an ‘all others’ can be interpreted as ensuring completeness here Inconsistency Natural language statements may contradict each other when followed mechanically And are we trying to reach one decision or a set of decisions? KE reflection A guideline that may be perfectly usable to an expert may readily be misinterpreted by someone without the appropriate background Need a human expert to verify the knowledge engineering Humans can smooth over vagueness and choice the ‘sensible’ interpretation (a computer cannot) A decision that looks to have just a couple parts may hide a wealth of complexity Decision Tables A flowchart can get huge We can pack more into a smaller space if we relinquish some control on indicating the order of microdecisions A decision table has One row per ‘rule’ One column per decision variable An additional column for the decision to take when that rule evaluates to true Decision Table example d= doesn’t matter (True or False) From van Bemmel & Musen, Ch 15 Flowcharts v. Tables Decision table is not as natural as a flowchart Decision table gets us close to production rule representation Good as design specification to take to an expert system shell Completeness is more evident with a flowchart Decision table could allow for multiple rules to simultaneously evaluate to true But a ‘real’ (complete and consistent) flowchart ends up very large (or representing a very small decision) Messy on a flowchart (need multiple charts, or terminals that include every possible combination of decision outcomes) Applying either in practice requires KE in a broad sense E.g., may need to reformulate the goals of the guideline Decision tree & table summary Decision trees are a basic design-level knowledge representation technique for ‘logical’ (rule based, Boolean-predicate-driven) decisions Decision tables let you compactly compile a host of decisions on a fixed set of decision variables These take you very close to the representation needed to encode production rules for an inference engine Rule induction from data provides an alternative to conventional Knowledge Engineering Computer figures out rules that fit past decisions instead of you pursuing experts to ask them what rules they use But requires lots of data with useful decision attributes, and typically doesn’t much resemble how a human would make the decision Testing, maintenance and evaluation You are a long way from having a CDSS ‘success’ in healthcare delivery just because you’ve built an ES (or any decision support software) Is it logically correct? Is it going to be usable? A systematic process will have been helpful, but there will still be errors Fit clinical workflow, be efficient and natural for users? Is it going to improve performance? If all that works out, you still need to maintain it Clinical practice changes And you will identify opportunities to improve the tool (and almost certainly to make minor corrections, too) Testing White box testing Testing with an understanding of the way the system is put together Design test cases that try out all the rules Black box testing Test without knowledge of (or sympathy for!) the structure of the system Ideally done by people removed from the main KA/KE process Obviously, for clinical expert systems, clinical people are the appropriate black box testers (this can be expensive) Maintenance An expert system can be dead, but it can never be complete And medical knowledge doesn’t sit still A non-trivial rule base cannot be proved correct Must always have a feedback process for end-users readily to report issues for investigation MYCIN is now an expert that’s >30 years out of date! So production expert systems need a routine process of review, revision and re-distribution Last step easier if the system’s web based, but at least users need a way to find out about the latest changes And when you change something, you have to re-test everything The major assignment 30% of mark Initial analysis (5%) report due Final report and video (25%) due Tuesday 7th April, 11.59pm Sunday 31st May, 11.59pm What we’re looking for Something that ‘interacts’ with the user And keeps a model of the user over time (from day to day, or just from response to response) Make it narrow (but make it interesting!) Don’t try for a broad natural language dialog Assignment deliverables Overview and motivation (include draft version in first deliverable) 500-800 words, including a few references What’s it do and why is it interesting to try to do that? Analysis components (include draft version in first deliverable) Use 2-4 methods from this lecture to document the domain and the task interaction (use words, too; not just diagram alone) Be clear on where you’re documenting current (pre-intelligent-agents) versus proposed way of working Design components (include proposed design in first deliverable) Overall architecture – major components and how they interact with each other and the user User modelling – what do you represent? How do you acquire and update that information? Planning logic – how does the system decide what to do next (including consideration of user model)? Assignment deliverables (contd.) Practical (final deliverable only) Illustrate the system (text and log or screenshots) in terms of realistic cases and provide an informal critique of strengths and weaknesses of the approach (this serves as testing) State the limitations of the implementation (things you’d do if you had more time, or knew how) Create a video of 2-3 minutes duration illustrating the system The team Include a brief summary of what each team member did (this is in the groupauthored deliverable) Peer assessment: separate from the group submission, give a distribution of 100% across the members of your team (e.g. 25/25/25/25 you felt it was absolutely equal); email to ykoh@cs.auckland.ac.nz with subject “765 peer assessment [your UPI]”) Submission As a PDF report with all your names and UPIs, plus (ideally) link to video at ykoh@cs.auckland.ac.nz (please peer assessment as separate individual emails) Tools CLISP JESS introduces a form of Logic Programming (inspired by Prolog) to the Python community by providing a knowledge-based inference engine (expert system) written in 100% Python. Other: LISP (maybe with Lisa), PROLOG, you tell me http://herzberg.ca.sandia.gov/ Some limitations, but gives a basic production rule capability with links to Java Can invoke from its own environment or use as API from a Java program Pyke http://clipsrules.sourceforge.net/ CLIPS is a productive development and delivery expert system tool which provides a complete environment for the construction of rule and/or object based expert systems. CLIPS can be ported to any system which has an ANSI compliant C or C++ compiler. Do NOT recommend just diving in with C# And Pat and I are not programming language tutors, sorry