Knowledge Acquisition and modelling Introduction to Knowledge Acquisition and Elicitation DIKW (Data, Information, Knowledge, Wisdom) Pyramid Hierarchy Framework Continuum Data, Information, Knowledge, Wisdom Data... is raw. simply exists and has no significance beyond its existence (in and of itself). It is raining Information data that has been given meaning by way of relational connection. "meaning" can be useful, but does not have to be. The temperature dropped 15 degrees and then it started raining. Data, Information, Knowledge, Wisdom Knowledge the appropriate collection of information, such that it's intent is to be useful. If the humidity is very high and the temperature drops substantially the atmospheres is often unlikely to be able to hold the moisture so it rains. “Knowledge is a fluid mix of framed experience, values, contextual information, expert insight and grounded intuition that provides an environment and framework for evaluating and incorporating new experiences and information. It originates and is applied in the minds of knowers. In organizations it often becomes embedded not only in documents and repositories but also in organizational routines, processes, practices and norm” Wallace, Danny P. (2007).Knowledge Management: Historical and CrossDisciplinary Themes. Data, Information, Knowledge, Wisdom Understanding... Cognitive and analytical. Way you can take knowledge and synthesize new knowledge from the previously held knowledge. Wisdom... calls upon all the previous experience previous levels of consciousness upon special types of human programming (moral, ethical codes, etc.). It rains because it rains. Transition Example I have a box. The box is 3' wide, 3' deep, and 6' high. The box is very heavy. The box has a door on the front of it. When I open the box it has food in it. It is colder inside the box than it is outside. You usually find the box in the kitchen. There is a smaller compartment inside the box with ice in it. When you open the door the light comes on. When you move this box you usually find lots of dirt underneath it. Junk has a real habit of collecting on top of this box. What is it? Types of Knowledge Procedural How to E.g. I Know How To Drive A Car Processes, Tasks, Activities And conditions under which tasks are performed And sequence of tasks Conceptual I know that … About ways in which things (concepts) are related to each other and their properties Types of Knowledge Explicit Knowledge at the forefront of a person’s brain Thought about in a deliberate, conscious way Concerned with basic tasks, basic relationships between concepts, basic properties of concepts Not difficult to explain Tacit Deep, embedded knowledge At the back of a person’s brain Built from experience rather than being taught Gain when practice Leads to activities which seem to require no conscious thought at all Types of Knowledge Procedural Knowledge How to boil an egg How to interview an expert How to tie a shoelace E=mc2 The properties of knowledge The position of keys on a keyboard Conceptual Knowledge Basic, Explicit Knowledge How to Boil An Egg How to tie a shoelace Requires demonstration with commentary E=mc2 Simple task easily explained Simply relates concepts The position of keys on a keyboard Most people know this sub-conciously but few conciously Taken from Knowledge Acquisition in Practice A Step By Step Guide, Millton, Springer-Verlag Deep, Tacit Knowledge Exercise Working in groups for 10 mins Create a version of the previous slide with examples of your own Knowledge Acquisition First need to determine what that knowledge is the process of Knowledge Acquisition and Elicitation non-trivial process The information is often locked away in the heads of people - domain experts The experts themselves may not be aware of the implicit conceptual models that they use Have to draw out and make explicit all the known knowns, unknown knowns, etc…. Example “There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know.” Donald Rumsfeld 2002 (US Secretary of Defense 2001 to 2006) Knowledge Acquisition Capturing knowledge about a subject domain From people And other sources Using this to create a store of knowledge Usable by many different applications, users and benefits Does not have to be a database Can be a knowledge web, ontology, knowledge document etc Eliciting Knowledge Most knowledge is in the heads of people People have vast amounts of knowledge People have a lot of tacit knowledge They don't know all that they know and use Tacit knowledge is hard (impossible) to describe People with knowledge in organisations are usually very busy and valuable people Each person doesn't know everything Difficulties of knowledge acquisition People find it difficult to Express their knowledge in a manner fully comprehensible to the person who wishes to acquire it Know exactly what the person wants Give the right level of detail Present ideas in a clear and logical order Explain all the jargon and terminology of the subject domain Recall everything relevant to the project/topic at hand Avoid drifting into talking about irrelevant things Difficulties of knowledge acquisition Person attempting to acquire knowledge from someone find it difficult to: Understand everything the person says Note down everything the person says Keep the person talking about relevant issues Maintain high level of concentration needed Check they have fully understood what has been said Difficulties of Knowledge Acquisition Arise due to human cognition and communication Humans are good at communication and performing complex activities Not good at communicating complex activities to those not from the same subject areas Knowledge Acquisition Bottleneck Nothing happens until knowledge is acquired Sources of knowledge are unreliable Knowledge bases are hard to build Domain experts provide incomplete, even incorrect knowledge Domain experts may not be able to articulate their knowledge Computational knowledge representations are complex Techniques Limited range Ignorance Knowledge Acquisition Bottleneck Narrow bandwidth. Available channels convert organizational knowledge from its source (either experts, documents, or transactions) are relatively narrow. Knowledge inaccuracy. Acquisition latency. Slow speed of acquisition is frequently accompanied by a delay between the time when knowledge (or the underlying data) is created and when the acquired knowledge becomes available to be shared. Experts make mistakes and so do tools used to mine data and information. Maintenance can introduce inaccuracies or inconsistencies into previously correct knowledge bases. Maintenance trap. As knowledge base grows, so does the requirement for maintenance. Previous updates that were made with insufficient care and foresight accumulate and render future maintenance more difficult . As summarised by Christian Wagner in his paper titled Breaking the Knowledge Acquisition Bottleneck Through Conversational Knowledge Management., 2006 Terminology - Knowledge Acquisition A Method of Learning Aristole For our purposes Elicitation Collection Analysis Modelling Validation Of Knowledge for use in a project Process of obtaining all data, information and knowledge to get a consistent view of a person solving a problem Identifying sources, vetting for quality, combining findings … Terminology - Knowledge Elicitation Sub-set of Acquisition Focuses on retrieving knowledge from humans (usually experts) Lots of tacit Terminology - Knowledge Codification Representing knowledge in some form Model Rules Ontology Video Presentation etc Terminology - Knowledge Capture Can be used instead of Acquisition or Codification Generic term covering aspects of all three previous terms Terminology – Knowledge Engineering Feignbaum and McCorduck 1983 Integrating knowledge into a computer system To solve problems that require extensive human expertise Typically building a knowledge based system Shares a lot with software engineering Feigenbaum, Edward A.; McCorduck, Pamela (1983), The fifth generation (1st © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang Knowledge Sources Documented Undocumented Written, viewed, sensory, behavior Memory Acquired from Human senses Machines © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang Knowledge Levels Shallow Surface level Input-output Deep Problem solving Difficult to collect, validate Interactions betwixt system components © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang Knowledge Categories Declarative Descriptive representation Procedural How things work under different circumstances How to use declarative knowledge Problem solving Metaknowledge Knowledge about knowledge © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang Knowledge Engineers Professionals who elicit knowledge from experts Integrate knowledge from various sources Empathetic, patient Broad range of understanding, capabilities Creates and edits code Operates tools Build knowledge base Validates information Trains users © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang Typical problems addressed Problem type Description Diagnosis Inferring malfunctions of an object from its behaviour and recommending solutions. Selection Recommending the best option from a list of possible alternatives. Prediction Predicting the future behaviour of an object from its behaviour in the past. Classification Assigning an object to one of the defined classes. Clustering Dividing a heterogeneous group of objects into homogeneous subgroups. Optimisation Improving the quality of solutions until an optimal one is found. Control Governing the behaviour of an object to meet specified requirements in real-time. Example Algorithm - a strategy, consisting of a series of steps, guaranteed to find the solution to a problem, if there is a solution. Example: How do you find the area of a triangular board, standing up vertically with one edge on the ground? Measure the length of the edge on the ground, multiply it by the vertical height, and divide by two. The answer will be exactly right, every time. Which makes it an algorithm Example Heuristic - a strategy to find the solution to a problem which is not guaranteed to work. One sort of heuristic usually gives you the right answer but sometimes gives you the wrong answer Another sort gives you an answer which isn’t 100% accurate. Example: How old are you? Subtract the year you were born in from 2012. The answer will either be exactly right, or one year short. Which makes it a heuristic. Knowledge Systems Analysis and Design Davis’ law: “For every tool there is a task perfectly suited to it”. But… It would be too optimistic to assume that for every task there is a tool perfectly suited to it. Knowledge Acquisition – Why a Collaborative Process ? Knowledge engineer Logic Domain expert Logic KEY DIFFERENCE between Usually oriented towards the Try to identify global knowledge-based systems andcase of their daily individual solutions, which are other appropriate andtypes can beof softwareworking processes, e.g. the individual patients. made legitimate for all Knowledge optimized for possible contexts. solutions that are appropriate Aim at obtaining knowledge for the given situation. models which are Try to consider as many transparent, objective, and factors as possible and are which consider a finite tolerant against number of factors. inconsistencies. Knowledge Acquisition – Why a Collaborative Process ? Complex and highly specialized domains Different perspectives E.g. medicine Characterized by a distribution of knowledge between domain experts. Different experts – even from one and the same discipline – will have their own personal preferences and mental models. E.g. Specialists for anesthesiology will rarely presume to build knowledge models for cardiac surgery. improve the quality of the resulting systems, so ensure that the systems will meet the requirements from different user groups, especially from both the technical and the application domain. Domain experts must ensure that the system will be accepted and trusted by their peers. E.g will a conservative user group of medical doctors reject a clinical decision-support system which is solely designed from an engineer’s perspective? Knowledge Acquisition – Why a Collaborative Process? “Knowledge is commonly socially constructed, through collaborative efforts toward shared objectives or by dialogues and challenges brought about by differences in persons’ perspectives.” Gavriel Salomon, Distributed Cognitions: Psychological and Educational Considerations. Cambridge University Press, 1993 Knowledge modeling must be heavily based on communication and will usually require compromises. Models are “negotiated in a social relationship” Rammert, Relations that constitute technology And media that make a difference: Toward a social pragmatic theory, 1999 Of technicizatio negotiation is often difficult KEY POINT Experience shows that the bottleneck of building knowledge systems lies more in the social process than in the technology. Human Cognition- Bernd Schmidt Human cognition and scientific theory construction - iterative processes Cognition => Human cognition is driven by feedback. based on the construction of theoretical models exposed to experimental data from real or simulated worlds. Theories must be validated or updated if new observations are made. Experimental acquisition of case data is essential in many scientific disciplines choice of experiments and the construction of simulation models has an impact on the resulting theoretical models. Knowledge Acquisition – Why an Evolutionary Process? Acquisition as a kind of theory construction Human experts have to construct formal theories about the domain Backed by knowledge either resides informally in their heads or can be acquired from some other knowledge source. Resulting knowledge model is part of a knowledge-based system which can operate in real or simulated worlds. Tests in both worlds produce feedback which allows the domain expert to revise the knowledge models. When installed in the real application scenario, the system even changes the real world and thus produces new requirements, which recursively suggest changes to the knowledge model. Knowledge Acquisition – Why an Evolutionary Process? We do not understand how humans carry out reasoning tasks Potential users are often unable to assess the benefits or usage scenarios of the new system Makes it difficult to set out a detailed specification for artefact to imitate humans especially when they are inexperienced computer users. Artefact modifies the work processes in which it is installed. Users modify their environment and their use of the system New working culture emerges. Changes requirements => knowledge models must be updated. Knowledge Acquisition – Why an Evolutionary Process? Process cannot be completely planned Different and unknown cognitive and social perspectives. Hard to predict Often based on incorrect assumptions. Domain experts required to transparently expose their daily practice but this “practice necessarily operates with deception” Every artefact resulting is only an approximation of reality and the actors involved in the process speak different “languages”. Knowledge Acquisition – Why an Evolutionary Process ? Knowledge is inherently complex and vague. especially in non-deterministic domains e.g. medicine Computers require formal data structures, which can be evaluared e.g. threshold values of patient observables. Experts tend to use trial-and-error methods to determine such thresholds, until the system exposes the expected behavior. Cannot predict progress which may change beliefs in KB Knowledge Acquisition – Why an Evolutionary Process ? Knowledge modeling process itself produces new knowledge. Self-observation performed during analysis of the existing work processes can lead to new insights Knowledge is being translated and reorganized => evolves in the process of being encoded and formatted for the system Existing work processes are challenged when analyzed – can lead to redesign during acquisition Installation of knowledge-based systems may require “digitization” of the data flow in the process. E.g. installing a neural network, addition of a database, creation of a data warehouse Knowledge Acquisition – Why an Evolutionary Process ? Knowledge can not be mined and processed like a raw material, but rather comes into existence during the communication Communication will influence the resulting artefacts. Process is characterized by reciprocities between engineers and experts Information provided by the expert depends on the context. As a domain expert gets more and more used to the formal view of the knowledge engineer, he/she will adjust her style, and vice-versa. Personal Construct Theory (George Kelly) Theory that gives an account of how people experience the world and make sense of that experience. ‘Person as a scientist’ Emphasises human capacity for meaning making, agency, and ongoing revision of personal systems of knowing across time Individuals are seen as creatively formulating hypotheses about the areas of their lives, in an attempt to make them understandable or predictable. Predictability is sought as a guide to practical action in concrete contexts and relationships. People engage in continuous extension, refinement, and revision of their systems of meaning Moving systems towards increased meaning Personal Construct Theory (PCT) Key Idea the world is 'perceived' by a person in terms of whatever 'meaning' that person applies to it and the person has the freedom to choose a different 'meaning' of whatever he or she wants. i.e. the person has the 'freedom to choose' the meaning that one prefers or likes. Alternative constructivism the person is capable of applying alternative constructions (meanings) to any events in the past, present or future. PCT – Alternative Constructivism We assume that all of our present interpretations of the universe are subject to revision or replacement... There are always some alternative constructions available to choose among in dealing with the world. Constructs are the way in which things or people are either similar or different. => reality does not reveal itself to us directly, but can be construed in a variety of ways. =>simultaneously differentiates and integrates. To construe is both to abstract from past events, and provide a reference axis for anticipating future events based on that abstraction. Kelly's notion of a personal scientist assumes that all people actively seek to predict and control events by forming relevant hypotheses, and then testing them against their experience. PCT Within man-the-scientist model, the individual creates his or her own ways of seeing the world in which (s)he lives; the world does not create them for him; (s)he builds constructs and tries them on for size; the constructs are sometimes organized into systems groups of constructs which embody subordinate and superordinate relationships; the same events can often be viewed in the light of two or more systems, yet the events do not belong to any system; and the individual's practical systems have particular foci and limited ranges of convenience. PCT Assumes a contrast between individual reality, social reality and shared reality: Individuality: "persons differ from each other in their construction of events." Communality: "to the extent one person employs a construction of experience which is similar to that employed by another, his psychological processes are similar to those of the other person." Socialty: "to the extent that one person construes the construction processes of another, he may play a role in a social process involving the other person." Over the last 50 years, the theory has found its home in the areas of artificial intelligence, education, human computer interaction, and human learning. Newell and Simon’s Human Problem Solving Problem space A person’s internal (mental) representation of a problem, and the place where problem-solving activity takes place. Model known as performance model Represents the problem solving behavior of one person who is performing a specific task, but are not adequate for system development since they are constrained to a single performer on a single task. Seen as consisting of knowledge states, and problem solving proceeds by a selective search within the problem space, according to Newell and Simon using rules of thumb (heuristics) to guide the search. Task environment The physical and social environment in which problem solving takes place. Situations which do not influence individual behavior can be studied by only analyzing the task environment. Model known as the task model Newell and Simon’s Human Problem Solving Both task and performance models are required to enable problem solving behavior to be adequately modeled within a specific domain. Unstructured environments are open for individual behavior, well-structured environments encourage common behavior. Bias What is bias? All views of reality are filtered. Bias only exists in relation to some reference point. Types of bias: Motivational bias Observational bias Limitations on our ability to accurately observe the world Cognitive bias expert makes accommodations to please the interviewer or some other audience Mistakes in use of statistics, estimation, memory, etc. Notational bias Terms used to describe a problem may affect our understanding of it Examples Social pressure response to imagined reactions of managers, clients,… Wishful thinking response to reactions of other experts Impression management response to verbal and non-verbal cues from interviewer Group think response to hopes or possible gains selective interpretation to support current beliefs assumptions made earlier are forgotten Availability contradictory data ignored once initial solution is available Inconsistency expert cannot accurately fit a response into the requested response mode Anchoring Appropriation Misrepresentation some data are easier to recall than others Underestimation of uncertainty tendency to underestimate by a factor of 2 or 3