Methods of Knowledge Acquisition

advertisement
Knowledge Acquisition
(These notes are a copy of the slides from the lecture but saved with a smaller font
size).
Definition – The process of acquiring, organising, & studying knowledge.
Identified by many researchers and practitioners (in particular Feigenbaum) as the
bottleneck in ES development.
Two main types of sources of knowledge –
 documented (which can take many forms) and
 undocumented (usually in the expert’s mind).
Refer to shallow knowledge & deep knowledge - what is meant by these two terms?
How are shallow & deep knowledge represented?
Example: surface level information might be represented as:
If the weather is bad  then stay in bed.
Deeper knowledge would require much more information about how we reach the
rules above. For instance what do we class as bad weather, why does bad weather
cause us not to want to go out, what’s so good about staying in bed etc. Frames &
semantic networks enable us to represent deeper knowledge.
Categories of Knowledge. (three main ones)



Declarative – i.e. descriptive knowledge, facts.
Procedural – how things are done, how to use the declarative knowledge.
Semantics – consider words & symbols & what they mean, how they are
related & manipulated. Reflects cognitive structure.
Why is it difficult to transfer knowledge?



Hard to get experts to express how they solve problems
Representation on machine requires detailed expression i.e. at a very low
level. Must be represented in a structured way.
Bringing together the ideas of all those involved in the knowledge transfer
process.
Various attempts at overcoming these difficulties – & there is much research on this.
1. One approach is the use of natural language interfaces so that experts can
communicate directly.
2. Computer aided knowledge acquisition tools – covered later today, will see
example of a Machine Learning tool (See5), S/W implementation of Quinlan’s
C5 algorithm.
What kind of skill do you think a Knowledge Engineer must have?
(See p126 in extra notes for skill requirements of the Knowledge Engineer.)
Methods of Knowledge Acquisition
These range from Manual  Automatic
1. Looking at available documentation
2. Interviews – i.e. between expert & KE
Knowledge collected using tapes, questionnaires. Often use walk-throughs.
This is time consuming, requires variety of skills on part of expert & KE – however
they are fairly easy to set up.
May start with an unstructured interviews, & move on to a more formal approach.


Unstructured – not good to be too unstructured. May take form of walk
through, talk through, teach through.
Structured – carefully planned. Adopt strategy for asking questions.
Preliminary consideration of types of answers etc.
3. Tracking is an alternative or an addition to interviews.
To do this, the reasoning process of the expert/s is tracked, such methods are popular
with cognitive psychologists.
One way is to use protocol analysis. The expert thinks aloud whilst carrying out the
task. This is recorded and used later for further analysis (to deduce the decision
process), and is coded by the KE. This is a one way process unlike the interview.
4. Observation
Often used as support for other methods. Generates enormous amount of extraneous
information.
(See p137 of handout for list of other manual methods.)
5. Expert Driven methods.
Why not let the experts do their own KE? - i.e. cut out the middle layer.
Two approaches to this:
Manual - Expert self-reports by means of completing open ended & closed
questionnaires, maintaining an activity log.
What problems can you see with this type of approach?
Computer aided approaches - Attempts to eliminate problems identified
with manual approaches.
Various KA tools available. Many based on the idea of Repertory Grid Analysis.
6. Repertory grid Analysis
This is based on the personal construct theory, which is a model of human thinking.
Knowledge & perceptions about the domain are classified & categorized by each
individual as a personal perceptual model.






Expert identifies important objects
Expert identifies important attributes
Expert establishes a bipolar scale with distinguishable characteristics (traits)
and their opposites.
Interviewer picks 3 objects & asks what distinguishes any 2 of these from the
third.
Continues for several triplets of objects.
Each object is given a score for each attribute that represents a point on the
range designated by the bipolar scale. (Usually use 1-3, or 1-5)
Example – Programming Language selection
1. what are the important objects? (i.e. the languages)
2. what are the important attributes (i.e. availability etc.)
3. determine traits & opposites in order to determine bipolar scale. (e.g. is the
language symbolic or numeric in orientation).
4. KE completes grid with the expert.
Attribute
Orientation
Trait
Opposite
LISP
PROLOG
C
COBOL
Symbolic (3)
Numeric (1)
3
3
2
1
Ease of
Programming
High (3)
Low (2)
3
2
3
2
Training
Time
High (1)
Low (3)
1
2
2
1
Availability
High (3)
Low (1)
1
1
2
3
Some KA tools have been developed that include an attempt to automate the RGA
process. See p141,142
7. Methods that support the Knowledge Engineer
Knowledge Engineer can use Knowledge Acquisition aids alongside manual
approaches:





Explanation facility can help, i.e. trials with knowledge coded so far.
Special knowledge base editors as interfaces to check for consistencies and
completeness.
A KA aid known as TEIRESIAS was designed for work using EMYCIN. Uses
a NL interface & has expanded explanation facility. Translates each new rule
to LISP and then back again so can show inconsistencies, conflicts, etc.
KADS is a more general approach to automated KA.
Auto-intelligence – captures knowledge of expert through interactive
interviews, distils knowledge, generates rule based system (see section rule
induction)
8. Automated Rule Induction
Induction means reasoning from the specific to the general. When applied to AI, it is
where rules are generated by a computer system given a number of examples.
A series of examples (the training set) are provided and the inductive learning system
generates rules from these. These rules can then be used to assess further examples
where the outcome is unknown.
This is done using algorithms. A well used algorithm is Quinlan’s ID3 algorithm
which generates a decision tree from the knowledge in the example cases, and then
provides rules.
A later version of this algorithm is C5, and software to enable the use of it is See5
(windows version).
Advantages of this approach




Bigger problem domains are harder for expert to articulate the processes.
However they can still provide examples & solutions which inductive system
can then make sense of.
Helps understand the impact of the different factors involved in making a
decision
Rules can be reviewed by the expert and modified, hence aids experts to
understand their own thinking processes
Such systems are fairly easy to use
Disadvantages




Some times the rules are not easy to understand
Expert has to select the attributes
Mainly good for classification type systems, not much else
Need a lot of examples to be sure the results are valid. Though See5 provides
information on how reliable its results are.
Idea of Knowledge Handbook
One of the functions of the knowledge engineer during the knowledge acquisition
phase is to document the knowledge that has been acquired. One idea suggested
(Wolfgram et al 1987 and others) is that of building a knowledge handbook.
Wolfgram et al describe the contents of the handbook as follows:









The general problem description.
Who the users are and their expectations of the system.
A breakdown of the sub-problems and sub-domains for future knowledge
acquisition.
A detailed description of the domain or sub-domain to be used for the
prototype.
A bibliography of reference documents.
A list of vocabulary, concepts, terms, phrases and acronyms in the domain.
A list of experts for the prototype.
Some reasonable performance standards for the system based on consultation
with the experts and users.
Descriptions of typical reasoning scenarios gained from the knowledge
acquisition.
The above is not necessarily a complete list and it may be that not all sections are
relevant. However, it is a good structure for organising and documenting knowledge
for the expert system.
Multiple experts
Many KA situations will involve more than one expert. How should this be handled?
Implementation – Next week’s lecture on Knowledge Representation
Testing – this involves:



Verification – does it correctly implement the specification?
Validation – how does the performance measure up w.r.t. that of the expert/s.
Is it accurate?
Evaluation – more general, e.g. is it cost effective as a whole etc.
Exercise.
An example taken from 'Expert Systems Design and Development'
by John Durkin:
For the following, pick out the important pieces of knowledge, sort them and provide
graphical representations:
KE: What advice would you give a farmer who was considering growing some new
crop?
EXPERT: I would first want to know what the farmer is considering growing..... I
would then be concerned that the farmer had the right environment for the crop. Too
often they want to grow something that they really aren't prepared to do...I'd consider
their amount of acreage, weather, soil conditions,.....ah... I might even look at the lay
of the land.... I then look at their cultivation techniques to see if they would be good
for this new crop. There are a number of things that concern me here, but pests really
worry me.....particularly for the spawn. If they run into an infestation it can really cut
into their production and profit...After I'm satisfied that their environment is ok, I'd
want to know if they even know how to grow or market the crop.
References
Turban, 1992, Expert Systems & Applied AI, Macmillan.
Partridge & Hussain, 1995, Knowledge Based Information Systems, McGraw-Hill.
Durkin, 1994, Expert Systems Design & development, MacMillan.
Download