Chapter 1: Introduction to Expert Systems

advertisement
Objectives
 Learn the meaning of uncertainty and explore
some theories designed to deal with it
 Find out what types of errors can be attributed to
uncertainty and induction
 Learn about classical probability, experimental,
and subjective probability, and conditional
probability
 Explore hypothetical reasoning and backward
induction
4/13/2015
2
Objectives
 Examine temporal reasoning and Markov chains
 Define odds of belief, sufficiency, and necessity
 Determine the role of uncertainty in inference
chains
 Explore the implications of combining evidence
 Look at the role of inference nets in expert
systems and see how probabilities are propagated
4/13/2015
3
How to Expert Systems Deal with
Uncertainty?
 Expert systems provide an advantage when
dealing with uncertainty as compared to decision
trees.
 With decision trees, all the facts must be known
to arrive at an outcome.
 Probability theory is devoted to dealing with
theories of uncertainty.
 There are many theories of probability – each
with advantages and disadvantages.
4/13/2015
4
What is Uncertainty?
 Uncertainty is essentially lack of information to
formulate a decision.
 Uncertainty may result in making poor or bad
decisions.
 As living creatures, we are accustomed to dealing
with uncertainty – that’s how we survive.
 Dealing with uncertainty requires reasoning under
uncertainty along with possessing a lot of
common sense.
4/13/2015
5
Dealing with Uncertainty
 Deductive reasoning – deals with exact facts and
exact conclusions
 Inductive reasoning – not as strong as deductive –
premises support the conclusion but do not
guarantee it.
 There are a number of methods to pick the best
solution in light of uncertainty.
 When dealing with uncertainty, we may have to
settle for just a good solution.
4/13/2015
6
Theories to Deal with Uncertainty






4/13/2015
Bayesian Probability
Hartley Theory
Shannon Theory
Dempster-Shafer Theory
Markov Models
Zadeh’s Fuzzy Theory
7
Errors Related to Hypothesis
 Many types of errors contribute to uncertainty.
 Type I Error – accepting a hypothesis when it is not
true – False Positive.
 Type II Error – Rejecting a hypothesis when it is true –
False Negative
4/13/2015
8
Errors Related to Measurement
 Errors of precision – how well the truth is known
 Errors of accuracy – whether something is true or
not
 Unreliability stems from faulty measurement of
data – results in erratic data.
 Random fluctuations – termed random error
 Systematic errors result from bias
4/13/2015
9
Errors in Induction
 Where deduction proceeds from general to
specific, induction proceeds from specific to
general.
 Inductive arguments can never be proven correct
(except in mathematical induction).
 Expert systems may consist of both deductive and
inductive rules based on heuristic information.
 When rules are based on heuristics, there will be
uncertainty.
4/13/2015
10
Deductive and Inductive Reasoning about
Populations and Samples
4/13/2015
11
Types of Errors
4/13/2015
12
Examples of Common Types of Errors
4/13/2015
13
Classical Probability
 First proposed by Pascal and Fermat in 1654
 Also called a priori probability because it deals
with ideal games or systems:
 Assumes all possible events are known
 Each event is equally likely to happen
 Fundamental theorem for classical probability is
P = W / N, where W is the number of wins and N
is the number of equally possible events.
4/13/2015
14
Deterministic vs. Nondeterministic
Systems
 When repeated trials give the exact same results,
the system is deterministic.
 Otherwise, the system is nondeterministic.
 Nondeterministic does not necessarily mean
random – could just be more than one way to
meet one of the goals given the same input.
4/13/2015
15
Three Axioms of Formal
Theory of Probability
If E1 and E2 are mutually exclusive events
4/13/2015
16
Experimental and Subjective Probabilities
 Experimental probability defines the probability
of an event, as the limit of a frequency
distribution:
 Experimental probability is also called
a
posteriori (after the event)
 Subjective probability deals with events that are
not reproducible and have no historical basis on
which to extrapolate.
17
Compound Probabilities
 Compound probabilities can be expressed by:
S is the sample space and A and B are events.
 Independent events are events that do not affect
each other. For pairwise independent events,
18
Additive Law
4/13/2015
19
Conditional Probabilities
 The probability of an event A occurring, given
that event B has already occurred is called
conditional probability:
20
Sample Space of Intersecting Events
4/13/2015
21
Advantages and Disadvantages of
Probabilities
 Advantages:
 formal foundation
 reflection of reality (posteriori)
 Disadvantages:
 may be inappropriate
the future is not always similar to the past
inexact or incorrect
 especially for subjective probabilities
Ignorance
 probabilities must be assigned even if no information is available
 assigns an equal amount of probability to all such items
non-local reasoning
 requires the consideration of all available evidence, not only from the rules
currently under consideration
no compositionality
 complex statements with conditional dependencies can not be decomposed into
independent parts
22





4/13/2015
Bayes’ Theorem
 This is the inverse of conditional probability.
 Find the probability of an earlier event given
that a later one occurred.
23
Hypothetical Reasoning
Backward Induction
 Bayes’ Theorem is commonly used for decision tree
analysis of business and social sciences.
 especially useful in diagnostic systems

medicine, computer help systems
 inverse or a posteriori probability

inverse to conditional probability of an earlier event
given that a later one occurred
 PROSPECTOR (expert system) achieved great fame as
the first expert system to discover a valuable molybdenum
deposit worth $100,000,000.
4/13/2015
24
Bayes’ Rule for Multiple Events
Multiple hypotheses Hi, multiple events E1, …, En
P(Hi|E1, E2, …, En) =
(P(E1, E2, …, En|Hi) * P(Hi)) / P(E1, E2, …, En)
or
P(Hi|E1, E2, …, En)=
(P(E1|Hi) * P(E2|Hi) * …* P(En|Hi) * P(Hi)) /
Σk P(E1|Hk) * P(E2|Hk) * … * P(En|Hk)*P(Hk)
with independent pieces of evidence Ei
25
Advantages and Disadvantages of
Bayesian Reasoning
Advantages:
 sound theoretical foundation
well-defined semantics for decision making
Disadvantages:
 requires large amounts of probability data
 subjective evidence may not be reliable
 independence of evidences assumption often not valid
 relationship between hypothesis and evidence is reduced to a
number
 explanations for the user difficult
 high computational overhead
 Major problem: the relationship between belief and disbelief P (H |
E) = 1 - P (H' | E) may not always work!

26
Temporal Reasoning
 Reasoning about events that depend on time
 Expert systems designed to do temporal reasoning
to explore multiple hypotheses in real time are
difficult to build.
 One approach to temporal reasoning is with
probabilities – a system moving from one state to
another over time.
 The process is stochastic if it is probabilistic.
4/13/2015
27
Markov Chain Process
 Transition matrix – represents the probabilities
that the system in one state will move to another.
 State matrix – depicts the probabilities that the
system is in any certain state.
 One can show whether the states converge on a
matrix called the steady-state matrix – a time of
equilibrium
4/13/2015
28
Markov Chain Characteristics
1.
2.
3.
4.
4/13/2015
The process has a finite number of possible
states.
The process can be in one and only one state at
any one time.
The process moves or steps successively from
one state to another over time.
The probability of a move depends only on the
immediately preceding state.
29
State Diagram
Interpretation of a Transition Matrix
4/13/2015
30
The Odds of Belief
 To make expert systems work for use, we must
expand the scope of events to deal with
propositions.
 Rather than interpreting conditional probabilities
P(A|B) in the classical sense, we interpret it to
mean the degree of belief that A is true, given B.
 We talk about the likelihood of A, based on some
evidence B.
 This can be interpreted in terms of odds.
4/13/2015
31
The Odds of Belief (cont.)
 The conditional probability could be referred to as
the likelihood.
 Likelihood could be interpreted in terms of odds
of a bet.
 The odds of A against B given some event C is:
P( A | C )
P( A | C )
If B  A': odds 
P( B | C )
1  P( A | C )
defining P  P( A | C ) then :
odds 
odds 
4/13/2015
P
wins
odds
(
) and P 
1 P
losses
1  odds
32
The Odds of Belief (cont.)
 The likelihood of P=95% (for the car to start in the
morning) is thus equivalent to:
odds 
P
0.95

 19 to 1
1  P 1  0.95
 Probability is natural forward chaining or deductive while
likelihood is backward chaining and inductive.
 Although we use the same simbol, P(A|B), for probability
and likelihood the applications are different.
4/13/2015
33
Sufficiency and Necessity
 The likelihood of sufficiency, LS, is calculated as:
 The likelihood of necessity is defined as:
34
Relationship Among Likelihood Ratio,
Hypothesis, and Evidence
4/13/2015
35
Relationship Among Likelihood of Necessity,
Hypothesis, and Evidence
4/13/2015
36
Uncertainty in Inference Chains
 Uncertainty may be present in rules, evidence
used by rules, or both.
 One way of correcting uncertainty is to assume
that P(H|e) is a piecewise linear function.
4/13/2015
37
Intersection of H and e
4/13/2015
38
Piecewise Linear Interpolation Function for Partial
Evidence in PROSPECTOR
4/13/2015
39
Combination of Evidence
 The simplest type of rule is of the form:
 IF E THEN H
where E is a single piece of known evidence from
which we can conclude that H is true.
 Not all rules may be this simple – compensation
for uncertainty may be necessary.
 As the number of pieces of evidence increases, it
becomes impossible to determine all the joint and
prior probabilities or likelihoods.
4/13/2015
40
Combination of Evidence Continued
 If the antecedent is a logical combination of
evidence, then fuzzy logic and negation rules can
be used to combine evidence.
4/13/2015
41
Types of Belief
 Possible – no matter how remote, the hypothesis
cannot be ruled out.
 Probable – there is some evidence favoring the
hypothesis but not enough to prove it.
 Certain – evidence is logically true or false.
 Impossible – it is false.
 Plausible – more than a possibility exists.
4/13/2015
42
Figure 4.20 Relative Meaning of Some Terms
Used to Describe Evidence
4/13/2015
43
Propagation of Probabilities
 The chapter examines the classic expert system
PROSPECTOR to illustrate how concepts of
probability are used in a real system.
 Inference nets like PROSPECTOR have a static
knowledge structure.
 Common rule-based system is a dynamic
knowledge structure.
4/13/2015
44
Summary
 In this chapter, we began by discussing reasoning
under uncertainty and the types of errors caused
by uncertainty.
 Classical, experimental, and subjective
probabilities were discussed.
 Methods of combining probabilities and Bayes’
theorem were examined.
 PROSPECTOR was examined in detail to see
how probability concepts were used in a real
system.
4/13/2015
45
Summary
 An expert system must be designed to fit the real
world, not visa versa.
 Theories of uncertainty are based on axioms;
often we don’t know the correct axioms – hence
we must introduce extra factors, fuzzy logic, etc.
 We looked at different degrees of belief which are
important when interviewing an expert.
4/13/2015
46
Download