Slides

advertisement
Explanation-Based Learning
(borrowed from mooney et al)
Explanation-Based Learning (EBL)
One definition:
Learning general
problem-solving
techniques by
observing and
analyzing solutions
to specific problems.
SBL (vs. EBL)
lots of data (examples)
• Similarity-based learning (SBL) are inductive:
– generalizes from training data
– empirically identifies patterns that distinguish between positive
and negative examples of a target concept.
• Inductive results are justified empirically (e.g., by
statistical arguments such as those used in establishing
theoretical results in PAC learning).
• Generally requires significant numbers of training
examples in order to produce statistically justified
conclusions.
• Generally does not require or exploit background
knowledge.
EBL (vs. SBL)
lots of knowledge
• Explanation-based learning (EBL) is (usually) deductive:
– uses prior knowledge to “explain” each training example
– Explanation identifies what properties are relevant to the target
function and which are irrelevant.
• Prior knowledge is used to reduce the hypothesis space
and focus the learner on hypotheses that are consistent
with prior knowledge about the target concept.
• Accurate learning is possible from very few (0) training
examples (typically 1 example per learned rule).
The EBL Hypothesis
• By understanding why an example is a member of a target concept,
one can learn the essential properties of the concept
• Trade-off
the need to collect many examples
for
the ability to “explain” single examples (via a domain theory)
• This assumes the domain theory is competent:
– Correct: does not entail that any negative example is positive
– Complete: each positive example can be “explained”
– Tractable: an “explanation” can be found for each positive
example.
SBL vs. EBL
entailment constraints
SBL:
Hypothesis & Descriptions ╞ Classifications
Hypothesis is selected from restricted hypothesis space.
EBL:
Hypothesis & Descriptions ╞ Classifications
Background╞ Hypothesis
EBL Task
• In addition to a set of training examples, EBL
also takes as input a domain theory,
background knowledge about the target concept
that is usually specified as a set of logical rules
(Horn clauses) and operationality criteria.
• The goal is to find an efficient or operational
definition of the target concept that is consistent
with both the domain theory and the training
examples.
EBL Task: operationality
observable vs. unobservable
• Operationality is often imposed by restricting the
hypothesis space to using only certain predicates (e.g.,
those that are directly used to describe the examples).
• Observable: predicates used to describe examples
• Unobservable: the target concept
• In “classical EBL” the learned definition is
– logically entailed by the domain theory
– a more efficient definition of the target concept
– requires only “look-up” (pattern matching) using
observable predicates rather than search (logical
inference) mapping observables to unobservables.
EBL Task
Given:
• Goal concept
• Training example
• Domain Theory
• Operationality Criteria
Find: a generalization of the training example that
is a sufficient criteria for the target concept and
satisfies the operationality criteria
EBL Example
• Goal concept: SafeToStack(x,y)
• Training Examples: One example
SafeToStack (Obj1,Obj2)
On(Obj1,Obj2)
Type(Obj1,Box)
Type(Obj2,Endtable)
Color(Obj1,Red)
Color(Obj2,Blue)
Volume(Obj1, 0.1)
Owner(Obj1,Molly)
Owner(Obj2, Muffet)
Fragile(Obj2)
Material(Obj1,Cardboard)
Material(Obj2,Wood)
Density(Obj1,0.1)
EBL Example
• Domain Theory:
SafeToStack(x,y) :- not(Fragile(y)).
SafeToStack(x,y) :- Lighter(x,y).
Lighter(x,y) :- Weight(x,wx), Weight(y,wy), wx < wy.
Weight(x,w) :- Volume(x,v), Density(x,d), w=v*d.
Weight(x,5) :- Type(x,Endtable).
Fragile(x) :- Material(x,Glass).
• Opertional predicates: Type, Color, Volume,
Owner, Fragile, Material, Density, On, <, >, =.
EBL Method
For each positive example not correctly covered by an
“operational” rule do:
1. Explain: Use the domain theory to construct a
logical proof that the example is a member of the
concept.
2. Analyze: Generalize the explanation to determine a
rule that logically follows from the domain theory
given the structure of the proof and is operational.
Add the new rule to the concept definition.
EBL Example
Training Example:
SafeToStack (Obj1,Obj2) Type(Obj2,Endtable)
Volume(Obj1, 0.1)
Density(Obj1,0.1)
…
Domain Theory:
SafeToStack(x,y) :- Lighter(x,y).
Lighter(x,y) :- Weight(x,wx), Weight(y,wy), wx < wy.
Weight(x,w) :- Volume(x,v), Density(x,d), w=v*d.
Weight(x,5) :- Type(x,Endtable).
…
Example Explanation (Proof)
SafeToStack(Obj1,Obj2)
Lighter(Obj1,Obj2)
Weight(Obj1,0.6)
Weight(Obj2,5)
06.<5
Volume(Obj1,2)
0.6=2*0.3
Density(Obj1,0.3)
Type(Obj2.Endtable)
Generalization
• Find the weakest preconditions A for a conclusion C such that A
entails C using the given proof.
• The general target predicate is regressed through each rule used in
the proof to produce generalized conditions at the leaves.
• To regress a set of literals P through a rule H :- B1,...Bn
(B={B1,...Bn}) using literal L element of P
Let Ф be the most general unifier of L and H
apply the resulting substitution to all the literals in P and B
and return: P = (PФ - LФ) U BФ
Also apply the substitution to update the conclusion: C=CФ
• After regressing the general target concept through each rule used
in the proof return: C :- P1,...Pn (P={P1...Pn})
Generalization Example
Regress {SafeToStack(x,y)} through
SafeToStack(x1,y1) :- Lighter(x1,y1).
Unifier: Ф = {x/x1, y/y1}
Result: {Lighter(x,y)}
Lighter(Obj1,Obj2)
Weight(Obj1,0.6)
Weight(Obj2,5)
06.<5
Generalization Example
Regress {Lighter(x,y)} through
Lighter(x2,y2) :- Weight(x2,wx2), Weight(y2,wy2), wx2 < wy2.
Unifier: Ф = {x/x2, y/y2}
Result:{Weight(x,wx), Weight(y,wy), wx < wy}
Weight(Obj1,0.6)
Weight(Obj2,5)
Generalization Example
Regress {Weight(x,wx), Weight(y,wy), wx < wy} through
Weight(x3,w) :- Volume(x3,v), Density(x3,d), w=v*d.
Unifeir: Ф = {x/x3, wx/w}
Result: {Volume(x,v), Density(x,d), wx=v*d,
Weight(y,wy), wx < wy}
Weight(Obj2,5)
Generalization Example
Regress {… Weight(y,wy) …} through
Weight(x4,5) :- Type(x4,Endtable).
Unifier: Ф = {y/x4, 5/wy}
Result: {Volume(x,v), Density(x,d), wx=v*d,
Type(y,Endtable), wx < 5}
Learned Rule:
SafeToStack(x,y) :- Volume(x,v), Density(x,d), wx=v*d,
Type(y,Endtable), wx < 5.
Re Generalization
• Simply substituting variables for constants in the proof
will not work because:
– Some constants (Endtable,5) may come from the domain theory
and cannot be generalized and maintain soundness.
– Two instances of the same constant may or may not generalize
to the same variable depending on structure of the proof (e.g.
assume both the weight and density happened to be the same in
the example, but they clearly don’t have to be the same in
general).
• Since generalization is basically performing a set of
unifications and substitutions and these operations have
linear time complexity, generalization is a quick, lineartime process.
Knowledge as Bias
• The hypotheses produced by EBL are obviously strongly
biased by the domain theory it is given.
• Being able to alter the bias of a learning algorithm by
supplying prior knowledge in declarative form
(declarative bias) is very useful (e.g., by adding new
rules and predicates).
• EBL assumes a complete and correct domain theory, but
theory refinement and other methods can be biased by
incomplete and incorrect domain theories.
Perspectives on EBL
• EBL as theory guided generalization of examples:
Explanations are used to distinguish relevant from
irrelevant features.
• EBL as example guided reformulation of theories:
Examples are used to focus on which operational
concept reformulations to learn are “typical”
• EBL as knowledge compilation: Deductive
consequences that are particularly useful (e.g., for
reasoning about the training examples) are “compiled
out” to subsequently allow for more efficient reasoning.
Standard Approach to EBL
An Explanation (detailed proof of goal)
goal
facts
After Learning (go directly from facts to solution):
goal
facts
Knowledge-Level Learning
(Newell, Dietterich)
Knowledge closure
all things that can be inferred from a collection of rules and facts
“Pure” EBL only learns how to solve faster, not how to solve
problems previously insoluble.
Inductive learners make inductive leaps and hence can
solve more after learning.
EBL is often called “Speed-up” learning
(not knowledge-level learning)
What about considering resource-limits (e.g., time) on
problem solving?
Utility of Knowledge Compilation
• Deductive reasoning is difficult and frequently
similar conclusions must be derived repeatedly.
• Some domains have complete and correct
theories and learning involves deriving useful
consequences that make reasoning more
efficient, e.g. chess, mathematics, etc.
Utility of Knowledge Compilation
• Different types of knowledge compilation:
– Static: Not example-based, reformulate KB up front to
make it more efficient for general inferences of a
particular type.
– Dynamic: Uses examples, perhaps, incrementally, to
tune a system to improve efficiency on a particular
distribution of problems.
• Dynamic systems like EBL make the inductive
assumption that improving performance on a set of
training cases will generalize to improved performance
on subsequent test cases.
Utility Problem
• After learning many macro-operators, macro-rules, or search control
rules, the time to match and search through this added knowledge
may start to outweigh its benefits (Minton 1988)
• A learned rule must be useful in solving new problems frequently
enough and save enough processing time in order to compensate
for the time need to attempt to match it every time.
Utility = (AvgSavings x ApplicFreq) - AvgMatchCost
• EBL methods can frequently result in learning a set of rules with
negative overall utility resulting in slowdown rather than the intended
speedup.
Addressing the Utility Problem
• Improve Efficiency of Matching: Preprocess learned
rules to improve their match effiicency.
• Restrict Expressiveness: Prevent learning of rules with
combinatorial match costs.
• Selective Acquisition: Only learn rules whose expected
benefit outweighs their cost.
• Selective Retention: Dynamically forget expensive
rules that are rarely used.
• Selective Utilization: Restrict the use of learned rules to
avoid undue cost of application.
Imperfect Theories and EBL
Incomplete Theory Problem
Cannot build explanations of specific problems because
of missing knowledge
Intractable Theory Problem
Have enough knowledge, but not enough computer time
to build specific explanation
Inconsistent Theory Problem
Can derive inconsistent results from a theory (e.g.,
because of default rules)
Applications
• Planning (macro operators in STRIPS)
• Mathematics (search control in LEX)
Planning with Macro-Operators
• AI planning using Strips operators is search intensive.
• People seem to utilize “canned” plans to achieve
everyday goals.
• Such pre-packaged planning sequences (macrooperators) can be learned by generalizing specific
constructed or observed plans.
• Method is analogous to composing Horn-clause rules by
generalizing proofs.
• A problem is solved by first trying to use learning macrooperators, falling back on general planning as a last
resort.
STRIPS
Original planning system which used means-ends analysis
and theorem proving in robot planning
Sample actions:
GoThru(A,D,R1,R2)
Preconditions: In(A,R1), Connects(D,R1,R2)
Effects: In(A,R2), ⌐In(A,R1)
PushThru(A,O,D,R1,R2)
Preconditions: In(A,R1), In(O,R1) Connects(D,R1,R2)
Effects: In(A,R2), In(O,R2),⌐In(A,R1), ⌐In(O,R1)
STRIPS
• Sample Problem:
State:
In(r,room1), In(box,room2),
Connects(d1,room1,room2),
Connects(d2,room2,room3)
Goal: In(box,room1)
• Sample Solution:
GoThru(r,d1,room1,room2)
PushThru(r,box,d1,room2,room1)
Learned Macro-Operator
EBL generalizing this plan produces the following macro-operator:
GoThruPushThru(A,D1,R1,R2,O,D2,R3)
Preconditions:
InRoom(A,R1), InRoom(O,R2), Connects(D1,R1,R2),
Connects(D2,R2,R3), ⌐(A=O & R1=R2)
Effects:
InRoom(O,R3), InRoom(A,R3), ⌐InRoom(A,R2),
⌐InRoom(O,R2), ⌐(R3=R1) → ⌐InRoom(A,R1)
• Extra preconditions needed to prevent precondition clobbering
during execution of generalized plan.
• Conditional effects come from possible deletions in the generalized
plan.
Representing Plan MACROPS
Strips actually used a “triangle table” to implicitly store
macros for every subsequence of the actions in the plan.
Plan: [State] OP1 → OP2 → OP3 → OP4 → OP5 [Goal]
“Op1
Op1
Op1
Op1
Op1
Op2
Op2 Op3
Op2 Op3 Op4
Op2 Op3 Op4 Op5”
The triangle table supports treating any of the 10
subsequence of the generalized plan as a macrop in
future problems.
Experimental Results
Planning time with and without learning (min:sec)
trial
1
2
3
4
5
No learn
3:05
9:42
7:03
14:09
--
learning
3:05
3:54
6:34
4:37
9:13
Learning Search Control
• Search control rules are used to select operators during search.
IF the state is of the form ∫ r f(x) dx,
THEN apply the operator MoveConstantOutsideIntegral
• Such search control rules can be learned by explaining how the
application of an operator in a sample problem led to a solution:
∫ 3sin(x)dx → 3 ∫ sin(x)dx → 3 cos(x)
• Positive examples of when to apply an operator are states in which
applying that operator leads to a solution, negative examples are
states in which applying the operator leads away from the solution
(i.e. another operator leads to the solution).
• Induction and combinations of explanation and induction can also be
used to learn search control rules.
EBL variations
• Generalizing to N: handling recursive rules in proofs
• Knowledge Deepening: explaining shallow rules
• Explanation-based induction and abductive generalization
Generalizing to N
(Shavlik, BAGGER2)
Handling recursive or iterative concepts
(recursive rules in proofs).
goal
P
1
2
P
P
3
4
P
P
5
6
Learned rules:
Goal ← P & gen-2
P ← gen-3 V gen-5 V gen-6 V recursive-gen-1 V recursive-gen-2
Knowledge Deepening
When two proofs, A and B, exist for a proposition, and proof
A involves a single (shallow) rule, P→Q, and the
weakest preconditions of proof B is equivalent to P, then
proof B “explains” rule P→Q.
Shallow rule: “leaves are green”
Explanation: “leaves are green because they contain
mesophylls, which contain chlorophyll, which is a green
pigment.
Knowledge Deepening
(leaf ?x)
Part(?x ?y) & (isa ?y Mesophyll)
Part(?y ?z) & (isa ?z Chrorophyll)
(green ?x)
(green ?y)
(green ?z)
The weakest preconditions of both proofs are the
same: (leaf ?x)
Use the more complicated proof to explain the
shallow rule.
Explanation-Based Induction
Teleology: function suggests structure
• Identify a “teleologic explanation”
Structural properties supporting physiological goal:
“leaf dehydration is avoided by the cutilcle covering
the leaf’s epidermis”
• Identify the weakest preconditions of the explanation.
• Separate into
– Structural preconditions: epidermis covered by cuticle
– Qualifying preconditions: performs transpiration
• Find other organs satisfying the qualifying conditions:
stems, flowers, fruit.
• Hypothesize they also have the structural conditions:
“are the epidermises of stems, flowers, and fruit also
covered by a cuticle?”
Download