INFERENCE BY REFINEMENT INDUCTIVE Abstract

From: AAAI-86 Proceedings. Copyright ©1986, AAAI (www.aaai.org). All rights reserved.
INFERENCE
INDUCTIVE
BY REFINEMENT
P. D. Laird*
of Computer
Department
Yale
New
University
Haven,
Abstract
that modify a hypothesis
by making it more general or more specific in response to examples.
The separate
effects of the syntax
(rule space) and semantics,
and the relevant orderings
on these,
are precisely specified.
Relations
called refinement
operators
are
defined, one for generalization
and one for specialization.
General and particular
properties
of these relations
are considered,
and algorithm
schemas
for top-down
and bottom-up
inference
difficulties
common
to refinement
Ct.,
06520
The essential
features
of this simple illustration
apply to
many inductive
learning
problems
in a variety of domains:
formal languages
and automata
(e.g., [Angluin,
19821, [CrespiReghizzi,
19721); programming
languages
(e.g., [Hardy,
19751,
A model is presented
for the class of inductive
inference problems
that are solved by refinement
algorithms
- that is, algorithms
are given. Finally,
are reviewed.
Science
algorithms
[Shapiro,
19821, [Summers,
19771 ); functions
and
([Hunt et al., 19661, [Langley,
19801); propositional
icate logic ([Michalski,
19751, [Shapiro,
19811, [Valiant,
and a variety of representations
specific to a particular
(e.g., [Feigenbaum,
19631, [Winston,
19751).
to explain
Introduction
The topic of this paper is the familiar
problem of inductive
learning: determining
a rule from examples.
Humans
exhibit a strikto solve this problem
in a variety
of situations
- to
that it is difficult to believe that a separate
algorithm
is at work in each case. Hence,
implementing
real systems
that
challenge of identifying
sort of learning.
As an illustration
concept-learning
jects have three
low, blue),
ordered
pair
determined.
represents
object”.
of the basic
problem
attributes:
and shape
in addition
to the problems
of
learn by example,
there is the
fundamental
principles
ideas,
that
consider
underlie
circle).
A concept
consists
Obyelof an
of objects,
possibly
with some attributes
left unFor example,
C = {(large ? circle), (small ? ?)}
the concept “a large circle of any color, and any small
There is a most-general
concept
( { (? ? ?), (? ? ?)})
and a large number
(144) o f most-specific
concepts
(both objects fully specified).
Examples,
or training
instances,
can be
positive or negative:
{(large blue circle),
(small blue triange)}
is a positive
example
of the concept
C above, whereas
{(large
red triangle),
(large blue circle)} is a negative
example.
If the
current
hypothesis
excludes
a positive
example,
the inference
procedure
must generalize
it; and if it includes a negative
example, the procedure
must make it more specific.
Every domain
has
rules
for making
a hypothesis
more
or less general;
here,
a concept
can be generalized
by changing
an attribute
from
by the inverse operation.
specific value to ‘?‘, or specialized
Work funded in part by the National
MCS8002447 and DCR8404226.
l
472
/ SCIENCE
Science Foundation,
under Grants
a
of examples
and a space
of rules rich enough
any set of examples.
Given some examples
and a set of possible
generalize
the hypotheses
that fail to explain
hypotheses,
positive ex-
amples,
amples.
negative
and specialize
If possible,
represent
to express
the rules.
hypotheses
examples
that
imply
in the same
language
ex-
used
Our goal is to present a formal model to unify many of the
ideas common
to these domains.
The value of such a formalism
is that
the following
(adapted
from [Mitchell,
19821).
size (large, small),
color (red,
(triangle,
this
19841))
domain
From the experiences
of many researchers
(see, for example, [Angluin
and Smith,
19831, [Banerji,
19851, [Cohen, 19823,
[Michalski,
19831 f or summaries),
a number of general guidelines
have been suggested:
Define a space
ing ability
the extent
sequences
and pred-
the
essential
projected
application
rithms
constructed,
from first
properties
features
of the
can be identified
without
the need
inductive
component
of a
quickly, and basic
to rediscover
these
algoideas
principles.
Another
advantage
is that the abstract
and limitations
common to algorithms
based on this
model
can
details
of a particular
be identified
and
studied
without
reference
to the
application.
This report
is necessarily
brief, with only the outlines
of
the principal
concepts
given, plus examples
to illustrate
their
application.
More details, examples,
and proofs are available
in
the full report
([Laird,
Inductive
Definition
nents:
19851).
Inference Problems
1 An inductive
ordered
inference
problem
l
A partially
set (D, >), called
l
A set E of expressions
over a finitely
called the syntactic domain.
has
six compo-
the semantic
presented
domain.
algebra,
A mapping
h: E + D such
some expression
e in E.
A designated
that
do of D, called
element
d in D is h(e) for
every
the target object.
An oracle, EX, for “examples”
of do, in the form of signed
expressions
in E. If EX() returns
+e, then do 2 h(e), and
if EX() returns
-e, then do 2 h(e).
An oracle
for the partial
1 if h(el)
returns
The
(>?)
examples
order,
such
that
below
will help
are simply elements
over some algebra
name every possible target
ciation
between
rules and
this
definition
seem
etc.);
according
a new syntax
less ab-
of inductive
of some set;
of a partial
ordering.
Rules
which is expressive
enough to
object.
The mapping
h is the assothe objects they denote.
Note that
h may map more than one expression
to the same
ement; if h(el) = h(ez), th en ei and ez are called
rules. There may, of course, be different syntactical
tions of one semantic
domain
(grammars,
automata,
ioms,
to this model,
the problem
semantic
elh-equivalent
representalogical axchanges
when
& is adopted.
Examples
are elements of the same set & of expressions:
i.e.,
every expression
in E is potentially
an example - positive in case
its semantic
and negative
representation
is no greater
than than the target,
otherwise.
In practice,
the set of examples
is often
limited to a subset of the expressions
EX represents
the mechanism
which
target;
in actuality,
examples
dom source, a query-answering
Note that
The
concerned
ple. In
easy to
nore the
EX depends
of inductive
Example
the
set
(see below).
The oracle
produces
examples
of the
may come from a teacher,
a ransource, or combinations
of these.
on the target
inference.
1 Let X = {xi,.
of subsets
the funcintegers
to integers.
If P is such a program,
then h(P) is the partial
function
computed
by P. Consider
the function
f(x) = x2. A
(LAMBDA (X) (COND ((=
positive
example
of f is the program
X 2) 4))).
Intuitively,
this example states that
out giving any other values of f. The problem
two arbitrary
programs
recursively
unsolvable.
Pi and P2 whether
f (2) = 4 withof deciding
for
Pr > Pz is, of course,
of X.
. . ,xt}
be a finite
For example,
The general inductive
inference
problem
allows any expresin E to be an example.
More often we are limited to a
subset of E for examples
(e.g., in Example
2 above, we may be
given only programs
defined on a single integer rather than arsion
bitrary
programs).
But what property
guarantees
of E .is sufficient
to identify a target uniquely?
Definition
2 Let S be a set of examples
e is said to agree with S if for every
h(e) 2 h(e+), and for every negative
that
a subset
of do. An expression
positive
example
e+ in S,
example
e- in S, h(e) 2
h(e-).
Definition 3 A suficient set of examples of do is a signed
set S of E with the property
that all expressions
that agree
S are h-equivalent
Example
by h to do.
and are mapped
3 In Example
1 above,
subwith
a sufficient
set of examples
for any target
can be formed from only the t minterms
with
exactly one uncomplemented
variable.
In Example
2, a sufficient
set of examples
can be constructed
from programs
of the form:
X i> j 1))) where i and j are integer
(LAMBDA (X) (COND ((=
constants.
do.
object
oracle (z?) serves to abstract
the aspect of the problem
with testing whether
an expression
imples an exampractice,
the complexity
of this problem
ranges from
By oracularizing
it we are choosing
to igunsolvable.
complexity
of this problem while we study the abstract
properties
language
for representing
= fi(x)- A convenient
mapping
in D is the subset of LISP expressions
2 h(ea), and 0 otherwise.
It is more general
than most definitions
stract.
inference,
in that target objects need not be subsets
instead,
they
are expressions
(ei > ez?)
fdx)
tions
set,
X might
Example 4 In many concept-learning
problems,
objects
possess a subset of some group of t binary attributes,
and a concept
is a subset of the set of all possible distinct
objects.
A “ball”,
for example,
might denote the concept
consisting
of all objects
with the “spherical?”
attribute,
regardless
of other attributes
(“red?“,
“wooden?“,
etc.).
As a formal expression
of this domain, let (51,. . . , xl} be Boolean
variables,
and D the set of
sets of assignments
of values (0 or 1) to all of the variables,
par-
and D = 2x,
tially
denote
expressions;
a set
ordered
by 2.
For syntax
h maps
we may take
an expression
the set of Boolean
to the set
of satisfying
as-
of automobile
attributes
etc.), while an element
(sedan, convertible,
front-wheel
drive,
of D specifies those attributes
possessed
signments.
by a particular
model.
containment
relation.
Let D be partially
ordered
by 2,
There are many possible
languages
since then any assignment
satisfying
ei then must also satisfy
eg. A sufficient set of examples for any expression
can be formed
the
for
representing
D, such as the conventional
one for set theory.
A
more algebraic
language
is a Boolean algebra
over the elements
it with elements of D represented
by monomials
of degree
t (minterms). Thus the empty set is represented
by xi . , . xi (x’
expression
Expression
“er +
er is an example
e2” is a tautology
from the set of minterms,
ment.
since
these
(+
of ez iff the
denotes
represent
Boolean
implication),
a single
assign-
Xl,...,
denotes the complement
this case h is a bijection
of z), and {xi}
between minterms
= h(x1xhxi . . . xi). In
and elements of D. If
ml and mz are minterms,
then ml is a positive
every uncomplemented
variable
in ml occurs
example of m2 iff
uncomplemented
Finally, note that for every inductive
inference problem there
is a dual problem,
differing
only in that D is partially
ordered
by 5 (rather
than 2).
e2 5 ei, and a negative
Then er is a positive
example otherwise.
example
of eg if
in mz.
Refinements
Example 2 Let
mapping
integers
define
fl
1
f2
D be the class of partial
recursive
functions
to integers.
If fr and f2 are functions
in D,
iff for every
integer
x such
that
f2(x)
is defined,
In most applications,
the mapping
h: & --) D is not just an
unstructured
assignment
of expressions
to objects.
Usually there
LEARNING
/ 473
is an ordering
on the
k of the expressions
underlying
that
is closely
related
to that
semantics.
For example,
referring
again to
the size-color-shape
example
in the introduction,
we see that
the syntactic
operation
of replacing
an attribute
(“red”) by the
don’t-care
token (“?“)
corresponds
semantically
to replacing
the set of expressed
objects by a larger set. The ordering
k may
only be a quasi-ordering
(reflexive and transitive
symmetric).
But we can still use it to advantage
generalizations
three properties:
statements.
The
but not antiin computing
and specializations
of hypotheses,
provided it has
an order-homomorphism
property
with respect
to h; a computational
completeness
property.
sub-relation
called a refiraement;
These we shall now define.
and
a
Definition
4 Let ? be an ordering
of E. The mapping
h: E +
D is said to be an order-homomorphism
if, for all el and e:! in E
such
el k e2, h(el)
that
There is a dual definition
of a downward refinement p for
an ordering
5: p* = 5, and if (el,e2) E p then h(el) < h(e2).
Nearly
everything
true of upward
refinements
has a dual for
downward
refinements,
but to save space we shall omit dual
2 h(e2).
r.e.
condition
on refinements
means
that
they
can be
effectively
computed.
Thus if e is found to be too specific, an
inference
algorithm
can compute
the set 7(e), and y of each
hyof these expressions,
etc., in order to find a more general
pothesis.
We would like to know that, by continuing
to refine
e upward
for every
in this way, we will eventually
obtain
object d more general than h(e). This
completeness
Definition
property
an expression
motivates
the
of refinements.
6 An upward refinement 7 is said
for e E & if h(r*(e))
= {d 1 d 2 h(e)}.
to be complete
If 7 is complete
for all
e E E then 7 is said to be complete.
Example
5 Referring
1, let rn:! k ml if every
back to Example
uncomplemented
variable in ml also occurs uncomplemented
in
m2. Then h is an order-homomorphism.
In Example
2, it is not clear how to define P2 > PI on arbitrary LISP programs
so as to form an order-homomorphism.
In
[Summers,
19771, this problem
is solved by restricting
the class
of LISP
programs
Example
to have a very specific
6 Because
many systems
as the syntactic
of the expressiveness
form.
of predicate
calculus,
use a first-order
language
L (or subset thereof)
component
of the inference problem.
But what
is the semantic
component,
4 is the analogous
case for
sisted of sets of assignments,
and how is it ordered?
Example
propositional
logic, where D conordered
by 2.
With first-order
logic, Herbrand
models (i.e., sets of variable-free
atomic formulas) take the place of assignments,
and D consists of classes of
(first-order
definable)
H er b rand models, ordered by 2. The sentence ‘dx (red(;c)V
which everything
A syntactic
(~1 implies ‘p2, then any model of ‘p1 is also a model
hence the models of cp1 are a subset of those of 92.
importance
of t
to inductive
inference
5 An
enumerable
(r.e.)
reflexive-transitive
upward
binary
closure
k
is a recursively
relation
on & such that (i) 7* (the
of 7) is the relation
2; and (ii) for all
el,e2 E l, if (el, e2) E 7 then
denotes the set of espressions
/ SCIENCE
7 for
h(el) 2 h(e2). The notation
el such that (el, e) E y.
1, let 7(m)
one of the uncomplemented
Example
8 Let & be the set of first-order
tions of atomic
literals
or their negation)
quantification,
assuming
some fixed language
finement for E (with respect to the ordering
as follows
([Shapiro,
be the
Unify
clauses (i.e., disjuncwith only universal
Z. An upward re? of Example
6) is
7(C) is the set of clauses obtained
one of the following
operations:
two distinct
currences
attributes.
19811):
Let C be a clause.
applying exactly
variables
x and y in C (replace
from
all oc-
of one by the other).
Substitute
for every occurrence
of a variable
call or constant)
general
term (i.e., function
is as follows:
Similarly,
if e is too specific,
e ? e’ can be eliminated
from
refinement
of Example
x a mostwith fresh
variables.
(For example,
replace every 5 by f(x1) where
is a function
symbol in C and x1 does not occur in C.)
and obtain from it expressions
that are more general
or more
specific.
This leads to the notion of a refinement relation.
t’t
by complementing
of cp2, and
In order to take advantage
of the efficiency
induced by the
syntactic
ordering
k, we need the means to take an expression
Definition
language
from m by uncomplementing
exattributes.
Thus 7(z~z~. . . xi) =
and
{(XIX:. . . xi), (5’1x24.. . xi),. . . ) (2’1.. *x:‘tqxCt)},
7 is a complete
upward
7(zlz2
- - . xt) = 0. It is easily seen that
refinement
for minterms.
A complete
downward
refinement
for
minterms
7(m) computes
the set of minterms
obtained
from m
imif
Suppose
a hypothesized
rule e is found to be too general in the
sense that there exists a negative
example
e- such that h(e) 2
h(e-).
Then (assuming
h is an order-homomorphism)
any new
hypothesis
e’ such that e’ t e will also be too general,
and
hence need not be considered.
then any hypothesis
e’ such that
consideration.
7 In the
set of minterms
m’ obtained
actly one of the complemented
- large(x))
designates
the class of models in
that is large is also red.
ordering
is as follows:
given sentences
(~1 and
‘p2 in l, define ‘p2 k ‘p1 iff k ‘p1 - ~32 (where -+ indicates
plication).
It is easy to see that h is an order-homomorphism:
The
Example
7(e)
f
Disjoin
a most-general
literal
with fresh variables
(i.e.,
symbol
and x1
P(XI> or - 14x1>, where p is a predicate
does not occur in C).
For example,
let r(x, y) stand for the relation
Z- is-a-bloodthe-father-ojx,
and m(s) for
relative-ojy,
j(x) f or tl le f unction
the function
the-mother-ojx.
Let C be the clause 7.(x, j(y)) -+
r(x, y), meaning
that if someone is related to a person’s father,
then
he is related
to that
person.
The following
clauses
are all
in 7(C):
0 r(x, j(x))
l
f+4 4 ,
l
+,
f(y))
Each
of the
more
models.
-+ r(x:x)
f M )
-
derived
-
+( 4 ,
(4x,
d
Y) V +I,
clauses
~2))
is easier
to satisfy
and
hence
has
More examples of refinements
over various domains are given
in [Laird, 19851. The task of constructing
a refinement
can be
tricky, because
specializations
one must ensure that all useful
have been included.
But given
tion, it is basically
a problem
problem
as has usually
been
essence, the refinement
rules” or “generalization
(e.g.,
[Michalski,
Below
tive
inference
assume
in algebra,
rather
the case for most
a simple
using
than a heuristic
applications.
In
is a formal expression
of the “production
rules” found in many implementations
19801, [Mitchell,
we sketch
generalizations
or
the formal defini-
an upward
the trivial
the correct
19771).
“bottom-up”
is finite
refinement.
2. There
for all e. (7 is then
is an expression
said
emin E & such
for induc-
For simplicity
we
to be locally finite.)
that
e ? emin for all
3. 7 is complete
The algorithm
for emin.
repeatedly
calls
on EX and tests
the current
hypothesis
against
the resulting
example.
If the hypothesis
is
too specific,
it refines it, placing the more general expressions
onto a queue and taking
a new hypothesis
from the front of
If the hypothesis
no refinement)
and takes
shown that the algorithm
hypothesis
the target.
as a possibility,
algorithms
since
provided
such as the one above
the
is too general,
a new one from
will eventually
EX presents
it discards
it (with
the queue.
It can be
converge
to a correct
a sufficient
set of examples
hypotheses.
Among
these are expressions
which extend
R to
R + WI; but other expressions,
such as R*, will also be constructed
and held on the queue in order.
If another
positive
R + w1 will be discarded,
but R’ will
example
w2 is presented,
be considered
before the refinements
of R + w1 (including
the
trivial one R + w1 + ~2). It can be shown that the algorithm
will converge
to a correct expression,
whether or not the target
is a finite set of strings.
for
of the Refinement
The induction-by-refinement
Initialize
cient algorithms
directly since it is too general
of specific properties
of the domain.
Instead,
H +
emin.
QUEUE +
emptyo.
EXAMPLES
+ empty0.
EXAMPLES +
EXAMPLES
{EXO}.
U
refinement
operators
and specializations
While H disagrees
with
any EXAMPLES:
Using
the (>?) oracle,
check
that
Hz
no negative
examples,
and
H 2
some
positive
example.
If
so,
as measured
sentation
et. al.,
we can construct
a top-down
algorithm
using
7
and emoz.
Other
algorithms
are also possible,
depending
on
the properties
of the domain and the refinements.
Most inductive inference
algorithms
in the literature
are either top-down
or bottom-up
([Mitchell,
19821 suggests
using both in parallel).
And for some domains,
one direction
seems advantageous
over
the other (conjunctive
logical domains,
for example, seem to prefer generalization
seems to occur
to specialization).
mainly when the
for computing
of hypotheses.
This directional
asymmetry
refinement
is locally finite in
by the probability
eration
is clearly
refinements
7(e),
sibilities
more
expressions
that
it is worth
dle the so-called
Disjunction
when the rules are expressed
as
([Biermann
and Feldman,
19721,
when the rules are expressed
as
19851).
observing
how refinement
Problem.
algorithms
In the context
appropriate
generalizations
Recently
several researchers
han-
of classical
governing
the pre-
using
the example
are consistent
with
Z. yielding
.z.
There are many domains
in which the partial
order is too
linear or too “flat” to be of much use in searching
for hypotheses. Consider,
for example,
the problem of finding an arithmetic
recursion
relation
of the form sn = j(sn-1)
to explain
a seWe might, for instance,
try the hypothesis
quence of integers.
2
only one integer in the
s, = s,-1 + 5 and find that it explains
general
([Laird,
important
roles played
and in the definition
of
distribution
- e.g., 7(e, Z) is computed
general
ple, are easier
expressions
to yield effi-
to take advantage
the primary
value
being employed;
but instead
of generating
all
the examples are used to reduce the set of pos-
sequence.
At this point,
dering used in Example
to infer bottom-up
is not expected
of examples
([Valiant,
19841, [V&liant, 19851. [Blumer
19861). I n many of these algorithms,
a refinement
op-
one direction
but not in the other. Note that this is a syntactic
property,
not a semantic
one: regular sets of strings,
for examor as logical axioms
1981]), but top-down
Approach
have been looking for efficient inference
algorithms
that yield
(with high probability)
rules whose “error” is arbitrarily
small,
add 7(H) to QUEUE.
H + front(QUEUE)
.
By duality
model
of the model is the way it clarifies the
by the semantic
and syntactic
orderings,
Do Forever:
Finally,
apply
are strings in and not in the
the alphabet
(0, l} and examples
target set. If the current
hypothesis
is represented
by a regular
expression
R, and a string w1 is presented
that is not included
ALGORITHM UP-INFER:
regular
lead to
operator
7 to hypotheses
in the order in which they are discarded, the minimal generalization
(adding only the one element
to the set) is not necessarily
the first one tried.
For example.
suppose
the domain
is the class of regular
sets of strings
over
Limitations
automata
[Shapiro,
it might
algorithm
will generalize
by applying
7 to
by R, a refinement
R, producing
a set of new expressions
to be tried in turn as
eE t.
the queue.
generalization
rule.
Since refinement
algorithm
that
1. 7(e)
concept-learn
7, this refers to the problem
of forming
a “reasonable”
gener. _ization from examples in a domain that includes
a disjunction
operation.
The trivial generalization,
consisting
of
the disjunction
of all the positive
examples,
is usually
unsuitable since it will never converge to a hypothesis
representing
an
infinite set. On the other hand, it is undesirable
to eliminate
function
than
the “less defined than or equal
2 is no more useful for finding
a simple
generate-and-test
Refinement
algorithms
have generally
the examples are subject to “noise” (e.g.,
performed
[Buchanan
to” ora more
approach.
poorly when
and Mitchell,
19781). They also tend to require that all examples
be stored,
that later refinements
can be tested to avoid over-generalization
or -specialization
(e.g.,
[Shapiro,
19821).
These
two limitations
LEARNING
/ 4’5
so
are inevitably
related:
for, a procedure
which
is tolerant
[8] Crespi-Reghizzi,
of faulty
examples
it chooses
to retain.
all refinement
algorithms
rection (up or down).
It is interesting
in the literature
Consequently
to note that
refine
these
nearly
in only one di-
algorithms
cannot
recover if they over-refine
in response
to a faulty example.
By
contrast,
an algorithm
which can refine upward
and downward
has the potential
for correcting
an over-generalization
resulting
from a false positive example when subsequent
examples
so indicate (e.g., [Shapiro,
19821).
Finally
- and most
on a fixed
incorporate
serious
- the refinement
learning
of geometric
shapes, for example,
define complex
patterns
in terms of the
of the
would
language
(edge,
be too complex
space
(e.g.,
technique
relies
algebraic
language,
without
suggesting
any
new “terms” or “concepts”
into the language.
way to
In the
we could in principle
elementary
relations
E. By contrast,
a suitable
triangle,
box, inside) could
set of higher-level
make the refinement
concepts
path to
a successful
rule short enough to find by searching.
But I am
unaware
of any general technique
for discovering
such new terms
(other than having a friendly
“teacher”
present them explicity).
A successful
in the study
model of this process would
of inductive
learning.
be a significant
advance
[lo]
[11) Hardy, S. Synthesis
IJCAI-75, pp.
[12] Hunt,
of LISP programs
268-273.
E. B., J. Marin,
Induction.
and
New York:
[13] Laird, P. D. Inductive
376, Department
New Haven,
the exposition.
Infor-
limit.
Proc.
Experiments
P. J. Stone.
Press,
in
1966.
inference by refinement.
Tech. Rpt.
of Computer
Science,
Yale UniverCt.,
1986.
[14] Langley,
P. W. Descriptive
discovery
ments in Baconian
science.
Tech.
Computer
Science
versity,
1980.
Department,
processes:
experiRep.
CS-80-121,
Carnegie-Mellon
Uni-
logic and its applications
to
[15] Michalski,
R. V ariable-valued
pattern
recognition
and machine
learning.
In Computer Science and multipte-valued logic theory and applications, D. Rine, ed. Amsterdam:
North-Holland,
506-534.
R. Pattern
recognition
inference.
IEEE Trans. Pat.
(PAMI-2):4
(1980) 349-361.
I am especially
grateful
to Dana Angluin
for many helpful discussions of this work. Thanks
also to Takeshi Shinohara,
whose
careful reading of the original report identified
some errors and
infer-
from examples.
Academic
-[16]_ Michalski,
Acknowledgements
for grammar
71, B. Gilchrist,
ed.
1972, pp. 524-529.
Gold, E. M. Language
identification
in the
mation and Control IO (1967) 447-474.
1975, pp.
improved
model
[9] Feigenbaum,
E. A. The simulation
of verbal learning
behavior.
In Computers and Thought, E. A. Feigenbaum
and J. Feldman,
eds. New York: McGraw-Hill,
1963.
sity,
circle, above, etc.), but the description
to find by searching
through
the rule
S. An effective
ence. In Information Processing
New York: Elsevier North-Holland,
examples
cannot expect to find a hypothesis
consistent
with every example
seen so far and hence must be selective about the
[17] Michalski,
ing.
[18] Mitchell,
R. Theory
AI 20:2
and methodology
(1983)
of inductive
learn-
111-161.
T. M. Version
approach
as rule-guided
inductive
Anal. and Mach. Intel.
Spaces:
a candidate
Proc.
to rule learning.
elimination
IJCAI-77,
pp.
305-
310.
[l] Angluin,
293
D. Inference
of Reversible
(1982) 741-765.
Languages.
J. ACM
[2] Angluin,
D. and C. H. Smith. Inductive
Inference:
theory
and methods.
Computing Surveys 15~3 (1983) 237-269.
The logic of learning.
In Advances in
[3] Banerji,
Ranan.
Computers 24, M. Yovits,
ed., Orlando:
Academic
Press,
1985, pp.
177-216.
[20] Shapiro,
E. Y. (1981).
facts.
Tech. Rep.
ence,
[21] Shapiro,
Ph.
A. W. and
J. Feldman.
On the
synthesis
of
finite-state
machines
from samples
of their behavior.
IEEE Trans. Comput. C-21 :6 (1972) 592 - 597.
[5] Blumer,
A., A. Ehrenfeucht,
D. Haussler,
Classifying
learnable
geometric
concepts
nik-Chervonekis
Dimension.
Theory of Comp.
[6] Buchanan,
learning
May,
B. G. and
T.
of production
Proc.
M. Warmuth.
with the Vap-
18th ACM
Symp.
1986.
M. Mitchell.
rules.
Model-Directed
In Pattern-directed
ference systems, D. A. Waterman
eds. New York: Academic
Press,
in-
and F. Hayes-Roth,
1978, pp. 297-312.
[7] Cohen, P. R. and E. A. Feigenbaum,
eds. The handbook
of artificial intelligence, Vol. III. Los Altos:
William
Kaufmann,
476
/
SCIENCE
Inc.,
1982.
Intel-
Inductive
inference of theories
192, Department
of Computer
from
Sci-
Yale University,
E. Y. (1982).
D. dissertation,
Yale University,
M.I.T. Press.
(221 Summers,
[4] Biermann,
Artificial
[19] Mitchell,
T. M. Generalization
ligence l8:2 (1982) 203-226.
References
struction
as search.
New Haven,
Algorithmic
Computer
New Haven,
P. D. A methodology
from
examples.
[23] Valiant,
L. G. A theory
(1984) 1134-1142.
Ct.,
1981.
program
debugging.
Science Department,
Ct.,
1982.
for
LISP
J.ACM
Published
program
24 :1 (1977)
of the learnable.
of conjunctions.
[25] Winston,
descriptions
P. H. Learning
structur’al
con-
161-175.
C. ACM
[24] Valiant, L. G. Learning
disjunctions
IJCAI-85, pp. 560 - 566.
amples.
In Psychology of Computer
ston, ed. New York: McGraw-Hill,
by
27:ll
Proc.
from
ex-
Vision, P. H. Win1975.