MODELING PERFECT BEHAVIOR: A GOAL-DRIVEN LEARNING ANALYSIS We

MODELINGPERFECT BEHAVIOR: A GOAL-DRIVEN LEARNING ANALYSIS
From: AAAI Technical Report SS-94-02. Compilation copyright © 1994, AAAI (www.aaai.org). All rights reserved.
LeonaF. Fass
Wehave taken a formal, foundational approach
to problemsof reasoning and learning, considering
a bodyof knowledgeto-be-acquired as a "desired
behavior". Learning is, to us, determination of a
"device" that models the specified behavior,
precisely andperfectly. Withsuitable constraints on
such a learning system, guidance of the learning
process, selection of experimentsand descriptionsof
the goal devicecan also be precise or, well-defined.
Here we briefly discuss our formal approach to
reasoningandlearning,particularlyas it is relatedto
the fundamental issues of goal-driven learning
described by Leakeand Ram[8]. Wealso describe
someapplications wehave found(and hope to find)
for our theory.
During the earliest development of formal
theories for computerscience muchattention was
given to designing "abstract devices" and to
reasoning about their "states". What we now
consider classical workalong these lines (e.g.,
Moore[1 I], Myhill[12], Nerode[13]) investigated
such areas as relationships betweencomponent-state
structure and device behavior; necessity or
eliminationof devicestates; choice or certainty in
state-to-state transitions; and achievement
of goals
throughaccessingof "final states". If a devicecould
be shownto producea precisely specified behavior,
then it realized its "function"behaved"perfectly",
and wasverifiably correct.
Astime progressedthe focus of formalcomputer
science shifted from such machinefoundations to
mathematics-of-the-feasible and time-and-space
efficiency. However, there has been renewed
interest in the classical foundationalapproach,in
connectionwith artificial intelligence researchinto
such areas as "human-like"
reasoning,
logical/philosophical epistemicsand computational
learning theory. Wehave used just such an
algebraic and logical approach to problems of
reasoningand learning. Wehavedefined learning in
terms of modelinga behavior, based on reasoning
about behavioral observations, with the goal of
determining precisely the model’s function,
structure, choicesand states.
Once a learning system determines an
appropriatemodelingdevice, the behaviorit models
is learned. When,as part of the procedure--- the
system decides what to accept as an appropriate
model; determines what necessary and sufficient
behavioralinformationcharacterizes the model;and
determinesexperimentsor constructive steps that,
given the behavioralinformation,define or produce
125
the model --- then the learning process maybe
viewedas goal-driven.
Webeganto investigate reasoning or learning
problems within the frameworkof a particular
process: formal language acquisition. This is a
reasoning problemconcerned with acquiring what
maywell be an infinite bodyof knowledge
(e.g., an
infinite set of sentences). Whileefficiency wasnot
our concern, effectiveness certainly was: we did
require that reasoningor learning be completed,and
that a result be obtained,finitely! Alearningsystem
couldnot possibly acquire the (infinite) linguistic
knowledgeby, say, storing or "memorizing"each
element (e.g., sentence) of a language. It would
haveto generalize from somerepresentation of the
language, conveyedin a finite way. If this goal
could be achieved, wedeterminedlearning wouldbe
achieved, through the acquisition of a perfectly
characterizing finite model. Wefirst chose to
investigate the problem for languages that are
context-free[2].
In the spirit of our learning approach, we
defined the language as a behavior within a
constrained containing domain,so that elementsof
the languagewere foundwithin the behavior, while
its complement
(relative to the domain)contained
syntactic structures that were not. The goal,
behavioral model of the language was a grammar
that producedexactly the language(realizing that
function), or a recognitive device that acceptedall
and only the language’selements. Thestates of the
grammar or device were "goal oriented",
correspondingto howfar along the generative or
recognitive reasoning process had progressed, in
determiningwhetheran elementwasin the language
(or not). Wewere able to showthat choice could be
eliminated (and thus, no wrongchoices made)
makingall steps functionalor deterministic.
Fromthe perspective of formal (context-free)
languagetheory, we established that such perfect
modelsof languageexisted. Fromthe perspectiveof
goal-driven learning, weestablished that, with an
unlimited class of possible models, we could
constrain the search process to discover as
"learnable", specific perfect behavioralmodels.The
actual learning process required a system to
experiment with behavioral samples until it had
enoughinformationto generalize, fromthe observed
behavior, to the perfect behavioral model.In the
language acquisition case, this meant that the
learning system wouldexperiment with languageinformationsamplesandlearn not just a givenset of
correct sentences. Rather, the systemwouldacquire
also goal-driven, when,as "the tester", it chooses
experimentsbasedon what/t considers to be correct
or incorrect.
In the language acquisition example, and
similarly mathematically constrained knowledge
acquisition problems, we have been able to show
that a perfect behavioralmodelcan be conclusively
determined
througha finite selection of "adversarial"
goal-driventests (with mathematicalconstraints, we
can determine finite characterizing complements:
whatis "in" a behaviorand whatis "not" [3-6]).
a model for morethan what it had observed: it
wouldhave a finite meansto determineeverything
that is in the language(as opposedto whatis not).
It wasonly throughappropriaterepresentationof
the (context-free) language that we succeeded
modelingthe linguistic behavior perfectly, for
structural properties of such languages made
representation a critical factor in obtaining our
results. Wewere able to show, in our specific
language case, that the componentsof perfect
behavioral models corresponded to congruence
classes of languagestructures. Afinite modelhad
finite classes. With sufficient distinguishing
experiments(generalizing Moore[11]) a learning
systemcould actively construct a perfect language
model,in an effective fashion, fromfinite behavioral
information representing each of the model’s
component(congruence) classes. The learning
systemalgorithm succeededby generalizing Myhill
[12] and Nerode[13].
As wehave just described, using a logical and
algebraic "foundational" approach, wedevelopeda
theory of reasoning about (infinite) information
through discovery/determinationof modelsof such
"behaviors" and their component classes and
"states". Wehavehad varyinglevels of success in
applyingour theory to several behavioraldomains,
dependent on what can be discovered about a
behavior’smathematicalstructure (e.g., can it be
describedfunctionally?finitely? canit be processed
deterministically?). This structure can define the
learnable, perfect behavioralmodel,if it exists. The
learning system can exploit the mathematical
structure to chooseits informationand experiments,
andits learnable model,with a goal-drivenprocess,
learningeffectively.
In the original formal language acquisition
problemweinvestigated we found completesuccess
in reasoningabout languagemodels,with inference
or goal-driven testing. A learning system could
easily adapt, throughchangesin its experiments,to
determinecomponents
of alternate behavioralmodels
that it defines as "perfect" and worthyof being
learned. (E.g., our system finds a minimalcomponentlanguage model, but could redirect its
goal to find a modelthat processes language more
time-efficiently.)
The language exampleis just one instance of
suchlearnability. Onceit is determinedthat a finite
perfect learnable modelexists, it is a relatively
simple matter to find it effectively, achievinga
learning goal. In a sense we werethe goal-driven
learning system, when we sought to solve the
languageacquisition problemwefirst approached.
Wedeterminedwhat modelto acquire (i.e., what to
learn) and howto acquire it. Wethen conveyedthis
capabilityto our algorithm,or learningsystem[2-4].
Wefind relationships betweenour approach,to
generalizing frombehavioral observations, and the
overarchingof tasks described in Ngand Bereiter
[14] as cited in Leake and Ram[8]. Wefind very
strong relationships between our constructive
approach to discovering behavioral models, and
Michalski’sinferential theoryof learning [10] also
cited in [8].
Based on such work as Cherniavsky, Statman
and Velauthapillai [1], we extended our original
work to show that "adversarial" reasoning was
possible, and that potential given behavioralmodels
couldbe tested (for incorrectness) to determine(by
default) that they werecorrect. In this case, the
states and structure mayor maynot be knownwhen
the reasoning system determines the tests.
Successfultesting results in default verificationif it
is possible to effectively characterizeboth behavior
that is correct (takingthe reasonerto a goal, "final
state") andbehaviorthat is not (relative to a specific
behavioral domain). The tester tries to showthe
potential model is wrong (e.g., under known
conditions,"goesinto the wrongstate", and doesnot
realize its "intended function"). But if, after
sufficient tests, no incorrectness is detected, the
tester can only conclude the modelis perfectly
correct. In such an approachthe learning systemis
Wehave had less success with applications of
our theory to natural languagelearning, reasoning
about arbitrary computational(program)processes,
or the common
sense "human-like"reasoning under
study today. A perfect model, that a goal-driven
system might seek, may not exist. The system
might choose an alternative model, and deal, in
future, with anomalies. Taking into account the
difficulty
of the problems we have been
investigating, wemayconsider as successful such
imperfect behavioral modeling that is correct
"sometimes"or, "approximately".
In the case of reasoningaboutnatural language,
not only is there great disagreement over such
language’smathematical
structure [3, 6], there is the
additional problem that new language may be
created, out-dating a modelonceit is found. But we
showthat/f there is a formal(context-free) finite
126
model then through adaptive techniques it maybe
identified in the limit [7] or tested, similarly. If
there is no such formal finite model, we mayaccept
as successful an adaptively-obtained model that is
identified "approximately". A significant knowledge
subset could thus be acquired.
~r~eeSented
1987Linguistic
Institute, and
MtgLogic,
on the
oreticalatInteractions
of Linguistics
Stanford, Jul 1987. Abstract, J. SymbolicLogic,
Vol 53, No. 4 (Dec 1988) pp. 1277-1278.
Research Note, SIGART Special Issue on
KnowledgeAcquisition, Apt 1989, pp. 175-176.
[3] Fass, L. F., "Applying SomeCFLLearnability
Results to NaturalLanguageLearning", presented
at AAAI-Stanford Spring SymposiumSeries,
Symposium on Machine Learning of Natural
Languageand Ontology, Stanford, Mar 1991.
Appearsin SymposiumNotes, pp. 48-52.
In the instance of learning or, reasoning about
"function", "states" and structure of arbitrary
computational processes or programs, we conjecture
that only approximationof results is possible, and in
general, such reasoning can never be perfectly
correct. Our approach can lead to some (at least
partial) assessment of program correctness that
comparesnot unfavorably with processes often used
today [4, 5]. Similar conclusions have been reached
by other theoreticians, e.g., Cherniavskyet al [1].
Techniques that establish
some behavioral
correctness, would appear to be preferable to
existing ad hoc processes that have no theoretical
foundations and thus, maynever establish correct
behavior at all. A flexible goal-driven system might
be satisfied with a "best possible" result.
[4] Fass, L. F., "Inference and Testing: When’Prior
Knowledge’
is Essential to Learning",in Notes of
AAAI-92 Workshop on Constraining Learning
Through Prior Knowledge, San Jose, CA, Jul
1992, pp. 88-92.
[5] Fass, L. F., "Software Design as a Problem in
Learning Theory", in Notes of the AAAI-92
Workshopon Automating Software Design, San
Jose, CA,Jul 1992, pp. 48-49.
[6] Fass, L. F., "Canonical (CF) Grammars and
Natural Language",presented at the 1993Annual
Mtg of the Linguistic Society of America, Los
Angeles, Jan 1993, Research Overview,15 pp.,
abstracted in MtgHandbook,p. 23.
Amongthe most interesting and confounding
learning or behavioral modeling problems we have
been investigating are the non-monotonic, "nonalgebraic", commonsense reasoning processes as
described by Lenat et al [9]. Here we have only
begun to examine the possibility
of modeling
behaviors, and to understand the difficulties
involved. At best we would expect a theory of
"weakly approximate" reasoning, unlike the
deterministic,
certain, and correct algebraic/
automata-theory reasoning of [2], and more in line
with humanreasoning as evidenced each day. In
such processes it maybe as difficult to specify the
intended function of a behavioral modelas it is to
construct/determine it. Weexpect that in such a
case, a system using goal-driven, and substantial
non-goal-driven, learning would be required.
[7] Gold, E. M., "Language Identification in the
Limit", Information and Control, Vol 10 (1967),.
pp. 447-474.
[8] Leake, D. and A. Ram, "Goal-Driven Learning:
FundamentalIssues", AI Magazine,Vol 14, No. 4
(1993), pp. 67-72.
[9] Lenat, D. B., and R. V. Guha, K. Pittman, D.
Pratt, M. Shepherd, "CYC:Toward Programs
With Common
Sense", Communications of the
ACM,,Vol 33 (1990), pp. 30-49.
[10] Michalski, R., "Inferential Theoryof Learning:
Developing Foundations for Multistrategy
Learning",forthcom,paper (1993) cited in [8].
[11] Moore,E. F., "Gedanken-Experiments
on Sequential Machines",in AutomataStudies, Princeton
Univ. Press, Princeton 1956, pp. 129-153.
Acknowledgment:
We are grateful
to the
unidentified reviewers whosuggested improvements
t.
that refocusedour earlier submittal
[12] Myhill,J., "Finite Automataandthe Representation
of Events", WADC
Tech. Rpt, 57-624, WrightPatterson AFB,Ohio, Nov1957.
SELECTED REFERENCES
[13] Nerode, A., "Linear AutomatonTransformations",
Proc. of the AmericanMathematicalSociety, Vol
9, (1958), pp. 541-544.
[1] Cherniavsky, J. C., R. Statman and M.
Velauthapillai, "Testing and InductiveInference:
Abstract Approaches",Georgetown
Univ. Dept. of
Computer Science Series, TR-5 1987. Also
appears m Proc. of the First Workshop on
Computational Learning Theory, MorganKaufmann,1988.
[14] Ng, E. and C. Bereiter, "Three Levels of Goal"
Orientation in Learning", J. of the Learning
Sciences, Vol 1, No. 3-4, pp. 243-271.
[2] Fass, L. F., "Learnability of CFLs: Inferring
Syntactic Modelsfrom Constituent Structure",
Dr. Fass maybe reachedat mailing address:
P.O. Box 2914; Carmel CA93921
t Spacelimitations precludeextensionssuggested,relating to Valiant’s
and to Haussler’swork.
127