PYTHIA: An Expert System for the ... Based on the Exemplar Learning ...

From: AAAI Technical Report FS-92-01. Copyright © 1992, AAAI (www.aaai.org). All rights reserved.
PYTHIA:An Expert System for the Optimal Selection of PDESolvers
Based on the Exemplar Learning Approach
E.N. Houstis, J.R. Rice, P. Varodoglou,S. Weerawarana
*
Department of Computer Sciences
Purdue University
West Lafayette, IN 47906, USA.
C.E. Houstis
University of Crete
Department of Computer Science
Heraklion, Greece.
November 16, 1992
may be supplied by the knowledge engineer in a predefined
format. PYTHIA’sinference engine tries to "match" the
It is clear that accuracy and execution time are often the given problem to the available population of PDEclasses
main performance objectives of a user of numerical simula- and problems and then indirectly infers the appropriate setion models based on partial differential equations (PDEs). lection of parameters from the known performance data of
In a PDEcomputation, accuracy is controlled through the the "matched" problem or class of problems.
refinement of the grid and the discretization scheme used,
In this extended abstract, we provide an overview of
while execution time depends on the speed of the targeted
problem
features and their extraction, the rules used in
machiae and the efficiency of the PDEsolver. For s given
PYTHIA,
the inference algorithm, the architecture,
and
machiae, the computation of s solution within a certain acfinally
the
results
of
some
initial
experiments.
curacy ("epsilon") and time frame ("W") requires the selection of appropriate grid, discretization and solution scheme
(method) plus an appropriate machine. If the machine
parallel, then the selection of its configuration (numberof
processors) and the partitioning of the computation into
Problem
Features
load balanced, optimally parallel subtasks is also required. 3
Unfortunately, the above parameters that affect the performance of a computation depend on various mathematical
properties and characteristics of the specified PDEmodel Weclassify problem features in terms of the following comand its unknownsolution. Hence, the acquisition of knowl- ponents: (i) PDEoperator, (ii) PDEequation right
(iii)
solution,
(iv)boundary
conditions,
(v)initial
edge about the mathematical behavior of the given PDE side,
and(vi)domain
ofdefinition.
Thus,thesetofcharmodel and the appropriate selection of these parameters to ditions,
acteristics
of a PDEproblem
includes
someclassification
achieve the computational objectives of the user constitute
information
for
the
above
components
and somequantitaa "hard" problem. Webelieve that future numerical simuaboutthebehavior
(i.e.smoothness
and
lation systems should be capable of makingthese decisions tiveinformation
of theI/Ofunctions
(i.e.coefficients
for the average user while allowing knowledgeable users to localvariation)
the operators, right side of the operator equations, and the
be involved with the decision process.
solution) of the PDEproblem.
1
Introduction
In the PYTHIA
environment, the characteristic vector v
is determined automatically through symbolic processing of
and by symbolic/numeric
PYTIIIAis a research projectthat attemptsto solve the PDEproblem specification
processing
of
its
I/O
functions.
Features that cannot be
the (grid,configuration)
and (method,machine)
selecascertained
automatically
must
be
provided by the user.
tionproblems
for the classof "field"problemswithin
the //ELLPACK[HRC+90]
numericalsimulationsystem
Problem features are represented by a data structure
by applyingan expertsystemmethodology.
PYTHIAap- that contains slots for booleans quantities (for example,
pliessymbolic
techniques
toderive
thecharacteristics
ofthe whether the operator is self-adjoint or not) as well as nuPDEmodelor allowstheuserto specify
themtogether
with merical quantities (for example, the smoothness and losomeweighting
factors.
PYTHIAknowledge
baserulesare cal variation of the boundary conditions). Also, if any
derivedfroma database
of performance
dataassociatedperformance data is available for the problem, then that
witha knownpopulation
of PDE problemsand a library information is kept in another slot as a list of (method,
of PDEsolvers.
Alternatively,
performance
dataor rules performance profile) tuples. The characteristic vector of
* Worksupported
inpartbyAFOSR
@rant
91-F49620,
NSF~prmata problem is then a vector that lists the values of all the
boolean and numerical slots of a problem structure.
CCR86-19817.
2
PYTHIA
52
w
Report FS-92-01. Copyright © 1992, AAAI (www.aaai.org).
All rights
reserved.
4From: AAAI
TheTechnical
Rules
problems
in C.
Then we used
the first 20 problems in C
(or, 50% of C) as the training set and as before, made
Theknowledge
basecontains
twoclasses
of rules.Thefirst predictions for the remaining problems. The Table below
area priori
rulesprovided
by a humanexpertandthelat- shows the results of this test. Each percentage indicates
ter are the rulesgenerated
by PYTHIA.We expectthat the percentage of correct predictions for each (training set,
the numberof human-given
ruleswillbe smallas many relative error level) pair. The "NumberPredicted" is the
immeasurable
quantities
suchas theefficiency
of theim- 1number of valid predictions made by PYTHIA.
plementation
alsoplaya significant
rolein theperformance
Training Number
Relative Error
of algorithms.
Set
Predicted0.1% 0.01% 0.001%
Thesecond
typeof rulesaregenerated
by deliberate
ac37
76%
73%
57%
tionsof the knowledge
engineer.
The knowledge
engineer
19
95%
89%
58%
"trains"PYTHIAby executingmanytest problemsthen
allowing
PYTHIAto gatherinteresting
performance
data
fromthem.Theseperformance
profiles
arestoredas facts In the next test, we wish to observe the effect of a larger
in theknowledge
base.A statistical
analyzer
is alsoused training set on the correctness of predictions on the same
to analyze
performance
datain orderto ranktheperfor-set of problems. Hence, we used the same training sets as
above, but, instead of makingpredictions for all remaining
manceof variousmethodson one problemand alsothe
performance
of one methodon manyproblemsof belong- problems for each training set (i.e., from problems 11 and
ing to one "class."The knowledge
en~neeridentifies
a 21 onwards, respectively), in both cases, we made predictions for just problems 21 onwards. The table below shows
classof problems
andrequests
PYTHIAto generate
rules
these results.
for that class using applicable performance profiles. Thus,
there are two sub-types of rules generated by PYTHIA:
Training
Number
Relative Error
The first type of rules indicate which method works best
Set
Predicted
0.1% 0.01% 0.001%
for a given class of problems at each error level while the
7
86%
86%
100%
second type of rules indicate with method works best for a
12
100%
92%
33%
given problem at each error level.
8
5
Conclusions
The Inference Algorithm
Although
the abovetestsare extremely
preliminary,
we
observe
that
PYTHIA
does
behave
quite
well.
We
hope
Giventheuser’sproblem,
PYTHIAfirstattempts
to find
environthe classof problems
thatis "closest"
by matching
the to closelycouplePYTHIAwith the//ELLPACK
ment
in
order
to
allow
it
to
learn
each
time
a
problem
is
properties
oftheproblem
withproperties
of theclass.
Once
solved
by
a
user.
This
will
allow
PYTHIA
to
improve
with
theclosest
classhasbeenfound,
ifclassrules
areavailable,
evenmoreaccurate.
thenthosecanbe usedto provide
information
to theuser. usageandmakeher predictions
Otherwise,
PYTHIAfindstheclosestexemplar
withinthat
classandthenusestherulesforthatproblem
to provide
References
thenecessary
information
to theuser.
[HR82]E. N. Houstis and J. R. Rice. High order
methods for elliptic partial differential equa6 Architecture
tions with singularities. International dournal
for Numerical Methods in Engineering, 18:737PYTHIAis being implemented as an editor within//ELL754, 1982.
PACK.The rule generation component uses the ELLPACK
Performance Evaluation System to execute test problems, [HRC+90] E. N. Houstis,
J. R. Rice,N. P. Chrisochoides,
generate performance profiles and to execute the FORH. C. Karathanasis,
P. N. Papachiou,
M. K.
TRANstatistical
analyzer. These profiles are then conSamartzis,
E. A. Vavalis,Ko-YangWang,and
verted in to knowledge base facts by PERLscripts. The
S. Weerawarana.//ELLPACK:A numerical
facts themselves are represented as CommonLisp strucsimulation
programming
environment
for partures. The expert system component is written in OPS5.
allelMIMDmachines.
In J. Sopka,editor,
ProThe user interface and the driver of the entire system is
ceedings
of Supercomputing ’go, pages 96-107.
written in C.
ACMPress, 1990.
7
PreliminaryStudies
Weused the class of problems studied in [HR82]to evaluate
the PYTHIA.Let C be that class of problems. First, we
used the first 10 problems in C (or, 25%of C) as a "training
set" to make performance predictions for the remaining
53
1 A prediction is not valid if other factors,
cability, are not satim~ied.
such as method appli-