From: AAAI Technical Report FS-92-01. Copyright © 1992, AAAI (www.aaai.org). All rights reserved. PYTHIA:An Expert System for the Optimal Selection of PDESolvers Based on the Exemplar Learning Approach E.N. Houstis, J.R. Rice, P. Varodoglou,S. Weerawarana * Department of Computer Sciences Purdue University West Lafayette, IN 47906, USA. C.E. Houstis University of Crete Department of Computer Science Heraklion, Greece. November 16, 1992 may be supplied by the knowledge engineer in a predefined format. PYTHIA’sinference engine tries to "match" the It is clear that accuracy and execution time are often the given problem to the available population of PDEclasses main performance objectives of a user of numerical simula- and problems and then indirectly infers the appropriate setion models based on partial differential equations (PDEs). lection of parameters from the known performance data of In a PDEcomputation, accuracy is controlled through the the "matched" problem or class of problems. refinement of the grid and the discretization scheme used, In this extended abstract, we provide an overview of while execution time depends on the speed of the targeted problem features and their extraction, the rules used in machiae and the efficiency of the PDEsolver. For s given PYTHIA, the inference algorithm, the architecture, and machiae, the computation of s solution within a certain acfinally the results of some initial experiments. curacy ("epsilon") and time frame ("W") requires the selection of appropriate grid, discretization and solution scheme (method) plus an appropriate machine. If the machine parallel, then the selection of its configuration (numberof processors) and the partitioning of the computation into Problem Features load balanced, optimally parallel subtasks is also required. 3 Unfortunately, the above parameters that affect the performance of a computation depend on various mathematical properties and characteristics of the specified PDEmodel Weclassify problem features in terms of the following comand its unknownsolution. Hence, the acquisition of knowl- ponents: (i) PDEoperator, (ii) PDEequation right (iii) solution, (iv)boundary conditions, (v)initial edge about the mathematical behavior of the given PDE side, and(vi)domain ofdefinition. Thus,thesetofcharmodel and the appropriate selection of these parameters to ditions, acteristics of a PDEproblem includes someclassification achieve the computational objectives of the user constitute information for the above components and somequantitaa "hard" problem. Webelieve that future numerical simuaboutthebehavior (i.e.smoothness and lation systems should be capable of makingthese decisions tiveinformation of theI/Ofunctions (i.e.coefficients for the average user while allowing knowledgeable users to localvariation) the operators, right side of the operator equations, and the be involved with the decision process. solution) of the PDEproblem. 1 Introduction In the PYTHIA environment, the characteristic vector v is determined automatically through symbolic processing of and by symbolic/numeric PYTIIIAis a research projectthat attemptsto solve the PDEproblem specification processing of its I/O functions. Features that cannot be the (grid,configuration) and (method,machine) selecascertained automatically must be provided by the user. tionproblems for the classof "field"problemswithin the //ELLPACK[HRC+90] numericalsimulationsystem Problem features are represented by a data structure by applyingan expertsystemmethodology. PYTHIAap- that contains slots for booleans quantities (for example, pliessymbolic techniques toderive thecharacteristics ofthe whether the operator is self-adjoint or not) as well as nuPDEmodelor allowstheuserto specify themtogether with merical quantities (for example, the smoothness and losomeweighting factors. PYTHIAknowledge baserulesare cal variation of the boundary conditions). Also, if any derivedfroma database of performance dataassociatedperformance data is available for the problem, then that witha knownpopulation of PDE problemsand a library information is kept in another slot as a list of (method, of PDEsolvers. Alternatively, performance dataor rules performance profile) tuples. The characteristic vector of * Worksupported inpartbyAFOSR @rant 91-F49620, NSF~prmata problem is then a vector that lists the values of all the boolean and numerical slots of a problem structure. CCR86-19817. 2 PYTHIA 52 w Report FS-92-01. Copyright © 1992, AAAI (www.aaai.org). All rights reserved. 4From: AAAI TheTechnical Rules problems in C. Then we used the first 20 problems in C (or, 50% of C) as the training set and as before, made Theknowledge basecontains twoclasses of rules.Thefirst predictions for the remaining problems. The Table below area priori rulesprovided by a humanexpertandthelat- shows the results of this test. Each percentage indicates ter are the rulesgenerated by PYTHIA.We expectthat the percentage of correct predictions for each (training set, the numberof human-given ruleswillbe smallas many relative error level) pair. The "NumberPredicted" is the immeasurable quantities suchas theefficiency of theim- 1number of valid predictions made by PYTHIA. plementation alsoplaya significant rolein theperformance Training Number Relative Error of algorithms. Set Predicted0.1% 0.01% 0.001% Thesecond typeof rulesaregenerated by deliberate ac37 76% 73% 57% tionsof the knowledge engineer. The knowledge engineer 19 95% 89% 58% "trains"PYTHIAby executingmanytest problemsthen allowing PYTHIAto gatherinteresting performance data fromthem.Theseperformance profiles arestoredas facts In the next test, we wish to observe the effect of a larger in theknowledge base.A statistical analyzer is alsoused training set on the correctness of predictions on the same to analyze performance datain orderto ranktheperfor-set of problems. Hence, we used the same training sets as above, but, instead of makingpredictions for all remaining manceof variousmethodson one problemand alsothe performance of one methodon manyproblemsof belong- problems for each training set (i.e., from problems 11 and ing to one "class."The knowledge en~neeridentifies a 21 onwards, respectively), in both cases, we made predictions for just problems 21 onwards. The table below shows classof problems andrequests PYTHIAto generate rules these results. for that class using applicable performance profiles. Thus, there are two sub-types of rules generated by PYTHIA: Training Number Relative Error The first type of rules indicate which method works best Set Predicted 0.1% 0.01% 0.001% for a given class of problems at each error level while the 7 86% 86% 100% second type of rules indicate with method works best for a 12 100% 92% 33% given problem at each error level. 8 5 Conclusions The Inference Algorithm Although the abovetestsare extremely preliminary, we observe that PYTHIA does behave quite well. We hope Giventheuser’sproblem, PYTHIAfirstattempts to find environthe classof problems thatis "closest" by matching the to closelycouplePYTHIAwith the//ELLPACK ment in order to allow it to learn each time a problem is properties oftheproblem withproperties of theclass. Once solved by a user. This will allow PYTHIA to improve with theclosest classhasbeenfound, ifclassrules areavailable, evenmoreaccurate. thenthosecanbe usedto provide information to theuser. usageandmakeher predictions Otherwise, PYTHIAfindstheclosestexemplar withinthat classandthenusestherulesforthatproblem to provide References thenecessary information to theuser. [HR82]E. N. Houstis and J. R. Rice. High order methods for elliptic partial differential equa6 Architecture tions with singularities. International dournal for Numerical Methods in Engineering, 18:737PYTHIAis being implemented as an editor within//ELL754, 1982. PACK.The rule generation component uses the ELLPACK Performance Evaluation System to execute test problems, [HRC+90] E. N. Houstis, J. R. Rice,N. P. Chrisochoides, generate performance profiles and to execute the FORH. C. Karathanasis, P. N. Papachiou, M. K. TRANstatistical analyzer. These profiles are then conSamartzis, E. A. Vavalis,Ko-YangWang,and verted in to knowledge base facts by PERLscripts. The S. Weerawarana.//ELLPACK:A numerical facts themselves are represented as CommonLisp strucsimulation programming environment for partures. The expert system component is written in OPS5. allelMIMDmachines. In J. Sopka,editor, ProThe user interface and the driver of the entire system is ceedings of Supercomputing ’go, pages 96-107. written in C. ACMPress, 1990. 7 PreliminaryStudies Weused the class of problems studied in [HR82]to evaluate the PYTHIA.Let C be that class of problems. First, we used the first 10 problems in C (or, 25%of C) as a "training set" to make performance predictions for the remaining 53 1 A prediction is not valid if other factors, cability, are not satim~ied. such as method appli-