From: AAAI-80 Proceedings. Copyright © 1980, AAAI (www.aaai.org). All rights reserved. A BASIS FOR A THEORY OF PROGRAM SYNTHESIS’ P.A.Subrahmanyam USC/Information Sciences Institute and Cepartment of Computer Science University of Utah, Salt Lake City, Utah 84112 With these requirements 1. Introduction In order to reliability theory of obtain a quantum software, it jump is imperative in the quality basic and principles argue (interactive) that synthesis viewing as that of abstract the that underly data types development problem of provides syntheeis. of such a theory of the applications tool. (automatic) synthesizing We 2 program implementations consisting a viable basis for a general We brlefly describe A basis for a Theory ---_ Intuitively, conttructlon the of “good” 1.1. Roquiromonts of for en Accrptable Theory of Program any partial We view some of the essential requirements of program synthesis of en acceptable come recursive objects the following additional on a sound primitives employed that data a file, a queue, a symbol table, etc., as an abstract hardware. such are those Various ways of achieving representations, is to be useful, we desire that it possess an attributes: exists, therefore, implementation problem “good” the basic admit that (if reliable rupportrd in prrt by ln IBM Followskip 74 of programs. evidence of layers of an effective _ in favor of type correponding of interest”) to some representation we think should to the in terms of another (the “target by the following underly the synthesis programs are to be produced consistently): 1. The programming one of program spocificrtions of primarily analytic then verifying it) program and then new and insightful perspectives of programming and problem by the methodologies to provide is’ further supported This perspective principles data (the “type process type.“) available; are provided organizations compelling of domain of program synthesis as one of obtaining for of interest problem to be already that in attempting the process abstractions programming means of coping with the complexity there the to a given are presumed available advocate mathematical that such primitives - it should possess adequate flexibility to being tailored to specific application tasks1 warn commonly relevant data type that corresponds work are representing involves as the basis for a program - it should serve development system which can generate provably correct non-trivial programs; - it should provide into the nature solving, independent the illustrations function can be presented ultimately, viewing if a theory mind and operations using - it must account for the “state of the art” of program synthesis; in particular, it must allow for the generation of “efficient” programs. Further, Although to be the following: - it must adhere to Q coherent set of underlying principles, and not be based on an ad hoa collection of heuristics; based to and has the important data type. should be general; - it must be framework; as to be performed representation of a problem. Programming - the theory a such as a stack, structures Synthesis. data typo” providing characterization readily set of functions set of objects. Such a collection of objects and is an “abstract advantage of the theory. of a problem can be viewed of an appropriate functions [5, Sj, and conclude by listing some Program Synthesis of the abstraction on an associated the salient most 1This the to characterize programs. to have a coherent software (automatically) of program features theory in mind, we now examino the nature process in an attempt of the programming of program synthesis which can serve as the basis for a sophisticated theory and Summary process should essentially be synthesis proceeding from the a problem, rather than being (e.g. constructing a program and or empirical (e.g. constructtng a testing it). implementations 2. The specification of a problem should be representation independent. This serves to guarantee complete freedom in the program synthesis process, in that no particular program is excluded a priori due to overspecification of the problem caused by representation dependencies. 3. The synthesis should be guided primarily semantics of the problem specification. relative of frequency This separation structure The for led us to adopt of our hardware. to behavior”. then defined the behavior of the develop methods to seek, interest. interest target type. the crux “mathematically” the of appropriately inherent by the rather than characterization where based upon there externally a requirement any provably formalism that viable This theory already is for the criteria The objectives programs theory, case allows dependent” conventional truly correct, There be admitted is in consonance of synthesis existing should empirical data to its domain. the by a fixed is that functions set of rules of the pivotal and an attempt is made is leeway making imposed for requirements. relative efficiency an indexed that such include: NEWST type of lies in done refinem@rnt is on the type of also of desired those for to the see associated [6].) formal Although an identifier the most specifications implementations have Identifiers), been we recent may for chosen table local next outer has already this from scope,) been (retrieve the definition complex (which to the entries be found more for naming attributes RETRIEVE generated have a new the and The SymbolTable the identifier block,) with a and associated if (cf. [4].) of a symbol (enter re-establish current SymbolTables type of SymbolTable, interest here is of in [4] definitions include definition an (see tests for because of its familiarity, choices example (test program block-structured SymbolTables. instance (discard scope, The manipulating an identifier current “Global” synthesis to pin-point LEAVEBLOCK Identifier.) An formal alternative (add attributes of for a new table,) in instances Symbol for a as a target ENTERBLOCK ADDID declared from (spawn symbol most paradigm of array defined scope,) ISINBLOCK by of are outermost the involved are instances Boolean, etc.; our primary scope,) the semantic in the the the stepwise is (An using manipulation (e.g. [7]>. steps synthesis functions and distinction the of the on and define aspects is to of An important SymbolTable some outline then, paradigm syntactic we observable the This illustrate of objects Attributes, of of To target functions principle systems is the that are synthesized of backtracking. “environment as a limiting The sorts Identifier, synthesis proposed the of the of some is provided, stages the both based semantics they use a of framework programs of implementations. that to theory mathematical of the proposed belief synthesis of one (the objective, specifications in a problem. transformation our relevant an instance “externally the programming. interpreting structure The the automatic the preserves incorporating into include outcomes now as is possible, 4. An Example: The Synthesis of Block Structured Table Using an Indexed Array of another between automate on with the different the the the of these feasible The in nature; “efficiency” theory approximate machine of an implementation of to is types trrget the the in as far formal in the underlying of to as valid Synthesis in terms which based refinement notion as a map implementations of of the to target i.e. for Prowam types type and the interest leeway possible to aid in each computationally without the implementations. of interest) two point any object representing characterized by its The type Intuitively, of nature applicative obtained relating paradigm to the it can even Paradigm (the of relating aid efficient the view that is completely observable such was and theory incorporation formulation was that the synthesis problems which an algebraic [l, 21, [3, 43. An important theory In fact, particular We adopt a type process goal sound are 3. the Proposed interest, our (a) (b) manner. primarily architectures guided independent are underlying most -- it becomes and assumptions is In summary, by specification, attempt b. decision type) in a relatively task which is algebraic. of this objects tasks synthesis modules The of any type build our principles their context of use, and (c) to further subdivide synthesis. independent data to adequate upon imposed problem underlying consequence of of the seek mathematically user interaction with the system becomes more viable, since the level of reasoning is now “visible” to the user. depending constraints the program development suited the in a. existing paradigms of programming such as “stepwise refinement” can be viewed in a mathematical framework above the functions demanded by the these two, serves of complexity 4. The level of reasoning used by the synthesis paradigm should be appropriate to “human reasoning,” rather than being machine oriented (see [6]). In addition to making the paradigm computationally more feasible, this has two major advantages: of inherent requirements interface by the different of use.) The of overall functions the SymbolTable) 75 synthesis defined into on one proceeds the of the type flowing by first of three categorizing interest categories: (here, the the 0) Base Constructor functions that type NEWST); (ii) (e.g. generate new ENTERBLOCK, return instances instances of (termed step to functions) of SymbolTables: these A major provide step for identify a type, are the kernel serve (e.g. than B(NEWST) E <NEWARRAY,ZERO,NEWAOTl’ B(AOOIO(s,id,al)) - <ASSIGN(a,SUCC(i),<id,al>),SUCC(i),adt g(ENTERBLOCK(s)) - <a,& ENTERBLOCK.AOTl(adtl,i)> aLEAVEBLOCK(s)) - <a,D.AOTl(adt 1), LEAVEBLOCK.ADTl BIISINBLOCK(s,id 1)) = ISINBLOCKTT(<a,i,adt l>,id 1) bKRETRIEVE(s,idl)) - RETRIEVETT(<a,i,adt l>,ldl) to ADDID, functions that SymbolTable model (e.g. Specifically, is partitioned defined implementation indicate therefore imposed upon terms syntactic structure an (of domain into implementation; of an indexed of to capture. how it how of the of space, Since no of the behavior the the details of how which Array (automaticlly) can, in turn, by a recursive Here, Other proj(i,exl..xn”) - Xi. Figure 1. A Symbol Implementations that are map. isomorphic to defined in course of the be synthesized invocation Othw in We note is almost generated a “Block a Examples Several Programs of synthesis the implementations Strut tured in terms Text-Editor, of the synthesis for Mark” ENTERBLOCK, implementation similar to a “hash table” suggested upon examining the semantics defined on the Symbol Table. --l----------cII--------------------~~~==~=-------~~~~~~~”~~-- the of the successor and been algorithms for a text the developed Stsck, SymbolTable, o of synthesized dispteys, so far. a Queue, an formatter, Synthesis interactive a hidden Paradigm. by direct applications These a Oeque, include a Block line-orlented surface elimination and an execution engine for a Symbol and eferencso an implementation is of the functions menus Applications have to identify [l] J.Goguen, J.Thatcher, E.Wagner, J.Wright. Initial Algebra Semantics and Continuous Algebras. JACM 24:68-95, 1977. (23 J.Goguen, J.Thatcher, E.Wagner. An Initial Algebra Approach to the Specification, Correctness, and implementation of in Current Trends in Programming Abstract Data Types, Methodology, Vol IV, Ed. R.Yeh, Prentice-Hall, NJ, 1979, pages 80-149. [33 J.Guttag, E.Horowitz, O.Musser. The Design of Data Type Specifications, in Current Trends in Programming Methodology, Vol IV, Ed. R.Yeh, Prentice-Hall, N.J., 1979. [43 J.Guttag, E.Horowitz, O.Musser. Abstract Data Types and Software Validation. CACM 21:1048-64, 1978. [53 P.A.Subrahmanyam. Towards a Theory of Program Synthesis: Automating Implementations of Abstract Data Types. PhO thesis, Dept. of Comp. SC., State University of New York a! Stony Brook, August, 1979. 16 J P.A.Subrahmanyam. A Basis for a Theory of Program Synthesis. Technical Report, Dept. of Computer Science, University of Utah, February, 1980. [7 3 OBarstow. Knowledge-Based Program Construction, Elsevier North-Holland Inc., NY., 1979. We list below one of the implementations generated for a SymbolTable, using an Array as the initially specified target The final representation consists of the triple type. <Array,lnteger,AOT1*1 the integer represents the current index into the array, whereas AOTl (for &diary Eata lype-1) is introduced in course of the synthesis process, and is isomorphic to a Stack that records the index-values corresponding to each ENTERBLOCK performed on a particular instance of a SymbolTable.) We denote this by-writing t#s) - <a,i,edtl>. informally, ENTERBLOCK.ADTl serves to “push” the current index value onto the stack adtl, LEAVEBLOCK.ADTl serves to “POP” the stack, and D.ADTl returns the -topmost element in the Stack, returning a zero if the stack is empty. WCC and PREO are lmplemen!a!iOn is done. is shown algorithm for graphical data driven machine, include an implementation using application Of the function Table to the procedures. Table the I>, idl) an equations this (automatically) data Stack l*,id 1) terms, the “implementation this type what is related (adtl)> as follows: RETRIEVETT(<a,i,adt l>,id 1) = if i - ZERO then UNDEFINED else if proj(l,OATA)a,i)) - id1 then proj(2, OATA(a,i)) else RETRIEVETT(<a, PRED(i),adt by the The defining are defined 1’ of “contributes” towards this this “semantic structure” generated type of is precisely 0 denotes was terms classes SymbolTable we omit implemetations integers) from its) equivalence of the underlying auxiliary the TOI is to in the specification -- and this is attempting 1, wherein for and RETRlEVETT ISINBLOCKTT(~a,i,adt 1~’ id 1) if i - ZERO then FALSE else if D.ADTl(adt 1) < i then if proj(l, OATA = idl, then TRUE else ISINBLOCKTT(<a,PRED(i),adt else FALSE all instances functions. be inferred the function and to lack functions and ENTERBLOCK. kernel is explicit must of each partitioning, of the to generate ADOID, the on the type the these on the type. Such an inference follows of the axioms defining the extraction SymbolTable extractors of an implementation for functions functions. that ones E&&or serve NEWST, in obtaining a suitable Stack of the that a subset which the functions defined from an examination figure (iii) other an implementation Due instances lSlNE3LOCKTT is kernel One existing and types new functions ISINBLOCK). next model to spawn from LEAVEBLOCKh RETRIEVE, The serve Canstructor functions on Integers. 76