A BASIS FOR A THEORY ... P.A.Subrahmanyam USC/Information Sciences Institute

From: AAAI-80 Proceedings. Copyright © 1980, AAAI (www.aaai.org). All rights reserved.
A BASIS
FOR A THEORY OF PROGRAM
SYNTHESIS’
P.A.Subrahmanyam
USC/Information Sciences Institute
and
Cepartment of Computer Science
University of Utah, Salt Lake City, Utah 84112
With these requirements
1. Introduction
In order
to
reliability
theory
of
obtain
a quantum
software,
it
jump
is imperative
in the
quality
basic
and
principles
argue
(interactive)
that
synthesis
viewing
as that
of abstract
the
that
underly
data types
development
problem
of
provides
syntheeis.
of such a theory
of the applications
tool.
(automatic)
synthesizing
We
2
program
implementations
consisting
a viable basis for a general
We brlefly
describe
A basis for a Theory
---_
Intuitively,
conttructlon
the
of
“good”
1.1. Roquiromonts
of
for en Accrptable Theory of Program
any partial
We view some of the essential requirements
of program synthesis
of en acceptable
come
recursive
objects
the following
additional
on
a
sound
primitives
employed
that
data
a file, a queue, a symbol table, etc.,
as an abstract
hardware.
such
are those
Various
ways of achieving
representations,
is to be useful, we desire that it possess
an
attributes:
exists,
therefore,
implementation
problem
“good”
the
basic
admit
that
(if reliable
rupportrd in prrt by ln IBM Followskip
74
of programs.
evidence
of layers of
an effective
_
in favor
of
type
correponding
of interest”)
to some representation
we think
should
to
the
in terms of another
(the “target
by the following
underly
the synthesis
programs are to be produced consistently):
1. The programming
one of program
spocificrtions
of
primarily analytic
then verifying
it)
program and then
new and insightful perspectives
of programming
and problem
by the
methodologies
to provide
is’ further supported
This perspective
principles
data
(the “type
process
type.“)
available;
are provided
organizations
compelling
of
domain
of program synthesis as one of obtaining
for
of interest
problem
to be already
that
in attempting
the process
abstractions
programming
means of coping with the complexity
there
the
to a given
are presumed
available
advocate
mathematical
that
such primitives
- it should possess adequate flexibility
to
being tailored to specific application tasks1
warn
commonly
relevant
data type that corresponds
work
are
representing
involves
as the basis for a program
- it should serve
development system which can generate provably
correct non-trivial programs;
- it should provide
into the nature
solving,
independent
the illustrations
function can be presented
ultimately,
viewing
if a theory
mind
and operations
using
- it must account for the “state of the art” of
program synthesis; in particular, it must allow for
the generation of “efficient”
programs.
Further,
Although
to be the following:
- it must adhere to Q coherent set of underlying
principles,
and not be based on an ad hoa
collection of heuristics;
based
to
and has the important
data type.
should be general;
- it must be
framework;
as
to be performed
representation
of a problem.
Programming
- the theory
a
such as a stack,
structures
Synthesis.
data typo”
providing
characterization
readily
set of functions
set of objects. Such a collection of objects and
is an “abstract
advantage
of the theory.
of a problem can be viewed
of an appropriate
functions
[5, Sj, and conclude by listing some
Program Synthesis
of
the abstraction
on an associated
the salient
most
1This
the
to characterize
programs.
to have a coherent
software
(automatically)
of program
features
theory
in mind, we now examino the nature
process in an attempt
of the programming
of program synthesis which can serve as the basis for a
sophisticated
theory
and Summary
process should essentially be
synthesis proceeding from the
a problem, rather than being
(e.g. constructing a program and
or empirical (e.g. constructtng a
testing it).
implementations
2. The
specification
of
a problem
should
be
representation
independent.
This
serves
to
guarantee
complete
freedom
in the
program
synthesis
process,
in that no particular
program
is
excluded
a priori
due to overspecification
of the
problem
caused by representation
dependencies.
3. The synthesis
should
be guided
primarily
semantics
of the problem
specification.
relative
of
frequency
This
separation
structure
The
for
led us to adopt
of our
hardware.
to
behavior”.
then
defined
the
behavior
of
the
develop
methods
to seek,
interest.
interest
target
type.
the
crux
“mathematically”
the
of
appropriately
inherent
by the
rather
than
characterization
where
based
upon
there
externally
a requirement
any
provably
formalism
that
viable
This
theory
already
is
for
the
criteria
The
objectives
programs
theory,
case
allows
dependent”
conventional
truly
correct,
There
be admitted
is in consonance
of synthesis
existing
should
empirical
data
to its domain.
the
by a fixed
is that
functions
set of rules
of the
pivotal
and an attempt
is made
is leeway
making
imposed
for
requirements.
relative
efficiency
an indexed
that
such
include:
NEWST
type
of
lies
in
done
refinem@rnt
is
on the type
of
also
of
desired
those
for
to
the
see
associated
[6].)
formal
Although
an
identifier
the
most
specifications
implementations
have
Identifiers),
been
we
recent
may
for
chosen
table
local
next
outer
has
already
this
from
scope,)
been
(retrieve
the
definition
complex
(which
to the
entries
be found
more
for
naming
attributes
RETRIEVE
generated
have
a new
the
and
The
SymbolTable
the identifier
block,)
with
a
and associated
if
(cf. [4].)
of a symbol
(enter
re-establish
current
SymbolTables
type
of SymbolTable,
interest
here is
of
in [4]
definitions
include
definition
an
(see
tests
for
because
of
its familiarity,
choices
example
(test
program
block-structured
SymbolTables.
instance
(discard
scope,
The
manipulating
an identifier
current
“Global”
synthesis
to pin-point
LEAVEBLOCK
Identifier.)
An formal
alternative
(add
attributes
of
for
a new
table,)
in
instances
Symbol
for
a
as a target
ENTERBLOCK
ADDID
declared
from
(spawn
symbol
most
paradigm
of
array
defined
scope,)
ISINBLOCK
by
of
are
outermost
the
involved
are instances
Boolean, etc.; our primary
scope,)
the
semantic
in the
the
the
stepwise
is
(An
using
manipulation
(e.g. [7]>.
steps
synthesis
functions
and
distinction
the
of
the
on
and
define
aspects
is to
of
An important
SymbolTable
some
outline
then,
paradigm
syntactic
we
observable
the
This
illustrate
of objects
Attributes,
of
of
To
target
functions
principle
systems
is the
that
are
synthesized
of backtracking.
“environment
as a limiting
The sorts
Identifier,
synthesis
proposed
the
of the
of some
is provided,
stages
the
both
based
semantics
they
use
a
of
framework
programs
of implementations.
that
to
theory
mathematical
of the proposed
belief
synthesis
of one
(the
objective,
specifications
in a problem.
transformation
our
relevant
an instance
“externally
the
programming.
interpreting
structure
The
the
automatic
the
preserves
incorporating
into
include
outcomes
now
as is possible,
4. An Example: The Synthesis of Block Structured
Table Using an Indexed Array
of another
between
automate
on
with
the
different
the
the
the
of these
feasible
The
in nature;
“efficiency”
theory
approximate
machine
of an implementation
of
to
is
types
trrget
the
the
in as far
formal
in the underlying
of
to
as valid
Synthesis
in terms
which
based
refinement
notion
as a map
implementations
of
of the
to target
i.e.
for Prowam
types
type
and the
interest
leeway
possible
to aid in each
computationally
without
the
implementations.
of interest)
two
point
any object
representing
characterized
by its
The
type
Intuitively,
of
nature
applicative
obtained
relating
paradigm
to the
it can even
Paradigm
(the
of
relating
aid efficient
the view that
is completely
observable
such
was
and
theory
incorporation
formulation
was that the synthesis
problems
which
an algebraic
[l, 21, [3, 43. An important
theory
In fact,
particular
We adopt
a type
process
goal
sound
are
3. the Proposed
interest,
our
(a)
(b)
manner.
primarily
architectures
guided
independent
are
underlying
most
-- it becomes
and
assumptions
is
In summary,
by
specification,
attempt
b.
decision
type)
in a relatively
task
which
is algebraic.
of this
objects
tasks
synthesis
modules
The
of any
type
build
our
principles
their
context
of use, and (c)
to further
subdivide
synthesis.
independent
data
to
adequate
upon
imposed
problem
underlying
consequence
of
of the
seek
mathematically
user
interaction
with the system
becomes
more viable,
since the level of reasoning
is
now “visible”
to the user.
depending
constraints
the
program
development
suited
the
in
a. existing
paradigms
of programming
such as
“stepwise
refinement”
can be viewed
in a
mathematical
framework
above
the
functions
demanded
by the
these
two,
serves
of
complexity
4. The
level
of reasoning
used by the synthesis
paradigm
should
be
appropriate
to
“human
reasoning,”
rather
than being
machine
oriented
(see
[6]).
In addition
to making
the paradigm
computationally
more feasible,
this has two major
advantages:
of
inherent
requirements
interface
by the
different
of use.)
The
of
overall
functions
the
SymbolTable)
75
synthesis
defined
into
on
one
proceeds
the
of the
type
flowing
by
first
of
three
categorizing
interest
categories:
(here,
the
the
0) Base
Constructor
functions
that
type
NEWST);
(ii)
(e.g.
generate
new
ENTERBLOCK,
return
instances
instances
of
(termed
step
to
functions)
of SymbolTables:
these
A major
provide
step
for
identify
a type,
are
the
kernel
serve
(e.g.
than
B(NEWST)
E <NEWARRAY,ZERO,NEWAOTl’
B(AOOIO(s,id,al))
- <ASSIGN(a,SUCC(i),<id,al>),SUCC(i),adt
g(ENTERBLOCK(s))
- <a,& ENTERBLOCK.AOTl(adtl,i)>
aLEAVEBLOCK(s))
- <a,D.AOTl(adt
1), LEAVEBLOCK.ADTl
BIISINBLOCK(s,id
1)) = ISINBLOCKTT(<a,i,adt
l>,id 1)
bKRETRIEVE(s,idl))
- RETRIEVETT(<a,i,adt
l>,ldl)
to
ADDID,
functions
that
SymbolTable
model
(e.g.
Specifically,
is partitioned
defined
implementation
indicate
therefore
imposed
upon
terms
syntactic
structure
an
(of
domain
into
implementation;
of an indexed
of
to capture.
how
it
how
of the
of space,
Since
no
of
the behavior
the
the details
of how
which
Array
(automaticlly)
can,
in turn,
by a recursive
Here,
Other
proj(i,exl..xn”)
- Xi.
Figure
1. A Symbol
Implementations
that
are
map.
isomorphic
to
defined
in course
of the
be synthesized
invocation
Othw
in
We note
is almost
generated
a “Block
a
Examples
Several
Programs
of
synthesis
the
implementations
Strut tured
in terms
Text-Editor,
of the synthesis
for
Mark”
ENTERBLOCK,
implementation
similar
to a “hash
table”
suggested
upon
examining
the semantics
defined
on the Symbol
Table.
--l----------cII--------------------~~~==~=-------~~~~~~~”~~--
the
of
the
successor
and
been
algorithms
for
a
text
the
developed
Stsck,
SymbolTable,
o
of
synthesized
dispteys,
so far.
a Queue,
an
formatter,
Synthesis
interactive
a hidden
Paradigm.
by direct
applications
These
a Oeque,
include
a
Block
line-orlented
surface
elimination
and an execution
engine
for
a
Symbol
and
eferencso
an
implementation
is
of the functions
menus
Applications
have
to identify
[l]
J.Goguen,
J.Thatcher,
E.Wagner,
J.Wright.
Initial Algebra
Semantics
and Continuous
Algebras.
JACM 24:68-95,
1977.
(23 J.Goguen,
J.Thatcher,
E.Wagner.
An Initial Algebra
Approach
to
the
Specification,
Correctness,
and implementation
of
in Current
Trends
in Programming
Abstract
Data
Types,
Methodology,
Vol IV, Ed. R.Yeh, Prentice-Hall,
NJ, 1979, pages
80-149.
[33 J.Guttag,
E.Horowitz,
O.Musser.
The Design of Data Type
Specifications,
in Current
Trends in Programming
Methodology,
Vol IV, Ed. R.Yeh, Prentice-Hall,
N.J., 1979.
[43 J.Guttag,
E.Horowitz,
O.Musser.
Abstract
Data Types
and
Software
Validation.
CACM 21:1048-64,
1978.
[53 P.A.Subrahmanyam.
Towards
a Theory of Program
Synthesis:
Automating
Implementations
of Abstract
Data
Types.
PhO
thesis,
Dept.
of Comp. SC., State University
of New York a!
Stony
Brook,
August,
1979.
16 J P.A.Subrahmanyam.
A Basis
for
a Theory
of Program
Synthesis.
Technical
Report,
Dept.
of Computer
Science,
University
of Utah, February,
1980.
[7 3 OBarstow.
Knowledge-Based
Program
Construction,
Elsevier
North-Holland
Inc., NY., 1979.
We list below
one of the implementations
generated
for a
SymbolTable,
using
an Array
as the initially
specified
target
The
final
representation
consists
of
the
triple
type.
<Array,lnteger,AOT1*1
the integer
represents
the current
index
into the array,
whereas
AOTl
(for &diary
Eata lype-1)
is
introduced
in course
of the synthesis
process,
and is isomorphic
to a Stack that records
the index-values
corresponding
to each
ENTERBLOCK
performed
on
a particular
instance
of
a
SymbolTable.)
We denote
this by-writing
t#s) - <a,i,edtl>.
informally, ENTERBLOCK.ADTl
serves
to “push”
the current
index
value onto the stack adtl, LEAVEBLOCK.ADTl
serves
to
“POP” the stack, and D.ADTl
returns
the -topmost
element in the
Stack, returning
a zero if the stack is empty.
WCC and PREO are
lmplemen!a!iOn
is done.
is shown
algorithm
for graphical
data driven
machine,
include
an implementation
using
application
Of
the
function
Table
to the
procedures.
Table
the
I>, idl)
an
equations
this
(automatically)
data
Stack
l*,id 1)
terms,
the “implementation
this
type
what
is related
(adtl)>
as follows:
RETRIEVETT(<a,i,adt
l>,id 1) =
if i - ZERO
then UNDEFINED
else if proj(l,OATA)a,i))
- id1
then proj(2,
OATA(a,i))
else RETRIEVETT(<a,
PRED(i),adt
by the
The defining
are defined
1’
of
“contributes”
towards
this
this
“semantic
structure”
generated
type
of
is precisely
0 denotes
was
terms
classes
SymbolTable
we omit
implemetations
integers)
from
its) equivalence
of the underlying
auxiliary
the TOI is to
in the specification
-- and this
is attempting
1, wherein
for
and RETRlEVETT
ISINBLOCKTT(~a,i,adt
1~’ id 1) if i - ZERO
then FALSE
else if D.ADTl(adt
1) < i
then if proj(l, OATA
= idl,
then TRUE
else ISINBLOCKTT(<a,PRED(i),adt
else FALSE
all instances
functions.
be inferred
the
function
and
to lack
functions
and ENTERBLOCK.
kernel
is explicit
must
of each
partitioning,
of the
to generate
ADOID,
the
on the type
the
these
on the type.
Such an inference
follows
of the axioms
defining
the extraction
SymbolTable
extractors
of
an implementation
for
functions
functions.
that
ones
E&&or
serve
NEWST,
in obtaining
a suitable
Stack
of the
that
a subset
which
the functions
defined
from
an examination
figure
(iii)
other
an implementation
Due
instances
lSlNE3LOCKTT
is
kernel
One
existing
and
types
new
functions
ISINBLOCK).
next
model
to spawn
from
LEAVEBLOCKh
RETRIEVE,
The
serve
Canstructor
functions
on
Integers.
76