Programming without programmers: towards an industrial revolution for software? Colin G. Johnson University of Kent at Canterbury October 18, 2002 The challenge presented in this document is simple to state: to make the production of software possible without any need for people to hand-write any code at all. Careful craft versus throw-away caprice: an analogy from data mining Programs, at present, are seen as a large works of craft, often involving many people working on a single object for many months. It has been commented that a revolution in data mining occurred when it was realized that models of data could be thought of as “throw away” things, created as and when needed, used for a particular purpose, and then discarded. Previously models were seen as something to be carefully built up over several weeks/months of delicate work. Clearly the availability of inductive techniques such as neural networks and efficient algorithms for building decision trees were important technological requirements/drivers for this process. Might there be a similar revolution in the production of computer software? What technological requirements are needed before ordinary software can be created in such a “throw away” fashion? Or is there something in the complexity of software-tackled problems which makes this impossible? Towards an industrial revolution for software The phrase “the software industry” is in common parlance. However it might be argued that software is currently in a pre-industrial stage of development. 1 Individual pieces of software are produced by individuals or teams working in a craft-like fashion rather than by any kind of industrial process. Is there any meaning in the idea of an “industrial revolution” for software? After such a revolution the vast majority of software would be produced by “software factories” (inside our own computers) which take problem descriptions and automatically output software to tackle those problem descriptions. Occasionally “bespoke” software might be produced; but this will be an obscure and rather erudite affair, whose practitioners will be viewed with the same mixture of admiration and pity currently heaped upon practitioners assembly-language programming. What might post-industrial-revolution software look like? The remainder of this document suggests a number of directions. Software as sculpture: clay, stone and meccano Traditionally software is a process of accretion. A problem is taken into the programmer’s mind (possibly supported by notations, tools and methodologies) and the programmer creates an accretion of programming language statements which solve this problem. This is like sculpting with clay. Another view, which might tie in nicely with ideas of object-orientation, is that of assembly. In such a process existing components are assembled together in ways which tackle the problem. This is like building from meccano. Nonetheless this might still seem to be accretion at a higher level of generality. A third attitude is rarely adopted, which is the idea of specialization or removal of irrelevant detail. Might there be a way of programming which works not by accretion but by removal; starting from some large, general structure and deciding what is to be removed in order to create something which solves just the problem at hand. This is like sculpting in stone. Sculptors talk of an intuitive feeling of there being the final sculpture “hidden” inside a lump of rock. There are a couple of places where this peeps into current computer science. One is in the application of meta-heuristics, where a general “problem solving algorithm” (e.g. an optimization algorithm) is made specific for a particular application. There are nice analogies with the biological immune system (and attempts to create computer systems inspired thereby) too. The body learns what is self (so that the immune system doesn’t attack the body) by a “negative selection” whereby matching objects are deleted leaving a repertoire of objects which match everything other than what has been deleted. The second area is in partial evaluation of programs, where a general 2 program is made more specific by specializing it on a certain subset of its input data, or by fixing certain internal parameters. It is perhaps bizarre that these two extreme ends of the computer science spectrum have produced almost the same solution. A fourth attitude takes us out of our analogy. A fourth way of fashioning software is via the distortion of prototypes rather than the specialization of idealized programs. This might be likened to “distorted photography”, or a “hall of mirrors”. Do any of these approaches support the industrial-scale production of software? Can specialization produce anything more profound than the “personalization” which current industrial processes produce (“any colour as long as it is black”; “the product can be personalized with up to three initials” (263 possibilities seems rather thin)). Inductive and deductive program creation We might want to say that there are two main ways in which the goal of eliminating programmers from the programming process might be achieved. Firstly there is a deductive approach. In such an approach we see programming vanishing because the process of going from requirements (“what”) to concrete statements of “how” becomes automated. Secondly there is an inductive approach. This automates the process of going from an “oversimple” or “incorrect” or “partial” program to a program which is more comprehensive, correct or complete, by using some measure to decide when an automated process of moving through program space is “going in the right direction”. There are prototypes for both of these out there already. For the deductive approach, the process of refining a formal specification of some program is a prototype of the kind of approach which could lead to a deductive approach to eliminating the need for programmers. For the inductive approach, search-based program creation methods such as genetic programming (the use of genetic algorithms to search a program-space) provide a similar prototype. Heuristics and theory are part of the same discipline It is interesting that both very “high church” computer science (such as the theory of program transformation) and the “suck it and see” opposite extreme (e.g. the pragmatic end of genetic programming) both have a similar attitude towards program code. At both of these extremes code is seen as malleable stuff, capable of being transformed and combined under a certain set of rules 3 and working fine at the end. By contrast programmers could be said to take a very brittle view of code. It is common to read in commentaries on the programming process of bugs being small, well hidden things with the implication that small changes to code can take it from being a stable, working thing to being something completely broken. The former attitude suggests an area of commonality where previously there was none. It seems feasible (with care) for heuristics and theory to co-exist without the heuristics breaking the rigour of the theory and without the theory constraining the search capacity of the heuristics. “Its déjà vu all over again.” It is interesting to note that during the early development of what we now call programming languages such languages were occasionally referred to with phrases such as “automated program production systems”. The view (this is a deliberately facetious exaggeration) is that once high level languages were available problems with programming would be eliminated; you would just “tell the computer what to do” in some approximation to natural language. Of course things are not as simple. Perhaps what is being proposed here is similar. There is an important difference however; in the ideas proposed above there is a clear change of mode from “how the computer should do it” to “what the computer should do”. This may be the biggest threat to the achievement of this challenge. Perhaps it is just as hard to write accurate descriptions of desired functionality as is it to describe how to achieve that functionality? 4