The Modest Software Engineer Martyn Thomas Visiting Professor of Software Engineering, Oxford University Computing Laboratory Email: mct@thomas-associates.co.uk Abstract Autonomous Decentralised Systems must be dependable or they will be unhelpful, at best, and may cause serious harm. Despite major advances in theoretical computer science and software engineering, industrial practice has not kept pace and most software intensive system development projects fail. If we are to achieve the benefits promised by new system architectures, industry will have to adopt much more rigorous software engineering practices, and universities will have to extend the amount of software engineering they teach, and to use such methods routinely on their own projects. This paper is dedicated to the great computer scientist and programmer, Edsger Dijkstra, who influenced a generation and who died last year. Key words: autonomous decentralised systems, computer science, software engineering, theory, practice, specification, dependability, verification, education. 1 Introduction The invention of the stored-programme computer in 1948 [1] started a revolution in human capabilities that will not be complete in our lifetime. Indeed, we are still only at the beginning, despite the remarkable progress since the first business computer ran its first program [2], half a century ago. This conference provides a good illustration of how far we still have to go. In the words of the Call for Papers: “… … possibilities and opportunities for realizing highly efficient and dependable business and control systems have been steadily increasing. Dynamically changing social and economic situations demand next-generation systems based on emerging technologies and applications. Such systems are expected to have the characteristics of living systems composed of largely autonomous and decentralized components.” That is a challenging vision, but few researchers would argue that it is impossible in principle. In practice, the greatest barriers to progress will lie in engineering problems rather than in science. They will be problems identified three decades ago and still unsolved, largely because of a failure of industrial will rather than a lack of scientific knowledge. To illustrate this thirty year gap, I have included throughout this paper quotations from two sources: the 1969 second NATO conference on Software Engineering Techniques [3] and Edsger Dijkstra’s 1972 Turing Award lecture The Humble Programmer [4]. If you doubt that there are still major engineering problems, consider the evidence of two commercial surveys in the USA and Europe. In 1995, The Standish Group [5] reported that the average US software project overran its budgeted time by 190%, its budgeted costs by 222%, and delivered only 60% of the planned functionality. Only 16% of projects were delivered at the estimated time and cost, and 31% of projects were cancelled before delivery, with larger companies performing much worse than smaller ones. Later Standish Group surveys show an improving trend, but success rates are still low. A UK survey, published in the 2001 Annual Review of the British Computer Society [6] showed a similar picture. Of more than 500 development projects, only three met the survey’s criteria for success. In 2002, the annual cost of poor quality software to the US economy was estimated at $60B [7]; I imagine that the cost to the European economy is of a similar order. I do not know the detailed sampling methods used by these surveys, nor have I seen their raw data, so I cannot comment on the statistical significance of the results. In my experience, as someone who has seen a lot of software projects and who has investigated many system failures, the survey results may exaggerate the situation somewhat, but the general state of software development in industry and commerce is depressingly amateurish. Despite this disgraceful rate of failure, a majority of customers, managers and programmers value recent experience with specific software products as a more important skill, and more important to project success, than qualifications in computer science or software engineering, or full membership of a professional institution. In other words, they value technician skills above engineering. In 1972, Edsger Dijkstra won the Turing Award. In his Turing Lecture [4], he said “The vision is that, well before the seventies have run to completion, we shall be able to design and implement the kind of systems that are now Keynote, ISADS, Pisa, April 2003.Prepublication copy. Reference as Proc ISADS 2003, pp 169-174, IEEE Press straining our programming ability at the expense of only a few percent in man-years of what they cost us now, and that besides that, these systems will be virtually free of bugs”. He provided convincing arguments to justify the claim that this vision was technically feasible. In November 2002, leading computer scientists in the UKCRC [8] met in Edinburgh UK to formulate “Grand Challenges” in computing: significant scientific and engineering goals that seem out of reach but that could credibly be achieved in ten to fifteen years. The software engineering Grand Challenge sets out to solve the problem of software dependability within fifteen years. It would have been recognised by Dijkstra as a modern version of his 1972 vision. Why is it still so far in the future? 2 Theory and Practice “A study of program structure has revealed that programs—even alternative programs for the same task— can differ tremendously in their intellectual manageability. A number of rules have been discovered, violation of which will either seriously impair or totally destroy the intellectual manageability of the program.” [Dijkstra, 4] One reason for the delay in achieving Dijkstra’s vision is the long-standing gulf between theory and practice. Thirty years ago, computer scientists understood the reasons why large programs are difficult to design dependably. Today, despite degree courses that have explained the problems and their solutions to a whole generation—hundreds of thousands—of computing undergraduates, most of the software industry still resists these insights. There has been progress, of course. Many practical aspects of software engineering have been addressed through advances in software engineering methods, tools, and processes. Configuration management and version control have been imported from the engineer’s drawingoffice (although lamentably few software projects address these issues professionally, even today). Software development processes are defined in standards, taught to software engineers, performed and quality controlled. Occasionally, processes are measured and the results are used for process improvement, in accordance with the levels 4 and 5 of the Software Engineering Institute’s Capability Maturity Model [9]. In these and many other ways, great progress has been made in addressing practical issues—and the advances have been made both on practical projects and in research projects. Yet progress has not been steady and uniform, and a great gulf remains between what is known and what is done. Most software development today is carried out by staff with little or no formal education in software engineering or computer science, who do not have a good grounding in either the processes or the theories that underpin their profession. They are often managed by people who have even less understanding than their junior staff of the great complexity of the engineering tasks they are undertaking, and of the risks inherent in departing from what is known to work. The result is the high rate of project failure reported earlier, and a delivered fault density of between 6 and 30 errors in every 1000 lines of program source [10]. University researchers must share the blame, because few research groups develop their own software using the software engineering methods that they teach to their students. I have yet to find a university team following any documented and auditable software development process. In general they just hack and, although their hacking may use abstruse notations and arcane tools, their students are not fooled. The message is dangerously wrong but very clear: software engineering processes are for other people; experts don’t need them. 3 Software Specification Computer systems are among our most complex artefacts. A large computer system may occupy several hundred people for a decade, and perhaps half of this effort involves taking design decisions, from large-scale architectural decisions to the choices made during detailed programming. The design work of most other engineers is small by comparison—the projects are no larger and design represents a much smaller proportion of the total project budget. There are also vastly more computer systems under development each year than other engineering designs, making the computer industry unique both in the scale of individual systems and in their total number. Yet, in 2003, this daunting task is still largely tackled by unqualified staff using inadequate methods and tools, at excessive cost and with a high rate of failure. In 1969, at the time of the NATO Software Engineering conference it was already recognised that a useful computer system was unlikely to have a fixed specification that could be determined early in development: “No matter how precisely you try to specify a system, once you have built it you find it isn’t exactly what is wanted.” [Oestreicher, 3]. Today, almost every computer system encapsulates one or more business processes or other assumptions about the human world. Business, and other human processes change constantly, and the subject of this conference, autonomous decentralised systems, are embedded in a changing world. So the specifications of autonomous decentralised systems and any other complex system are likely to change during development unless the development period is very short, and will certainly change after release to operational service. Software engineering methods must therefore support this process of change as a central part of the engineering process. Keynote, ISADS, Pisa, April 2003.Prepublication copy. Reference as Proc ISADS 2003, pp 169-174, IEEE Press Software change is the most important step in the software lifecycle: most software costs far more after delivery than before (and most current “software maintenance” destroys more value than it preserves in your software assets). When requirements change, it is important to be able to make controlled changes to the specification. (In these circumstances, modifying software by going directly to the detailed design or code is vandalism). The specification therefore needs to be expressed in such a way that the nature, scope and impact of any change can be assessed and accommodated. The form of the specification should be such that it mirrors the real-world nature of the requirement. This principle was set out clearly (and with great humour) by Michael Jackson in the 1970s [11]. It is a matter of forming the appropriate abstractions, so that localised changes in the business involve localised changes to the system specification, and so that the abstractions are closely related to real-world objects or activities. The notation used for the specification should permit unambiguous expression of requirements, and should support rigorous analysis to uncover contradictions and omissions. This will then mean that it is straightforward to carry out a rigorous impact analysis of any changes. Mathematics is the branch of knowledge that deals with well-defined notations and the rules for manipulating them, and any rigorous method for specifying computer systems must have a mathematical basis. This is not a new insight; it was foreseen by Alan Turing in the 1940s, and the Vienna Method (later VDM) was already in use at the time of the 1969 NATO workshop [3]. Numerous specification methods have been developed during the past three decades, and the most successful have all had some mathematical foundations. Jackson JSP is based on formal grammars; JSD on CSP; EntityRelationship diagrams on sets and maps, etc. Unfortunately, the advocates of proprietary methods often claim that their methods have strengths that, on examination, prove to have no real foundations. Many systems are specified using box-and-arrow notations that have ill defined (or no) semantics; such specifications create a dangerous sense of security in the development team (and especially amongst its managers and clients), because the notations make it easy to give the impression that you understand precisely what is required, when your understanding may well be superficial, incomplete, selfcontradictory or plain wrong. The two most important characteristics of a specification notation are that it should permit problemoriented abstractions to be expressed, and that it should have rigorous semantics so that specifications can be analysed for anomalies. VDM and Z are two well-known notations that have both these characteristics. “In this connection it might be worthwhile to point out that the purpose of abstracting is not to be vague, but to create a new semantic level in which one can be absolutely precise” [Dijkstra, 4]. Unfortunately, most software specifications are still written in English or some other natural language. These specifications rely on strong and persistent understanding between team members (and superhuman feats of analysis) if most defects in the specification are to be uncovered. Such extreme talents are not common in software teams, and most projects suffer the consequences. This insight is not new. In 1972, Dijkstra said “we [must] confine ourselves to the design and implementation of intellectually manageable programs. … If someone fears that this restriction is so severe that we cannot live with it, I can reassure him: the class of intellectually manageable programs is still sufficiently rich to contain many very realistic programs for any problem capable of algorithmic solution.” [4]. I would go further: if the requirement is not intellectually manageable then we do not know what we are trying to achieve; any project that starts in this way (and most do) is out of control from the very beginning. The use of notations such as VDM or Z can lead to systems that have very low levels of specification defects coupled with a low and predictable development cost. Why then are they used so rarely? The main reason appears to be that most software developers and most managers have a low level of confidence in their own mathematical abilities. I sympathise with such fears, but the mathematics that are involved in using VDM or Z are not deep, and although most people find it difficult to form suitable abstractions, these difficulties are inherent in the task of system specification and design, rather than in the particular notation chosen to express the abstractions. Developing software systems is a complex engineering task, and engineering involves the application of appropriate science. For software engineers, this is computer science and its foundations are mathematical. This is not surprising: all engineering requires mathematics, and it is common for aeronautical, civil, electronic, and other engineers to build mathematical models so that they can explore the characteristics of new systems before investing the effort to build them. Software developers need mathematics as a tool to enable them to communicate with other engineers and to give themselves the power to model and manage the complexity of real-world applications. In my opinion, a developer who cannot (or will not) use appropriate mathematics should not claim to be an engineer, software or otherwise, and is not qualified to carry out the specification or design of complex systems. For this reason, asking a firm of management consultants to specify or design a complex system (as often happens) is like asking the occupants of an airport departure lounge to specify or design an aircraft. They may have useful Keynote, ISADS, Pisa, April 2003.Prepublication copy. Reference as Proc ISADS 2003, pp 169-174, IEEE Press insights into the requirements, but turning those opinions into a specification is a task for a specialist engineer. 4 Software Dependability Dependability includes the issues of safety, confidentiality, availability, trustworthiness, integrity and reliability. These are system issues to which software makes an essential contribution. When software fails, it does so by performing actions that were not foreseen or intended, but which result from undetected oversights or mistakes in the specification, design or programming. Mechanical systems, being far less complex, usually fail because some physical component has worn or degraded—it has undergone some physical change. Historically, system reliability analysis has focused on these unavoidable physical failures of components; design failures, being rarer and avoidable, have been individually investigated and have led to improvements in standard methods. In contrast, all software failures are design failures (in the sense that they are incorporated in the software as built), but they are common rather than rare (because of software complexity and because of the poor skills, methods and tools that are commonly used). The techniques of traditional reliability assessment cannot be employed in the same way—it is unhelpful to ask, as people did fifteen years ago, “how reliable is an assignment statement, or a procedure call”, and then try to fit the answers into a fault tree. If a program contains a fault, it will fail whenever the input data activates the erroneous path. Because the fault is always present, nothing changes when the component fails. The program will still handle other data satisfactorily, and will fail again when the triggering conditions are repeated. The apparent randomness of software failures comes from the randomness of the operating environment, and useful forecasts of failure probability can be made if the rate of failure can be measured under representative operating conditions, and the effects of any changes to the software can be proved not to affect the measured results significantly. The field of software metrics has a history of measuring what can be measured easily, and of using these measurements to predict system attributes that are only loosely correlated with them. There is no reason for this to continue. Progress has been made in recent years in developing experimental protocols and statistical models to assess software reliability and the relative strengths of software engineering methods. The modern approach to system dependability involves detailed analysis of possible failure modes and their consequences, feeding back into the system specification and design, to eliminate the greatest number of possible failures from the design, and to minimise the amount of critical software. Then the developers will use the appropriate combination of methods for fault prevention, fault detection, fault removal, fault tolerance and fault estimation, leading to a system that has been engineered to achieve the required degree of dependability and whose dependability has been analysed and documented in a dependability case. This analysis and documentation is important because, for any important application, it is not enough to develop a system that works; it is also necessary to be able to show that it works dependably enough to be released into operational service. A dependability case should contain all the evidence that is needed to argue that the system is adequately dependable. There should be dependability targets, describing the failure modes that must be considered and the tolerable failure rate for each one. There should be a hazard analysis to support these targets, including a fault tree that shows the system, subsystem and component failures that could contribute to the identified failure modes. There should be arguments based on the architecture and design, explaining how failures are eliminated architecturally (perhaps by providing a lowcomplexity mechanical protection system that is known to be highly reliable), or reduced in probability, or detected and recovered. There should be product data, showing the historic failure rates of re-used components, and the results of tests and analysis of newly developed software. There should be process data, giving the audit trail for the development, showing that rigorous processes were followed, that the software was controlled throughout development and that the fielded system has been built out of identifiable versions of each component, corresponding to the verification and testing data given. Finally, all this evidence should be summarised in a dependability argument. This should be underwritten by an independent authority, whose engineering credentials and experience should be given, and who should state that in their professional judgement the system meets the dependability targets and that they are willing to stake their professional reputation and accept personal liability if they have been negligent. I believe that this approach is an essential part of professional software engineering, and should exist in an appropriate form on every major software project. It is not sufficient to rely on testing because, as was recognised 30 years ago: “Testing shows the presence, not the absence, of bugs.” [Dijkstra, 3]. “One can construct convincing proofs quite readily of the ultimate futility of exhaustive testing of a program and even of testing by sampling. So how can one proceed? The role of testing, in theory, is to establish the base propositions of an inductive proof. You should convince yourself, or other people, as firmly as possible, that if the program works a certain number of times on specified Keynote, ISADS, Pisa, April 2003.Prepublication copy. Reference as Proc ISADS 2003, pp 169-174, IEEE Press data, then it will always work on any data. This can be done by an inductive approach to the proof.” [Hoare, 3]. As Hoare says, testing must be combined with detailed knowledge of the properties of the software if it is to give the greatest confidence in the dependability of the system. The alternative is statistical testing under operational conditions but this too has limitations. Even the most extensive testing can only give limited confidence in system dependability. In general, even heroic amounts of testing cannot provide justified confidence for failure rates better than about 10-4 per hour (which would need more than 10,000 hours of fault-free testing). 5 Verification There will almost always be a gap between the functionality of any useful system and the needs of its users—because the world changes, the users’ requirements are always legitimately developing and the software always lags the real need. I call this the quality gap. Functionality is usually deficient for other reasons too: some functions will have been omitted to reduce the cost of the software or to meet delivery schedules; the software will contain unintentional deficiencies—bugs—resulting from errors in specification, design or programming; and errors will be made during modification. Software quality management involves continuous action to minimise the quality gap. Software engineering methods and notations should be chosen so that they provide the greatest support for verification and validation, for analysing the impact of proposed changes and for ensuring that the software and all its documentation remain consistent. Each step in the development should include an independent review, ideally involving the user. Software engineering is a team activity; all work must be open to constructive criticism and improvement. Software development is very difficult. It needs excellent people, using methods and tools that provide strong support for their work. One of the strongest tools is verification: checking that the output from a development stage is self-consistent, and consistent with the output from the preceding stage. We have made great progress in 30 years, although less than perhaps we should have done, and less than an observer at the 1969 Rome NATO conference would have predicted. E.S. Lowry, of IBM, remarked: “I will start with a strong statement of opinion. I think that any significant advance in the programming art is sure to involve very extensive automated analyses of programs. … … Doing thorough analyses of programs is a big job. … It requires a programming language which is susceptible to analysis. I think other programming languages will head either to the junk pile or to the repair shop for overhaul, or they will not be effective tools for the production of large programs.” [3] Thirty years later we have such tools, although the strongest of them, SPARK Ada and the SPARK Examiner [12], are not widely used outside the defence and aerospace industry. Nevertheless, SPARK shows what can be achieved when strong computer science is combined with strong software engineering. A recent paper by Peter Amey [13] shows that building software so that it is correct by construction can also cost less than traditional software development. Dijkstra knew this in 1972 although few people believed him: “Those who want really reliable software will find that they must find means of avoiding the majority of bugs to start with, and as a result the programming process will become cheaper.” [4] Yet the programming language C has conquered the world, despite its well-documented weaknesses, and the main hope to replace it seems to be Java, which is a far better language but which is winning mainly in the niche application areas of Internet and mobile handset applications. Perhaps, as more and more autonomous decentralised systems are introduced, Java usage will become far wider. I fear, however, that Java is not being adopted for its stronger support for verification, but for the flexibility and power of the Java Virtual Machine. The programmers who rejoice in the lack of type-checking in C will not become software engineers simply because Java supports better methods. 6 Software Engineering Education “The way in which people are taught to program is abominable. … … I think we will not make it possible to have a subject of software engineering until we can have some proper professional standards about how to write programs; and this has to be done by teaching people right at the beginning how to write programs properly.” [Christopher Strachey, 3]. I believe that the relationship between computer science and software engineering is the same as the relationship between chemistry and chemical engineering. The chemistry graduate is educated with the principles of chemistry and the state of research; they are taught what they will need if they are to become research chemists, extending the boundaries of knowledge or advising others on the chemistry of specific compounds. The chemical engineer must be taught enough chemistry to be able to work with the research chemists. But they must also understand the principles and processes of industrial chemistry, what happens when a laboratory reaction is scaled up to industrial quantities, how to maximise yields, plant safety, quality assurance, what has worked and what has failed, standard methods, measurements, and process control. Keynote, ISADS, Pisa, April 2003.Prepublication copy. Reference as Proc ISADS 2003, pp 169-174, IEEE Press Computer science is an important discipline, but most computing graduates will be involved in commercial or industrial software development not in fundamental research. These graduates need to be educated as engineers not as scientists. I have no doubt that software engineering is a discipline quite distinct from computer science. Software engineers must understand computer science, so that they can apply it when they need to. But they should also understand the principles of building systems of industrial scale: how to specify them, design them, program, and assure them; why maintenance is harder than development and needs the best people; what architectures and designs have worked in the past, and why; how to tell a good design from a poor one; how to estimate, plan and manage a project; quality control and quality assurance; risk management; when reuse is better than innovation; when, how and why to test; how to participate effectively in reviews; what to measure and how to do process improvement; engineering ethics and the importance of safety; personal development and professionalism. “It is clearly wrong to teach undergraduates the state of the art; one should teach them things that will still be valid in 20 years time: the fundamental concepts and the underlying principles. Anything else is dishonest.” [Strachey, 3]. Software engineering, like computer science, has “fundamental concepts and underlying principles”. They, too, can be taught. 7 Concluding Remarks Software development is an engineering discipline, one of the most demanding. Yet most software developers approach these most challenging projects with bravado and ignorance. Where hubris leads, nemesis follows: the high rate of failure of computer projects is largely attributable to lack of professionalism and a wilful refusal to learn from the experience of fifty years. Industry must accept the need to change: to retool, train, and adopt rigorous engineering methods. Universities need to change too, to teach much more software engineering and to use what they teach, routinely, on their own research projects. The advances we seek in autonomous decentralised systems cannot be made until we software and systems engineers have learnt to use the best abstractions and the best notations, to limit ourselves to problems that we can make intellectually manageable, and to learn from the past and adopt strong software engineering methods. Then, with due humility, we can call ourselves engineers. 8 References 1. The first stored program computer, the Manchester University “Baby”, ran its first program in June 1948. The Designer, F. C. Williams, later wrote “..and there, in the expected place, was the expected answer. That was June 1948, and nothing was ever the same again”. See Towards the 50th Anniversary of the Manchester Mark 1 Computer, Briefing Note 1, The Birth of The Baby, Manchester University Department of Computer Science, November 1995. 2. On 17 November, 1951, LEO ran its first business application, providing management information for Joe Lyons bakery in London. See LEO and the Computer Revolution, David Tresman Caminer, CCEJ, Volume 13, No 6, December 2002. 3. J. N. Buxton and B Randell [eds], Software Engineering Techniques, NATO Science Committee Report, April 1970. 4. E. W. Dijkstra, The Humble Programmer, CACM 15, 10 October 1972 pp859-866. 5. The Standish Group. http://www.standishgroup.com/chaos 6. British Computer Society 2001. http//www.bcs.org.uk 7. The Economic Impacts of Inadequate Infrastructure for Software Testing, RTI Project Number 7007.011, Final Report, May 2002 (US National Institute of Standards and Technology). 8. UK Computing Research Committee. http://www.ukcrc.org.uk/events 9. Capability Maturity Models. http://www.sei.cmu.edu/cmm/cmms/transition.html 10. Shari Lawrence Pfleeger and Les Hatton, Investigating the Influence of Formal Methods, IEE Computer, pp33-42, February 1997. 11. M. A. Jackson, Principles of Program Design, Academic Press, 1975. 12. Spark. http://www.sparkada.com. 13. Peter Amey, Correctness by Construction: Better Can Also Be Cheaper, http://www.sparkada.com/downloads/Mar2002Amey.pdf Keynote, ISADS, Pisa, April 2003.Prepublication copy. Reference as Proc ISADS 2003, pp 169-174, IEEE Press