Chapter 3 - Language Design Principles Programming Languages: Principles and Practice, 2nd Ed. Louden, 2003 1 The language design problem Language design is difficult, and success is hard to predict: – Pascal a success, Modula-2 a failure – Algol60 a success, Algol68 a failure – FORTRAN a success, PL/I a failure Nevertheless, there have been some basic goals or principles that have been important over the years, and that can contribute to success. Chapter 3 K. Louden, Programming Languages 2 Efficiency The “first” goal (FORTRAN): execution efficiency. Still an important goal in some settings (C++, C). Many other criteria can be interpreted from the point of view of efficiency: – programming efficiency: writability, reliability (security). – maintenance efficiency: readability. Chapter 3 K. Louden, Programming Languages 3 Features that aid efficiency of execution Static data types allow efficient allocation and access. Manual memory management avoids overhead of “garbage collection”. Simple semantics allow for simple structure of running programs (simple environments - Chapter 8). Chapter 3 K. Louden, Programming Languages 4 Features that aid other design goals (note efficiency conflicts): Writability, expressiveness: no static data types (variables can hold anything, no need for type declarations). Reliability, writability, readability: automatic memory management (no need for pointers). Expressiveness, writability, readability: more complex semantics, allowing greater abstraction. Chapter 3 K. Louden, Programming Languages 5 Internal consistency of a language design: Regularity Regularity is a measure of how well a language integrates its features, so that there are no unusual restrictions, interactions, or behavior. Regularity issues can often be placed in subcategories: – Generality: are constructs general enough? (Or too general?) – Orthogonality: are there strange interactions? – Uniformity: Do similar things look the same, and do different things look different? Chapter 3 K. Louden, Programming Languages 6 Regularity examples from C Functions are not general: there are no local functions (simplicity of environment). Declarations are not uniform: data declarations must be followed by a semicolon, function declarations must not. Parameters are not orthogonal with data type: arrays are references, other parameters are copies. Chapter 3 K. Louden, Programming Languages 7 What about Java? Are function declarations non-general? – There are no functions, so a non-issue. (Well, what about static methods?) Are class declarations non-general? – No multiple inheritance (but there is a reason: complexity of environment). – Java has a good replacement: interface inheritance. Do declarations require semicolons? – Local variables do, but is that an issue? (Not really - they look like statements.) Chapter 3 K. Louden, Programming Languages 8 Java regularity, continued Are some parameters references, others not? – Yes: objects are references, simple data are copies. – This is a result of the non-uniformity of data in Java, in which not every piece of data is an object. – The reason is efficiency: simple data have fast access. What is the worst non-regularity in Java? – My vote: arrays. But there are excuses. Chapter 3 K. Louden, Programming Languages 9 Other design principles Simplicity: make things as simple as possible, but not simpler. (Pascal, C) Expressiveness: make it possible to express conceptual abstractions directly and simply. (Scheme) Extensibility: allow the programmer to extend the language in various ways. (Scheme, C++) Security: programs cannot do unexpected damage. (Java) Chapter 3 K. Louden, Programming Languages 10 Other design principles (cont.) Preciseness: having a definition that can answer programmers and implementors questions. (Most languages today, but only one has a mathematical definition: ML) Machine-independence: should run the same on any machine. (Java) Consistent with accepted notations. (Most languages today, but not Smalltalk & Perl) Restrictability: a programmer can program effectively in a subset of the full language. (C++: avoids runtime penalties) Chapter 3 K. Louden, Programming Languages 11 C++ case study Thanks to Bjarne Stroustrup, C++ is not only a great success story, but also the best-documented language development effort in history: – 1997: The C++ Programming Language, 3rd Edition (Addison-Wesley). – 1994: The Design and Evolution of C++ (Addison-Wesley). – 1993: A History of C++ 1979-1991, SIGPLAN Notices 28(3). Chapter 3 K. Louden, Programming Languages 12 Major C++ design goals OO features: class, inheritance Strong type checking for better compile-time debugging Efficient execution Portable Easy to implement Good interfaces with other tools Chapter 3 K. Louden, Programming Languages 13 Supplemental C++ design goals C compatibility (but not an absolute goal: no gratuitous incompatibilities) Incremental development based on experience. No runtime penalty for unused features. Multiparadigm Stronger type checking than C Learnable in stages Compatibility with other languages and systems Chapter 3 K. Louden, Programming Languages 14 C++ design errors Too big? – C++ programs can be hard to understand and debug – Not easy to implement – Defended by Stroustrup: multiparadigm features are worthwhile No standard library until late (and even then lacking major features) – Stroustrup agrees this has been a major problem Chapter 3 K. Louden, Programming Languages 15 DUPLICATE MATERIAL? Chapter 3 K. Louden, Programming Languages 16 What makes a good language? 1. 2. 3. It is almost impossible to get to CS types to agree on what is most important. Important features of the language include: Readability - important because ease of maintenance is greatly influenced by readability Restricting identifiers to short length affects readability. Writability Trade-off: readability and writability Clarity, simplicity and unity of Language concept – – – – – – – Chapter 3 Language is an aid to the programmer Conceptual integrity: Minimum number of different concepts - rules simple Sometimes programmer who must use a large language have tendency to learn a subset of of the language and then ignore its other features. This is used to justify the large number of language components. But readability problems will occur as someone must read in another's subset of the language. Having more than one way to accomplish the same thing (multiplicity) is a detriment to simplicity. (In C, c= c+1; c++, ++c, c+=1 all do the same thing) Operator overloading (single operator has more than one meaning) is another problem. (+: integer add, float add, set addition, etc) Sometimes statements are TOO simple - no complex control structure - and thus hard to read. Control Statements. sequence, selection, iteration, recursion. Data Structures: The presence of adequate facilities for defining data types and data structures is significant aid to readability. K. Louden, Programming Languages 17 4. Orthogonality: a relatively small set of primitive constructs can be combined in a relatively small number of ways. Every possible combination is legal. For example - in IBM assembly language there are different instructions for adding memory to register or register to register (non-orthogonal). In Vax, a single add instruction can have arbitrary operands. Closely related to simplicity - the more orthogonal, the fewer rules to remember. For examples of non-orthogonality consider C++: We can convert from integer to float by simply assigning a float to an integer, but not vice versa. We can use a derived class instance in place of a parent class instance, but not vice versa. A switch statement works with integers, characters, or enumerated types, but not doubles. Arrays are pass by reference while integers are pass by value. Too much orthogonality can also be a problem - sometimes get unnatural or extremely complex results. When any combination is legal, errors in writing programs can go undetected. Can accidentally use unknown features. Functional languages (so named as computations are made primarily by applying functions to given parameters - Lisp, Prolog) are completely orthogonal and very simple as they have a single control construct - the function. In contrast, an imperative language (like C++) have computations specified by variables and assignment statements. Chapter 3 K. Louden, Programming Languages 18 5. Naturalness for Application - reason for proliferation of languages Ease of use readable Cryptic programs may be easy to write - but impossible to read What makes a program difficult to read? Form of Special Words begin-end or {} suffer in that all constructs are terminated the same way. Also, if special words can also be used as variable names, the result can be very confusing. Different meanings LOOK different In Snobol a single blank can alter the statements meaning \ X_Y_=_Z vs XY_=_Z The latter is assignment of Z to XY. The former means the pattern Y is looked for in X and, if found, is replaced in X by Z. \ X = `A GOOD EXAMPLE' Y = `GOOD' Z = 'BAD' XY = Z XY = 'BAD' (variable XY is assigned the string) X Y = Z X = 'A BAD EXAMPLE' Meaning of statement is obvious from form. Grep in Unix can be deciphered only through prior knowledge. Even the name of grep is hard to remember. It is named because of g/regular expression/p. Beginning g means ``global'' and trailing p means print. Reflects logical structure of program Will be easier to write - won't have to redesign logic to code Chapter 3 K. Louden, Programming Languages 19 6. Support for Abstraction (Extension) Abstraction means: complicated structures can be stated in simple ways by ignoring many of the details. The goal is to allow data structures, types, and operations to be defined and maintained as self contained abstractions the programmer may use them in other parts of program knowing only abstract properties Procedures and functions are starts in this direction. User defined types are a start Black Box vs. Clear Box 7. Ease of program verification - External Support Formal (proofs of correctness) OR informal (testing) Debugging aids Test Facilities Structure of Language - Proofs 8. Programming environment implementation documentation editors testing packages Chapter 3 K. Louden, Programming Languages 20 9. Portability (transportability) 10. Reliability - affected by readability and writability Type Checking: testing for type compatibility. Important in reliability in that run time checking is expensive. If cannot be done at compile time, user may suspend with dangerous results. Trade-off: reliability and cost of execution Exception Handling: the ability for a program to intercept run-time errors, take corrective measures, and continue is a great aid to reliability. Aliasing: two distinct referencing methods or names for the same cell. May be too dangerous to justify. 11. Cost training programmers to use the language. writing/testing: programmer time is very important now translation: not very important except in an educational environment execution: may not be important in some applications cost of language implementation system cost of poor reliability maintenance - the largest cost if used several years includes: errors in original; changes caused by hardware or OS changes; extensions and enhancements to original product Cost depends on a number of language features, but primarily on readability. Chapter 3 K. Louden, Programming Languages 21