CmSc 315 Programming languages Chapter 7 . Semantics 1. Introduction Semantics: What is the meaning of a program? Motivation: To provide an authoritative definition of the meaning of all language constructs for: 1. Programmers 2. Compiler writers 3. Standards developers A programming language is complete only when its syntax, type system, and semantics are well-defined. Several approaches to semantics Grammatical models: grammar rules are paired with semantic rules. Resulting grammars are called attribute grammars Rule EE+T ET TTxP TP PI P (E) Attribute value(E1) = value(E2) + value(T) value(E) = value(T) value(T1) = value(T2) x value(P) value(T) = value(P) value(P) = value of number I value(P) = value(E) Attribute grammars, that associate with each non-terminal in the grammar a set of attributes, are one of the earliest semantic models (Knuth, Donald E., Semantics of context-free languages, Mathematical Systems Theory 2 2 :127-145, 1968.) and they are still in use. Their most beneficial feature is that they can be used for efficient automatic translation. However, attribute grammars are not sufficiently powerful to represent the entire semantics of programming languages - they are too tightly coupled with parse trees. Operational models Describe the meaning of the language constructs in terms of machine states, i.e. memory and register contents 1 Denotational semantics The meaning of programs is described in terms of mathematical functions on programs and program components. Programs are translated into functions about which properties can be proved using the standard mathematical theory of functions. Based on Lambda calculus. (Scott, D.S. and Strachey, C., Towards a mathematical semantics for computer languages, in Proc. Symp. Computers and Automata , pages 19-46. Polytechnic Press, NY, 1971.) Axiomatic models Describe the meaning as pre-conditions and post-conditions Used in program verifications Specification models Describe the relationship among various functions that implement a program. e.g. pop(push(S,x)) = S 2. Operational semantics Sequence control : the control of the order of execution of the operations both primitive and user defined. Implicit: determined by the order of the statements in the source program or by the built-in execution model Explicit: the programmer uses statements to change the order of execution (e.g. uses If statement) Levels of sequence control Expressions: How data are manipulated using precedence rules and parentheses. Statements: conditional and iteration statements change the sequential execution. Declarative programming: an execution model that does not depend on the order of the statements in the source program. Subprograms: transfer control from one program to another. 3. Sequencing with expressions The issue: given a set of operations and an expression involving these operations, what is the sequence of performing the operations? How is the sequence defined, and how is it represented? 2 An operation is defined in terms of an operator and operands. The number of operands determines the arity of the operator. Basic sequence-control mechanism: functional composition Given an operation with its operands, the operands may be: Constants Data objects Other expressions Example: 3 * (var1 + 5) operation - multiplication, operator: *, arity - 2 operand 1: constant (3) operand 2: operation addition operand1: data object (var1) operand 2: constant (5) Functional compositions imposes a tree structure on the expression, where we have one main operation, decomposable into an operator and operands. In a parenthesized expression the main operation is clearly indicated. However we may have expressions without parentheses. Example 2: 3* var1 +5 Question: is the example equivalent to the above one? Example 3: 3 + var1 +5 Question: is this equivalent to (3 + var1) + 5, or to 3 + (var1 + 5) ? In order to answer the questions we need to know: Operator's precedence Operator's associativity Precedence concerns the order of applying operations, associativity deals with the order of operations of same precedence. Precedence and associativity are defined when the language is defined - within the semantic rules for expressions. 3.1. Arithmetic operations / expressions In arithmetic expressions the standard precedence and associativity of operations is applied to obtain the tree structure of the expression. 3 Linear representation of the expression tree: Prefix notation (- ( + a b) (* c d)) Postfix notation a b + c d * Infix notation a + b – c * d Postfix notation is parentheses-free. Problems with unary operators Cambridge prefix notation: operator followed by operands. The expression is enclosed in parentheses. Used in LISP and Scheme There are algorithms to evaluate prefix and postfix expressions and algorithms to convert an infix expression into prefix/postfix notation, according to the operators' precedence and associativity. History: In the 1920's, Jan Lukasiewicz (1878-1956) developed a formal logic system which allowed mathematical expressions to be specified without parentheses by placing the operators before (prefix notation) or after (postfix notation) the operands. For example the (infix notation) expression (4 + 5) × 6 could be expressed in prefix notation as × 6 + 4 5 or × + 4 5 6 and could be expressed in postfix notation as 4 5 + 6 × or 6 4 5 + × Prefix notation also came to be known as Polish Notation in honor of Lukasiewicz. Postfix notation became known as RPN: Reversed Polish Notation From these ideas, Charles Hamblin (Australian philosopher and computer scientist) developed a postfix notation for use in computers. Hamblin's work on postfix notation was in the mid-1950's. Calculators, notably those from Hewlett-Packard, used various postfix formats beginning in the 1960s. 4 3.2. Other expressions Languages may have some specific operations, e.g. for processing arrays and vectors, built-in or user defined. Precedence and associativity still need to be defined - explicitly in the language definition or implicitly in the language implementation. 3.3. Execution-time representation of expressions Machine code sequence Tree structures - software simulation Prefix or postfix form - requires stack, executed by an interpreter. 3.4. Evaluation of tree representation Eager evaluation - evaluate all operands before applying operators. Lazy evaluation Problems: Side effects - some operations may change operands of other operations. Error conditions - may depend on the evaluation strategy (eager or lazy evaluation) Boolean expressions - results may differ depending on the evaluation strategy. Example if(a < b < c) 5