Semantics

advertisement

Programming Languages

2nd edition

Tucker and Noonan

Chapter 7

Semantics

Surely all this is not without meaning.

Ishmael, Moby Dick by Herman Melville

Copyright © 2006 The McGraw-Hill Companies, Inc.

Contents

7.1 Motivation

7.2 Expression Semantics

7.3 Program State

7.4 Assignment Semantics

7.5 Control Flow Semantics

Copyright © 2006 The McGraw-Hill Companies, Inc.

Semantics

Semantics is a precise definition of the meaning of a program that is syntactically correct and valid according to the type rules.

Semantics describes run-time behavior

A programming language is complete only when its syntax, type system, and semantics are well-defined.

Copyright © 2006 The McGraw-Hill Companies, Inc.

7.1 Motivation (for Semantics)

To provide an authoritative definition of the meaning of all language constructs for:

1. Programmers

2. Compiler writers

3. Standards developers

A program written in a given language must always produce the same results when given the same data, regardless of the compiler or computer.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Ways to Define Meaning:

• Operational Semantics : What happens when a program is run on a machine – how statements change the machine’s state.

Early languages were designed for a specific computer; e.g, Fortran on IBM 709; today, semantics are general

Axiomatic semantics

– Chapter 18

Abstract, based on formal logic

Denotational semantics: Statements as program state-transforming functions – (Chapter 8)

Rigorous, mathematical, defines functions that specify exactly how state is affected by each language object.

Copyright © 2006 The McGraw-Hill Companies, Inc.

7.2 Expression Semantics – how to interpret the meaning of an expression

Copyright © 2006 The McGraw-Hill Companies, Inc.

Semantic Issues

Notation (infix, Polish, etc.)

Precedence/associativity rules (if not completely specified in the grammar)

Special issues short-circuit evaluation representational issues subexpression evaluation and side effects

Copyright © 2006 The McGraw-Hill Companies, Inc.

Expression Notation – Infix

• Infix : a binary operator is between its two operands.

• What is the meaning of a + b – c * d

Is it (((a + b)- c)* d) ?

Or ( a + ( b – c) * d) ? Or …?

• Without knowledge of the language, the expression is ambiguous

• Languages that use infix may also use associativity and precedence rules to disambiguate.

Parentheses can be used to override normal evaluation order to achieve the desired meaning.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Expression Notation

Four ways to write the expression for the tree on the earlier slide

• Infix : (a + b) - (c * d)

• Polish Prefix : - + a b * c d

• Polish Postfix : a b + c d * -

• Cambridge Polish : (- (+ a b) (* c d))

Identify the subexpressions.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Expression Notation – Polish

• Completely specifying evaluation order leads to large grammars; one alternative: use an un ambiguous notation

Polish prefix: a binary operator precedes its operands

Polish Prefix : - + a b * c d

Polish postfix: a binary operator follows its operands

Polish Postfix : a b + c d * -

• Preorder and postorder traversals of the parse tree generate the Polish notations.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Pre-order: visit the root, visit the left subtree, visit the right subtree

Post-order: visit the left subtree, visit the right subtree, visit the root

Copyright © 2006 The McGraw-Hill Companies, Inc.

Expression Notation - Polish

• If all operators are binary, this notation eliminates the need for parentheses, precedence rules, or associativity rules

– a*b*c : **abc (left associativity) or * a*bc (right)

– a*b*c : abc** or ab*c* (which is left associativity?)

• Flaw: what about unary operators?

- a + b * c (+-a*bc won’t work)

One solution: Use a different symbol (e.g. ‘~’) for the unary operator

Copyright © 2006 The McGraw-Hill Companies, Inc.

Expression Notation – Cambride prefix

• Polish prefix and postfix fail if operators aren’t binary

- a + b * c (+-a*bc won’t work)

• Solution 1: use different symbol for unary operators

• Solution 2: Cambridge prefix - operators can be n-ary:

Rewrite a + b – c * d as (-(+ a b)(* c d))

Rewrite a + b + c + d as (+ a b c d)

Rewrite -a + b * c as (+(-a)(*b c)

Lisp and Scheme use Cambridge prefix.

Operators precede operands, fully parenthesized.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Associativity of Operators – Table 7.1

Language + - * / Unary - ** == != < ...

C-like

Ada

Fortran

L

L

L

R non

R non

R

L non

L

In some languages the meaning of a < b < c is not the same as the meaning of (a < b) && (b < c)

Argues for non-associativity of logical operators

Copyright © 2006 The McGraw-Hill Companies, Inc.

Precedence of Operators – Table 7.2

Operators C-like

Unary 7

**

* / 6

+ -

== !=

< <= ...

not (!) log. and 2 log. or 1

3

7

5

4

Ada Fortran

2

2

3

2

1

1

3

5

4

3

5

4

3

2

2

2

1

1

Copyright © 2006 The McGraw-Hill Companies, Inc.

Other Languages

Smalltalk: left-to-right evaluation, uniform precedence; left associative

APL: right-to-left evaluation, uniform precedence, right associative

APL example:

The following expression sorts a word list stored in matrix X according to word length:

X[ ⍋ X+.≠' ';]

Copyright © 2006 The McGraw-Hill Companies, Inc.

Summary

Grammars can capture some semantic features, notably operator precedence and associativity

To simplify the grammar, languages with many operators often write an ambiguous grammar and use tables such as those in the previous examples during translation.

Copyright © 2006 The McGraw-Hill Companies, Inc.

7.2.3 Short Circuit Evaluation

Short-circuit evaluation starts evaluating from the left and stops when the expression value is known

So a && b is evaluated as: if a then b else false

Or if (!a) false else B

Copyright © 2006 The McGraw-Hill Companies, Inc.

7.2.3 Short Circuit Evaluation

Likewise a || b is evaluated as: if a then true else b

One reason for short-circuit is slightly more efficient.

Another reason: allows for situations where the second operand might be undefined.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Example

Node p = head; while (p != null && p.info

!= key) p = p.next; if (p == null) // not in list

...

else // found it

// without short-circuiting, what happens if p is null?

Copyright © 2006 The McGraw-Hill Companies, Inc.

Without Short-circuit Evaluation:

} boolean found = false; while (p != null && !found) { if (p.info == key) found = true; else p = p.next;

Copyright © 2006 The McGraw-Hill Companies, Inc.

7.2.4: Expression Meaning

If an expression has been statically type checked its meaning should depend only on sub-expression values and operator meaning.

But …

• Number representation (finite-ness)

• Side effects

Copyright © 2006 The McGraw-Hill Companies, Inc.

Number Representation

In computers, unlike the real world, numbers are finite: there is a largest and a smallest.

Associativity does not always hold.

Is (a + b) + c = a + (b + c) ?

What if a = IntMax , b = 3 , c = -5 ?

See example for floats in a previous lecture.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Side Effects – of Functions or Expressions

“A function or expression is said to have a side effect if, in addition to producing a value, it also modifies some state or has an observable interaction with calling functions or the outside world.

For example, a function might modify a global or a static variable , modify one of its arguments, raise an exception, write data to a display or file, read data, or call other side-effecting functions.” http://en.wikipedia.org/wiki/Side_effect_(computer_science)

Copyright © 2006 The McGraw-Hill Companies, Inc.

Side Effects – of Functions or Expressions

• Side effects are not necessarily bad

– They are how void functions accomplish their job

• However, sometimes they have unexpected consequences, particularly if there are language issues that are semantically undefined.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Side Effects & SubExpression Evaluation

What is the value of a = b * i++ + c * i; when i = 2; b = 2; c = 5;

Order of evaluation for subexpressions is not defined in most languages: ( b * i++) may be evaluated before or after ( c * i)

In C-like languages, a is semantically undefined.

Or what about the expression y + f(x, y); if f modifies the value of y

Copyright © 2006 The McGraw-Hill Companies, Inc.

7.3 Program State

“The state of a program is the binding of all active

[alive] objects to their current values.” – page 160

State-related maps:

1. The pairing of active objects with specific memory locations ( environment )

2. The pairing of active memory locations with their current values ( memory map )

Copyright © 2006 The McGraw-Hill Companies, Inc.

Example

Given a program with two variables, i and j , currently associated with memory addresses 154 and 155 and having values 13 and -1: environment = { < i , 154>, < j , 155>} memory = {<0, undef

>, …, <154, 13 >, <155, -1 >, …} program state = memory X environment

= {< i, 13 >, < j, -1 >}

Copyright © 2006 The McGraw-Hill Companies, Inc.

State Transitions

The individual steps that occur during a program run can be viewed as a series of state transformations.

For example, in the previous case, executing statement i = i + 2 * j produces this state: {< i,

11 >, < j, -1 >}

A program trace (similar to what could be provided by a debugger) shows changes in program state.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Figure 7.2 Factorial Program

// compute the factorial of n

1 void main ( ) {

2 int n, i, f;

3 n = 3;

4 i = 1;

5 f = 1;

6 while (i < n) {

7 i = i + 1;

8

9 } f = f * i;

10 }

Copyright © 2006 The McGraw-Hill Companies, Inc.

// compute the factorial of n

1 void main ( ) {

2 int n, i, f;

3 n = 3;

4 i = 1;

5 f = 1;

6 while(i < n) {

7 i = i + 1;

8 f = f * i;

9 }

10 } n i f undef? undef undef

3 undef undef

Copyright © 2006 The McGraw-Hill Companies, Inc.

// compute the factorial of n

1 void main( ) {

2 int n, i, f;

3 n = 3;

4 i = 1;

5 f = 1;

6 while(i < n) {

7 i = i + 1;

8 f = f * i;

9 }

10 } n i f

3

3 undef undef

1 undef

Copyright © 2006 The McGraw-Hill Companies, Inc.

// compute the factorial of n

1 void main( ) {

2 int n, i, f;

3 n = 3;

4 i = 1;

7

8

5 f = 1;

6 while(i < n) { i = i + 1; f = f * i;

9 }

10 } n

3

3

3

Copyright © 2006 The McGraw-Hill Companies, Inc.

i f undef undef

1 undef

1 1

// compute the factorial of n

1 void main( ) {

2 int n, i, f;

3 n = 3;

4 i = 1;

7

8

5 f = 1;

6 while(i < n) { i = i + 1; f = f * i;

9 }

10 } n

3

3

3

3

Copyright © 2006 The McGraw-Hill Companies, Inc.

i f

1

1 undef undef

1 undef

1

1 (no change)

// compute the factorial of n

1 void main( ) {

2 int n, i, f;

3 n = 3;

4 i = 1;

7

8

5 f = 1;

6 while(i < n) { i = i + 1 f = f * i;

;

9 }

10 } n

3

3

Copyright © 2006 The McGraw-Hill Companies, Inc.

i

1

2 f

1

1

// compute the factorial of n

1 void main( ) {

2 int n, i, f;

3 n = 3;

4 i = 1;

7

8

5 f = 1;

6 while(i < n) { i = i + 1; f = f * i ;

9 }

10 } n

3

Copyright © 2006 The McGraw-Hill Companies, Inc.

i

2 f

3

Discussion

(In C/C++ undefined variables are initialized according to their types but not in every language.)

Loop test – no side effects

Assignment statements – have side effects

T he program state is altered.

Order of sub-expression evaluation will be an issue if the sub-expressions have side effects.

Copyright © 2006 The McGraw-Hill Companies, Inc.

7.4 Assignment Semantics

Abstract syntax for Clite assigment statement

Assignment = Variable target : Expression source

Semantic interpretation :

• Evaluate the source expression, using current state

• Get value

• Replace value of target variable with value, getting a new state.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Meaning Rules (Chapter 8)

Meaning rules map states to states

Meaning Rule 8.3 (page 202) The meaning of an assignment statement is the result of replacing the value of the target Variable by the value of the source Expression in the current state.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Denotational Semantics (Chapter 8)

Assignment = Variable target : Expression source

M: Assignment X State → State

M(Assignment a, State state )

= state

ט

{<a.target, M(a.source, state)>}

Loose interpretation: the meaning of an assignment a given the state state is the state in which the only change is that the target variable of a has a new value which is defined as the meaning of the source expression.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Assignment Statement Issues

Issues

• Multiple assignment

• Assignment statement vs. expression

• Copy vs. reference semantics

Copyright © 2006 The McGraw-Hill Companies, Inc.

Multiple Assignment

Example: a = b = c = 0;

Sets all 3 variables to zero.

a = b = c;

Associativity???

Copyright © 2006 The McGraw-Hill Companies, Inc.

Assignment Statement vs. A. Expression

In most languages, assignment is a statement; cannot appear in an expression.

In most C-like languages, assignment is an expression.

• while (*p++ = *q++) ; // strcpy

• while (ch = getc(fptr)) ... // ???

• while (p = p->next) ... // ???

Expression value: the value assigned to the target var.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Ass’t Statement vs. Ass’t. Expression

if (a = 0) ...

// an error ???

The meaning (semantics) of an assignment statement is not the same as the meaning of an assignment expression.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Copy vs. Reference Semantics

• Copy: a = b;

– a, b have same value.

– Changes to either have no effect on other.

– Used in imperative languages.

• Reference

– a, b point to the same object.

– A change in object state affects both

– Used by many object-oriented languages.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Turing Complete Languages

Textbook definition: A programming language is

Turing complete if its programs are capable of computing any computable functions.

Or… if it can compute the same functions that a

Turing machine can compute

Turing machine = a simple, theoretical model of a computer; can recognize all languages described by any grammar (including the unrestricted grammars)

Importance: in theoretical CS – decidability issues

Copyright © 2006 The McGraw-Hill Companies, Inc.

7.5 Control Flow Semantics

To be Turing complete, an imperative language needs:

• Statement sequencing

• Conditional statement

• Looping statement

Bohm-Jacopini Theorem

Copyright © 2006 The McGraw-Hill Companies, Inc.

Sequence

s1 s2

Semantics: in the absence of a branching statement,

• First execute s1

• Then execute s2

• Output state of s1 becomes the input state of s2

Branching statements: return , break , continue , goto,

Copyright © 2006 The McGraw-Hill Companies, Inc.

Conditional

IfStatement

 if ( Expresion ) Statement

[ else Statement ]

Example:

Meaning: if (a > b) z = a; else z = b;

If the test expression is true, then the output state of the conditional is the output state of the then branch, else the output state of the conditional is the output state of the else branch.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Conditional

Switch (Case) Statement

Originated in Fortran as a computed goto: goto (100, 200, 300, 400), i

Some issues for a switch statement:

• Data type of switch variable – some integral type

• Whether or not to require a break after each case

– Use of implicit break (Ada) is less error prone

• Whether or not to allow a range of values

Copyright © 2006 The McGraw-Hill Companies, Inc.

Loops

WhileStatement

 while ( Expression ) Statement

The expression is evaluated.

If it is true, first the statement is executed, and then the loop statement is executed again.

Otherwise the loop terminates.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Loop Variations

• Begin-test loop (e.g. while )

• End-test loop (e.g. do_while )

• Counting loop (e.g. for )

• Iterator loops (e.g., for-each )

– An iterator is any finite set of values over which a loop can be repeated

– foreach $number (@ number) {

$number = $number + 1;

}

Copyright © 2006 The McGraw-Hill Companies, Inc.

Loop Variations

• foreach loops exist in several languages (e.g., Ada, C#)

• Simple form of a for loop in which there is no explicit

LCV, no requirement to specify increment

• The assumption is that every item in the collection will be processed in order.

– Another Perl example

– foreach $key (sort keys %list) { print $key, “\t”, $list{$key},“\n”;

}

Copyright © 2006 The McGraw-Hill Companies, Inc.

Branching Statements & Loop Control

return: exit from a function break : non-standard exit from a loop continue : non-standard exit from current loop iteration goto : transfer of control to any named program statement

Structured Programming: In part, advocates program design based on sequence, conditional, while loops only – no goto.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Spaghetti code?

j = 0 ct = 0 sum = 0.0

100 if (j .ge. n) goto 300 j = j + 1 if (a(j) .lt. 0) goto 100 ct = ct + 1 sum = sum + a(j) goto 100

300 continue

Copyright © 2006 The McGraw-Hill Companies, Inc.

Summary

Semantics = program meaning i.e

., how the program’s state changes as it executes

This chapter presents an informal discussion expressions, assignments, conditionals, branching, looping statements (skipped I/O and exceptions)

Chapter 8 shows how the semantics of a language can be defined using a set of meaning functions; a formal statement of the semantics of various program constructs.

Copyright © 2006 The McGraw-Hill Companies, Inc.

Examples of Meaning Functions for Clite

The meaning of a Program is … the meaning of its body when given an initial state consisting of the variables of the decpart , each initialized to the undef value corresponding to its type.

The meaning of an assignment statement is the result of replacing the value of the target Variable by the value of the source Expression in the current state.

Copyright © 2006 The McGraw-Hill Companies, Inc.

7.6 Input/Output Semantics

• Background Concepts

– open , close

• open binds a file to a file id, sets up tables, etc.

Access: sequential vs. random

• Sequential access reflects order of data in file; random access allows elements to be accessed out of order, multiple times

Stream vs. fixed length records

• Data is viewed as either a stream of characters or a series of fixed length records

Copyright © 2006 The McGraw-Hill Companies, Inc.

Standard Files

• Used to represent keyboard, screen display, and a standard error file to store run-time errors

– Unix: stdin, stdout, stderr

– C: stdin, stdout, stderr

– C++: cin, cout, cerr

– Java: System.in, System.out, System.err

Copyright © 2006 The McGraw-Hill Companies, Inc.

Unformatted vs Formatted I/O

Unformatted I/O corresponds to what C++ and Java refer to as streams, and Fortran calls list-directed .

Characters are read or written as a stream; input routines group them into tokens and convert to an internal data type based on the next list item

C++ Example : cin >> x >> y >> p;

Fortran Example : READ(*,*) x, y, p

Copyright © 2006 The McGraw-Hill Companies, Inc.

Unformatted Input/Output Streams

• Fortran code to read data into an array using list integer :: i, a(8) write(*,*) “Enter 8 integers: “ read(*,*) a write(*,*) a

• Java stream types for sequential I/O:

– file: transfers data to or from a file

– pipe: transfers data to another stream

– memory: transfer data between an array and the program

Copyright © 2006 The McGraw-Hill Companies, Inc.

Formatted I/O

Requires the programmer to write descriptions of the input/output format

C: fscanf(input, “%d”, &a[i]); hybrid; permits arbitrary number of blanks

Fortran syntax for formatted reads and writes:

READ (file-unit,format-expr)var-list)

WRITE (file-unit,format-expr)var-list)

Fortran doesn’t automatically skip blanks

Copyright © 2006 The McGraw-Hill Companies, Inc.

Formatted I/O

WRITE (*, 120) a

120 FORMAT ( ‘Contents of array a: ‘, 8I5)

Position 1 in an output format statement controls line feeds

Blank indicates single space, ‘1’ (?) double space, anything else is a form feed

Copyright © 2006 The McGraw-Hill Companies, Inc.

Formats

• C

– Codes: d, e, f, c, s (decimal. float, float, char, string)

– Specifier: % opt-width code

– Ex: %s %5d %20s %8.2f

• Fortran

– Codes: i, f, a (integer, float, string)

– Specifier: op-repeat code width

– Ex: 8i4, f8.2, a20

Copyright © 2006 The McGraw-Hill Companies, Inc.

Download