Chapter 1 - Introduction

advertisement
For Java (and most other langauges we will cover), you are given a ticket.
We can’t possibly cover everything you need to know, but we have allowed
you admitance to the “show”.
You know the debugger. You can run hello world. You understand key
differences. You know where the documenation is. What you do with your
admission ticket is up to you!
Chapter 1
Louden, Programming Languages
1
Chapter 1 - Introduction
Programming Languages:
Principles and Practice, 2nd Ed.
© Kenneth C. Louden, 2003 Adapted by Vicki Allan 2006
2
Course Motivation

Why are programming languages the
way they are?

How are particular language features
implemented/supported?
Chapter 1
Louden, Programming Languages
3
Course Motivation cont…

understand the underlying ideas of the
main programming paradigms
 know more about the huge variety of
programming languages
 understand how the syntax and semantics
of languages can be defined precisely.
 know how important features are
supported
 have a deeper understanding of the
history and rationale behind languages
like
C++.
Chapter 1
Louden, Programming Languages
4
Relationship between thought and
language.

The Sapir-Whorf hypothesis in linguistics
states that the structure of one's mothertongue influences the way one's mind
perceives the world. It has found at best
very limited experimental support, at least
in its strong form.
 One study has shown that subjects in
memory tests are more likely to remember
a given color if their mother language
includes a word for that color.
 Example – if you had no identified concept
of recursion, how would that affect the
ability to reason about it?
Chapter 1
Louden, Programming Languages
5
Why study programming languages?
Increased capacity to express ideas
 improved background for choosing
language
 increased ability to learn new
languages
 Better understanding of significance
of implementation
 Ability to design new languages - or
user interface

Chapter 1
Louden, Programming Languages
6
Example
Beginning students – always wanted
to know specific answers:Can I do
X? What happens if I do Y?
 Often hadn’t tried the specific test,
but could reason about it from
general knowledge of
implementation.
 Ex: What happens if I try to return a
reference to a local variable?

Chapter 1
Louden, Programming Languages
7
The human-computer semantic gap
Human:
 Interested in modelling the real world
 More interested in what computer should do
than how
Computer:
 Only data it can manipulate is sequences of
zeros and ones.
 Understands low-level “how” instructions.
Chapter 1
Louden, Programming Languages
8
What are programming
languages…
High-level languages bridge the
human-computer semantic gap by
providing a higher level notation
that can still be executed by
computer
Chapter 1
Louden, Programming Languages
9
What is a Programming Language?

Definition: A programming language
is a notational system for describing
computation in machine-readable and
human-readable form.
Chapter 1
Louden, Programming Languages
10
Computation:
Described by a Turing Machine - a
very simple computer that can carry
out all known computations (albeit
not very efficiently).
 A programming language is Turing
complete if it can be used to describe
any computation performed by a
Turing Machine.

Chapter 1
Louden, Programming Languages
11
Turing Machine – 1936 Alan Turing




based on the idea of a person executing a well-defined procedure by
changing the contents of an unlimited paper tape, divided into squares
that can contain one of a finite set of symbols.
"If your state is 42 and the symbol you see is a '0' then replace this with a
'1', move one symbol to the right, and assume state 17 as your new state."
A Turing machine is equivalent to a pushdown automaton made more
powerful by relaxing the last-in-first-out requirement of its stack.
More precisely, a Turing machine consists of:
–
–
–
–
–
Chapter 1
A tape which is divided into cells, one next to the other. Each cell contains a
symbol from some finite alphabet. The alphabet contains a special blank symbol
(here written as '0') and one or more other symbols. The tape is assumed to be
arbitrarily extendible to the left and to the right, i.e., the Turing machine is always
supplied with as much tape as it needs for its computation. Cells that have not
been written to before are assumed to be filled with the blank symbol.
A head that can read and write symbols on the tape and move left and right.
A state register that stores the state of the Turing machine.
An action table (or transition function) that tells the machine what symbol to
write, how to move the head ('L' for one step left, and 'R' for one step right) and
what its new state will be, given the symbol it has just read on the tape and the
state it is currently in. If there is no entry in the table for the current combination
of symbol and state then the machine will halt.
Note that every part of the machine is finite; it is the potentially unlimited amount
of tape that gives it an unbounded amount of storage space.
Louden, Programming Languages
12
What is needed for Turing
completeness?

Virtually nothing:
– A programming language is Turing
complete provided it has
• integer variables and
• arithmetic and
• sequentially executes statements, which
include assignment, selection (if) and loop
(while) statements.

Even if statements are unnecessary
Chapter 1
Louden, Programming Languages
13
Machine-readability:
Also not a huge requirement:
Basically, the existence of a (more or
less) linear-time translation
algorithm.
 Usually boils down to:
The syntax must be given by a
context-free grammar.

Chapter 1
Louden, Programming Languages
14
Human-readability:
This is the real issue!
 Virtually all the complex details of a
programming language are there to
(supposedly) enhance human
readability.
 Still not very well understood.
 Is strongly dependent on good
choice of abstractions.

Chapter 1
Louden, Programming Languages
15
What about human “writability??”
Aren’t programming languages there
to promote the writing of programs,
not the reading of them?
 Nonsense! Writability is a hacker’s
goal: Perl is very writable, but try to
read it!
 Readability is the real goal: many
people are going to have to read your
program after you have written it.

Chapter 1
Louden, Programming Languages
16
Abstractions:
S im p le S tru c tu re d U n it
D a ta
in t,
char
C o n tro l g o to ,
=
Chapter 1
c la s s ,
s tru c t
if { }
e ls e { },
w h ile { },
p ro c e d u re
Louden, Programming Languages
file ,
package,
AP I,
AD T
file ,
package,
AP I,
AD T
17
Computational Paradigms
Programming languages began by
imitating the operations of a
computer.
 It is not surprising that the kind of
computer for which they were written
had significant effect on their design.

– variables representing memory
– assignment to change values
– sequential execution of statements
Chapter 1
Louden, Programming Languages
18
Language Paradigms:

Imperative (procedural): traditional
sequential programming (passive data,
active control). Characterized by variables,
assignment, and loops.
 Object-oriented: data-centric, data
controls its own use, action by request to
data objects. Characterized by messages,
instance variables, and protection.
Extension of imperative paradigm.
 Functional: passive data, but no
sequential control; all action by function
evaluation (“call”), particularly recursion.
No local variables! ~ to mathematics
Chapter 1
Louden, Programming Languages
19
Language Paradigms (cont.):

Logic: Assertions are the basic data; logic
inference the basic control. Again, no
sequential operation. ~ to mathematics
ex: I am your sister if I am female and we
have common parents.
 Parallel: well, maybe not really a
paradigm, but some think so. Again, no
sequential operation.
 “Declarative”: Logic and functional
paradigms share this property: state
“what” needs computing, not “how”
(sequence).
Chapter 1
Louden, Programming Languages
20
Languages and paradigms

Imperative: C, Pascal, core Ada,
FORTRAN

Functional: Lisp (Scheme), ML,
Haskell

Object-oriented: C++, Java, Smalltalk

Logic: Prolog

Parallel: Java (threads), Ada (tasks)
Chapter 1
Louden, Programming Languages
21
Perl






The overall structure of Perl derives broadly from the
programming language C. Perl is a procedural
programming language, with variables, expressions,
assignment statements, brace-delimited code blocks,
control structures, and subroutines.
Like the Unix shells, Perl has many built-in functions for
common tasks, like sorting, and for accessing system
facilities.
Perl takes lists from Lisp, associative arrays from awk,
and regular expressions from sed. These simplify and
facilitate all manner of parsing, text handling, and data
management tasks.
Perl has many and varied applications.
It has been used since the early days of the Web to write
CGI scripts, and is an integral component of the popular
LAMP (Linux / Apache / MySQL / (Perl / PHP / Python))
platform for web development.
Perl is often used as a "glue language", tying together
systems and interfaces that were not specifically
designed to interoperate.
Chapter 1
Louden, Programming Languages
22
Are functional languages Turingcomplete?

Previous theorem on Turing-completeness
depends on the existence of variables and
loops.
 Functional programs do not have
variables or loops. Can all computation be
expressed?
 Yes!:
– A programming language is Turing complete if it has
integer values, arithmetic functions on those values,
and if it has a mechanism for defining new functions
using existing functions, selection, and recursion.
Chapter 1
Louden, Programming Languages
23
Paradigm use is rarely “pure”:
The C program (in text)defined gcd
function in a purely functional style,
even though C is mainly imperative.
 The Java program used some
imperative code to compute the gcd,
and was not completely objectoriented (integers aren’t objects).
 The Scheme code used sequencing
to do I/O, an imperative feature.

Chapter 1
Louden, Programming Languages
24
Examples of languages that are
pure (mostly):

Imperative: (old) FORTRAN

Functional: Haskell

Object-oriented: Smalltalk
Chapter 1
Louden, Programming Languages
25
Language definition

Syntax: the structure of a program. Usually
given a formal (i.e., mathematical) definition
using a context-free language. (Lexical
structure - the structure of the words or
tokens - uses regular expressions.)
 Semantics: the actual result of execution.
Usually described in English, but can be
done mathematically.
 Semantics can have a static component:
type checking, definition checking, other
consistency checks prior to execution.
 What are dynamic components of
semantics?
Chapter 1
Louden, Programming Languages
26
Language translation
source

compiler
executable
inputs
run
outputs
Compiler: two-step process that (1)
translates source code into target
code; then (2) the user executes the
target code.
Chapter 1
Louden, Programming Languages
27
Language translation
source
inputs
run
outputs
Interpreter: one-step process in which
the source code is executed directly.
 Hybrids are also possible (Java).
inputs
source
Java
compiler
bytecode
machine
dependent
interpreter
outputs
Bytecode is composed of instructions that have been brought to the
lowest level possible without making them machine dependent.
Chapter 1
Louden, Programming Languages
28
Compilation, Interpretation, and Hybrid systems
Consider this piece of code:
public class Test
{
public static void main(String args[])
{
int i;
i = 2;
i = i + 7;
}
}
Chapter 1
Louden, Programming Languages
29

If we were to compile it, we would change
it to machine instructions that would only
work for one architecture.
 If we were to interpret it, our interpreter
would have to be able to understand the
high level code AND would repeatedly
parse it (if the code was in a loop).
 When we use the hybrid approach of Java,
we produce the file Test.class, which is a
binary file that's not readable by most
humans. We can convert the file to a
readable form with the javap tool as
shown here:
Chapter 1
Louden, Programming Languages
30



C:\ > javap -c Test
Compiled from Test.java
public class Test extends java.lang.Object {
public Test(); // a default constructor created
public static void main(java.lang.String[]);
}
Method Test()
0 aload_0
1 invokespecial #3
4 return
Method void main(java.lang.String[])
0 iconst_2 // Put integer 2 on stack
1 istore_1 // Store the top stack value at location 1
2 iload_1 // Put the value at location 1 on stack
3 bipush 7 // Put the value 7 on the stack
5 iadd // Add two top stack values together
6 istore_1 // The sum, on top of stack, stored at location 1
7 return // Finished processing
Chapter 1
Louden, Programming Languages
31

Although the bytecode cannot access
registers or directly reference memory
locations and must obey various other
restrictions, the actual JVM (java virtual
machine) program can use internally
whatever techniques are convenient to
use for a particular platform. As long as
the Java bytecode sees only a JVM
specification compliant system, the JVM
programmer has broad discretion for its
implementation
Chapter 1
Louden, Programming Languages
32
Language Implementation Methods
Compilation:
 lexical analysis: characters grouped into
logical chunks (keywords, constants, etc)
 syntax analysis: figure out what it means usually use parse trees (grammars to
define). Like diagraming sentences.
small dogs and cats – what is meant?
 optimization - to make smaller or faster
 linking: supplying missing addresses to
system code
 load module: user code augmented with
system code
Chapter 1
Louden, Programming Languages
33
The tall boy ran fast.
Chapter 1
Louden, Programming Languages
34
"Maria gave Joe the rice"
Chapter 1
Louden, Programming Languages
35
Language Implementation Methods
(cont)
Pure Interpretation:
 no translation phase - fetch, decode, and
execute the source code (not a machine code)
 Advantages/Disadvantages
1. easy to write debugger - as source lines are
unchanged
2. execution is 10-100 times slower; statement
decoding is bottleneck
3. better for simple structure - as not so slow to
decode
4. natural for some kinds of features - like dynamic
binding of type.
Ex: a = a+b If a may be integer, string, or a set, how
can we know what code to generate?
Chapter 1
Louden, Programming Languages
36
What is meant by dynamic binding?

Girls choice dance:
– Will you go with Sofie? (early binding)
– Will you go with Sofie/Ann/Betty
(whoever shows up at your door)?
(delayed binding)
– No specific partner assigned, but will
change throughout the night. (changing
binding)
Chapter 1
Louden, Programming Languages
37
Hybrid Implementation System







Compile into intermediate code. Java - compiles to bytecode.
Then bytecode is interpreted.
JVM (Java Virtual Machine) is byte code interpreted and run
time system.
Java is delivered partially compiled. Developers compile it
into byte codes, downloaded across the network. Then those
byte codes are interpreted by the browser.
Architecture-neutral: Sun designed the language so that it is
also partially interpreted.
Creating Bytecode does about 80% of the compilation work.
However, one set of Bytecode can run on any Java-enabled
computer.
The last 20% is performed at runtime by the Java
environment provided by the machine specific browser.
Eliminates the version mismatch problems: All external
program references are resolved when the application is
executed.
Chapter 1
Louden, Programming Languages
38
Compilation Steps




Figure out the full name of the class to be
invoked
Determine which method signature to use If there
is more than one matching signature, the one that
is most specific is chosen.
Example: doit(Object o) or doit(ColoredPoint p) or
doit (Point p)
A method is applicable if
– the number of parameters matches
– the type of each actual argument can be converted to
the type of the corresponding parameter.

A method is accessible if the access modifier
allows access.
Chapter 1
Louden, Programming Languages
39
Error classification





Lexical: character-level error, such as
illegal character (hard to distinguish from
syntax).
Syntax: error in structure (e.g., missing
semicolon or keyword).
Static semantic: non-syntax error prior to
execution (e.g., undefined vars, type
errors).
Dynamic semantic: non-syntax error
during execution (e.g., division by 0).
Logic: programmer error, program not at
fault.
Chapter 1
Louden, Programming Languages
40
Notes on error reporting

A compiler will report lexical, syntax, and
static semantic errors. It cannot report
dynamic semantic errors.
 An interpreter will often only report lexical
and syntax errors when loading the
program. Static semantic errors may not
be reported until just prior to execution.
Indeed, most interpreted languages (e.g.
Lisp, Smalltalk) do not define any static
semantic errors.
 No translator can report a logic error.
Chapter 1
Louden, Programming Languages
41
Sample Errors (Java):
public int gcd ( int v# ) // lexical
{ int z = value // syntax - missing ;
y = v; // static semantic - y undefined
while ( y >= 0 ) // dynamic semantic // division by zero
{ int t = y; y = z % y; z = t;
}
return y; // logic - should return z
}
Chapter 1
Louden, Programming Languages
42
Language design
Good, consistent set of abstractions.
 Tie-in to other technology:

– C : Unix
– Java : Internet
– C++ : most efficient OO language

Now also:
– Ease of interface with other languages
and systems
– Good libraries
Chapter 1
Louden, Programming Languages
43
Download