programming languages

advertisement
Computer Programming Languages
Draft (incomplete)
Computer programming languages
Low-level vs. high level languages
Machine code
Assembly language
Interpreters and compilers
internal vs. external storage
source code, object code, executable files
Development of high-level languages
Teaching languages vs. production languages
Scripting languages
High level Languages
Interpreters
Compilers
Assembly Language
Assembler
Machine Code
C. Herbert 9/15/2005
Computer Programming Languages
Draft (incomplete)
C. Herbert 9/15/2005
Computer Programming Languages
Almost all modern high-speed digital
electronic computers are based on the binary
01
numbering system. In the heart of the
00
00
01
computer, inside the central processing unit
01
00
01
00
(CPU), there are one or more arithmetic logic
units (ALUs) that process information by
0111 0101 0110 1101
performing binary arithmetic. All of the
audio, video, word processing, Internet access
and so on, which we associate with modern
computers is processed with binary arithmetic
010
10
11
10
01
10
101
inside the CPU by these relatively simple
electronic circuits. All data to be handled by
the computer, and all of its instructions, must
be processed in the CPU as a stream of binary
Figure 2.1 – All information in modern computers is
processed as base two binary numbers.
numbers.
The set of binary digits, or bits, that the CPU understands as its instructions to perform this binary arithmetic is
called the computer’s “machine code.” Each CPU, or each family of CPUs, such as the Intel 8086 family, has
its own machine code. So, there are as many machine codes as there are families of processing units.
Eventually, everything that a computer does must be translated into its machine code.
Insert image here
When a new processing unit is first invented and manufactured, it only
understands its machine code. Systems programmers work with these
The same code in java, assembler,
and machine code.
binary codes to create a new language called assembly language. They
do this by using machine code to build an assembler, which is a program
that translates assembly language into machine code. Assembly
languages are made up of very primitive instructions, just like machine
code, but they can be written using numbers in bases other than base
two; mnemonics, or short words that sound like the instructions they
represent, such as ADD for addition or SUB for subtraction; and
symbolic names instead of numbers to refer to memory locations.
In the sample code on the left, …
Computer Programming Languages
Draft (incomplete)
C. Herbert 9/15/2005
Writing sophisticated software such as word processors and video games is still rather difficult and very time
consuming in assembly language. Eventually, computer scientists and software engineers build translators that
can handle high-level languages, which are closer to human languages. Java, JavaScript, Visual BASIC, C, C++,
C#, and Python are all examples of modern high-level computer programming languages.
The translators that convert high-level
High level Languages
languages into machine code fall into
two categories: compilers and
interpreters. Using a compiler, a
programmer ends up with two stored
copies of the program. The first, in the
Interpreters
Compilers
Assembly Language
original high-level programming
language, is called the “source code.”
The second stored copy of the
Assembler
program, which is the same program
after translation into a particular
machine code, is called the object
Machine Code
code. Even after translation into
machine code, a program may still
Figure 2.3 – All programs must be translated into machine code.
Compilers, interpreters and assemblers perform this translation.
need to be processed so that it will run
on a particular computer with a particular operating system. There is often another step necessary after
compiling to mix the object code in with subroutines from the operating system. This step is sometimes called
“linking and loading” or “making” an executable program. Sometime linking and loading happens when we try
to run object code, and sometimes the compilers make and store an executable program as another step in the
process of compiling. So, with a compiler, there are two stored copies of the program, the original source code
and the object code, and sometimes a third copy called an executable program.
An interpreter is much simpler than a compiler. Rather than translating an entire source code program into
object code at once, the compiler translates each instruction one at a time and then feeds it to the CPU to be
processed before translating the next instruction. The only stored copy of the program is the original source
code program. Often scripting languages, such as JavaScript or Visual BASIC for Applications (VBA), work
this way. Scripting languages are simplified high-level programming languages that allow someone to program
in a particular environment. JavaScript can be added to the HTML codes for Web pages to provide them with
some primitive data processing capability. VBA allows someone to program features in Microsoft Office
products such as Microsoft Word or PowerPoint.
Computer Programming Languages
Draft (incomplete)
C. Herbert 9/15/2005
Interpreters are also used for teaching languages. Serious computer programming languages such as Java and
C# that are used by professional programmers are sometimes referred to as production languages. Teaching
languages are languages that are not generally used in production environments, but are instead used to teach
someone the logic of computer programming or the processes used in creating computer software before
attempting to teach them to use production languages. The Alice programming language, which is included
with this book, is an example of a teaching language. Its primary purpose is to be used as a tool to teach people
to be better programmers.
The first high level language that ever existed was the FORTRAN language, created in 1957 by the U.S.
Government in cooperation with IBM, which, at the time, was by far the world’s largest computer company.
FORTRAN was intended to be used by scientists and engineers working on large, primitive main frame
computers of the day – about 20 years before the first personal computers appeared – to program mathematical
formulas and processes. For example, it is rather easy, assuming one knows the math, to write FORTRAN
programs to perform matrix algebra on large sets of data or to perform the fast Fourier transformations that
electrical engineers use in calculus-based applications. In fact, the name FORTRAN comes from the two
words “formula translator.”
Before FORTRAN all software had to be created using assembly language and machine code. Once
FORTRAN appeared, people began to use it for much more than science and engineering. The increasing use
of FORTRAN to process commercial business data led to problems for financial accountants and auditors. A
bank auditor, for example, needs to be able to read a computer’s instructions to see what it computer is doing
with figures that represent banks deposits, account interest, and transaction fees. This was nearly impossible
with FORTRAN, unless the auditor was also a trained computer programmer.
The solution to the problems of using FORTRAN in the business world were solved with appearance in 1960 of
the COBOL language. Like the name FORTRAN, COBOL is an acronym that comes from the words “common
business-oriented language.” COBOL was developed by a team of people working for the United States Navy
under the direction of Grace Hopper, who rose to become an admiral before she retired nearly 40 years later. It
is estimated that as of the year 2000 there were more lines of code written in COBOL than in any other
computer programming language. COBOL has functions and instructions that are more suited to commercial
data processing than FORTRAN, and is a wordier language, which makes it easier for financial auditors to
understand without extensive training.
Yet COBOL, like FORTRAN, takes a while to master. For College students, this often meant that several
semesters had to be spent learning programming before anything useful could be done with a computer. At the
same time, computers were becoming smaller, less-expensive, and more accessible to the public. Personal
Computer Programming Languages
Draft (incomplete)
C. Herbert 9/15/2005
computers were still some years away, but by the mid-1960’s many college campuses had computers that
students could use. In 1965, in response to the promise of the computer on campus, and in order to make
programming as accessible to students as the new “mini-computers” that had begun to appear, two professors at
Dartmouth College in Hanover, New Hampshire, John Kemeny and Thomas Kurtz, invented the BASIC
programming language. BASIC was an interpreter-based language rather than a compiled language like
FORTRAN and COBOL, which was designed to be easy to learn and easy to use. It caught on quickly, and
when personal computers began to appear in the late 1970’s every machine had to have a BASIC interpreter or
people wouldn’t buy it, and more people learned BASIC than any other language.
Yet, as BASIC increased in popularity, one major problem with the language became evident. The BASIC
language had a GOTO command, which is sometimes strangely referred to as an “unconditional branching”
command. Each line in a BASIC program was numbered, and at any point in the program the GOTO
instruction could suddenly re-direct the flow of control to a line number in another part of the program. The
command was intended to let users set up branching ad looping command linked to IF…THEN statements, but
it was so flexible to use that for more than just a simple straight line sequence of instructions, programmer often
ended up with poorly designed logic that jumped repeatedly back and forth throughout the code. People other
than the original programmer often had to spend hours trying to figure out how the program worked. People
had to be trained to avoid creating what was referred to as “spaghetti code.”
The BASIC language was so easy to learn and so easy to use, that people often developed very bad
programming habits, such as creating spaghetti code, before they were properly trained in how to design the
logic of computer programs.
Insert loop example
In response to this problem, a computer scientist from the Netherlands named
Nicholas Wirth, invented the Pascal programming language around 1970. He
Pascal vs. BASIC here
named the language after the 17th Century French Mathematician and Philosopher,
Blaise Pascal, who 300 years earlier had been one of the first people to ever build a working mechanical
calculator. From the beginning, Wirth’s Pascal programming language was intended to be used as a teaching
language, and contained commands with built-in structured logic, which we will see in chapter xx.
Pascal was the first language to have built-in commands for looping and branching that forced the user to write
programs according to good principles of structured design. In Pascal, it became natural for programmers to
construct programs with a logical flow of instructions and almost impossible for them to end up with spaghetti
code. Like BASIC, Pascal was a simple interpreter-based language and was easy to learn and easy to use.
[Continue with C, Smalltalk, C++, and JAVA.
Download