Implementation Issues Cs2403 Programming Language Spring 2005 Kun-Yuan Hsieh PLLab, NTHU Cs2403 Programming Languages Why studying? • Increased capacity to express ideas – Language feature limits the form of algorithm you construct – Simulating other language’s feature to break this limit • Improved ability of choosing right PL – To give you the ability to make informed language choices • Increase ability to learn new PL – Programming language is in a state of continuous evolution – Vocabulary and fundamental concepts helps you learn PLLab, NTHU Cs2403 Programming Languages Why studying? (cont.) • Better understanding of implementation – To use a language in the way it was designed to be used • Overall advancement of computing – Most popular languages are the best available? – Better educated programmer leads to better language population? PLLab, NTHU Cs2403 Programming Languages Evaluation criteria • Readability – Important because maintenance is cost – Overall simplicity since we tend to learn subset of it • Feature multiplicity: more than one way to accomplish one operation – Count = count +1 – Count += 1 – Count ++ • operation overloading – Reducing # of operations • Less readable if carried too far – Assembly language is simple but less readable • Orthogonality PLLab, NTHU Cs2403 Programming Languages Evaluation criteria • Readability (cont.) – Control statement • Read from top to bottom is easier to understand than jump to follow the execution order – Data type and structure makes your meaning clear • timeOut = 1 v.s. timeOut = true!! – Syntax – Identifier forms: using connotative names – Special words: program appearance – Form and meaning: statement indicates the purpose PLLab, NTHU Cs2403 Programming Languages Evaluation criteria • Writability – What affects readability affect writability – Simplicity and orthogonality • Simplicity avoid misusing but may leads to undetected error – Support for abstraction to define then use complicated structures in a simple way • Subprogram • Tree (using point or array!!) – Expressivity PLLab, NTHU Cs2403 Programming Languages Evaluation criteria • Reliability – Performs to its specification under all conditions – Type checking – Exception handling – Aliasing – Readability and writability PLLab, NTHU Cs2403 Programming Languages Evaluation criteria • Cost – Training programmers to use language – Writing programs – Compiling programs, optimization – Executing programs – Language implementation system • Java interpreter is free – Reliability – Maintaining programs PLLab, NTHU Cs2403 Programming Languages Influences on language design • Computer architecture – Software executes on hardware • Programming methodologies – 1950s and early 1960s: machine efficiency – Late 1960s: programming efficiency, readability, better control structures – Late 1970s: data abstraction – Middle 1980s: object-oriented programming PLLab, NTHU Cs2403 Programming Languages The von Neumann Arch (cont.) Memory (stores both instructions and data) Results of operations Arithmetic and logic unit Instructions and data Control unit PLLab, NTHU Cs2403 Programming Languages Input and output devices A grid-computer arch (non-Von Neumann) CPU CPU CPU CPU CPU CPU CPU CPU CPU PLLab, NTHU Cs2403 Programming Languages Categories & trade-off • PL categories: – – – – Imperative (C, Pascal, FORTRAN) Functional (LISP, Scheme) Logic (Prolog) Object-Oriented (C++, C Sharp, Java) • Design trade-off – Reliability v.s. cost of execution – Writability v.s. readability – Flexibility v.s. safety PLLab, NTHU Cs2403 Programming Languages Implementation methods • Compilation (Pascal, C, C++, FORTRAN) – Translate high-level program to machine code – Slow translation – Fast execution • Interpretation (LISP, Scheme, scripts) – No translation, interpreted by interpreter – Slow execution (translated on execution) – Easy to debug, run-time error refers to source level units • Hybrid method (Java, CSharp, Perl) – Translated to intermediate language (Java byte code, MSIL) – Medium execution speed PLLab, NTHU Cs2403 Programming Languages Compiler examples PLLab, NTHU Cs2403 Programming Languages Interpreter v.s. Compiler Input data Source Program Source Program Interpreter Compiler PLLab, NTHU Cs2403 Programming Languages Results Executable (sample.exe) Hybrid method Java Source Code Java Compiler (private property) Java Interpreter on Windows Java Byte Code Java Interpreter on Unix (Stored in the disk & Publicly available) PLLab, NTHU Cs2403 Programming Languages Hybrid method example- .Net C# Visual J# .NET VB .NET Compile into MSIL MSIL CLR do this Linux native code Windows native code Will Support soon Support now Mac OS native code Will Support soon PLLab, NTHU Cs2403 Programming Languages Helloworld.il PLLab, NTHU Cs2403 Programming Languages Virtual computer concept Virtual C++ Computer C++ compiler LISP interpreter Operating system FORTRAN compiler Macroinstruction interpreter Operating System Command interpreter Bare machine C compiler Assembler Virtual C Computer Ada compiler … PLLab, NTHU Cs2403 Programming Languages Levels of Programming Languages High-level program class Triangle { ... float surface() return b*h/2; } Low-level program ldloc.i4 b ldloc.i4 h mul ldc.i4 2 div ret Executable Machine code 0001001001000101001 0010011101100101011 01001... PLLab, NTHU Cs2403 Programming Languages Compiler phases Source program Lexical analyzer Syntax analyzer Semantic analyser Symbol-table manager Error handler Intermediate code generator Code optimizer Code generator PLLab, NTHU Target program Cs2403 Programming Languages Lexical analysis • Source program is read from left-to-right and group into tokens • Tokens: sequences of characters have a meaning • Ex: – a= b + c * 10 • • • • • • • identifier a assignment symbol = identifier b plus sign + identifier c multiplication sign * number 10 PLLab, NTHU Cs2403 Programming Languages Syntax analysis • Grouping the tokens into grammatical phrases (hierarchical) • Usually presented as a parse tree • Expressed by recursive rules – Any identifier is an expression – Any number is an expression – If exp1 and exp2 are expressions, then so are Assignment statement identifier = expression expression + expression expression * a identifier b identifier c • exp1 + exp2 • exp1 * exp3 PLLab, NTHU Cs2403 Programming Languages expression number 10 Semantics • Specification of semantics is concerned with specifying the “meaning” of well-formed programs. • Terminology: – Expressions are evaluated and yield values (and may or may not perform side effects) – Commands are executed and perform side effects. – Declarations are elaborated to produce bindings • Side effects: – change the values of variables – perform input/output PLLab, NTHU Cs2403 Programming Languages Syntax Analysis Dataflow chart Source Program Stream of Characters Scanner Error Reports Stream of “Tokens” Parser Error Reports Abstract Syntax Tree PLLab, NTHU Cs2403 Programming Languages Type Checking • For most statically typed programming languages, a bottom up algorithm over the AST: • Types of expression AST leaves are known immediately: – literals => obvious – variables => from the ID table – named constants => from the ID table • Types of internal nodes are inferred from the type of the children and the type rule for that kind of expression PLLab, NTHU Cs2403 Programming Languages