Final Examination - CS 4610 - Fall 2009 This final examination tests your knowledge of, and your ability to communicate coherently about, high-level concepts from the course. It is an essay test. Complete the exam by writing text in the boxes provided, saving the file as uvaid.doc (e.g., wrw6y.doc), and emailing the resulting file back to me. There are thirteen pages on this exam, including this one. There are eight questions, but you should only answer seven of them. Skip one question. The final page has extra credit questions. This exam is open book: you may use any reference you like. You may spend as much time as you like on the exam, but it is due MONDAY DECEMBER 14th AT NOON. For the essay questions, include the most important relevant details you deem necessary. Your answers are space limited: you cannot exceed the box space provided, and you must use a 12-point Times New Roman font. Demonstrating knowledge is more important than English prose; if you can get your ideas across in an outline or other loose format, feel free to do so. The most important part is to convince us that you have understood and retained the material and key concepts. If there is not enough room to tell us everything you know, part of the question is for you to select the details that best show off your knowledge. You should skip one question. Do not answer all eight of the questions in this exam. Type your name here: Westley Weimer Type your UVA ID here: wrw6y Q1. Compiler Overview ?? / 10 pts Q2. Language & Paradigm Analysis ?? / 10 pts Q3. Formal Semantics ?? / 10 pts Q4. Reality Check: Improving A Language ?? / 10 pts Q5. Adding A Feature: Variable Argument ?? / 10 pts Q6. Type Checking Overview ?? / 10 pts Q7. Language Security: Case Studies ?? / 10 pts Q8. Language Feature Interaction ?? / 10 pts Extra Credit ?? Total ?? / 70 pts (recall: skip one question) 1. Compiler Overview The stages of a modern compiler are often taken to be lexing, parsing, semantic analysis, optimization and code generation. Write an essay. For each stage: (1) explain its purpose (2) explain how it is achieved (e.g., underlying algorithms) (3) explain how we write that phase (e.g., tool support) (4) explain its running time (5) explain what would happen if we did not have that phase (e.g., anything catastrophic, why can't we parse with a lexer or check types with a parser?) Your answer must fit in this box using 12-point Times New Roman. 2. Language & Paradigm Analysis Consider the languages C, Python, OCaml and Java. In addition, consider the paradigms imperative programming, functional programming, and object-oriented programming. For each of the following applications and sets of requirements, choose a language and paradigm pair and argue convincingly that they would be the best choice for an implementation. Assume the developers have equal expertise in all languages and paradigms, but are not perfect. The paradigms represent the majority, rather than the totality of the program (e.g., a mostly-functional program can still have imperative state updates). In each situation below, the requirements are listed in descending order of importance, with the most important requirement first. Each situation should be answered in at most four or five sentences. Note that the "best" answer may not be the current state of the art in industrial practice. (A) An operating system scheduler. Requirements: (1) correctness, (2) real-time performance, (3) access to hardware and low-level operations. (B) A scientific application (e.g., containing actions like matrix-matrix multiplication). Libraries with efficient implementations of high-level matrix operations are available for all languages. Requirements: (1) correctness, (2) scalability to large input sizes, (3) performance. (C) Business logic for the middle tier of a three-tier web application (e.g., the part of Amazon.com that lists reviews, suggests related products, and calculates shipping and taxes). Libraries with efficient interfaces to databases are available for all languages. Requirements: (1) correctness, then extensibility, then maintainability. (D) A proof-of-concept throw-away graphical demo for a marketing meeting (e.g., a brainstorming session). Efficient libraries for graphical user interfaces are available for all languages. Requirements: rapid development time, then ease of run-time modification during the meeting to try out ideas. (E) A vertex or pixel shader for a modern graphical game (i.e., a small program that will be uploaded to a graphics card and used to render parts of a scene). Since the result will be viewed by humans, some errors (e.g., minor blur) can be tolerated. Requirements: real-time performance, then small code and data size. (F) A small graphical game that will be downloaded by clients and run on their local computers. Efficient libraries for graphics are available for all languages. Requirements: correctness, portability to multiple architectures, performance. (G) A compiler. Requirements: correctness, manipulation of recursive data structures, performance. Answer on next page. Your answer must fit in this box using 12-point Times New Roman. 3. Formal Semantics (Note that this question tests knowledge of the details of formal semantics; students seeking a "highlevel" test should skip this question.) Consider this simple "C-Lite" language with pointers: exp := let var : type = malloc in exp | var = exp | *var = exp | &var | *exp | var | constant | exp1 ; exp2 | print exp ;; allocate space for a new local variable ;; variable assignment ;; assignment through a pointer ;; evaluates to the address of the variable ;; dereference a pointer (i.e., load its value) ;; evaluates to the value of the variable ;; integer constants ;; sequencing ;; side effects type := int | ptr to type ;; basic scalar type ;; pointer to another type Given that language, the printed output of this program: let x : int = malloc in x = 11 ; let y : int = malloc in y = 22 ; let my_ptr : ptr to int = malloc in my_ptr = & x ; *my_ptr = 33 ; print x ; print y ... is "33" followed by "22". Example types are "int" and "ptr to ptr to int". The typing rules follow: O[x/T1] ├ e : T2 -------------------------------------O ├ let x : T1 = malloc in e : T2 O(x) = T1 O ├ e : T2 T2 <= T1 ----------------------------O ├ x = e : T2 O(x) = ptr to T 1 O ├ e : T2 T2 <= T1 ------------------------------------O ├ * x = e : T2 O(x) = T ------------------------O ├ &x : ptr to T O(x) = T --------------------O├x:T --------------------O ├ constant : int O ├ e1 : T1 O ├ e2 : T2 --------------------------------O ├ e1 ; e2 : T2 O ├ e : int --------------------------------O ├ print e : int ------------------------------ptr to type <= int t1 <= t2 ------------------------------ptr to t1 <= ptr to t2 The operational semantics follows: L = newloc() S[L/0],E[x/L] ├ e : S2 --------------------------------------------S,E ├ let x : T1 = malloc in e : S2 E(x) = L S(L) = v ----------------------------S,E ├ x : v,S S,E ├ e : v,S2 E(x)=L ----------------------------S,E ├ x = e : v,S2[L/v] ----------------------------S,E ├ constant : constant,S S,E ├ e : v_rhs,S2 E(x) = L S(L)=L_lhs ----------------------------S,E ├ *x = e : v,S2[L_lhs/v_rhs] E(x) = L ----------------------------S,E ├ &x : L,S S,E ├ e1 : v1,S2 S2,E ├ e2 : v2,S3 ------------------------------S,E ├ x = e1 ; e2 : v2,S3 S,E ├ e : v,S2 ------------------------------S,E ├ print e : v,S2 Note that null-pointer dereferences are checked for at run-time and are not errors. Give a program that type checks (starting with an empty object environment) but fails at run-time (i.e., a program for which there is no operational semantics derivation). Your answer must fit in this box using 12-point Times New Roman. 4. Reality Check: Improving A Language Consider the Cool programing language as defined in this class. You are charged with making design changes to Cool so that it can be used by professional developers to rapidly produce correct and maintainable code. Identify five problems with Cool that prevent it from attaining those goals. Explain each problem, explain how you would change Cool, and justify your change. Be concrete. Choose problems and solutions to show off your knowledge. The aspects you change can occur at any level (e.g., lexing, parsing, semantic analysis, run-time, language features, etc.). Your answer must fit in this box using 12-point Times New Roman. 5. Adding A Feature: Variable Argument Functions You are tasked with adding variable argument functions (e.g., like printf in C or OCaml) to Cool. For each stage of the compiler (i.e., lexer, parser, semantic analysis, optimization, or code generation) indicate what you would have to change and why. When your changes have been implemented, Cool programmers should be able to write functions such as many_add() that work as follows: out_int( many_add(1,2,3) * many_add(4,5) ) ; (* prints 54 *) out_int( many_add(1,"hello") ) (* type error *) Be specific. You will lose points if you fail to address relevant issues (e.g., inheritance of variable argument functions). Your answer must fit in this box using 12-point Times New Roman. 6. Type Checking: Overview What is the difference between a static type and a dynamic type? Describe differences between the type checking of and the evaluation of a dynamic method dispatch expression. Carefully explain the roles of static and dynamic types in dispatch. Include an explanation of subtyping and how it is important to object-oriented programs. Your answer must fit in this box using 12-point Times New Roman. There are two broad categories of mistakes related to type checkers. That is, a type system can "go wrong" or "do the wrong thing" in two ways. Describe and comment on both failure modes. Which one is more desirable? When is a type system formally sound? Your answer must fit in this box using 12-point Times New Roman. 7. Language Security: Case Study You are tasked with operating a website with a webform that accepts ZIP files uploaded by users. Received ZIP files are then unpacked. If a file named main.py is found inside, that file is assumed to be the source code to a type-checker and it is interpreted repeatedly against a series of held-out AST test cases stored locally. Its output is compared against reference output also stored locally. A web page report is generated for the uploading user indicating which outputs matched. This is exactly the "PA4 Automated Test Server" scenario you used in class. Describe three distinct attacks villainous evil-doers might launch against such a website. For each attack, describe how you might mitigate the treat. If no solution is possible to a particular threat, describe a partial solution that fails to work in all cases and explain why. Each example you choose should involve programming language concepts in the attack or in the defense; explain all such relevant concepts. Security attacks typically occur with respect to some policy such as "outside users should not be able to crash my machine" or "outside users should not be able to see the contents of the test cases" or "outside users should not be able to cheat". For each attack you should make up and state a relevant policy that it violates. Similarly, you may make up and state any reasonable server assumptions you like (e.g., "a database is used to store the file contents" or "the normal filesystem is used to store file contents" or "a static buffer is used to store the uploaded file"). Answer on next page. Your answer must fit in this box using 12-point Times New Roman. 8. Language Feature Interactions Consider the following language features: (1) linking and shared libraries (6) subtyping and/or polymorphism (2) garbage collection (7) compiler optimizations (3) profiling (8) exception handling (4) debugging (9) static type checking (5) arrays Choose three different pairs of features. For each feature pair, describe how the interaction or combination of those two features leads to errors or challenges for the language implementor or user. Then describe the best possible solution you can think of. For example, you might note that arrays combined with compiler optimizations that remove bound checks lead to buffer overflow vulnerabilities, explain all of the concepts involved, and then discuss solutions (you may not use this example in your answer). Be very specific: this is an easy question to mistakenly gloss over, but I want details. Your answer must fit in this box using 12-point Times New Roman. Your Comments (2 points of Extra Credit.) Indicate if you have completed Zak Fry's human study, located at http://church.cs.virginia.edu/~zpf5a/bugFinder/ . Thoughts? Comments? Answer goes here. Use as much space as you like. Answer goes here. Use as much space as you like. (1 point of Extra Credit.) What were your favorite aspects of CS 4610? Favorite topics? Favorite things the professor did or didn't do? Answer goes here. Use as much space as you like. (1 point of Extra Credit.) What were your least favorite aspects of CS 4610? What would you change for next time? Answer goes here. Use as much space as you like. (1 point of Extra Credit.) Write an "advertisement" (e.g., "you should take this class because ...") or bit of "advice" (e.g., "you should start PA4 early because ...") for students taking this class next year. I will present your text verbatim (but anonymously) to next year's students when they are considering taking the course (e.g., in the first week of class) and also add your advice to the project description pages. Answer goes here. Use as much space as you like.