2 Values and Types Types of values. Primitive, composite, recursive types. Type systems: static vs dynamic typing, type completeness. Expressions. Implementation notes. © 2004, D.A. Watt, University of Glasgow 2-1 Types (1) Values are grouped into types according to the operations that may be performed on them. Different PLs support different types of values (according to their intended application areas): • Ada: booleans, characters, enumerands, integers, real numbers, records, arrays, discriminated records, objects (tagged records), strings, pointers to data, pointers to procedures. • C: enumerands, integers, real numbers, structures, arrays, unions, pointers to variables, pointers to functions. • Java: booleans, integers, real numbers, arrays, objects. • Haskell: booleans, characters, integers, real numbers, tuples, disjoint unions, lists, recursive types. 2-2 Types (2) Roughly speaking, a type is a set of values: • v is a value of type T if v T. E is an expression of type T if E is guaranteed to yield a value of type T. But only certain sets of values are types: • {false, true} is a type, since the operations not, and, and or operate uniformly over the values false and true. • {, –2, –1, 0, +1, +2, …} is a type, since operations such as addition and multiplication operate uniformly over all these values. • {13, true, Monday} is not considered to be a type, since there are no useful operations over this set of values. 2-3 Types (3) More precisely, a type is a set of values, equipped with one or more operations that can be applied uniformly to all these values. The cardinality of a type T, written #T, is the number of values of type T. 2-4 Primitive types A primitive value is one that cannot be decomposed into simpler values. A primitive type is one whose values are primitive. Every PL provides built-in primitive types. Some PLs also allow programs to define new primitive types. 2-5 Built-in primitive types (1) Typical built-in primitive types: Boolean = {false, true} Character = {…, ‘A’, …, ‘Z’, …, ‘0’, …, ‘9’, …} PL- or implementation-defined set of characters (ASCII, ISOLatin, or Unicode) Integer = {…, –2, –1, 0, +1, +2, …} PL- or implementation-defined set of whole numbers Float = {…, –1.0, …, 0.0, +1.0, …} PL- or implementation-defined set of real numbers Names of types vary from one PL to another: not significant. 2-6 Built-in primitive types (2) Cardinalities: #Boolean = 2 #Character = 128 (ASCII), 256 (ISO-Latin), or 32768 (Unicode) #Integer = max integer – min integer + 1 Note: In some PLs (such as C), booleans and characters are just small integers. Some languages provide not one but several integer types. For example, JAVA provides • • • • byte {−128, . . . ,+127}, #byte = 28 short {−32 768, . . . ,+32 767}, #short = 216 int {−2 147 483 648, . . . ,+2 147 483 647}, #int= 232 long {−9 223 372 036 854 775 808, . . . ,+9 223 372 036 854 775 807}. #short = 264 2-7 Defined primitive types In Ada we can define new numeric types. In Ada and C we can define new enumeration types simply by enumerating their values (called enumerands). ADA type definition: type Month is (jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec); C++ type definition enum Month {jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec}; 2-8 Example: Ada numerics Type declaration: type Population is range 0 .. 1e10; Set of values: Population = {0, 1, …, 1010} Cardinality: #Population = 1010+1 2-9 Example: Ada enumerations Type declaration: type Color is (red, green, blue); Set of values: Color = {red, green, blue} Cardinality: #Color = 3 2-10 Composite types A composite value is one that is composed from simpler values. A composite type is a type whose values are composite. PLs support a huge variety of composite types. All these can be understood in terms of a few concepts: • Cartesian products (tuples, structures, records) • mappings (arrays) • disjoint unions (algebraic data types, discriminated records, objects) • recursive types (lists, trees, etc.) 2-11 Cartesian products (1) In a Cartesian product, values of several types are grouped into tuples. Let (x, y) stand for the pair whose first component is x and whose second component is y. Let S T stand for the set of all pairs (x, y) such that x is chosen from set S and y is chosen from set T: S T = { (x, y) | x S; y T } Cardinality: #(S T) = #S #T hence the “” notation 2-12 Cartesian products (2) We can generalise from pairs to tuples. Let S1 S2 Sn stand for the set of all n-tuples such that the ith component is chosen from Si: S1 S2 Sn = { (x1, x2, , xn) | x1 S1; x2 S2; …; xn Sn } Basic operations on pairs: • construction of a pair from its component values • selection of the first or second component of a pair. Records (Ada), structures (C), and tuples (Haskell) can all be understood in terms of Cartesian products. 2-13 Example: Ada records (1) Type declarations: type Month is (jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec); type Day_Number is range 1 .. 31; type Date is record m: Month; d: Day_Number; end record; Application code: record construction someday: Date := (jan, 1); … put(someday.m+1); put("/"); put(someday.d); someday.d := 29; someday.m := feb; component selection 2-14 Example: Ada records (2) Set of values: Date = Month Day-Number = {jan, feb, …, dec} {1, …, 31} viz: (jan, 1) (feb, 1) … (dec, 1) (jan, 2) (feb, 2) … (dec, 2) … (jan, 30) … (feb, 30) … … … (dec, 30) (jan, 31) (feb, 31) … (dec, 31) Cardinality: NB #Date = #Month #Day-Number = 12 31 = 372 2-15 Example: C structures (1) Type declarations: enum Month {jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec}; struct Date { Month m; byte d; }; structure construction: struct Date someday = {jan, 1}; 2-16 Example: C structures (2) structure selection: printf("%d/%d", someday.m +1, someday.d); someday.d = 29; someday.m = feb; Set of values : Date = Month × Byte = {jan, feb, ... , dec} × {0, ... , 255} Cardinality: #Date = #Month #Byte = 12 256 = 4K 2-17 Example: Haskell tuples Declarations: data Month = Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec type Date = (Month, Int) Set of values: Date = Month Integer = {Jan, Feb, …, Dec} {…, –1, 0, 1, 2, …} Application code: someday = (jan, 1) m, d = someday anotherday = (m + 1, d) tuple construction component selection (by pattern matching) 2-18 Mappings (1) We write m : S T to state that m is a mapping from set S to set T. In other words, m maps every value in S to some value in T. If m maps value x to value y, we write y = m(x). The value y is called the image of x under m. Some of the mappings in {u, v} {a, b, c}: m1 = {u a, v c} m2 = {u c, v c} m3 = {u c, v b} image of u is c, image of v is b 2-19 Mappings (2) Let S T stand for the set of all mappings from S to T: S T = { m | x S m(x) T } What is the cardinality of S T? There are #S values in S. Each value in S has #T possible images under a mapping in S T. So there are #T #T … #T possible mappings. Thus: #(S T) = (#T)#S #S copies of #T multiplied together For example, in {u, v} {a, b, c} there are 32 = 9 possible mappings. 2-20 Arrays (1) Arrays (found in all imperative and OO PLs) can be understood as mappings. If an array’s components are of type T and its index values are of type S, the array has one component of type T for each value in type S. Thus the array’s type is S T. An array’s length is the number of components, #S. Basic operations on arrays: • construction of an array from its components • indexing – using a computed index value to select a component. so we can select the ith component 2-21 Arrays (2) An array of type S T is a finite mapping. Here S is nearly always a finite range of consecutive values {l, l+1, …, u}. This is called the array’s index range. lower bound upper bound In C and Java, the index range must be {0, 1, …, n–1}. In Ada, the index range may be any primitive (sub)type other than Float. We can generalise to n-dimensional arrays. If an array has index ranges of types S1, …, Sn, the array’s type is S1 … Sn T. 2-22 Example: C++ arrays (1) Type declarations: bool p[3]; Application code: bool p[] = {true, false, true}; … p[c] = !p[c]; indexing array construction indexing 2-23 Example: C++ arrays (2) Set of values: {0, 1, 2} → {false, true} viz: {0 false, {0 false, {0 false, {0 false, {0 true, {0 true, {0 true, {0 true, 1 false, 1 false, 1 true, 1 true, 1 false, 1 false, 1 true, 1 true, 2 false} 2 true} 2 false} 2 true} 2 false} 2 true} 2 false} 2 true} Cardinality: (#Boolean)#index = 23 = 8 2-24 Example: Ada arrays (1) Type declarations: type Color is (red, green, blue); type Pixel is array (Color) of Boolean; Application code: p: Pixel := (true, false, true); c: Color; array construction … p(c) := not p(c); indexing indexing 2-25 Example: Ada arrays (2) Set of values: Pixel = Color Boolean = {red, green, blue} {false, true} viz: {red false, {red false, {red false, {red false, {red true, {red true, {red true, {red true, green false, green false, green true, green true, green false, green false, green true, green true, blue false} blue true} blue false} blue true} blue false} blue true} blue false} blue true} Cardinality: #Pixel = (#Boolean)#Color = 23 = 8 2-26 Example: Ada 2-dimensional arrays Type declarations: type Xrange is range 0 .. 511; type Yrange is range 0 .. 255; type Window is array (YRange, XRange) of Pixel; Set of values: Window = Yrange Xrange Pixel = {0, 1, …, 255} {0, 1, …, 511} Pixel Cardinality: #Window = (#Pixel)#Yrange #Xrange = 8256 512 2-27 Functions as mappings Functions (found in all PLs) can also be understood as mappings. A function maps its argument(s) to its result. If a function has a single argument of type S and its result is of type T, the function’s type is S T. Basic operations on functions: • construction (or definition) of a function • application – calling the function with a computed argument. We can generalise to functions with n arguments. If a function has arguments of types S1, …, Sn and its result type is T, the function’s type is S1 … Sn T. 2-28 Example: C++ functions Definition: bool isEven (int n) { return (n % 2 == 0); } Type: Integer Boolean Value: {…, 0 true, 1 false, 2 true, 3 false, …} Other functions of same type: is_odd, is_prime, etc. 2-29 Example: Ada functions Definition: function is_even (n: Integer) return Boolean is begin return (n mod 2 = 0); end; or any other code that achieves the same effect Type: Integer Boolean Value: {…, 0 true, 1 false, 2 true, 3 false, …} Other functions of same type: is_odd, is_prime, etc. 2-30 Disjoint unions (1) In a disjoint union, a value is chosen from one of several different types. Let S + T stand for a set of disjoint-union values, each of which consists of a tag together with a variant chosen from either type S or type T. The tag indicates the type of the variant: S + T = { left x | x S } { right y | y T } • left x is a value with tag left and variant x chosen from S • right x is a value with tag right and variant y chosen from T. Let us write left S + right T (instead of S + T) when we want to make the tags explicit. 2-31 Disjoint unions (2) Cardinality: #(S + T) = #S + #T hence the “+” notation Basic operations on disjoint-union values in S + T: • construction of a disjoint-union value from its tag and variant • tag test, to determine whether the variant was chosen from S or T • projection, to recover either the variant in S or the variant in T. Algebraic data types (Haskell), discriminated records (Ada), and objects (Java) can all be understood in terms of disjoint unions. We can generalise to multiple variants: S1 + S2 + + Sn. 2-32 Example: Haskell algebraic data types (1) Type declaration: data Number = Exact Int | Inexact Float Each Number value consists of a tag, together with either an Integer variant (if the tag is Exact) or a Float variant (if the tag is Inexact). Set of values: Number = Exact Integer + Inexact Float viz: … Exact(–2) Exact(–1) Exact 0 Exact 1 Exact 2 … … Inexact(–1.0) … Inexact 0.0 … Inexact 1.0 … Cardinality: #Number = #Integer + #Float 2-33 Example: Haskell algebraic data types (2) Application code: pi = Inexact 3.1416 rounded :: Number -> Integer rounded num = case num of Exact i -> i projection Inexact r -> round r (by pattern matching) disjoint-union construction tag test 2-34 Example: Ada discriminated records (1) Type declarations: type Accuracy is (exact, inexact); type Number (acc: Accuracy := exact) is record case acc of when exact => ival: Integer; when inexact => rval: Float; end case; end record; Each Number value consists of a tag field named acc, together with either an Integer variant field named ival (if the tag is exact) or a Float variant field named rval (if the tag is inexact). 2-35 Example: Ada discriminated records (2) Set of values: Number = exact Integer + inexact Float viz: … exact(–2) exact(–1) exact 0 exact 1 exact 2 … … inexact(–1.0) … inexact 0.0 … inexact 1.0 … Cardinality: #Number = #Integer + #Float 2-36 Example: Ada discriminated records (3) Type declarations: type Form is (pointy, circular, rectangular); type Figure (f: Form := pointy) is record x, y: Float; case f is when pointy => null; when circular => r: Float; when rectangular => w, h: Float; end case; end record; Each Figure value consists of a tag field named f, together with a pair of Float fields named x and y, together with either an empty variant or a Float variant field named r or a pair of Float variant fields named w and h. 2-37 Example: Ada discriminated records (4) Set of values: Figure = pointy(Float Float) + circular(Float Float Float) + rectangular(Float Float Float Float) e.g.: pointy(1.0, 2.0) circular(0.0, 0.0, 5.0) rectangular(1.5, 2.0, 3.0, 4.0) … represents the point (1, 2) represents a circle of radius 5 centered at (0, 0) represents a 34 rectangle centered at (1.5, 2) 2-38 Example: Ada discriminated records (5) Application code: discriminated-record construction box: Figure := (rectangular, 1.5, 2.0, 3.0, 4.0); function area (fig: Figure) return Float is begin case fig.f is when pointy => tag test return 0.0; when circular => return 3.1416 * fig.r**2; when rectangular => return fig.w * fig.h; end case; end; projection 2-39 Example: Java objects (1) Type declarations: class Point { private float x, y; … // methods } class Circle extends Point { private float r; … // methods } inherits x and y from Point class Rectangle extends Point { private float w, h; … // methods } inherits x and y from Point 2-40 Example: Java objects (2) Set of objects in this program: Point(Float Float) + Circle(Float Float Float) + Rectangle(Float Float Float Float) +… The set of objects is open-ended. It is augmented by any further class declarations. 2-41 Example: Java objects (3) Methods: class Point { … public float area() { return 0.0; } } class Circle extends Point { … public float area() { return 3.1416 * r * r; } } overrides Point’s area() method class Rectangle extends Point { overrides Point’s … area() method public float area() { return w * h; } 2-42 } Example: Java objects (4) Application code: Rectangle box = new Rectangle(1.5, 2.0, 3.0, 4.0); float a1 = box.area(); it can refer to a Point it = …; Point, Circle, or float a2 = it.area(); Rectangle object calls the appropriate area() method 2-43 Composite values Composite concept Set of values cardinality (C++, Java) Ada General set of values Cartesian product Mappings SxT #(S x T) = #S x #T structures Records S 1 x S2 x x Sn ST #(S T) = (#T)#S Arrays , S is integer Arrays S 1 x S2 x x Sn T Disjoint unions S+T #(S + T) = #S + #T objects discriminated records S1 + S2 + + Sn 2-44 Recursive types A recursive type is one defined in terms of itself. Examples of recursive types: • lists • trees 2-45 Lists (1) A list is a sequence of 0 or more component values. The length of a list is its number of components. The empty list has no components. A non-empty list consists of a head (its first component) and a tail (all but its first component). A list is homogeneous if all its components are of the same type. Otherwise it is heterogeneous. 2-46 Lists (2) Typical list operations: • length • emptiness test • head selection • tail selection • concatenation. 2-47 Lists (3) For example, an integer-list may be defined recursively to be either empty or a pair consisting of an integer (its head) and a further integer-list (its tail): Integer-List = nil Unit + cons(Integer Integer-List) or Integer-List = { nil } { cons(i, l) | i Integer; l Integer-List } where Unit is a type with only one (empty) value. Solution: Integer-List = { nil } { cons(i, nil) | i Integer } { cons(i, cons(j, nil)) | i, j Integer } { cons(i, cons(j, cons(k, nil))) | i, j, k Integer } … 2-48 Example: Haskell lists Type declaration for integer-lists: data IntList = Nil | Cons Int IntList recursive Some IntList constructions: Nil Cons 2 (Cons 3 (Cons 5 (Cons 7 Nil))) Actually, Haskell has built-in list types: [Int] [String] [[Int]] Some list constructions: [] [2,3,5,7] ["cat","dog"] [[1],[2,3]] 2-49 Example: Ada lists Type declarations for integer-lists: type IntNode; type IntList is access IntNode; type IntNode is record head: Integer; tail: IntList; end record; mutually recursive An IntList construction: new IntNode'(2, new IntNode'(3, new IntNode'(5, new IntNode'(7, null))) 2-50 Example: Java lists (1) Class declarations for integer-lists: class IntList { public int head; public IntList tail; recursive public IntList (int h, IntList t) { head = h; tail = t; } } An integer-list construction: new IntList(2, new IntList(3, new IntList(5, new IntList(7, null))))); 2-51 Example: Java lists (2) Class declarations for object-lists: class List { public Object head; public List tail; public List (Object h, IntList t) { head = h; tail = t; } } Note that List objects are heterogeneous lists (since head can refer to an object of any class). By contrast, IntList objects are homogeneous lists. 2-52 Strings A string is a sequence of 0 or more characters. Some PLs (ML, Python) treat strings as primitive. Haskell treats strings as lists of characters. Strings are thus equipped with general list operations (length, head selection, tail selection, concatenation, …). Ada treats strings as arrays of characters. Strings are thus equipped with general array operations (length, indexing, slicing, concatenation, …). Java treats strings as objects, of class String. 2-53 Type systems A PL’s type system groups values into types: • to enable programmers to describe data effectively • to help prevent type errors. A type error occurs if a program performs a nonsensical operation such as multiplying a string by a boolean. Possession of a type system distinguishes high-level PLs from low-level languages (such as assembly languages). In the latter, the only “types” are bytes and words, so nonsensical operations cannot be prevented. 2-54 Static vs dynamic typing (1) Before any operation is performed, its operands must be type-checked to prevent a type error. E.g.: • mod operation: check that both operands are integers • and operation: check that both operands are booleans • indexing operation: check that the left operand is an array, and that the right operand is a value of the array’s index type. 2-55 Static vs dynamic typing (2) In a statically typed PL: • all variables and expressions have fixed types (either stated by the programmer or inferred by the compiler) • all operands are type-checked at compile-time. Most PLs are statically typed, including Ada, C, C++, Java, Haskell. 2-56 Static vs dynamic typing (3) In a dynamically typed PL: • values have fixed types, but variables and expressions do not • operands must be type-checked when they are computed at runtime. Some PLs and many scripting languages are dynamically typed, including Smalltalk, Lisp, Prolog, Perl, Python. 2-57 Example: C++ static typing Ada function definition: bool even (int n) { return (n % 2 == 0); } Call: int p; … even(p+1) … The compiler doesn’t know the value of n. But, knowing that n’s type is Integer, it infers that the type of “n mod 2 = 0” will be Boolean. The compiler doesn’t know the value of p. But, knowing that p’s type is Integer, it infers that the type of “p+1” will be Integer. Even without knowing the values of variables and parameters, the C++ compiler can guarantee that no type errors will happen at run-time. 2-58 Example: Ada static typing Ada function definition: The compiler doesn’t function is_even (n: Integer) know the value of n. return Boolean is But, knowing that n’s begin type is Integer, it infers return (n mod 2 = 0); that the type of “n mod end; 2 = 0” will be Boolean. Call: p: Integer; … if is_even(p+1) … The compiler doesn’t know the value of p. But, knowing that p’s type is Integer, it infers that the type of “p+1” will be Integer. Even without knowing the values of variables and parameters, the Ada compiler can guarantee that no type errors will happen at run-time. 2-59 Example: Python dynamic typing (1) Python function definition: def even (n): return (n % 2 == 0) The type of n is unknown. So the “%” (mod) operation must be protected by a runtime type check. The types of variables and parameters are not declared, and cannot be inferred by the Python compiler. So run-time type checks are needed to detect type errors. 2-60 Example: Python dynamic typing (2) Python function definition: def respond (prompt): # Print prompt and return the user’s response, # as an integer if possible, otherwise as a string. try: response = raw_input(prompt) yields a string return int(response) except ValueError: converts the string to an return response integer, or throws ValueError if impossible Application code: m = respond("Month? ") if m == "Jan": m = 1 elif m == "Feb": m = 2 2-61 Static vs dynamic typing (4) Pros and cons of static and dynamic typing: • Static typing is more efficient. Dynamic typing requires run-time type checks (which make the program run slower), and forces all values to be tagged (to make the type checks possible). Static typing requires only compile-time type checks, and does not force values to be tagged. • Static typing is more secure: the compiler can guarantee that the object program contains no type errors. Dynamic typing provides no such security. • Dynamic typing is more flexible. This is needed by some applications where the types of the data are not known in advance. 2-62 Expressions An expression is a PL construct that may be evaluated to yield a value. Forms of expressions: • literals (trivial) • constant/variable accesses (trivial) • constructions • function calls • conditional expressions • iterative expressions. 2-63 literals The simplest kind of expression is a literal, which denotes a fixed value of some type. Here are some typical examples of literals in programming languages: 365 3.1416 false '%' "What?" These denote an integer, a real number, a boolean, a character, and a string, respectively. 2-64 Constructions A construction is an expression that constructs a composite value from its component values. In C, the component values are restricted to be literals. In Ada, Java, and Haskell, the component values are computed by evaluating subexpressions. 2-65 Example: Ada record and array constructions Record constructions: type Date is record m: Month; d: Day_Number; end record; today: Date := (Dec, 25); tomorrow: Date := (today.m, today.d+1); Array construction: leap: Integer range 0 .. 1; … month_length: array (Month) of Integer := (31, 28+leap, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31); 2-66 Example: C structures Assume : struct Date { Month m; byte d; }; structure construction: struct Date someday = {jan, 1}; Or someday.d = 1; someday.m = jan; struct Date tomorrow = {someday.m, someday.d+1}; 2-67 Example: Java object constructions Assume: class Date { public int m, d; public Date (int m, int d) { this.m = m; this.d = d; } … } Object constructions: Date today = new Date(12, 25); Date tomorrow = new Date(today.m, today.d+1); 2-68 Example: C++Array constructions Array construction: int size[] = {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}; . . . if (is_leap(this_year)) size[1] = 29; 2-69 Example: Haskell tuple and list constructions Tuple constructions: today = (Dec, 25) m, d = today tomorrow = (m, d+1) List construction: monthLengths = [31, if isLeap y then 29 else 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31] 2-70 Function calls (1) A function call computes a result by applying a function to some arguments. If the function has a single argument, a function call typically has the form “F(E)”, or just “F E”, where F determines the function to be applied, and the expression E is evaluated to determine the argument. In most PLs, F is just the identifier of a specific function. However, in PLs where functions as first-class values, F may be any expression yielding a function. E.g., this Haskell function call: (if … then sin else cos)(x) 2-71 Function calls (2) If a function has n parameters, the function call typically has the form “F(E1, …, En )”. We can view this function call as passing a single argument that is an n-tuple. 2-72 Function calls (3) An operator may be thought of as denoting a function. Applying a unary operator to its operand is essentially a function call with one argument: E is essentially equivalent to (E) Applying a binary operator to its operands is essentially a function call with two arguments: E1 E2 is essentially equivalent to (E1, E2) Thus a conventional arithmetic expression is essentially equivalent to a composition of function calls: a * b + c / d is essentially equivalent to +(*(a, b), /(c, d)) 2-73 Conditional expressions A conditional expression chooses one of its subexpressions to evaluate, depending on a condition. An if-expression chooses from two subexpressions, using a boolean condition. A case-expression chooses from several subexpressions. Conditional expressions are commonplace in functional PLs, but less common in imperative/OO PLs. 2-74 Example: Java if-expressions Java if-expression: x>y ? x : y Conditional expressions tend to be more elegant than conditional commands. Compare: int max1 (int x, int y) { return (x>y ? x : y); } int max2 (int x, int y) { if (x>y) return x; else return y; } 2-75 Example: Haskell if- and case-expressions Haskell if-expression: if x>y then x else y Haskell case-expression: case m of feb -> if isLeap y then 29 else 28 apr -> 30 jun -> 30 sep -> 30 nov -> 30 _ -> 31 2-76 Iterative expressions An iterative expression is one that performs a computation over a series of values (typically the components of an array or list), yielding some result. Iterative expressions are uncommon, but they are supported by Haskell in the form of list comprehensions. 2-77 Example: Haskell list comprehensions Given a list of characters cs, convert all lowercase letters to uppercase, yielding a modified list of characters: [if isLowercase c then toUppercase c else c | c <- cs] Given a list of year numbers ys, compute a list (in the same order) of those year numbers in ys that are not leap years: [y | y <- ys, not(isLeap y)] 2-78 Implementation notes Values and types are mathematical abstractions. In a computer, each value is represented by a bit sequence stored in one or more bytes or words. Important principle: all values of the same type must be represented in a uniform way. (But values of different types can be represented in different ways.) Sometimes the representation of a type is PL-defined (e.g., Java primitive types). More commonly, the representation is implementationdefined, i.e., chosen by the compiler. 2-79 Representation of primitive types (1) Each primitive type T is typically represented by single or multiple bytes: usually 8 bits, 16 bits, 32 bits, or 64 bits. The choice of representation is constrained by the type’s cardinality, #T: • With n bits we can represent at most 2n different values. • So the smallest possible representation is log2(#T) bits. 2-80 Representation of primitive types (2) Booleans can in principle be represented by a single bit (0 for false and 1 for true). In practice, the compiler is likely to choose a whole byte. Characters have a representation determined by the character set: • ASCII or ISO-Latin characters have an 8-bit representation • Unicode characters have a 16-bit representation. Enumerands are typically represented by unsigned integers starting from 0. • E.g., the enumerands of type Month above would be represented by the integers {0, …, 11}. The representation must have at least 4 bits. In practice the compiler is likely to choose a whole byte. 2-81 Representation of primitive types (3) Integers have a representation influenced by the desired range. Assuming two’s complement representation, in n bits we can represent the integers {–2n–1, …, 2n–1–1}: • In a PL where the compiler gets to choose the number of bits n, from that we can deduce the range of integers. • In a PL where the programmer defines the range of integers, the compiler must use that range to determine the minimum n. E.g., if the range is {0, …, 1010}, the representation must have at least 35 bits. In practice the compiler is likely to choose 64 bits. Real numbers have a representation influenced by the desired range and precision. Nowadays most compilers adopt the IEEE floating-point standard (either 32 or 64 bits). 2-82 Representation of Cartesian products Tuples, records, and structures are represented by juxtaposing the components in a fixed order. Example (Ada): type Date is record y: Year_Number; m: Month; d: Day_Number; end record; y 2000 m jan d 1 2004 dec 25 Implementation of component selection: • Let r be a record or structure. • Each component r.f has a fixed offset (determined by the compiler) relative to the base address of r. 2-83 Representation of arrays (1) The values of an array type are represented by juxtaposing the components in ascending order of indices. Example (Ada): type Vector is array (1 .. 3) of Float; 1 2 3 3.0 4.0 0.0 1.0 1.0 0.5 2-84 Representation of arrays (1) Example (Ada): arrays of type Pixel Example (C++): arrays of type bool[] 2-85 Representation of arrays (2) Implementation of array indexing: • Let a be an array with index range {l, …, u}. • Assume that each component occupies s bytes (determined by the compiler). • Then a(i) has offset s(i–l) bytes relative to the base address of a. (In C and Java l = 0, so this simplifies to si bytes.) • The offset computation must be done at run-time (since the value of i is not known until run-time). • A range check must also be done at run-time, to ensure that l i u. 2-86 Representation of disjoint unions (1) Each value of a disjoint-union type is represented by juxtaposing a tag with one of the possible variants. The type (and therefore representation) of the variant depends on the current value of the tag. Example (Haskell): data Number = Exact Int | Inexact Float tag Exact variant 2 tag Inexact variant 3.1416 2-87 Representation of disjoint unions (2) Example (Ada): type Accuracy is (exact, inexact); type Number (acc: Accuracy := exact) is record case acc of when exact => ival: Integer; when inexact => rval: Float; end case; end record; acc exact ival 2 acc inexact rval 3.1416 2-88 Representation of disjoint unions (3) Example (Ada): type Form is (pointy,circular,rectangular); type Figure (f: Form := pointy) is record x, y: Float; case f is when pointy => null; when circular => r: Float; when rectangular => w, h: Float; end case; f rect. f pointy f circ. end record; x 1.5 x 1.0 x 0.0 y 2.0 y 0.0 y 2.0 r 5.0 w 3.0 h 4.0 2-89 Representation of objects (simplified) Example (Java): class Point { private float x, y; … // methods } class Circle extends Point { private float r; … // methods } class Rectangle extends Point { private float w, h; … // methods } Point tag Circle tag Rect. tag 1.5 x 2.0 y 3.0 w 4.0 h 0.0 x 0.0 y 5.0 r 1.0 x 2.0 y 2-90 Representation of disjoint unions (4) Implementation of tag test and projection: • Let u be a disjoint-union value/object. • The tag of u has an offset of 0 relative to the base of u. • Each variant of u has a fixed offset (determined by the compiler) relative to the base of u. 2-91 Representation of recursive types (1) Each value of a recursive type is represented by a pointer (whether the PL has explicit pointers or not). Example (Ada): type IntList; type IntNode is record head: Integer; tail: IntList; end record; type IntList is access IntNode; 2 3 5 7 head tail 2-92 Representation of recursive types (3) Example (Java): class IntList { public int head; public IntList tail; … } IntList 2 IntList 3 IntList 5 IntList tag 7 head tail 2-93 Representation of recursive types (2) Example (Haskell): data IntList = Nil | Cons Int IntList Cons 2 Cons 3 Cons 5 Cons 7 Nil 2-94