Slide Set 2 for ENCM 339 Fall 2014 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Fall Term, 2014 ENCM 339 F14 Section 01 Slide Set 2 Contents The int and char types are both integer types Introduction to Pointers First example program with pointers Pointers as function arguments slide 2/36 ENCM 339 F14 Section 01 Slide Set 2 Outline of Slide Set 2 The int and char types are both integer types Introduction to Pointers First example program with pointers Pointers as function arguments slide 3/36 ENCM 339 F14 Section 01 Slide Set 2 slide 4/36 The int and char types are both integer types In mathematics, the integers are all of the members of this infinite set: { . . . , −2, −1, 0, 1, 2, 3, . . . }. √ (In mathematics, numbers such as 0.5, 2, and π are real numbers that are not integers.) C and C++ share a large collection of integer types, and share a bunch of messy and complicated rules regarding these types. In contrast to the set of integers in mathematics, the range of values for a C or C++ integer type is finite. For any given integer type, there’s a minimum value, and there’s a maximum value. Probably the two most frequently used integer types in C and C++ are int and char. ENCM 339 F14 Section 01 Slide Set 2 slide 5/36 Really, char is an integer type! It’s common to think of a char variable as a container for something like a letter (A–Z, a–z), a digit (0–9), a punctuation mark, or a space, etc. However, to understand how C and C++ programs work, it’s extremely useful to know that char is an integer type, with a relatively small range of values. The range of values for char is not the same for every C/C++ development system. The most common range is { −128, −127, −126, . . . , 126, 127 } ; this is the default for C programming on Linux, Mac OS X, and Windows. The second-most common range is { 0, 1, 2, . . . , 254, 255 }. ENCM 339 F14 Section 01 Slide Set 2 slide 6/36 char variables are often used to hold character codes For example, a very common character set is ASCII—the American Standard Code for Information Interchange, in which I I I I character codes 48 to 57 stand for digits 0 to 9, in that order; character codes 65 to 90 stand for letters A to Z, in that order; character codes 97 to 122 stand for digits a to z, in that order; there are various other character codes for spaces, tabs, newlines, punctuation, and so on. ENCM 339 F14 Section 01 Slide Set 2 slide 7/36 Most current computers use extensions of the ASCII character set—ASCII is supported, along with some kind of encoding for characters that don’t appear in North American English. Here’s a demonstration: #include <stdio.h> int main(void) { int i; printf("C program says ... "); for (i = 65; i < 68; i++) printf("%d %c ", i, i); printf("... bye from C!\n"); return 0; } What’s going on with %d and %c here, and what will the program output be? ENCM 339 F14 Section 01 Slide Set 2 slide 8/36 Character constants These begin and end with ’, the “single quote”, character, also known as “forward quote” or “apostrophe”. Each character constant is really a convenient symbol for a number—a specific code for a single character. Here are a few examples, which assume that the ASCII character set is supported: constant ’a’ ’B’ ’3’ ’{’ ’\n’ ’\\’ meaning value code for letter a 97 code for letter B 66 code for digit 3 51 code for left brace 123 code for newline 10 code for backslash 92 ENCM 339 F14 Section 01 Slide Set 2 slide 9/36 Important: Do not mix up single quotes and double quotes! Double quotes are used for string constants, things like "hello", "h", and "". (Another term for string constant is string literal.) In C and C++, a single-character string constant, say, "x", does not mean the same thing as the similar-looking character constant ’x’. (But in the Python programming language, "x" and ’x’ do mean exactly the same thing!) Finally, don’t get ’ mixed up with the ‘ (“backtick”) character. You can’t make a C character constant with a backtick at either end. ENCM 339 F14 Section 01 Slide Set 2 slide 10/36 Conversions between char and int In most expressions involving one or more char values, the char value(s) gets converted to int before any arithmetic or comparison takes place. Example 1. Let’s describe in detail how the assignment statement works in this code fragment: char a = ’G’, b; int c = 3; b = a + c; Example 2. Let’s explain how the comparison works in this code fragment, then write out the output . . . char x; for (x = 48; x < 58; x++) printf("%c", x); ENCM 339 F14 Section 01 Slide Set 2 slide 11/36 Because arithmetic and comparisons involving char values are really done using int values, functions that receive character codes as arguments often have int arguments, not char arguments. Example: Write a definition for isdigit. This function has one argument, called c, and its return value indicates whether or not c is a character code for a digit. ENCM 339 F14 Section 01 Slide Set 2 slide 12/36 Connections to ENEL 353 material Right about now, students in ENEL 353 are learning about unsigned integer systems and two’s complement signed integer systems. Fact: In C and C++ on most current laptop, desktop, and server computers, I the char type uses 8-bit two’s complement numbers; I the int type uses 32-bit two’s complement numbers. ENCM 339 F14 Section 01 Slide Set 2 Outline of Slide Set 2 The int and char types are both integer types Introduction to Pointers First example program with pointers Pointers as function arguments slide 13/36 ENCM 339 F14 Section 01 Slide Set 2 slide 14/36 Introduction to Pointers The upcoming major topics in ENCM 339 are pointer types and the relationships between arrays and pointers. These topics are extremely important. You can’t be an effective C programmer if you don’t know them well. You also can’t get a good grade in ENCM 339 if you don’t know them well. The rest of Slide Set 2 will provide an introduction to pointer types. Slide Set 3 will show how pointers and arrays are related. ENCM 339 F14 Section 01 Slide Set 2 slide 15/36 Computer memory—bits and bytes One of the key uses for main memory (the RAM circuits, not the hard drive) in a computer system is to hold data (such as variables and function arguments) belonging to running programs. The value of a bit is either 0 or 1. In a single-bit memory cell, a voltage close to ground represents a bit value of 0, and a somewhat higher voltage represents a bit value of 1. In memory systems, bits are grouped together to form bytes. In modern systems, there are 8 bits per byte; in other words, the width of a byte is 8 bits. Let’s write some examples of possible values for a byte, using the ENEL 353 notation for binary numbers. ENCM 339 F14 Section 01 Slide Set 2 slide 16/36 Memory can be modeled as a giant array of bytes (Reminder: A model is a simplified description of a natural or engineered system; a good model helps to understand and perhaps to predict the behaviour of a system.) Each byte of memory has its own unique address, which is simply a number. The address of a byte indicates where the byte is located, not what the value of the byte is. The address space of a computer is the set of all possible addresses. Let’s sketch the address space of a computer system in which memory addresses are 32 bits wide. ENCM 339 F14 Section 01 Slide Set 2 slide 17/36 Address spaces and memory capacity Typically, the capacity of the memory system of a computer is much smaller than the address space. Example: In 2014, a mid-range laptop has 8 GB ( 233 bytes) of RAM, and a 64-bit address space. What is the memory capacity expressed as a fraction of the address space size? And a typical program uses much less than the entire memory capacity of a computer. Usually a program has access to only a few relatively small regions within the address space. ENCM 339 F14 Section 01 Slide Set 2 slide 18/36 Memory storage of variables and function arguments If a C variable or function argument is in memory, it is stored in a group of adjacent bytes—a sequence of bytes with consecutive addresses. Let’s sketch this out for some common C types, for a few different systems: 1. typical laptop, desktop or server in 2014; 2. 32-bit smartphone or 32-bit embedded system; 3. low-power embedded system designed for minimal battery use. ENCM 339 F14 Section 01 Slide Set 2 slide 19/36 General rules about sizes of types A char is always exactly 1 byte in size. For all other types, sizes vary, depending on hardware (design of processor and memory circuits), operating system, and sometimes on compiler settings. The fact that most sizes are hardware- and OS-dependent is a major headache for C and C++ developers trying to “port” software from one “platform” to another. ENCM 339 F14 Section 01 Slide Set 2 slide 20/36 The address of a variable The address of a variable the lowest address of all of the addresses of the bytes used for the variable. Let’s illustrate that with a sketch of storage for an int variable called x. (To do this, we’ll have to make up some addresses for the bytes of x.) ENCM 339 F14 Section 01 Slide Set 2 slide 21/36 Expressions At this point it would be useful to provide a rough definition for the term expression. An expression is an identifier (the name of a variable, argument or function), or a constant, or a meaningful chunk of C built using identifiers, constants, and/or operators. Consider the statement y = -x + 7 * f(z); Let’s list I expressions within the statement that are identifiers; I expressions that are constants; I all the more complex expressions. ENCM 339 F14 Section 01 Slide Set 2 slide 22/36 Types and values for expressions Every expression has a specific type. The type of a C expression is determined when a program is compiled, not when a program is run. (In “dynamically typed” languages such as Python, types of expressions often are determined at run-time.) C expressions that are not constants usually have values that are computed as a program runs. ENCM 339 F14 Section 01 Slide Set 2 slide 23/36 Pointers A pointer expression is an expression that has a memory address as a value. A pointer variable is a container for a memory address. A pointer argument is also a container for a memory address. This is a similar idea to the idea that an int variable or argument is a container for an integer value within the range of the int type. ENCM 339 F14 Section 01 Slide Set 2 slide 24/36 Example pointer variable declaration int *p; (This use of * has nothing to do with multiplication.) Let’s write down a couple of different ways to describe exactly what the example variable declaration means. Attention! The name of the variable here is simply p; it is not *p ! The * character is part of the type information, not part of the variable name. I am not fussing unduly over a microscopic detail here—trust me, getting this right really helps you to understand the use of pointer variables and arguments! ENCM 339 F14 Section 01 Slide Set 2 slide 25/36 Key operators related to pointer types A binary operator has two operands, but a unary operator has only one. Examples of binary and unary operators: a = b - c; // Here - is binary: subtraction. d = -e; // Here - is unary: negation. The binary * operator does multiplication. The unary * operator is called the pointer deference operator. Dereferencing a pointer means “accessing the data a pointer points to”. I’ll explain that with examples very soon. The unary & operator is called the address-of operator. Use of & for “address-of” is unrelated to the use of & in C++ for reference types. Let’s write out a very simple example use of the address-of operator. ENCM 339 F14 Section 01 Slide Set 2 Outline of Slide Set 2 The int and char types are both integer types Introduction to Pointers First example program with pointers Pointers as function arguments slide 26/36 ENCM 339 F14 Section 01 Slide Set 2 slide 27/36 First example program with pointers In textbooks, lecture slides and notes, some program examples resemble “production code”—software that is intended to provide useful services to people or machines. Other examples don’t resemble production code at all—instead, they’re intended to explain programming language features in a way that is as clear and as brief as possible. The upcoming example is definitely in the second of those two categories! Let’s copy the program on the next slide, make some remarks about it, and mark points 1 to 5 so we can draw some memory diagrams. ENCM 339 F14 Section 01 Slide Set 2 slide 28/36 #include <stdio.h> int main(void) { int j, k; int *p; p = &j; *p = 3; p = &k; *p = 7; printf("j = %d, k = %d, *p = %d.\n", j, k, *p); return 0; } ENCM 339 F14 Section 01 Slide Set 2 slide 29/36 Arrow notation for diagrams with pointers Writing things like “addr. of j” in a diagram is inconvenient, so usually we’ll use “blobs and arrows” to indicate which addresses are contained in pointer variables and arguments. Let’s redraw the diagram for point 5 in our most recent example, using arrow notation. ENCM 339 F14 Section 01 Slide Set 2 slide 30/36 Addresses are numbers Usually programmers don’t know the exact addresses of variables, but to understand the use of pointers, it’s sometimes useful to pretend that we do know those addresses. Let’s make up some addresses for the bytes of p, k, and j in our most recent example, and draw yet another diagram for point 5 . ENCM 339 F14 Section 01 Slide Set 2 Outline of Slide Set 2 The int and char types are both integer types Introduction to Pointers First example program with pointers Pointers as function arguments slide 31/36 ENCM 339 F14 Section 01 Slide Set 2 slide 32/36 Pointers as function arguments Here’s an example problem: Write a C function to convert a measurement in inches only to the equivalent in in feet and inches. For example, 67 inches is 5 feet, 7 inches. Notice that the function receives one number and must communicate back two numbers. The function can’t do its job simply by returning an int. The solution is given over the next two slides . . . ENCM 339 F14 Section 01 Slide Set 2 slide 33/36 #include <stdio.h> void foot_and_inch(int inch_only, int *feet, int *extra_inch); int main(void) { int total_in = 75; int ft; int in; foot_and_inch(total_in, &ft, &in); printf("%d inches is the same as %d ft, %d in.\n", total_in, ft, in); return 0; } ENCM 339 F14 Section 01 Slide Set 2 slide 34/36 void foot_and_inch(int inch_only, int *feet, int *extra_inch) { // point one *feet = inch_only / 12; *extra_inch = inch_only % 12; // point two } What is the program output? Let’s write down a bunch of remarks, and make diagrams for point one and point two. ENCM 339 F14 Section 01 Slide Set 2 slide 35/36 Would it not have been easier to use reference types? A good way to solve the problem in C++ would be to use reference types, as in void foot_and_inch(int inch_only, int& feet, int& extra_inch); Why does this approach not work in C, a simpler and older language than C++? ENCM 339 F14 Section 01 Slide Set 2 slide 36/36 A common mistake with pointers What if we changed main to the code on this slide? Let’s make some remarks, then explain what is likely to go wrong. // Example of DEFECTIVE code! int main(void) { int total_in = 75; int *ft; int *in; foot_and_inch(total_in, ft, in); printf("%d inches is the same as %d ft, %d in.\n", total_in, *ft, *in); return 0; }