CMPUT229 - Fall 2002 TopicB: Fundamentals of C Originally Created By: José Nelson Amaral Modifications By: Shane A. Brewer CMPUT 229 - Computer Organization and Architecture I 1 Who Am I? Shane A. Brewer 2nd Year Masters Student Supervisor: Dr. Nelson Amaral Research: Java Virtual Machine Optimizations http://www.cs.ualberta.ca/~brewer brewer@cs.ualberta.ca CMPUT 229 - Computer Organization and Architecture I 2 Reading Material The slides for this topic were prepared based on chapters 11 and 12 of: Patt, Yale N., and Patel, Sanjay J., Introduction to Computing Systems: from bits & gates to C & Beyond, McGrawHill Press, 2001. An excellent reference book for the C Language is: Harbison, Samuel P., and Steele Jr., Guy, C: A Reference Manual, Prentice Hall, 4th Edition, 1995. CMPUT 229 - Computer Organization and Architecture I 3 Shane’s Recommended Reading Material Brian W. Kernighan and Dennis M. Ritchie, “The C Programming Language”, Prentice Hall, 2nd Edition, 1988. For good high level programming habits: Steve McConnell, “Code Complete”, Microsoft Press, 1993. CMPUT 229 - Computer Organization and Architecture I 4 Why Learn C? C code is fast (Faster than C++). C is a procedural language. C is lower level than C++. It is generally seen as one step up from assembly language. Fairly portable Faster development compared with assembly language Much easier to read compared to assembly language CMPUT 229 - Computer Organization and Architecture I 5 The C Compiler C Source and Header Files C Preprocessor Source Code Analysis Symbol Table Target Code Synthesis Library Object Files Linker CMPUT 229 -Executable Computer Organization and Architecture I 6 The C Preprocessor (cont.) The C Preprocessor transforms the original C program before the program is handed off to the compiler. Preprocessor directives start with the # character. The define directive is used to give symbolic name to constants in a program: Before Preprocessing After Preprocessing #define FIRST_ELEMENT 10 #define ARRAY_LENGTH 1000 #define INCREMENT 5 /* ••• */ unsigned int k; for(k=FIRST_ELEMENT ; k < ARRAY_LENGTH ; k += INCREMENT) /* ••• */ /* ••• */ unsigned int k; for(k=10 ; k < 1000, k += 5) /* ••• */ CMPUT 229 - Computer Organization and Architecture I 7 The C Preprocessor (cont.) The include directive instructs the preprocessor to insert another file into the source file: #include #include <stdio.h> “program.h” If the file name is surrounded by < > the preprocessor will look for the file in a predefined directory, usually defined when the system is configured. If the file name is surrounded by double quotes (“ “) the preprocessor will look for the file in the same directory as the C source file. CMPUT 229 - Computer Organization and Architecture I 8 The C Preprocessor (cont.) The #ifdef, #ifndef, #else, #elif, #endif define conditional inclusion. #define IA64 #ifdef IA64 #include #else #include #endif “ia64.h” “ia32.h” You always need an #endif to delimit the end of the statement. This is useful for customizing your code to various different environments. Testing, Production Different Computer Architectures CMPUT 229 - Computer Organization and Architecture I 9 The C Preprocessor (cont.) The C preprocess also allows for macros to be defined. Macros are an identifier that are replaced by a series of tokens. #define min(X, Y) ((X) < (Y) ? (X) : (Y)) …. min(3, 4) -> ((3) < (4)) ? (3) : (4)) A function-like macro is only expanded if its name appears with a pair of parentheses after it. Macros are not typed, and thus can take any type of argument. This can be both good and bad. CMPUT 229 - Computer Organization and Architecture I 10 Why Function Macros Are Bad Error Prone #define square(x) x * x Require excessive parenthesis, reducing readability Can cause line numbering to be confused making debugging harder Not typed, like functions. CMPUT 229 - Computer Organization and Architecture I 11 Using The C Preprocessor Advantages Disadvantages Can improve readability No runtime cost Can also hinder readability Errors are not detected by compiler Makes code easier to modify Not entirely necessary as they can be replaced by functions CMPUT 229 - Computer Organization and Architecture I 12 Comments C only provides 1 style for commenting. Any characters between the /* and */ tokens are ignored by the compiler. #include <stdio.h> /* Print Fahrenheit-Celsius Table for fahr = 0, 20, …, 300 */ main() { int fahr; /*Holds the Fahrenheit Temperature */ int celsius; /* Holds the Celsius Temperature */ … } CMPUT 229 - Computer Organization and Architecture I 13 Good Commenting Comments don’t repeat the code, they describe the code’s intent. Functions The following should always be commented: Variables Paragraphs Of Code Complex Algorithms Source Code Files Remember that what may be obvious to you, won’t be to someone else looking at your code. Avoid abbreviations. Keep comments up-to-date! Nothing is worse than a comment that is WRONG! CMPUT 229 - Computer Organization and Architecture I 14 Input and Output We can describe the output function in C, and illustrate its format string, through some examples: printf(“This is the meaning of life: %d.\n”, 42); printf(“43 plus 59 as a decimal is %d.\n”, 43+59); printf(“43 plus 59 in hexadecimal is %x.\n”, 43+59); printf(“43 plus 59 as a character is %c.\n, 43+59); printf(“The wind speed is %d km/hr.\n”, windSpeed); A run of this program will produce: This is the meaning of life: 42. 43 plus 59 as a decimal is 102. 43 plus 59 in hexadecimal is 66. 43 plus 59 as a character is f. The wind speed is 35 km/hr. CMPUT 229 - Computer Organization and Architecture I 15 Input and Output (cont.) For data input from the keyboard, C uses the function scanf. scanf requires a format string that is similar to the one for printf. scanf does automatic type conversion: from the ASCII characters that we type in the keyboard to the type specified in the format string. Examples: /* Reads in a character and stores it in the nextChar */ scanf(“Next Character: %c”, &nextChar); /* Reads in a floating point number and stores it in the variable radius */ scanf(“Radius: %f”, &radius); /* Reads in two decimal numbers and stores them in the variables length and width */ CMPUT 229 - Computer Organization Architecture I scanf(“Length\t width: %d\t %d”, &length,and &width); 16 Variable’s Scope The scope of a variable determines the region of the program in which the variable is accessible. A global variable can be accessed throughout the program. A local variable is accessible only within the block in which it is defined. CMPUT 229 - Computer Organization and Architecture I 17 Blocks Blocks are used to group of declarations and statements together using braces { and }. The braces that surround the statements of a function are one example. However braces can also be found after if, else, while, and for statements to group together statements. Local variables must be declared at the beginning of a block. Once the block has finished it’s execution, the variables are no longer available for use. main() functionCall() { { int loopCount; int anotherLoopCount; int numExecutions; … ... } functionCall(); … CMPUT 229 - Computer } Organization and Architecture I 18 Variable Naming The name of a variable is referred to as an identifier. While the name may not seem very important at first, the importance becomes much larger as your program grows and the longer you use it. For example, explain what the use would be for the following variable names: x, y, z; count; foo; intCount; num; numTeamMembers; firstItem, lastItem; NumberOfPeopleOnTheCanadianOlympicTeam; CMPUT 229 - Computer Organization and Architecture I 19 Why Global Variables Are Bad Why is it necessary to use local variables when you could just make all of your variables global? Limiting the scope of a variable reduces the code space in which a variable can change. This becomes extremely important when debugging code. int number1 = 2; /* First number to be added */ int number2 = 3; /* Second number */ int sum; /* The sum of number1 and number2 */ main() { badFunction(); sum = number1 + number2; printf(“%d”, sum); } badFunction() { printf(“%d”, number1++); printf(“%d”, number2++); } CMPUT 229 - Computer Organization and Architecture I 20 Symbol Table A compiler uses a symbol table to keep track of variables in a program. The compiler creates a new entry in the symbol table for every variable declaration that it encounters in the code. Typically each entry in the symbol table contains: (1) its name (2) its type (3) a place in memory where the value of the variable is stored (4) an identifier to indicate the block in which the variable is declared (the scope of the variable). CMPUT 229 - Computer Organization and Architecture I 21 Symbol Table (example) For instance, the following variable declarations in main: main() { int int … counter; starPoint; Will produce these entries in the symbol table: Name Type Offset Scope counter int 3 main starPoint int 4 main CMPUT 229 - Computer Organization and Architecture I 22 Memory Allocation in C In C each function has an activation frame, or activation record, in the stack. The exact organization of this frame depends on the compiler. Some of the data stored in an activation frame is shown below. high addresses In MIPS, parameter 0-3 are passed in registers. parameter n ••• parameter 4 temporaries local variables low addresses Area to “spill” the values of temporaries that cannot be kept in registers. Storage of variables whose scope is local to the function. CMPUT 229 - Computer Organization and Architecture I 23 Memory Organization in C The overall organization of the runtime memory in C is given below. Program Code Constants and global variables Static Data Stack Function activation records Heap Dynamically allocated memory CMPUT 229 - Computer Organization and Architecture I 24 Operators Arithmetic operators (examples): distance = rate * time; netIncome = income - taxesPaid; fuelEconomy = mileTraveled / fuelConsumed; area = 3.14159 * radius * radius; y = a*x*x + b*x + c C has an integer division (/) and a modulus (%) operator: z = x / y; /* If x and y are integers, the result is the integral portion: e.g., 7/2 = 3 */ z = x % y; /* The result is x mod y, e.g., 7 % 2 = 1 */ CMPUT 229 - Computer Organization and Architecture I 25 Operators (cont.) Bitwise operators: Logiocal operators: Operator Symbol Operation Example Usage ~ << >> & ^ | bitwise NOT left shift right shift bitwise AND bitwise XOR bitwise OR ~x x << y x >> y x&y x^y x|y && || ! logical AND logical OR logical negation x && y x || y !x int int int f = 7; g = 8; h = 0; h = f & g; h = f && g; h = f | g; h = f || g; /* bitwise AND */ /* logical AND */ /* bitwise OR */ CMPUT 229 - Computer /* logical OR */Organization and Architecture I h = ~f | ~g; h = !f && !g; h = f ^ g; h = 29 || -52; /* bitwise operators */ /* logical operators */ /* bitwise XOR */ /* logical OR */ 26 Special Operators in C Operator Symbol Special operators: ++ ++ += = *= /= %= &= |= ^= <<= >>= Operation Example Usage increment (postfix) decrement (postfix) increment (prefix) decrement (prefix) add and assign subtract and assign multiply and assign divide and assign modulus and assign “and” and assign or and assign xor and assign left-shift and assign right-shift and assign x++ x ++x x x += y x = y x *= y x /= y x %= y x &= y x |= y x ^= y x <<= y x >>= y CMPUT 229 - Computer Organization and Architecture I 27 C Special Conditional Expression b x = a ? b : c; c C Conditional Expression 1 0 a if(a) x = b; else x = c; x Logical Diagram of a MUX Alternative code for the C Conditional Expression CMPUT 229 - Computer Organization and Architecture I 28 Order of Evaluation If x = 1, z = -3, and w = 9, what are the values of w, x, y, and z after the following program statement is executed? y = x & z + 3 || 2 w % 6; In order to evaluate this expression correctly, we need to know what is the rules for operator precedence and associativity in C. CMPUT 229 - Computer Organization and Architecture I 29 Associativity and Precedence Rules Precedence Group Associativity Operator function call () [ ] . > postfix ++ postfix prefix postix ++ indirection * address & unary + unary ~ ! sizeof cast (type) multiply * / % + << >> < > <= >= == != & ^ | && || ?: = += -= *= etc. 1 2 3 4 l to r r to l r to l r to l 5 6 7 8 9 10 11 12 13 14 15 16 17 r to l l to r l to r l to r l to r l to r l to r l to r l to r l to r l to r l to r r to l CMPUT 229 - Computer Organization and Architecture I 30 y = x & z + 3 || 2 (w) % 6; Precedence Group Associativity Operator function call () [ ] . > postfix ++ postfix prefix postix ++ indirection * address & unary + unary ~ ! sizeof cast (type) multiply * / % + << >> < > <= >= == != & ^ | && || ?: = += -= *= etc. 1 2 3 4 l to r r to l r to l r to l 5 6 7 8 9 10 11 12 13 14 15 16 17 r to l l to r l to r l to r l to r l to r l to r l to r l to r l to r l to r l to r r to l CMPUT 229 - Computer Organization and Architecture I 31 y = x & z + 3 || 2 ((w) % 6); Precedence Group Associativity Operator function call () [ ] . > postfix ++ postfix prefix postix ++ indirection * address & unary + unary ~ ! sizeof cast (type) multiply * / % + << >> < > <= >= == != & ^ | && || ?: = += -= *= etc. 1 2 3 4 l to r r to l r to l r to l 5 6 7 8 9 10 11 12 13 14 15 16 17 r to l l to r l to r l to r l to r l to r l to r l to r l to r l to r l to r l to r r to l CMPUT 229 - Computer Organization and Architecture I 32 y = x & (z + 3) || (2 ((w) % 6)); Precedence Group Associativity Operator function call () [ ] . > postfix ++ postfix prefix postix ++ indirection * address & unary + unary ~ ! sizeof cast (type) multiply * / % + << >> < > <= >= == != & ^ | && || ?: = += -= *= etc. 1 2 3 4 l to r r to l r to l r to l 5 6 7 8 9 10 11 12 13 14 15 16 17 r to l l to r l to r l to r l to r l to r l to r l to r l to r l to r l to r l to r r to l CMPUT 229 - Computer Organization and Architecture I 33 y = (x & (z + 3)) || (2 ((w) % 6)); Precedence Group Associativity Operator function call () [ ] . > postfix ++ postfix prefix postix ++ indirection * address & unary + unary ~ ! sizeof cast (type) multiply * / % + << >> < > <= >= == != & ^ | && || ?: = += -= *= etc. 1 2 3 4 l to r r to l r to l r to l 5 6 7 8 9 10 11 12 13 14 15 16 17 r to l l to r l to r l to r l to r l to r l to r l to r l to r l to r l to r l to r r to l CMPUT 229 - Computer Organization and Architecture I 34 Order of Evaluation If x=1, z = -3, and w=9, what are the values of y, x, z, and w after the following program statement is executed? y = x & z + 3 || 2 w % 6; Using the precedence rules the expression must be evaluated as follows: y = (x & (z + 3)) || (2 ((w) % 6)); For x=1, z = -3, and w=9: y = (1 & (3 + 3)) || (2 (9 % 6)); y = (1 & 0) || (2 3); y = 0 || -1; y = 1; CMPUT 229 - Computer Organization and Architecture I Thus after the statement: x=1 z = 3 w=8 y=1 35 A C Program Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 #include <stdio.h> int inGlobal main() { int inLocal; int outLocalA; int outLocalB; /* Initialize */ inLocal = 5; inGlobal = 3; /* Perform calculations */ outLocalA = inLocal++ & ~inGlobal; outLocalB = (inLocal + inGlobal) (inLocal inGlobal); /* Print out results */ printf(“The results are: outLocalA = %d, outLocalB = %d\n”, outLocalA, outLocalB); } CMPUT 229 - Computer Organization and Architecture I 36 Compiler-Generated MIPS code (with option -O0) # 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 inLocal = 5; addiu $t2,$0,5 # $t2 5 sw $t2,0($sp) # inLocal 5 # 13 inGlobal = 3; addiu $t0,$0,3 # $t2 3 lw $t1,%got_disp(inGlobal)($gp) # $t1 address of inGlobal sw $t0,0($t1) # inGlobal 3 # 16 outLocalA = inLocal++ & ~inGlobal; lw $a3,0($sp) # $a3 inLocal addiu $a3,$a3,1 # $a3 inLocal+1 sw $a3,0($sp) # save inLocal + 1 #include <stdio.h> lw $a2,%got_disp(inGlobal)($gp) # $a2 address of inGlobal int inGlobal lw $a2,0($a2) # $a2 inGlobal main() nor $a2,$a2,$0 # $a2 ~inGlobal { int inLocal; lw $a3,0($sp) # $a3 inLocal + 1 int outLocalA; addiu $a3,$a3,-1 # $a3 (inLocal+1) -1 int outLocalB; and $a2,$a2,$a3 # $a2 inLocal AND ~inGlobal /* Initialize */ sw $a2,4($sp) # save $a2 in outLocalA inLocal = 5; # 17 outLocalB = (inLocal + inGlobal) - (inLocal - inGlobal); inGlobal = 3; lw $a1,%got_disp(inGlobal)($gp) # $a1 address of inGlobal /* Perform calculations */ lw $a1,0($a1) # $a1 inGlobal outLocalA = inLocal++ & ~inGlobal; lw $a2,%got_disp(inGlobal)($gp) # $a2 address of inGlobal outLocalB = (inLocal + inGlobal) (inLocal inGlobal); lw $a2,0($a2) # $a2 inGlobal CMPUT 229 - Computer /* Print out results */ addu $a1,$a1,$a2 # $a2 inGlobal + inGlobal Organization and Architecture I 37 printf(“The results are: outLocalA = %d, outLocalB = %d\n”, outLocalA, outLocalB); } sw $a1,8($sp) # outLocalB Compiler-Generated MIPS code (with option -O3) # # # # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 12 13 16 17 #include <stdio.h> int inGlobal main() { int inLocal; int outLocalA; int outLocalB; inLocal = 5; inGlobal = 3; outLocalA = inLocal++ & ~inGlobal; outLocalB = (inLocal + inGlobal) - (inLocal - inGlobal); lw $v0,%got_disp(inGlobal)($gp) # $v0 address of inGlobal addiu $a2,$0,6 # $a2 6 addiu $t0,$0,3 # $t0 3 addiu $a1,$0,4 # $a1 4 sw $t0,0($v0) # store $t0 in inGlobal /* Initialize */ inLocal = 5; inGlobal = 3; /* Perform calculations */ outLocalA = inLocal++ & ~inGlobal; outLocalB = (inLocal + inGlobal) (inLocal inGlobal); CMPUT 229 - Computer /* Print out results */ Organization and Architecture printf(“The results are: outLocalA = %d, outLocalB = %d\n”, outLocalA, outLocalB); } I 38 Changing the Example a bit 1 2 3 4 5 6 7 8 9 10 11 12 13 14 13 14 15 16 17 18 19 20 #include <stdio.h> int inGlobal main() { int inLocal; int outLocalA; int outLocalB; /* Initialize */ printf(“inLocal: “); scanf(“%d”, &inLocal); printf(“inGlobal: “); scanf(“%d”, &inGlobal); /* Perform calculations */ outLocalA = inLocal++ & ~inGlobal; outLocalB = (inLocal + inGlobal) (inLocal inGlobal); /* Print out results */ printf(“The results are: outLocalA = %d, outLocalB = %d\n”, outLocalA, outLocalB); CMPUT 229 - Computer } Organization and Architecture I 39 Compiler-Generated MIPS code (with option -O3) # 18 # 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 13 14 15 16 17 18 19 20 #include <stdio.h> int inGlobal main() { int inLocal; int outLocalA; int outLocalB; /* Initialize */ printf(“inLocal: “); scanf(“%d”, &inLocal); printf(“inGlobal: “); scanf(“%d”, &inGlobal); outLocalA = (inLocal++) & ~inGlobal; outLocalB = (inLocal + inGlobal) - (inLocal - inGlobal); lw $t0,0($sp) # $t0 inLocal lw $a1,%got_disp(inGlobal)($gp) # $a1 address of inGlobal addiu $t0,$t0,1 # $t0 inLocal+1 lw $a1,0($a1) # $a1 inGlobal addiu $a3,$t0,-1 # $a3 (inLocal+1)1 nor $a1,$a1,$0 # $a1 ~inGlobal sw $t0,0($sp) # store inLocal and $a1,$a1,$a3 # $a1 ~inGlobal & inLocal /* Perform calculations */ outLocalA = inLocal++ & ~inGlobal; outLocalB = (inLocal + inGlobal) (inLocal inGlobal); CMPUT 229 - Computer /* Print out results */ printf(“The results are: outLocalA = %d, outLocalB = %d\n”, outLocalA, outLocalB); Organization and Architecture } I 40 Post-Increment A question about the semantics of “post” in the post-increment addressing was raised last class. What should be the result of the following statement in C? z = (x++) + (x++); Should the statement above have the same effect as the following pair of statements? z = (x++) z = z + (x++); We can write a simple C program and compile to find out what code the compiler is generating? CMPUT 229 - Computer Organization and Architecture I 41 Simple Program to Study Post-Increment 1 #include <stdio.h> 2 3 main() 4 { 5 int inLocalA; 6 int inLocalB; 7 int outLocalA; 8 int outLocalB; 9 10 /* Initialize */ 11 inLocalA = 4; 12 inLocalB = 4; 13 14 /* Perform calculations */ 15 outLocalA = (inLocalA++) + (inLocalA++); 16 outLocalB = (inLocalB++); 17 outLocalB = outLocalB + (inLocalB++); 18 19 /* Print out results */ 20 printf(“The results are: outLocalA = %d, outLocalB = %d\n”, outLocalA, outLocalB); 21 } CMPUT 229 - Computer Organization and Architecture I 42 Compiler and Running the postinc.c Program First I compiled and run the code at -O0 on caslan, using MIPSpro: bash-2.01$ cc -version MIPSpro Compilers: Version 7.2.1 bash-2.01$ cc postinc.c -o postincO0 bash-2.01$ ./postincO0 The results are : outLocalA = 10 outLocalB = 9 1 #include <stdio.h> 2 3 main() 4 { 5 int inLocalA; 6 int inLocalB; 7 int outLocalA; 8 int outLocalB; Then I compiled and run the code at -O3 9 10 /* Initialize */ on caslan, using MIPSpro: 11 inLocalA = 4; 12 inLocalB = 4; bash-2.01$ cc -O3 postinc.c -o postincO3 13 bash-2.01$ ./postincO3 14 /* Perform calculations */ The results are : outLocalA = 10 outLocalB = 9 15 outLocalA = (inLocalA++) + (inLocalA++); 16 outLocalB = (inLocalB++); 17 outLocalB = outLocalB + (inLocalB++); 18 CMPUT 229 - Computer 19 /* Print out results */ 20 printf(“The results are: outLocalA = %d, outLocalB = %d\n”,and outLocalA, outLocalB); Organization Architecture I 21 } 43 Compiler and Running the postinc.c Program Next I compiled and run the code using gcc at -O0 on caslan: bash-2.01$ /usr/gnu/bin/gcc -v Reading specs from /usr/gnu/lib/gcc-lib/mips-sgi-irix5.3/2.7.2/specs gcc version 2.7.2 bash-2.01$ /usr/gnu/bin/gcc postinc.c -o postinc-gcc bash-2.01$ ./postinc-gcc The results are : outLocalA = 9 outLocalB = 9 1 #include <stdio.h> 2 3 main() 4 { 5 int inLocalA; 6 int inLocalB; 7 int outLocalA; 8 int outLocalB; 9 Finally I compiled and run the code using gcc version 2.96 10 /* Initialize */ 11 inLocalA = 4; at -O0 on kinsella (a Pentium III machine): 12 inLocalB = 4; [amaral@kinsella postinc]$ cc -v 13 14 /* Perform calculations */ Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs 15 outLocalB = (inLocalB++); gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-85) 16 outLocalB = outLocalB + (inLocalB++); [amaral@kinsella postinc]$ cc postinc.c -o postinc-gcc-kin 17 [amaral@kinsella postinc]$ ./postinc-gcc-kin CMPUT 229 - Computer 18 /* Print out results */ results= %d\n”, are : and outLocalA = 8 outLocalB =9 19 printf(“The results are: outLocalA = %d,The outLocalB outLocalA, outLocalB); Organization Architecture I 44 [amaral@kinsella postinc]$ 20 } What is the lesson? Keep things simple. If compilers cannot agree on the semantics of multiple post-increments in the same statement, how many programmers will be able to agree on it? Avoid expressions such as: z = (x++) + (x++); The same effect can be obtained by the following statements: z = x + x + 1; x = x + 2; No one will question the semantics of these statements. CMPUT 229 - Computer Organization and Architecture I 45