A C++ Crash Course Part I UW Association for Computing Machinery http://www.cs.washington.edu/orgs/acm/tutorials acm@cs.washington.edu Questions & Feedback to Hannah C. Tang (hctang) and Albert J. Wong (awong) What We’ll Cover • C/C++ fundamentals – Functions – Primitive data types • The stack • C-style types – Typedefs – Structs • Arrays (whole story) – Arrays (working model) – Pointers • A practice program • More C++-isms – C++-style vs Java-style references – C++ gotchas What We’re NOT Covering • Topics related to C++ classes – Multiple files – The preprocessor • C++ classes – Inheritance and dynamic dispatch • Memory management – The heap – Destructors • Advanced topics – Modifiers: const, static, and extern – Operator overloading – Templates Goals of Java Java, C, and C++, have different design goals. Java – – – – – Simple Consistent Huge OO Focus Cross platform via a virtual machine Originally for embedded systems Goals of C and C++ C and C++ are popular because they have met, with reasonable success, their goals. C – Low level – No Runtime Type info – Easy implementation C++ – Originally to add some OO functionality to C – Attempt to be a higher-level language – Now it’s a totally different language A simple program snippet public void printSum(void) { int x, y; } void printSum(void) { int x, y; // … get user input … // … get user input … int sum = x + y; int sum = x + y; // … print sum … // … print sum … } The simple program – Java version class Calculator { public void printSum(void) { int x, y; // … get user input … App int sum = x + y; // … print sum … } printSum() } class App { public static void main(String[] args) { Calculator c = new Calculator; c.printSum(); } } Calculator The simple program – C++ version void printSum(void) { int x, y; class Calculator { public void printSum(void) { int x, y; // … get user input … // … get user input … int sum = x + y; int sum = x + y; // … print sum … // … print sum … } } } int main(int argc, const char * argv[]) { printSum(); return 0; } class App { public static void main(String[] args) { Calculator c = new Calculator; c.printSum(); } } Procedural Programming • Functions are free-floating “methods” disassociated from any class • Functions declarations can be separate from function implementations – The declaration provides the signature, which specifies the name, return type, and parameter list of the function • C is completely procedural • C++ mixes object-oriented and procedural programming Discussion Point I Which of these programs can be written procedurally? Object-orientedly? • HelloWorld • A traffic simulator – Must simulate cars, roads, and the interactions between these entities • A calculator – Accepts two numbers, then calculates the sum or difference, depending on a user-selected operator • An mp3 player – Accepts a list of files, and plays them in the specified order. Needs to support skins • Come up with your own example Function Syntax and Semantics <ReturnType> functionName( … <parameter list> … ); int calculatePower(int base, int exponent); • <ReturnType> can be any type except an array • Class-scoped methods and free-floating functions are basically the same, except … Parameter Passing in Java – Part I class Example { public void moveToDiagonal(Point p) { p.setY(p.getX()); } public static void main( String[] args ) { Point pt; pt = new Point(3, 4); moveToDiagonal(pt); // What are the coordinates of pt now? } } In Java, everything is a reference x: 3 x: 3 y: 4 y: 3 pt Point pt; pt = new Point(3, 4) p moveToDiagonal(Point p) { p.setY(p.getX()); } In Java, modifying a method parameter means modifying the original instance … almost everything is a reference Java atomic types: • int • double • boolean C++ atomic types: • int • double • bool • etc … • etc … In Java, modifying an atomically-typed parameter did NOT modify the original instance. In Java, atomic types are passed by copy. The same semantics hold for C++ atomic types C/C++ Function Parameters • In C++, all function parameters are passed by copy – even if they’re not of atomic type • Why? – First, a brief detour … Detour: Functions & Memory • Every function needs a place to store its local variables. Collectively, this storage is i Memory called the stack d2 location • This storage (memory aka d1 “RAM”), is a series of storage spaces and their numerical y addresses x • Instead of using raw addresses, we use variables to attach a name to an void aFunc(int x, address { • All of the data/variables for a double d1, d2; particular function call are int i; located in a stack frame } int y) Detour: Functions & Memory (cont) • When a function is called, a new stack frame is set aside • Parameters and return values are passed by copy (ie, they’re copied into and out of the stack frame) • When a function finishes, its stack frame is reclaimed void aFunc(int x, int y) { double d1 = x + y; } int main(int argc, const char * argv[]) { int x = 7; aFunc(1, 2); aFunc(2, 3); return 0; } d1 y aFunc x x 7 main C/C++ Function Parameters (cont.) • In C++, all function parameters are passed by copy – even if they’re not of atomic type • Why? – In C++, all variables exist on the stack by default – In C++, parameters are copied into the callee’s stack frame – We’ll talk about Java parameter passing later (when we talk compare C++ and Java references) Discussion Point II • Examine the code fragment below. – Draw the stack frame(s) for some sample input. – If you see any bugs, what are they? How would the program behave? void sillyRecursiveFunction(int i) { if(i == 0) { return; } else { sillyRecursiveFunction(i – 1); } } Arrays <ArrayType> arrayName[ numElements ] • Arrays are contiguous memory locations, and its name refers only to the address of the first element • Indexing into an array is the same as adding an offset to the address of the first element • When declaring an array, its size must be known at compile-time myArray[5] myArray[4] myArray[3] myArray[2] myArray[1] myArray[0] or myArray Arrays as function parameters <ReturnType> funcName( ArrayType arrName[ ] ) int sumOfArray( int values[], int numValues ) • Arrays are not passed by copy. Instead, the address of the first element is passed to the function – Note how array parameters and nonparameter arrays behave identically Discussion Point III • Why are arrays not passed by copy? – Hint: the size of a stack frame is computed long before the program is run (specifically, at compile time) Pointers What if we had variables that contained addresses? They could contain addresses of anything! We could use these variables in functions to modify the caller’s data (we could implement Java’s parameterpassing semantics!) x (4104) y (4100) Variable name n (4096) Address Storage space Pointers: vocabulary • A pointer is a variable which contains addresses of other variables • Accessing the data at the contained address is called “dereferencing a pointer” or “following a pointer” x (4104) y (4100) 4096 n (4096) 7 Pointer Syntax Declaring Pointers Using Pointers Declaring a pointer: <Type> * ptrName; Dereferencing a pointer: *ptrName “Go to the address contained in the “ptrName is a variable which variable ptrName” contains the address of something of type <Type>” Getting the address of a variable: &aVar “Get the address of aVar” For example: int * nPtr1, * nPtr2; void aFunc( int aParam, int * ptrParam); For example: aFunc(myInt, &anotherInt); anInt = *myPtr * 4; *dinner = 100; Pointers: Putting it all together The code int * p; int q; p = &q *p = 5; Box Diagrams “p’s type is int pointer. q’s type is int.” “Assign 5 to where p points (which is q).” p q 5 Memory Layout p contains the address of an int. q contains an int. Go to the address that p contains, and place a 5 there. p (8200) 8196 q (8196) 5 Pointers: Putting it all together (cont.) The code Memory Layout Box diagram main void doubleIt(int x, int * p) 16 a { *p = 2 * x; } int main(int argc, const char * argv[]) doubleIt { int a = 16; x 9 doubleIt(9, &a); return 0; p } p (8200) 8192 doubleIt x (8196) 9 a (8192) 16 main Pointer Arithmetic Pointers are numbers, so you can do math on them! int * p = &a; p (8200) b (8196) a (8192) 8192 9 16 *p = 200; p (8200) b (8196) a (8192) 8192 9 200 *(p+1) = 300; p (8200) b (8196) a (8192) 8192 300 200 Pointer p refers to an int, so adding 1 to p increments the address by the size of one int. The C/C++ expression for this is sizeof(int) Pointers and Arrays Pointers and arrays are (almost) interchangeable Given: int myArray[5]; int * p = myArray; These are equivalent: • • • • • • *p myArray[0] *(p+0) *myArray p[0] 0[p] myArray[4] (9000) myArray[3] (8196) myArray[2] (8192) myArray[1] (8188) myArray[0] (8184) p (8180) 8184 Discussion Point IV • How do pointers and arrays differ? – Hint: how are pointers implemented in memory? Arrays? Exercise • Get up and stretch! • Do the worksheet exercise • Then, write a program to do the following: – Read some numbers from the user (up to a max number of numbers) – Calculate the average value of those numbers – Print the user’s values which are greater than the average • Get up and stretch again! Pointer Problems • Pointers can refer to other variables, but: – Create an additional variable – Have an ugly syntax – – Function Pointers <ReturnType> (*ptrName)(arg type list ); • • • • Functions are pieces of code in memory Pointers can point to functions. This syntax is U-G-L-Y (the ugliest in C) Notice that the name of the variable appears in the middle of the statement! • You do not have to dereference a function pointer Function pointers are not scary. They are useful! Function Pointers - example void foo(int i, char b); void bar(int i, char b); int main(void) { void (*p)(int,char); p = foo; p(1, ‘c’); // equivalent to foo(1, ’c’); p = bar; p(2, ‘b’); // equivalent to bar(2, ‘b’); (*p)(2, ‘b’); // Exactly the same return 0; } References References are an additional name to an existing memory location If we wanted something called “ref” to refer to a variable x: Pointer: x ref Reference: 9 x ref 9 Properties of References Reference properties: – Cannot be reassigned – Must be assigned a referee at construction Therefore: – References cannot be NULL – You cannot make an array of references. Given what you know about references, can you explain where these properties come from? Reference Syntax References Pointers Declaring a reference: <Type> & refName = referee; Declaring a pointer: <Type> * ptrName; Usage: int n; int & referee = n; void aFunc( int aParam, int & ptrParam); Usage: int n; int * nPtr1 = &n; void aFunc( int aParam, int * ptrParam); aFunc(1, n); aFunc(1, &n); Discussion Point V • What are the differences between Java references and C++ references? What about Java references and C++ pointers? C-style struct A struct is used to group related data items struct student { int id; char name[80;] }; Note that the it is optional to name a struct • To the programmer – id and name are now related – struct student creates a convenient grouping • To the compiler – Id and name have a fixed ordering (not offset) in memory – Struct student is a first-class type that can be passed to functions struct Syntax Declaring a Struct Declaring a struct: struct [optional name] { <type> field1; <type> field2; … } [instance list]; Examples: struct Foo { int field1; char field2; } foo,*foo_ptr; struct Foo foo2; struct { int a; } blah; Access struct fields Accessing a field in a struct: foo.field1; “gets field1 from the instance foo of struct Foo” Pointers syntax and structs The * has lower precedence than the ‘.’ : *foo_ptr.field1; means *(foo_ptr.field1); Which won’t compile Accessing a field in a struct pointer: (*foo_ptr).field1; foo_ptr->field1; enum An enum creates an enumerated type; they are options with an associated value enum PrimaryColors { RED = 0, GREEN, BLUE }; • • • • Note that the it is optional to name an enum By default, the first option is given the value 0 You can assign an option any integer Subsequent options have the previous option’s value + 1 All enumeration values are in the same namespace enum Syntax Declaring an enum Declaring a enum: enum [optional name] { OptionName [= int], OptionName [= int], … } [instance list]; Example of an enum: enum Color { RED, GREEN, BLUE } color, *color_ptr; enum Color c; void drawCircle (enum Color c); Enum quirks Problems with Enums: • Frail abstraction • Treated as integers • Can be assigned invalid values • Flat namespace Proper use guidelines: • Avoid breaking abstraction • Mangle name of enum into option name (so ColorRed instead of Red) Here is one sanctioned abstraction break enum Color { RED, GREED, BLUE, NumberOfColors }; union An union creates an union type; all fields share the same memory location union Argument { int intVal; double doubleVal; char charVal; }; Note that the it is optional to name a union • Changing intVal changes doulbeVal and charVal! • Can be used to create constrained-type containers • Usually used in conjunction with an enum that says which field is currently valid. union Syntax Declaring an enum Declaring a enum: union [optional name] { <type> name1; <type> name2; … } [instance list]; Example of a union: union Argument { int value; char *string; } arg1, *ptr; union Argument arg2; arg1.value = 3; arg2.string = NULL; Union quirks Problems with Enums: • Only assume that the last field written two is valid. • Don’t use to “save space.” Proper use guidelines: • Ensure you have another method of knowing which field is currently valid. Typedef Typedef is used to create an alias to a type typedef unsigned char unsigned char mybyte; byte mybyte; byte; • byte now represents an unsigned char • Both definitions of mybyte are equivalent to the compiler. • The second definition is preferred as it gives more info Typedef – common uses • Abstraction – The user may easily change the type used to represent a variable. • Clarification – More informative names for a type be given – Variables that use the same type in different ways can be separated easily • Convenience – Type names can get very long – People like structs to look like real types – Some type names (like function pointers or array pointers) are really hard to read/write Typedefs – structs/enums/unions People often make a typedef of an anonymous struct, enum, or union typedef struct { int id; char name[80]; } Student; struct Student { int id; char name[80]; }; Student st; struct Student st; These are almost the same. However, anonymous structs cannot refer to themselves. struct List { int data; struct List *next; }; Discussion Point VI • What advantages do named structs/unions have over anonymous ones? Are enums different? – How would you try to pass anonymous structs, enums, or unions to a function? Can you? C++ “Gotcha” I Don’t use exceptions unless you know what you’re doing! • Uncaught C++ exceptions do not produce a stack trace. • C++ does not automatically reclaim new’d resources (more in a later tutorial) void someFunc(void) { throw “Exception!"; } int main(int argc, const char * argv[]) { someFunc(); return 0; } $ ./myProg Aborted $ C++ “Gotcha” II Don’t return pointers (or references) to local variables! double * aFunc(void) { double d; return &d; } int main(int argc, const char * argv[]) { double * pd = aFunc(); *pd = 3.14; return 0; } Boom! (maybe) C++ “Gotcha” III Uninitialized pointers are bad! int * i; if( someCondition ) { … i = new int; } else if( anotherCondition ) { … i = new int; Does the phrase “null } *i = someVariable; pointer exception” sound familiar? C++ “Gotcha” IV Never use an array without knowing its size int myArray[5]; • C++ arrays do not know their own size. – Always pass a size variable with the array – Always check the bounds manually (C++ won’t do it for you!) myArray[0] myArray[1] myArray[2] myArray[3] myArray[4] = = = = = 85; 10; 2; 45; 393; myArray[5] = 9; myArray[-1] = 4; No Error! Undefined Behavior! What We Covered • The procedural programming paradigm • Functions and parameter passing • The C/C++ memory model – Part I (the stack) – Pointers – Arrays – C++-style References • C type constructs – Structs, enums, unions, typedefs Any questions? Acknowledgements & References – Books: – Essential C++ (C++ In-Depth Series), Stanley B. Lippman, 1999, 304 pgs. – The C++ Primer, 3rd edition, Stanley B. Lippman, 1998, 1237 pgs. – Effective C++, 2nd edition, Scott Meyers, 1997, 304 pgs. – The C++ Language, 2nd Edition, Bjarne Stroustrup, 2000, 1019 pgs. – Thinking in C++, 2nd Edition, Bruce Eckel, 2000, 814 pgs. Also available online (for free): http://mindview.net/Books/TICPP/ThinkingInCPP2e.html • Nathan Ratliff – Version 1 of the C++ tutorial • Doug Zongker – Version 1 of the handouts • Hannah C. Tang & Albert J. Wong – Wrote, proofread, and presented the current version of the tutorial and handouts It’s basically over now The next few slides are here for completeness. You do not need to know most of the following info. The stuff on array, the majority of C developers probably do not know this following info. If you are not comfortable with the material on pointers and arrays presented previously, just skip the next slides. If you are terminally curious, keep going. Arrays (the whole story) Arrays are not pointers. They are not first class types either. • Arrays know their size! • Arrays forget their size after they get passed to a function! • You CANNOT return arrays of any type int foo(int ar[ ]) { printf(“%d\n”, sizeof(ar)); } int main(void) { int ar[10]; printf(“%d\n”,sizeof(ar)); foo(ar); return 0; } The output of this, assuming a 4-byte int would be: 40 4 Pointers to Arrays int (*ar)[3] vs. int *ar[3] • The first is a pointer to an array of 3 integers. • The second is a array of 3 elements, where each element is an int-pointer. • This is how multidimensional arrays work p int a[3]; int *p = a; p+1 == 8188 int (*p2)[3] = &a; p2+1 == 8196 (*p2)[0] == p2[0][0] == 122 (*(p2+1))[0] == p2[1][0] == p2 == 8184 (8200) p2 (8196) &a[2] (8192) &a[1] (8188) &a[0] (8184) 8184 8184 16 485 122