Programming in C Chars, Strings and Structs ASCII • • • The American Standard Code for Information Interchange (ASCII) character set, has 128 characters designed to encode the Roman alphabet used in English and other Western European languages. C was designed to work with ASCII and we will only use ASCII in this course. ASCII can represent 128 characters and is encoded in one eight bit byte with a leading 0. Seven bits can encode numbers 0 to 127. Since integers in the range of 0 to 127 can be stored in 1 byte of space, the sizeof(char) is 1. The characters 0 through 31 represent control characters (e.g., line feed, back space), 32-126 are printable characters, and 127 is delete . char type • Like Java, C supports the char data type for storing a single character. • char uses one byte of memory • char constants are enclosed in single quotes, ‘A’ • Use %c in printf( ) to print a single character ASCII Character Chart ASCII Observations • Characters for digits are consecutive – Can change an ASCII digit to its decimal value by subtracting ‘0’ • ASCII digit - ‘0’ = digit value (e.g. ‘7’ - ‘0’ = 7) – Can use a loop for( d = ‘0’; d <= ‘9’; d++) printf( “%c”, d); • Uppercase and lowercase letters are consecutive – Can use a loop for (k = 0; k < 26; k++) { printf( “%c”, ‘A’ + k); printf( “%c”, ‘a’ + k); } for (c = ‘A’; c <= ‘Z’; c++) printf( “%c”, c); – Can use them as an index char grade = ‘D’; int x = score [ grade - ‘A’]; Special Characters • The backslash character, \, is used to indicate that the char that follows has special meaning. E.g. for unprintable characters and special characters. • For example – \n is the newline character – \t is the tab character – \” is the double quote (necessary since double quotes are used to enclose strings) – \’ is the single quote (necessary since single quotes are used to enclose chars) – \\ is the backslash (necessary since \ now has special meaning) – \a is beep which is unprintable Special Char Example Code • What is the output from these statements? • printf(“\t\tMove over\n\nWorld, here I come\n") ; Move over World, here I come • printf("I\’ve written \”Hello World\”\n\t many times\n\a“); I’ve written “Hello World” many times <beep> Character Library • There are many functions written to handle characters. To use these functions, include <ctype.h>. • Note that the function parameter type is int, not char. Why is this ok? • Note that the return type for some functions is int since C does not support boolean. • A few of the commonly used functions are listed on the next slide. For a full list of ctype.h functions, type man ctype.h at the unix prompt. ctype.h • int isdigit (int c); – Determine if c is a decimal digit (‘0’ thru ‘9’) • Int isxdigit(int c); – Determines if c is a hexadecimal digit (‘0’ - ’9’, ‘a’ - f’, ‘A’ - ‘F’) • int isalpha (int c); – Determines if c is an alphabetic character (upper or lower-case) • int isspace (int c); – Determines if c is a whitespace character (space, tab, etc) • int isprint (int c); – Determines if c is a printable character • int tolower (int c); • int toupper (int c); – Changes c to lower- or upper-case respectively, if possible Strings in C • In C, a string is an array of characters terminated with the “null” character (‘\0’, value = 0). • Char arrays are permitted a special initialization using a string constant char name[4] = “bob”; which is shorthand for, but equivalent to char name[4] = {‘b’, ‘o’, ‘b’, ‘\0’}; name ‘b’ ‘o’ ‘b’ ‘\0’ Let the compiler count • If the size of your initialized char array is not specified, the compiler will count the characters and size your array for you char name[ ] = “bob”; char title[ ] = “Mr.”; ‘o’ ‘b’ ‘\0’ ‘M’ ‘r’ ‘.’ ‘\0’ ‘b’ name title C String Library • C provides a library of string functions. Note that assignment( = ) and equality (==) operators don’t do the job. • To use the string functions, include <string.h>. • Some of the more common functions are listed here on the next slides. • To see all the string functions, type man string.h at the unix prompt. C String Library (2) • Commonly used string functions – strlen( char string[ ] ) • Returns the number of characters in the string, not including the “null” character. – strcpy( char s1[ ], char s2[ ] ) • Copies s2 on top of s1. The order of the parameters mimics the assignment operator. – strcmp ( char s1[ ] , char s2[ ] ) • Returns < 0, 0, > 0 if s1 < s2, s1 == s2 or s1 > s2 lexigraphically – strcat( char s1[ ] , char s2[ ]) • Appends (concatenates) s2 to s1 C String Library (3) • Some function in the C String library have an additional size parameter. – strncpy( char s1[ ], char s2[ ], int n ) • Copies at most n characters of s2 on top of s1. Again, the order of the parameters mimics the assignment operator. – strncmp ( char s1[ ] , char s2[ ], int n ) • Compares up to n characters of s1 with s2. • Returns < 0, 0, > 0 if s1 < s2, s1 == s2 or s1 > s2 lexigraphically – strncat( char s1[ ], char s2[ ] , int n) • Appends at most n characters of s2 to s1 • Use %s in printf( ) to print a string. String Code char char char char first[10] = “bobby”; last[15] = “smith”; name[30]; you[5] = “bobo”; strcpy( name, first ); strcat( name, last ); printf( “%d, %s\n”, strlen(name), name ); strncpy( name, last, 2 ); printf( “%d, %s\n”, strlen(name), name ); int result = strcmp( you, first ); result = strncmp( you, first, 3 ); strcat( first, last ); Simple Encryption char c, msg[] = "this is a secret message"; int i = 0; char code[26] = /* Initialize our encryption code */ {'t','f','h','x','q','j','e','m','u','p','i','d','c', 'k','v','b','a','o','l','r','z','w','g','n','s','y'} ; /* Print the original phrase */ printf ("Original phrase: %s\n", msg); /* Encrypt */ while( msg[ i ] != '\0‘ ){ if( isalpha( msg[ i ] ) ) { c = tolower( msg[ i ] ) ; msg[ i ] = code[ c - ‘a’ ] ; } ++i; } printf("Encrypted: %s\n", msg ) ; “Big Enough” • The “owner” of a string is responsible for allocating array space which is “big enough” to store the string (including the null character) • Most string library functions do not check the size of the string memory. E.g. strcpy What can happen? int main( ) { char first[10] = "bobby"; char last[15] = "smith"; printf("first contains %d chars: %s\n", strlen(first), first); printf("last contains %d chars: %s\n", strlen(last), last); strcpy(first, "1234567890123"); /* too big */ printf("first contains %d chars: %s\n", strlen(first), first); printf("last contains %d chars: %s\n", strlen(last), last); return 0; } /* output */ first contains 5 chars: bobby last contains 5 chars: smith first contains 13 chars: 1234567890123 last contains 5 chars: smith Segmentation fault 1/14/10 18 No Classes in C • Because C is not an OOP language, there is no way to combine data and code into a single entity. C does allow us to combine related data into a structure using the keyword struct. All data in a struct can be accessed by any code. The general form of a structure definition is struct tag { Note the semi-colon member1_declaration; member2_declaration; member3_declaration; . . . memberN_declaration; }; where struct is the keyword, tag names this kind of struct, and member_declarations are variable declarations which define the members. C struct Example • Defining a struct to represent a point in a coordinate plane struct point point is the struct tag { int x; /* x-coordinate */ int y; /* y-coordinate */ }; • Given the declarations struct point p1; struct point p2; we can access the members of these struct variables: * the x-coordinate of p1 is p1.x * the y-coordinate of p1 is p1.y * the x-coordinate of p2 is p2.x * the y-coordinate of p2 is p2.y Using structs and members • Like other variable types, struct variables (e.g. p1, p2) can be passed to functions as parameters and returned from functions as return types. • The members of a struct are variables just like any other and ca be used wherever any other variable of the same type may be used. For example, the members of the struct point can then be used just like any other integer variables. printPoint.c struct point inputPoint( ) { struct point p; printf(“please input the x- and y-coordinates: “); scanf(“%d %d”, &p.x, &p.y); return p; } void printPoint( struct point point ) { printf (“( %2d, %2d )”, point.x, point.y); } int main ( ) { struct point endpoint; endpoint = inputPoint( ); printPoint( endpoint ); return 0; } struct assignment • The contents of a struct variable may be copied to another struct variable of the same type using the assignment (=) operator • After this code is executed struct struct p1.x = p1.y = point p1; point p2; 42; 59; p2 = p1; /* structure assignment copies members */ The values of p2’s members are the same as p1’s members. E.g. p1.x = p2.x = 42 and p1.y = p2.y = 59 struct within a struct • A data element in a struct may be another struct (similar to composition in Java / C++). • This example defines a line in the coordinate plane by specifying its endpoints as POINT structs struct line { struct point leftEndPoint; struct point rightEndPoint; }; • Given the declarations below, how do we access the x- and y-coodinates of each line’s endpoints? struct line line1, line2; line1.leftEndPoint.x line1.rightEndPoint.x line2.leftEndPoint.x line2.rightEndPoint.x Arrays of struct • Since a struct is a variable type, we can create arrays of structs just like we create arrays of int, char, double, etc. • Write the declaration for an array of 5 line structures name “lines” struct line lines[ 5 ]; • Write the code to print the x-coordinate of the left end point of the 3rd line in the array printf( “%d\n”, lines[2].leftEndPoint.x); 1/20/10 25 Array of struct Code /* assume same point and line struct definitions */ int main( ) { struct line lines[5]; int k; /* write code to initialize all data members to zero */ for (k = 0; k < 5; k++) { lines[k].leftEndPoint.x = 0; lines[k].leftEndPoint.y = 0; lines[k].rightEndPoint.x = 0; lines[k].rightEndPoint.y = 0; } /* call the printPoint( ) function to print ** the left end point of the 3rd line */ printPoint( lines[2].lefEndPoint); return 0; } 26 Bitfields • When saving space in memory or a communications message is of paramount importance, we sometimes need to pack lots of information into a small space. We can use struct syntax to define “variables” which are as small as 1 bit in size. These variables are known as “bit fields”. struct weather { unsigned unsigned unsigned unsigned unsigned }; int int int int int temperature : 5; windSpeed : 6; isRaining : 1; isSunny : 1; isSnowing : 1; Using Bitfields struct weather todaysWeather; todaysWeather.temperature = 44; todaysWeather.isSnowing = 0; todaysWeather.windSpeed = 23; /* etc */ if (todaysWeather.isRaining) printf( “%s\n”, “Take your umbrella”); if (todaysWeather.temperature < 32 ) printf( “%s\n”, “Stay home”); More on Bit fields • Almost everything about bit fields is implementation (machine and compiler) specific. • Bit fields may only defined as (unsigned) ints • Bit fields do not have addresses, so the & operator cannot be applied to them • We’ll see more on this later Unions • A union is a variable type that may hold different type of members of different sizes, BUT only one type at a time. All members of the union share the same memory. The compiler assigns enough memory for the largest of the member types. • The syntax for defining a union and using its members is the same as the syntax for a struct. Formal Union Definition • The general form of a union definition is union tag { member1_declaration; member2_declaration; member3_declaration; . . . memberN_declaration; }; where union is the keyword, tag names this kind of union, and member_declarations are variable declarations which define the members. Note that the syntax for defining a union is exactly the same as the syntax for a struct. union.c union data { int x; char c[8]; } ; int i; union data item; item.x = 42; printf(“%d, %o, %x”, item.x, item.x, item.x ); for (i = 0; i < 8; i++ ) printf(“%x ”, item.c[i]); printf( “%s”, “\n”); printf(“%s\n”, “size of DATA = ”, sizeof(DATA)); Union vs. Struct • Similarities – Definition syntax virtually identical – Member access syntax identical • Differences – Members of a struct each have their own address in memory. – The size of a struct is at least as big as the sum of the sizes of the members (more on this later) – Members of a union share the same memory. The size of a union is the size of the largest member.