Chars, Strings and Structs

advertisement
Programming in C
Chars, Strings and Structs
ASCII
•
•
•
The American Standard Code for Information Interchange (ASCII)
character set, has 128 characters designed to encode the Roman
alphabet used in English and other Western European languages.
C was designed to work with ASCII and we will only use ASCII in this
course. ASCII can represent 128 characters and is encoded in one
eight bit byte with a leading 0. Seven bits can encode numbers 0 to
127. Since integers in the range of 0 to 127 can be stored in 1 byte of
space, the sizeof(char) is 1.
The characters 0 through 31 represent control characters (e.g., line
feed, back space), 32-126 are printable characters, and 127 is delete .
char type
• Like Java, C supports the char data type for storing
a single character.
• char uses one byte of memory
• char constants are enclosed in single quotes, ‘A’
• Use %c in printf( ) to print a single character
ASCII Character Chart
ASCII Observations
•
Characters for digits are consecutive
– Can change an ASCII digit to its decimal value by subtracting ‘0’
• ASCII digit - ‘0’ = digit value (e.g. ‘7’ - ‘0’ = 7)
– Can use a loop
for( d = ‘0’; d <= ‘9’; d++)
printf( “%c”, d);
•
Uppercase and lowercase letters are consecutive
– Can use a loop
for (k = 0; k < 26; k++) {
printf( “%c”, ‘A’ + k);
printf( “%c”, ‘a’ + k);
}
for (c = ‘A’; c <= ‘Z’; c++)
printf( “%c”, c);
– Can use them as an index
char grade = ‘D’;
int x = score [ grade - ‘A’];
Special Characters
• The backslash character, \, is used to indicate that the char that
follows has special meaning. E.g. for unprintable characters
and special characters.
• For example
– \n is the newline character
– \t is the tab character
– \” is the double quote (necessary since double quotes are used to
enclose strings)
– \’ is the single quote (necessary since single quotes are used to
enclose chars)
– \\ is the backslash (necessary since \ now has special meaning)
– \a is beep which is unprintable
Special Char Example Code
•
What is the output from these statements?
•
printf(“\t\tMove over\n\nWorld, here I come\n") ;
Move over
World, here I come
•
printf("I\’ve written \”Hello World\”\n\t many times\n\a“);
I’ve written “Hello World”
many times <beep>
Character Library
• There are many functions written to handle characters. To use
these functions, include <ctype.h>.
• Note that the function parameter type is int, not char. Why is
this ok?
• Note that the return type for some functions is int since C
does not support boolean.
• A few of the commonly used functions are listed on the next
slide. For a full list of ctype.h functions, type man ctype.h at
the unix prompt.
ctype.h
• int isdigit (int c);
– Determine if c is a decimal digit (‘0’ thru ‘9’)
• Int isxdigit(int c);
– Determines if c is a hexadecimal digit (‘0’ - ’9’, ‘a’ - f’, ‘A’ - ‘F’)
• int isalpha (int c);
– Determines if c is an alphabetic character (upper or lower-case)
• int isspace (int c);
– Determines if c is a whitespace character (space, tab, etc)
• int isprint (int c);
– Determines if c is a printable character
• int tolower (int c);
• int toupper (int c);
– Changes c to lower- or upper-case respectively, if possible
Strings in C
• In C, a string is an array of characters terminated with the
“null” character (‘\0’, value = 0).
• Char arrays are permitted a special initialization using a string
constant
char name[4] = “bob”;
which is shorthand for, but equivalent to
char name[4] = {‘b’, ‘o’, ‘b’, ‘\0’};
name
‘b’
‘o’
‘b’
‘\0’
Let the compiler count
• If the size of your initialized char array is not specified, the
compiler will count the characters and size your array for you
char name[ ] = “bob”;
char title[ ] = “Mr.”;
‘o’
‘b’
‘\0’
‘M’ ‘r’
‘.’
‘\0’
‘b’
name
title
C String Library
• C provides a library of string functions. Note that
assignment( = ) and equality (==) operators don’t do the job.
• To use the string functions, include <string.h>.
• Some of the more common functions are listed here on the
next slides.
•
To see all the string functions, type man string.h at the
unix prompt.
C String Library (2)
• Commonly used string functions
– strlen( char string[ ] )
• Returns the number of characters in the string, not including
the “null” character.
– strcpy( char s1[ ], char s2[ ] )
• Copies s2 on top of s1. The order of the parameters mimics
the assignment operator.
– strcmp ( char s1[ ] , char s2[ ] )
• Returns < 0, 0, > 0 if s1 < s2, s1 == s2 or s1 > s2 lexigraphically
– strcat( char s1[ ] , char s2[ ])
• Appends (concatenates) s2 to s1
C String Library (3)
• Some function in the C String library have an
additional size parameter.
– strncpy( char s1[ ], char s2[ ], int n )
• Copies at most n characters of s2 on top of s1. Again, the
order of the parameters mimics the assignment operator.
– strncmp ( char s1[ ] , char s2[ ], int n )
• Compares up to n characters of s1 with s2.
• Returns < 0, 0, > 0 if s1 < s2, s1 == s2 or s1 > s2 lexigraphically
– strncat( char s1[ ], char s2[ ] , int n)
• Appends at most n characters of s2 to s1
• Use %s in printf( ) to print a string.
String Code
char
char
char
char
first[10] = “bobby”;
last[15] = “smith”;
name[30];
you[5] = “bobo”;
strcpy( name, first );
strcat( name, last );
printf( “%d, %s\n”, strlen(name), name );
strncpy( name, last, 2 );
printf( “%d, %s\n”, strlen(name), name );
int result = strcmp( you, first );
result = strncmp( you, first, 3 );
strcat( first, last );
Simple Encryption
char c, msg[] = "this is a secret message";
int i = 0;
char code[26] = /* Initialize our encryption code */
{'t','f','h','x','q','j','e','m','u','p','i','d','c',
'k','v','b','a','o','l','r','z','w','g','n','s','y'} ;
/* Print the original phrase */
printf ("Original phrase: %s\n", msg);
/* Encrypt */
while( msg[ i ] != '\0‘ ){
if( isalpha( msg[ i ] ) ) {
c = tolower( msg[ i ] ) ;
msg[ i ] = code[ c - ‘a’ ] ;
}
++i;
}
printf("Encrypted: %s\n", msg ) ;
“Big Enough”
• The “owner” of a string is responsible for allocating
array space which is “big enough” to store the string
(including the null character)
• Most string library functions do not check the size of
the string memory. E.g. strcpy
What can happen?
int main( )
{
char first[10] = "bobby";
char last[15] = "smith";
printf("first contains %d chars: %s\n", strlen(first), first);
printf("last contains %d chars: %s\n", strlen(last), last);
strcpy(first, "1234567890123");
/* too big */
printf("first contains %d chars: %s\n", strlen(first), first);
printf("last contains %d chars: %s\n", strlen(last), last);
return 0;
}
/* output */
first contains 5 chars: bobby
last contains 5 chars: smith
first contains 13 chars: 1234567890123
last contains 5 chars: smith
Segmentation fault
1/14/10
18
No Classes in C
•
Because C is not an OOP language, there is no way to combine data and
code into a single entity. C does allow us to combine related data into a
structure using the keyword struct. All data in a struct can be accessed
by any code. The general form of a structure definition is
struct tag
{
Note the semi-colon
member1_declaration;
member2_declaration;
member3_declaration;
. . .
memberN_declaration;
};
where struct is the keyword, tag names this kind of struct, and
member_declarations are variable declarations which define the
members.
C struct Example
• Defining a struct to represent a point in a coordinate plane
struct point
point is the struct tag
{
int x;
/* x-coordinate */
int y;
/* y-coordinate */
};
•
Given the declarations
struct point p1;
struct point p2;
we can access the members of these struct variables:
* the x-coordinate of p1 is p1.x
* the y-coordinate of p1 is p1.y
* the x-coordinate of p2 is p2.x
* the y-coordinate of p2 is p2.y
Using structs and members
• Like other variable types, struct variables (e.g. p1, p2) can be
passed to functions as parameters and returned from
functions as return types.
• The members of a struct are variables just like any other and
ca be used wherever any other variable of the same type may
be used. For example, the members of the struct point can
then be used just like any other integer variables.
printPoint.c
struct point inputPoint( )
{
struct point p;
printf(“please input the x- and y-coordinates: “);
scanf(“%d %d”, &p.x, &p.y);
return p;
}
void printPoint( struct point point )
{
printf (“( %2d, %2d )”, point.x, point.y);
}
int main ( )
{
struct point endpoint;
endpoint = inputPoint( );
printPoint( endpoint );
return 0;
}
struct assignment
• The contents of a struct variable may be copied to another
struct variable of the same type using the assignment (=)
operator
• After this code is executed
struct
struct
p1.x =
p1.y =
point p1;
point p2;
42;
59;
p2 = p1;
/* structure assignment copies members */
The values of p2’s members are the same as p1’s members.
E.g. p1.x = p2.x = 42 and p1.y = p2.y = 59
struct within a struct
• A data element in a struct may be another struct
(similar to composition in Java / C++).
• This example defines a line in the coordinate plane by
specifying its endpoints as POINT structs
struct line
{
struct point leftEndPoint;
struct point rightEndPoint;
};
• Given the declarations below, how do we access the x- and
y-coodinates of each line’s endpoints?
struct line line1, line2;
line1.leftEndPoint.x
line1.rightEndPoint.x
line2.leftEndPoint.x
line2.rightEndPoint.x
Arrays of struct
• Since a struct is a variable type, we can create
arrays of structs just like we create arrays of int,
char, double, etc.
• Write the declaration for an array of 5 line structures
name “lines”
struct line lines[ 5 ];
• Write the code to print the x-coordinate of the left
end point of the 3rd line in the array
printf( “%d\n”, lines[2].leftEndPoint.x);
1/20/10
25
Array of struct Code
/* assume same point and line struct definitions */
int main( )
{
struct line lines[5];
int k;
/* write code to initialize all data members to zero */
for (k = 0; k < 5; k++)
{
lines[k].leftEndPoint.x = 0;
lines[k].leftEndPoint.y = 0;
lines[k].rightEndPoint.x = 0;
lines[k].rightEndPoint.y = 0;
}
/* call the printPoint( ) function to print
** the left end point of the 3rd line */
printPoint( lines[2].lefEndPoint);
return 0;
}
26
Bitfields
• When saving space in memory or a communications
message is of paramount importance, we sometimes need
to pack lots of information into a small space. We can use
struct syntax to define “variables” which are as small as 1
bit in size. These variables are known as “bit fields”.
struct weather
{
unsigned
unsigned
unsigned
unsigned
unsigned
};
int
int
int
int
int
temperature : 5;
windSpeed : 6;
isRaining : 1;
isSunny : 1;
isSnowing : 1;
Using Bitfields
struct weather todaysWeather;
todaysWeather.temperature = 44;
todaysWeather.isSnowing = 0;
todaysWeather.windSpeed = 23;
/* etc */
if (todaysWeather.isRaining)
printf( “%s\n”, “Take your umbrella”);
if (todaysWeather.temperature < 32 )
printf( “%s\n”, “Stay home”);
More on Bit fields
• Almost everything about bit fields is implementation
(machine and compiler) specific.
• Bit fields may only defined as (unsigned) ints
• Bit fields do not have addresses, so the & operator
cannot be applied to them
• We’ll see more on this later
Unions
• A union is a variable type that may hold different
type of members of different sizes, BUT only one
type at a time. All members of the union share
the same memory. The compiler assigns enough
memory for the largest of the member types.
• The syntax for defining a union and using its
members is the same as the syntax for a struct.
Formal Union Definition
• The general form of a union definition is
union tag
{
member1_declaration;
member2_declaration;
member3_declaration;
. . .
memberN_declaration;
};
where union is the keyword, tag names this kind of union, and
member_declarations are variable declarations which
define the members. Note that the syntax for defining a union
is exactly the same as the syntax for a struct.
union.c
union data
{
int x;
char c[8];
} ;
int i;
union data item;
item.x = 42;
printf(“%d, %o, %x”, item.x, item.x, item.x );
for (i = 0; i < 8; i++ )
printf(“%x ”, item.c[i]);
printf( “%s”, “\n”);
printf(“%s\n”, “size of DATA = ”, sizeof(DATA));
Union vs. Struct
• Similarities
– Definition syntax virtually identical
– Member access syntax identical
• Differences
– Members of a struct each have their own address in
memory.
– The size of a struct is at least as big as the sum of the
sizes of the members (more on this later)
– Members of a union share the same memory. The size of a
union is the size of the largest member.
Download