ch10 V2 - ccsa221-4

advertisement
CCSA 221
Programming in C
Chapter 10
Character Strings
AlHanouf AlAmr
1
Outline
• Arrays of Characters
• Variable-Length Character Strings
•
•
•
•
•
Initializing and Displaying Character Strings
Testing Two Character Strings for Equality
Inputting Character Strings
Single-Character Input
The Null String
• Escape Characters
• More on Constant Strings
• Character Strings, Structures, and Arrays
• A Better Search Method
• Character Operations
2
What is a character string?
•
In the statement: printf ("Programming in C is fun.\n");
the argument that is passed to the printf function is the character string
"Programming in C is fun.\n“
•
The double quotation marks are used to delimit the character string, which can
contain any combinations of letters, numbers, or special characters.
•
When introduced to the data type char , you learned that a variable that is
declared to be of this type can contain only a single character.
•
To assign a single character to such a variable, the character is enclosed within a
pair of single quotation marks.
•
Thus, the assignment plusSign = '+'; has the effect of assigning the
character '+' to the variable plusSign.
3
Arrays of Characters
•
If you want to be able to deal with variables that can hold more than a single
character, you can use an array of characters.
•
You can define an array of characters called word as follows:
char word [] = { 'H', 'e', 'l', 'l', 'o', '!' };
 To print out the contents of the array word,
you run through each element in the array
and display it using the %c format characters.
 To do processings of the word (copy,
concatenate two words, etc) you need to have
the actual length of the character array in a
separate variable !
4
Variable-Length Character Strings
• A method for dealing with character arrays without knowing the actual length of the
character array:
 Placing a special character at the end of every character string. In this manner,
the function can then determine for itself when it has reached the end of a
character string after it encounters this special character.
• In the C language, the special character that is used to
signal the end of a string is known as the null character
and is written as '\0'.
char word [] = { 'H', 'e', 'l', 'l', 'o', '!', '\0' };
• How these variable-length character strings are used?
 Using a function that counts the number of characters
in a character string.
5
Variable-Length Character Strings
// Function to count the number of characters in a string
#include <stdio.h>
int stringLength (const char string[]){
int count = 0;
while ( string[count] != '\0' )
++count;
return count;
}
The function declares its
argument as a const array of
characters because it is not
making any changes to the
array
int main (void) {
const char word1[] = { 'a', 's', 't', 'e', 'r', '\0' };
const char word2[] = { 'a', 't', '\0' };
const char word3[] = { 'a', 'w', 'e', '\0' };
printf ("%i %i %i\n", stringLength (word1),
stringLength (word2), stringLength (word3));
return 0;
}
Output:
5 2 3
6
Initializing and Displaying Character Strings
•
Initializing a string:
char word[] = "Hello!";
•
Is equivalent to:
char word[] = { 'H', 'e', 'l', 'l', 'o', '!', '\0' };
As you can see, the size of the array is not specified in the previous 2 examples.
•
We can specify the size of the string as follows:
char
word[7] = { "Hello!" };
In this example, the compiler has enough room in the array to place the terminating
null character.
• However, in: char word[6] = { "Hello!" };
• the compiler can’t fit a terminating null character at the end of the array, and so it
doesn’t put one there.
7
Initializing and Displaying Character Strings
• In general, character-string constants in the C language are automatically terminated by
the null character wherever they appear in your program.
 In this example: printf ("Programming in C is fun\n");
Function printf can determine when the end of a character string has been reached
when it finds the null character that is automatically placed after the newline character.
• . To display an array of characters, we use the special format characters %s. e.g.
printf ("%s\n", word);
8
String processing-Concatenate strings
#include <stdio.h>
// Function to concatenate two character strings
void concat (char result[], const char str1[], const char str2[])
{
int i, j;
// copy str1 to result
for ( i = 0; str1[i] != '\0'; ++i )
result[i] = str1[i];
// copy str2 to result
for ( j = 0; str2[j] != '\0'; ++j )
result[i + j] = str2[j];
// Terminate the concatenated string with a null character
result [i + j] = '\0';
}
Output:
int main (void)
{
const char s1[]
const char s2[]
char s3[20];
concat (s3, s1,
printf ("%s\n",
return 0;
}
Test works.
= "Test " ;
= "works." ;
s2);
s3);
9
Testing Two Character Strings for Equality
•
You cannot directly test two strings to see if they are equal with a statement
such as
if ( string1 == string2 )
•
We can develop a function that can be used to compare two character strings as
follows:
bool equalStrings (const char s1[], const char s2[])
{
int i = 0;
bool areEqual;
while ( s1[i] == s2 [i] && s1[i] != '\0' && s2[i] != '\0' )
++i;
if ( s1[i] == '\0' && s2[i] == '\0' )
areEqual = true;
else
areEqual = false;
return areEqual;
}
10
// Program to Test Strings for Equality
#include <stdio.h>
/***** Insert equalStrings function here *****/
int main (void)
{
const char
const char
stra[] = "string compare test";
strb[] = "string";
printf ("%i\n", equalStrings (stra, strb));
The function
that we
declared in the
previous slide
printf ("%i\n", equalStrings (stra, stra));
Comparing
same string
printf ("%i\n", equalStrings (strb, "string"));
return 0;
}
Comparing
same string
Output:
0
1
1
11
Inputting Character Strings
• The scanf function can be used with the %s format characters to read in a string of
characters up to a blank space, tab character, or the end of the line, whichever occurs
first.
char string[81];
scanf ("%s", string);
& is not
placed
 Note that unlike previous scanf calls, in the case of reading strings, the & is not
placed before the array name !
 If the following characters are entered: Shawshank
the string "Shawshank" is read in by the scanf function and is stored inside the string
array.
 If the following line of text is typed instead: iTunes playlist
just the string "iTunes" is stored inside the string array because the blank space.
 If the scanf call is executed again, this time the string "playlist" is stored inside the
string array because the scanf function always continues
12
scanning from the most recent character that was read in.
Inputting Character Strings
•
If s1 , s2 , and s3 are defined to be character arrays of appropriate sizes, execution of
the statement:
scanf ("%s%s%s", s1, s2, s3);
with the input line of text:
micro computer system
results in the assignment of the string "micro" to s1 , "computer" to s2 , and "system“ to
s3 .
•
If the following line of text is typed instead:
system expansion
it results in the assignment of the string "system" to s1 , and "expansion" to s2.
13
// Program to illustrate the %s scanf format characters
#include <stdio.h>
Output:
int main (void)
{
char s1[81], s2[81], s3[81];
printf ("Enter text:\n");
scanf ("%s%s%s", s1, s2, s3);
Enter text:
system expansion
bus
s1 = system
s2 = expansion
s3 = bus
printf ("\ns1 = %s\ns2 = %s\ns3 = %s\n", s1, s2, s3);
return 0;
}
•If you type in more than 80 consecutive characters to the preceding program without pressing
the spacebar, the tab key, or the Enter (or Return) key, scanf overflows the character array !
 scanf has no way of knowing how large its character array parrameters are ! When handed a %s
format, it simply continues to read and store characters until one of the noted terminator
characters is reached.
•If you place a number after the % in the scanf format string, this tells scanf the maximum
number of characters to read.
 Correct use of scanf for reading strings: scanf ("%80s", string);
14
Single-Character Input
#include <stdio.h>
// Function to read a line of text from the terminal
void readLine (char buffer[]){
char character;
int
i = 0;
Reading 1 line by repeated calls to
do
the getchar function return
{
successive single characters from
character = getchar ();
the input. When the end of the line
buffer[i] = character;
is reached, the function returns the
++i;
newline character '\n' .
}
while ( character != '\n' );
Placing null at
buffer[i - 1] = '\0';
the end of
Output:
}
every line
int main (void){
int
i;
char line[81];
for ( i = 0; i < 3; ++i ){
readLine (line);
printf ("%s\n\n", line);
}
return 0;
}
Read 3
lines of
text
This is a sample line of text.
This is a sample line of text.
abcdefghijklmnopqrstuvwxyz
Abcdefghijklmnopqrstuvwxyz
runtime library routines
runtime library routines
15
The Null String
•
The null string: A character string that contains no characters other than the null
character.
•
The null string length is zero.
•
In any program that asked the user to enter a text, if the user presses Enter key
without entering any character before, the string that has been read is a null
string.
•
Examples:
char empty[]= "";
char buf[100]= "";
16
//Program to count words in a piece of text
#include <stdio.h>
#include <stdbool.h>
Function
readLine
Slide 15
/***** Insert readLine function here *****/
// Function to determine if a character is alphabetic
bool alphabetic (const char c){
if ( (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') )
return true;
else
return false;}
// Function to count the number of words in a string
int countWords (const char string[]){
int
i, wordCount = 0;
bool lookingForWord = true;
for ( i = 0; string[i] != '\0'; ++i )
if ( alphabetic(string[i]) ){
if ( lookingForWord ){
++wordCount;
lookingForWord = false;
}
}
else
lookingForWord = true;
17
return wordCount;}
int main (void) {
char text[81];
int
totalWords = 0;
bool endOfText = false;
printf ("Type in your text.\n");
printf ("When you are done, press ‘Enter'.\n\n");
while ( ! endOfText )
{
readLine (text);
Function
readLine
Slide 15
tests the input line to
see if just the Enter was
pressed then it is a null
string.
if ( text[0] == '\0' )
endOfText = true;
else
totalWords += countWords (text);
}
printf ("\nThere are %i words in the above text.\n",
return 0;
}
totalWords);
18
Escape Characters
•
•
The backslash character has a special significance that extends beyond its use in
forming the newline ( \n )and null ( \0 ) characters.
Other characters can be combined with the backslash character to perform special
functions. These are referred to as escape characters.
\a
\b
\f
\n
\r
\t
\v
\\
\"
\'
\?
Audible alert
Backspace
Form feed
Newline
Carriage return
Horizontal tab
Vertical tab
Backslash
Double quotation mark
Single quotation mark
Question mark
19
Escape Characters
Examples:
• printf ("\aSYSTEM SHUT DOWN IN 5 MINUTES!!\n");
sounds an alert and displays the indicated message.
• printf ("%i\t%i\t%i\n", a, b, c);
displays the value of a , spaces over to the next tab setting, displays the value of b,
spaces over to the next tab setting, and then displays the value of c.
• printf ("\\t is the horizontal tab character.\n");
displays the following:
\t is the horizontal tab character.
Note that because the \\ is encountered first in the string, a tab is not displayed in this case.
• printf ("\"Hello,\" he said.\n");
results in the display of the message:
"Hello," he said.
• c = '\'';
assigns a single quotation character to c .
20
More on Constant Strings
• Without the line continuation character, your C compiler generates an error message
if you attempt to initialize a character string across multiple lines; for example:
char letters[] =
{ "abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ" };
 By placing a backslash character at the end of each line to be continued, a character
string constant can be written over multiple lines:
char letters[] =
{ "abcdefghijklmnopqrstuvwxyz\
ABCDEFGHIJKLMNOPQRSTUVWXYZ" };
21
More on Constant Strings
 Another way to break up long character strings is to divide them into two or more
adjacent strings. Adjacent strings are constant strings separated by zero or more spaces,
tabs, or newlines.
 The compiler automatically concatenates adjacent strings together.
So, the letters array can also be set to the letters of the alphabet by writing
char letters[] =
{ "abcdefghijklmnopqrstuvwxyz"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ" };
 The three printf calls:
printf ("Programming in C is fun\n");
printf ("Programming" " in C is fun\n");
printf ("Programming" " in C" " is fun\n");
all pass a single argument to printf because the compiler concatenates the strings
together in the second and third calls.
22
Character Strings, Structures, and Arrays
• Suppose we want to write a computer program that acts like a dictionary. we could
type the word into the program, and the program could then automatically “look
up” the word inside the dictionary and tell us its definition.
• Because the word and its definition are logically related, we can represent the word
and its definition inside the computer as a structure. we can define a structure
called entry to hold the word and its definition:
struct entry
{
char word[15];
char definition[50];
};
• The following is an example of a variable defined to be of type struct entry that is
initialized to contain the word “blob” and its definition.
struct entry
word1 = { "blob", "an amorphous mass" };
• Because we want to provide for many words inside our dictionary, it seems logical to
define an array of entry structures, such as in
struct entry
dictionary[100];
23
// Program to use the dictionary lookup program
#include <stdio.h>
#include <stdbool.h>
struct entry
{
char
word[15];
char
definition[50];
};
/***** Insert equalStrings function here *****/
// function to look up a word inside a dictionary
int lookup (const struct entry dictionary[], const char
const int entries)
{
int i;
Function
equalStrings
Slide 10
search[],
for ( i = 0; i < entries; ++i )
if ( equalStrings (search, dictionary[i].word) )
return i;
return -1;
}
24
int main (void){
const struct entry dictionary[100] ={
{ "aardvark", "a burrowing African mammal"
},
{ "abyss",
"a bottomless pit"
},
{ "acumen",
"mentally sharp; keen"
},
{ "addle",
"to become confused"
},
{ "aerie",
"a high nest"
},
{ "affix",
"to append; attach"
},
{ "agar",
"a jelly made from seaweed"
},
{ "ahoy",
"a nautical call of greeting"
},
{ "aigrette", "an ornamental cluster of feathers" },
{ "ajar",
"partially opened"
} };
char word[10];
Output:
int
entries = 10;
Enter word: agar
int
entry;
a jelly made from seaweed
printf ("Enter word: ");
scanf ("%s", word);
entry = lookup (dictionary, word, entries);
Output (Rerun):
Enter word: accede
Sorry, the word accede is not in my dictionary.
if ( entry != -1 )
printf ("%s\n", dictionary[entry].definition);
else
printf ("Sorry, the word %s is not in my dictionary.\n", word);
return 0;
}
25
A Better Search Method
• The method used by the lookup function is perfectly fine for a small-sized dictionary.
If you start dealing with large dictionaries containing hundreds or perhaps even
thousands of entries, this approach might no longer be sufficient because of the time
it takes to sequentially search through all of the entries.
• What you are really looking for is an algorithm that reduces the search time in most
cases, not just in one particular case. Such an algorithm exists under the name of the
binary search.
• Binary Search Algorithm
Search x in SORTED array M of size n.
Step 1: Set low to 0, high to n – 1.
Step 2: If low > high, x does not exist in M and the algorithm terminates.
Step 3: Set mid to (low + high) / 2.
Step 4: If M[mid] < x, set low to mid + 1 and go to step 2.
Step 5: If M[mid] > x, set high to mid – 1 and go to step 2.
Step 6: M[mid] equals x and the algorithm terminates.
26
// Dictionary lookup program using binary search
#include <stdio.h>
struct entry {
char word[15];
char definition[50];
};
// Function to compare two character strings
int compareStrings (const char s1[], const char
{
int i = 0, answer;
s2[])
while ( s1[i] == s2[i] && s1[i] != '\0'&& s2[i] != '\0' )
++i;
if ( s1[i] < s2[i] )
answer = -1;
else if ( s1[i] == s2[i] )
answer = 0;
else
answer = 1;
/* s1 < s2
*/
/* s1 == s2 */
/* s1 > s2
*/
return answer;
}
27
// Function to look up a word inside a dictionary
int lookup (const struct entry dictionary[], const char
const int entries)
{
int low = 0;
int high = entries - 1;
int mid, result;
search[],
while ( low <= high )
{
mid = (low + high) / 2;
result = compareStrings (dictionary[mid].word, search);
if ( result == -1 )
low = mid + 1;
else if ( result == 1 )
high = mid - 1;
else
return mid;
/* found it */
}
return -1;
/* not found */
}
28
int main (void){
const struct entry dictionary[100] ={
{ "aardvark", "a burrowing African mammal"
{ "abyss",
"a bottomless pit"
{ "acumen",
"mentally sharp; keen"
{ "addle",
"to become confused"
{ "aerie",
"a high nest"
{ "affix",
"to append; attach"
{ "agar",
"a jelly made from seaweed"
{ "ahoy",
"a nautical call of greeting"
{ "aigrette", "an ornamental cluster of feathers"
{ "ajar",
"partially opened"
int
char
int
entries = 10;
word[15];
entry;
printf ("Enter word: ");
scanf ("%s", word);
entry = lookup (dictionary, word, entries);
},
},
},
},
},
},
},
},
},
} };
Output:
Enter word: aigrette
an ornamental cluster of feathers
Output (Rerun):
Enter word: acerb
Sorry, that word is not in my dictionary.
if ( entry != -1 )
printf ("%s\n", dictionary[entry].definition);
else
printf ("Sorry, the word %s is not in my dictionary.\n", word);
return 0;
}
29
Character Operations
• Character variables and constants are frequently used in relational and arithmetic
expressions.
• Whenever a character constant or variable is used in an expression in C, it is
automatically converted to an integer value.
 If (c >= 'a'
&&
c <= 'z‘)
The first part of this expression is actually comparing the value of c against the internal
numerical representation of the character 'a‘. In ASCII, the character 'a' has the value 97
, the character 'b' has the value 98, and so on.
 printf ("%i \n", 'a'); displays 97.
 c = 'a' + 1;
printf ("%c \n", c);
Because the value of 'a' is 97 in ASCII, c = 97+ 1 = 98. The value 98 represent 'b' in ASCII,
so 'b' will be displayed by the printf call.
 i = c - '0';
The character '0' is not the same as the integer 0, the character '0' has the numerical
value 48 in ASCII. Therefore, if you suppose that c contained the character '5' , which, in
ASCII, is the number 53 so I = 53 – 48 = 5.
30
which results in the integer value 5 being assigned to i .
// Function to convert a string to an integer
#include <stdio.h>
int
{
strToInt (const char
string[])
int
i, intValue, result = 0;
for
{
( i = 0; string[i] != ‘\0'; ++i )
if ((string[i] >= '0') && (string[i] <= '9'))
{
intValue = string[i] - '0';
result = result * 10 + intValue;
}
}
return result;
}
int main (void)
{
printf ( "%i\n", strToInt("245") );
printf ( "%i\n", strToInt("100") + 25 );
printf ( "%i\n", strToInt("13x5") );
return 0;
}
Output:
245
125
13
31
Download