Primitive Data Part 2 (character representation, ASCII)

advertisement
cs3843
syllabus
lecture notes
programming
assignments
recitations
Characters
For the USA, there are two standard encoding of 8 bit characters:
EBCDIC - Extended Binary Coded Decimal Interchange
Code is mainly used on IBM mainframe
computers. It descended from punch card 6 bit
representations.
ASCII - American Standard Code for Information
Interchange has become the more dominant
encoding.
Do you notice anything that might be annoying about the EBCDIC
value sequence?
Printable Character Representations of Numbers
homework
set up
Letter ASCII
Decima
l
A
65
B
66
C
67
D
68
E
69
F
70
G
71
H
72
I
73
J
74
K
75
L
76
M
77
N
78
O
79
P
80
Q
81
R
82
S
83
T
84
U
85
V
86
W
87
X
88
Y
89
Z
90
Number ASCII
(Hex)
0
30
ASCII
(Hex)
EBCDIC
(HEX)
41
42
43
44
45
46
47
48
49
4A
4B
4C
4D
4E
4F
50
51
52
53
54
55
56
57
58
59
5A
EBCDI
C
(Hex)
F0
C1
C2
C3
C4
C5
C6
C7
C8
C9
D1
D2
D3
D4
D5
D6
D7
D8
D9
E2
E3
E4
E5
E6
E7
E8
E9
Punctuation Symbols and Operators in ASCII
1
2
3
4
5
6
7
8
9
31
32
33
34
35
36
37
38
39
Symbol
ASCII
(Hex)
20
21
22
23
24
25
26
27
28
29
2A
2B
2C
2D
2E
2F
3A
3B
3C
3D
3E
3F
40
5B
5C
5D
5E
5F
7B
7C
7D
7E
!
"
#
$
%
&
'
(
)
*
+
,
.
/
:
;
<
=
>
?
@
[
\
]
^
_
{
|
}
~
F1
F2
F3
F4
F5
F6
F7
F8
F9
Meaning
blank
exclamation point
double quote
hash
dollar sign
percent
ampersand
apostrophe
left parenthesis
right parenthesis
asterisk
addition
comma
subtraction
period
slash
colon
semicolon
less than
equal to
greater than
question mark
at symbol
left bracket
backslash
right bracket
carat
underscore
left brace
vertical bar (or)
right brace
tilde
Some Control Characters in ASCII
Dec
0
7
8
9
10
12
13
Printing Hexadecimal Values for Character Values
Note that single char values are treated as short values (2 byte)
when passed as a parameter. To avoid propagating a negative sign
when the first bit is 1, replace the higher part of the value with 00
using a bitwise and. (See the example for Recognizing Printable
Characters.)
ASCII
(Hex)
00
07
08
09
0A
0C
0D
Meaning
Null Character (\0)
Bell Character (\a)
Backspace (\b)
Tab (\t)
Line Feed (\n)
Form Feed (\f)
Carriage Return (\r)
// Sample code
char szValue[] = "#1 San Antonio
for (i = 0; i < strlen(szValue);
printf("%c ", szValue[i]);
printf("\n");
for (i = 0; i < strlen(szValue);
printf("%02X ", szValue[i] &
printf("\n");
Spurs";
i++)
i++)
0x00FF);
We will discuss the bitwise and in more detail later.
// Output
# 1
S a n
A n t o n i o
S p u r s
23 31 20 53 61 6E 20 41 6E 74 6F 6E 69 6F 20 53 70 75 72 73
Recognizing Printable Characters
In C, there are several C functions which are supposed to help.
Unfortunately, the definition allowed for some not very useful
issues. Some implementations give a runtime error when the first
bit is 1 and therefore makes them not very valuable.
isprint(c)
returns non-zero if a printable character or space;
otherwise, 0 is returned
isalpha(c)
returns non-zero if it is an alphabetic character (AZ, a-z); otherwise, 0 is returned
isdigit(c)
returns non-zero if a numeric character;
otherwise, 0 is returned
// For ASCII, this is the set of printables
We can use the %X format code to print them. (See also the notes
on printing integer values as hexadecimal.)
A
z
~
.
5
$
.
.
Try the sample code without the highlighted text. What happens?
#define
PRINTABLE(c)
((c >= ' ' && c <= '~') ? 1 : 0)
char cValue[] = { 'A', 'z', '~', '\t', '5', '$', '\f', 0xF5, '\0' };
int i;
for (i = 0; cValue[i] != '\0'; i++)
{
if (PRINTABLE(cValue[i]))
printf("%c %02X\n", cValue[i], cValue[i]);
else
printf("%c %02X\n", '.', cValue[i] & 0x00FF);
}
// Output
41
7A
7E
09
35
24
0C
F5
Download