Data Representation Bits and Bytes ASCII Code

advertisement
Data Representation
Bits and Bytes
A computer stores data in units called bits and bytes. Computer chips called integrated circuits
have one of two states, off or on. Therefore, a system was developed that used only two
numbers, 0 and 1. Zero representing off and 1 representing on. You can think of this as a sort of
light switch. Each switch is called a bit.
Bits are grouped together in sets of eight. Each set of eight bits is called a byte. Setting different
combinations of those eight "on and off" combinations can be developed to stand for letters
numbers, spaces, and symbols. For practical purposes, think of a byte as one character. When
computers refer to memory or storage they refer to terms using the following forms of
measurement.
8 bits = 1 byte
1024 bytes = 1 Kilobyte (K)
1024 Kilobytes = 1 Megabyte (MG)
1024 Megabytes = 1 Gigabyte (GB)
1024 Megabytes = 1 Terabytes (TB)
ASCII Code
Text characters, as well as numbers must be binary coded to get them into a computer memory.
A code that copes with alphabetic characters, _ a . . . z, A . . .Z _as well as digits _ 0 . . . 9 is
called alphanumeric. Up to recently, the most frequently used code was the ASCII (American
Standard Code for Information Interchange) code; ASCII uses 7-bits, (0 to 127); this includes
printable and control characters, see the following ASCII table.
Name
Bel
line-feed
carriage-return
blank-space
Digits
0
.
9
Alphabetic
A
.
Z
Alphabetic
A
.
Z
Hex
07
0a
0d
20
30
.
39
upper-case
41
.
5a
lower-case
61
.
7a
Control
ctrl-G
ctrl-J
ctrl-M
.
.
.
ASCII table
UNICODE
ASCII has proved adequate for English text. But what about languages with accents? And those
with funny shaped characters? UNICODE is a new international standard, governed by the
UNICODE consortium, aimed at allowing a richer character set. In fact, Java uses UNICODE a
16-bit code . so that Java char type is 16-bit, compared to 8 bit in C and C++. According to
(Tanenbaum 1999) the totalof known characters in the world's languages number some 200,000
thus, there is already pressure on UNICODE.
Number System
Digital computers use binary (base 2) as their `native' representation; however, in addition to
binary, and decimal, it is also useful to discuss hexadecimal (base 16) and octal (base 8)
representations. Our everyday representation of numbers is decimal; it uses the 10 symbols
{0,1,2,3,4,5,6,7,8,9} digits. 10 is its base or radix.
The decimal number system is a positional system: the meaning of any digit is determined by its
position in the string of digits that represent the number. Thus:
123 = 1 x102 + 2 x101 + 3 x100
_
= 1 x 100 + 2 x10 + 3 x 1
= 100
+ 20 + 3
_
_
Binary Representation
Base 2 requires just two symbols, 0 and 1 , but the pattern remains the same as for decimal. The
_
digits are termed binary digits bits. The binary number
(1011)2 = 1x23 + 0x22 + 1x21 + 1x20 = (11)10
Octal Representation
Base 8, Octal numbers are made of octal digits: {0,1,2,3,4,5,6,7}
The Octal number (4536)8 = 4x83 + 5x82 + 3x81 + 6x80 = (1362)10
Hexadecimal Representation
Hexadecimal is base 16: the sixteen symbols used are {0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F}.
The hexadecimal number (3A9F)16 = 3x163 + 10x162 + 9x161 + 15x160 = (14999)10
A table of the values for all bases is shown below.
Dec
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Bin
0
1
10
11
100
101
110
111
1000
1001
1010
1011
1100
1101
1110
1111
Oct
0
1
2
3
4
5
6
7
10
11
12
13
14
15
16
17
Hex
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
Note: Conversion between the different bases will discussed in detail in the
next face to face meeting. Along with many examples.
Download