MS Word version

advertisement
Representation of Data Within the
Computer
Brian Bramer, DeMontfort University, UK (bb@dmu.ac.uk)
Wed based version:
http://www.cse.dmu.ac.uk/~bb/Teaching/ComputerSystems/RepresentationOfData/RepresentationOfData.htm
Contents
1 Decimal and binary integer numbers
2 Binary Addition
3 Signed Binary Numbers
4 Overflow
5 Hexadecimal Numbers
6 Conversion between the binary and hexadecimal number systems
7 Conversion of Decimal Numbers to Binary
8 Conversion of binary numbers to decimal
9 Conversion of decimal numbers to hexadecimal
10 Conversion of hexadecimal numbers to decimal
11 Fixed Point Real Numbers
12 Floating Point Real Numbers
13 ASCII Character Code
When humans use numeric data they usually represent the numbers using the decimal
system, i.e. base 10. Working in decimal numbers requires the ability to differentiate
between ten different states, the digits 0 through 9. For the human brain this is
straightforward, and may even be extended to take account of alphabetic information
(i.e. the characters a to z and A to Z). Computer systems are built from large numbers
of similar electronic circuits. Although it is possible to build electronic circuits which can
store and manipulate ten states, it is easier and cheaper to build electronic switches that
may be in one of two states, either ON or OFF. Such circuits can therefore be used to
represent binary data (base 2) with, for example, binary 1 and 0 being represented by
the ON and OFF states respectively.
Representation of Data Within the Computer
page
1
Computer systems internally represent all information, data and instructions, in binary
form, with conversion between binary and human readable forms for input and output.
When working in machine code or assembly language it is sometimes necessary to use
binary or some similar number system. Binary numbers tend to be very long and hence
it is easy to make mistakes when dealing with such data. In such situations the
hexadecimal (base 16) number system is commonly used (it is very easy to convert
between binary and hexadecimal).
1 Decimal and binary integer numbers
Decimal Digit
number base
possible states
Binary Bit
10
0, 1, 2, 3, 4, 5, 6, 7, 8, 9
2
0 or 1
The above table shows that a decimal digit can represent one of ten states, 0 through 9,
and a binary bit (a single binary digit is called a bit) can represent two states, 0 or 1. It is
possible, however, to represent more states by joining a sequence of digits or bits
together, and in such a case it is assumed that the least significant digit or bit is the
rightmost and the most significant is the leftmost. The bits or digits are generally
numbered starting with the least significant from 0.
1.1 An eight digit decimal number
digit
7
digit
value
107
6
5
4
3
2
106
105
104
103
102
1
0
101
100
In the decimal system the least significant (rightmost) digit represents units (100), the
next tens (101), the next hundreds (102), etc., therefore the above eight digit number
can represent values in the range 0 (all digits 0) to 99999999 (all digits 9).
1.2 An eight-bit binary number
bit
7
6
5
4
3
2
1
0
bit
value
27
26
25
24
23
22
21
20
In the binary system the least significant (rightmost) bit represents units (20), the next
twos (21), the next fours (22), etc., therefore the above 8-bit binary number can
represent values in the range 0 (all bits 0) to 11111111 binary (all bits are 1). It is
possible to convert between number bases and 11111111 binary is equivalent to 255
decimal. Larger values can be represented by more bits, for example a 16-bit binary
number can represent 0 to 65535 decimal, and a 32-bit number 0 to 4294967295.
Within a computer system a memory word is built up from a number of bits. Typical
word sizes are eight bits (usually called a byte), 16 bits, 32 bits or 64 bits. In practice the
majority of modern computer systems use a memory based on bytes of storage, with
sequences of bytes being used to store 16-, 32- or 64-bit numeric data.
Representation of Data Within the Computer
page
2
2 Binary Addition
The following truth tables show all the possible combinations of the addition of:
(a) two bits A and B
(b) two bits A and B plus a carry in from a previous addition.
In both cases the addition results in a SUM and a carry out.
A +
B
0
0
1
1
0
1
0
1
0
1
1
0
A + B + carry in
0
0
0
0
1
1
1
1
carry
SUM
0
0
1
1
0
0
1
1
0
0
0
1
carry
SUM
0
1
0
1
0
1
0
1
out
0
1
1
0
1
0
0
1
0
0
0
1
0
1
1
1
The following are examples of decimal and binary addition:
decimal
binary
decimal
binary
decimal
binary
5
101
10
1010
27
11011
+2
+ 10
+ 9
+1001
+15
+ 1111
7
111
19
10011
42
101010
The rightmost bits are added using the left hand table above. This results in a SUM and
a carry bit which is carried out to be added into the addition of the next two bits (using
the right hand table above). This addition then results in a sum and a carry out, etc.
The majority of modern computer systems store numeric values in sequences of bytes,
i.e. 8-bit words of storage. A single byte is limited to representing a number in the range
0 to 255 decimal. If the addition of two bytes results in a carry out of bit 7, the result is
greater than 255, and an error has occurred. When carrying out integer arithmetic on a
computer system care must be taken to ensure that the results will fit the word size
being used (generally 16 or 32 bits are used for integer number calculations).
Representation of Data Within the Computer
page
3
3 Signed Binary Numbers
Mathematical and scientific calculations require the storage of negative as well as
positive integer numbers. To represent a positive or negative number using the binary
system one bit, usually the leftmost bit, is reserved for the sign. A negative number can
then be represented in a number of forms, e.g. to represent -10 decimal as an eight bit
signed binary number:
(a)
sign-true magnitude
10001010
(b)
ones-complement
11110101
(c)
twos complement
11110110
Sign-True Magnitude Form. The leftmost bit holds the sign of the number, 0 for positive
and 1 for negative, and the other seven bits represent the magnitude. In the example (a)
0001010 is the magnitude equivalent to 10 decimal, and the leftmost bit is 1 to indicate
that the value is negative. This system is not commonly used in computer systems
because it requires separate addition and subtraction circuits.
Ones Complement Form. To obtain the negative of a number each bit of the positive
binary value is complemented, i.e. 0s are replaced with 1s and 1s with 0s. In example
(b) +10 decimal, 00001010 binary, is complemented to form -10 decimal, i.e. 11110101
binary. This form is used in some computer systems, e.g. CDC 7600 series, but it has
the problem that 0 can take two forms +0 (00000000) or -0 (11111111).
Twos Complement Form. To obtain the negative value of a number the ones
complement is obtained, and then 1 added, i.e. in (c) above the value of +10 decimal,
00001010, is ones complemented to obtain 11110101, and then 1 added to obtain
11110110 (-10 decimal).
The advantage of complemented numbers is that separate addition and subtraction
circuits are not required. To subtract a number, its complement is formed (a very easy
operation), and the result added (using the normal adder circuits) to the other number.
The majority of modern computer systems use twos complement form to represent
signed binary numbers. In practice signed numbers are used for normal arithmetic
calculations, and unsigned numbers for addresses, e.g. in assembly language
programs. The range that can be represented by signed and unsigned 8-bit, 16-bit and
32-bit binary numbers is shown in Chapter 1 Table 1.1.
4 Overflow
Overflow occurs if the number of bits is too small to store the result of an arithmetic
operation. For example, when using 8-bit signed numbers the binary addition 01101110
+ 00101101 (decimal: 110 + 45) would result in the value 10011011 binary. It can be
seen that the addition of the two positive numbers has resulted in the incorrect negative
value -101 decimal. After the computer hardware has carried out an arithmetic operation
it sets condition code bits that indicate if:



the result was 0;
the result was negative;
a carry resulted from the operation; or
Representation of Data Within the Computer
page
4

an overflow occurred during the operation.
The condition code bits can be used in program control structures and for checking for
error conditions. Many high-level language run-time systems automatically check for
overflow errors, and special instructions can be used by assembly language programs
to test the condition code bits.
5 Hexadecimal Numbers
When working in assembly languages it is often necessary to specify memory
addresses and bit patterns. To do this using binary numbers would be cumbersome and
error prone, i.e. to represent a 16-bit binary number sixteen 0s and 1s would have to be
entered. In practice, the hexadecimal (base 16) number system is commonly used:

it is a very concise way to represent numbers (each hexadecimal digit represents
four binary bits); and

it is easy to convert between binary and hexadecimal
decimal hexadecimal
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
binary
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
decimal
hexadecimal
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
10
11
12
13
14
15
16
17
18
19
1A
1B
1C
1D
1E
1F
binary
00010000
00010001
00010010
00010011
00010100
00010101
00010110
00010111
00011000
00011001
00011010
00011011
00011100
00011101
00011110
00011111
Table 1: Decimal, hexadecimal and binary numbers
Representation of Data Within the Computer
page
5
6 Conversion between the binary and hexadecimal
number systems
To convert a binary number to hexadecimal:
1. working from the least significant (rightmost) bit split the binary number up into
groups of four bits;
2. using Table 1 convert each group of four bits into the equivalent hexadecimal
digit.
For example:
0001001110011110 to 0001 0011 1001 1110 = 139E hexadecimal
To convert from hexadecimal to binary, replace each hexadecimal digit with the
equivalent four bit binary value.
7 Conversion of Decimal Numbers to Binary
To convert a positive decimal integer the following algorithm starts by generating the
least significant binary bit, then the next, etc.:
LOOP
next binary bit = remainder of DECIMAL_VALUE/2
DECIMAL_VALUE = DECIMAL_VALUE/2 (ignoring remainder)
UNTIL DECIMAL_VALUE=0
e.g.convert decimal 38 to binary
result
38/2 = 19 remainder 0 gives binary 0
0
19/2 = 9 remainder 1 gives binary 1
10
9/2 = 4 remainder 1 gives binary 1
110
4/2 = 2 remainder 0 gives binary 0
0110
2/2 = 1 remainder 0 gives binary 0
00110
1/2 = 0 remainder 1 gives binary 1 100110
To obtain the binary equivalent of a negative decimal number, convert the absolute
value to binary then take the twos complement.
Representation of Data Within the Computer
page
6
8 Conversion of binary numbers to decimal
The following algorithm converts a binary number into decimal:
DECIMAL_VALUE=0
LOOP starting with the most significant binary bit
BIT_VALUE = value of current binary bit
DECIMAL_VALUE = DECIMAL_VALUE*2 + BIT_VALUE
UNTIL current bit is the least significant
For example, convert 100110 binary (remember the least significant bit is bit 0):
bit processed
5
4
3
2
1
0
DECIMAL_VALUE (((1*2 + 0)*2 + 0)*2 + 1)*2 + 1)*2 + 0 = 38
9 Conversion of decimal numbers to hexadecimal
The following algorithm generates the least significant (rightmost) hexadecimal digit,
then the next digit, etc.:
LOOP
REMAINDER = remainder of DECIMAL_VALUE/16
next hexadecimal digit =
hexadecimal equivalent of REMAINDER
DECIMAL_VALUE = DECIMAL_VALUE/16 (ignoring remainder)
UNTIL DECIMAL_VALUE=0
e.g. convert 1567 to hexadecimal
1567/16 = 97 remainder 15 gives hexadecimal F
97/16 = 6 remainder 1 gives hexadecimal 1
6/16 = 0 remainder 6 gives hexadecimal 6
result
F
1F
61F
10 Conversion of hexadecimal numbers to decimal
DECIMAL_VALUE=0
LOOP starting with the most significant hexadecimal digit
DIGIT_VALUE = decimal value of current hexadecimal digit
DECIMAL_VALUE = DECIMAL_VALUE*16 + DIGIT_VALUE
UNTIL current hexadecimal digit is the least significant
For example, convert 61F hexadecimal to decimal:
hex digit processed
2
1
0
DECIMAL_VALUE
((6*16) + 1)*16 + 15 = 1567
Representation of Data Within the Computer
page
7
11 Fixed Point Real Numbers
So far only integer numbers have been considered. Such numbers are useful when
calculations on whole number values are required, e.g. for loop control in programs. In
practice, however, it is necessary to be able to represent fractional components of
numbers as well. These are called real numbers and one means by which these may be
represented is in fixed point form. The following shows a 16-bit binary value in which the
whole number part (with sign) is stored in eight bits (bits 8 to 15) and the fractional
component in eight bits (bits 0 to 7):
15 14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
Sign 26
25
24
23
22
21
20
2-1
2-2
2-3
2-4
2-5
2-6
2-7
2-8
For example decimal 10.75 would be 1010.11 binary. The major limitations of this
system of real number representation are that:
1. The maximum absolute size of numbers is limited by the number of bit assigned
to hold the whole number part (as with a normal binary integer number).
2. When dealing with small fractional components accuracy is lost, and very small
values cannot be represented at all, i.e. the smallest value that can be
represented by the above fixed point number is 0.00390625.
In practice these restrictions on fixed point numbers do not make it worth while
providing the extra software or hardware within the computer system to process them.
Representation of Data Within the Computer
page
8
12 Floating Point Real Numbers
In many scientific and engineering applications very small or very large numbers have
to be represented, e.g. from the sizes of atomic particles to intergalactic distances. In
the floating point number system the real value is represented by a signed fractional
component called the mantissa and a signed exponent. For example, decimal floating
point numbers (using base 10) can be represented:
mantissa * 10exponent where 0.1 = mantissa < 1.0
To maintain accuracy the absolute value of the mantissa is maintained within the range
shown (this process is called normalisation), e.g. e.g 6520000.0 would be 0.652*107
and -0.00000000652 would be -0.652*10-8. In practice many printers cannot print
superscripts so the above examples would be printed as follows: 6520000.0 as 0.652E7
and -0.00000000652 as -0.652E-8 where the E indicates an exponent of 10.
Within computer systems the fractional component is held as a binary fraction and the
exponent is a power of 2 (or possibly 16). A typical system may store each floating point
number in 32 bits with 24 bits to hold the signed mantissa and 8 bits for the signed
exponent. In this case the accuracy of the mantissa is 23 binary bits (which is equivalent
to 6 or 7 decimal figures of accuracy), and the range of the exponent would be -128 to
+127. Greater accuracy can be obtained by using 64-bit storage in which 53 bits may be
used to store the signed mantissa (giving 15 to 17 decimal figures of accuracy) and 11
bits for the exponent.
Floating point calculations can be carried out using floating point co-processor chips, or
emulated in software that uses the integer arithmetic operations of a computer. The
advantage of floating point hardware is that it can be several orders of magnitude faster
than software emulation, but it requires more complex and expensive hardware.
Representation of Data Within the Computer
page
9
13 ASCII Character Code
Table 2 lists the ASCII character codes (American Standard Code for Information
Interchange), with the columns being the decimal value, the hexadecimal value, then
the corresponding character. ASCII is the most widely used character code for data
transmission between computers, terminals and printers. As with all information within
the computer system, characters are represented by binary patterns. In the ASCII code
each character is represented by a seven bit code that is stored one character per byte
(with bit 7 set to 0 or used as a parity check).
The characters below 32 decimal (20 hexadecimal) are non-printing control characters.
These are used to control the action of printers, display screens, communications
systems, etc. Important control characters are:
NUL
BEL
BS
HT
LF
FF
CR
ESC
SP
null: no action (used as a fill or delay character)
bell: rings the keyboard bell or buzzer
backspace: move back one character width
horizontal tabulate: move horizontally to next tabulate position
line feed: move page vertically one character height
form feed: new page on printer, clear display screen
carriage return: move to start of current line
escape: used in many systems as a program control character
space: move horizontal by one character width
For example to move a printer or a display screen to a new line position the characters
CR (carriage return) then LF (line feed) will be output. In addition some of the printable
characters will depend upon the printer font being used.
It is worth noting that the ASCII codes for the numeric characters 0 to 9, and alphabetic
characters A to Z and a to z, are arranged in ascending numerical order. This property
can be used for:
1.
2.
Testing if a character is within a range, i.e. in the range A to Z.
The conversion of numeric decimal data, entered at a keyboard, into internal
binary form.
Do not confuse the code for a numeric character with the equivalent numeric binary
value, i.e the code for the character 1 is 31 hexadecimal (49 decimal). When a number
composed of several digits is read from a keyboard the character codes are read,
turned into the equivalent binary numeric value and then added to any previous total.
The following algorithm reads a decimal number from a keyboard (until a non-digit is
entered):
NUMBER=0
READ(character)
LOOP WHILE character is in the range '0' to '9'
DIGIT_VALUE = character - '0'
NUMBER = NUMBER*10 + DIGIT_VALUE
READ(character)
END LOOP
Representation of Data Within the Computer
page
10
In the majority of programming languages a character code value is specified by
enclosing it in quote marks. In the above algorithm characters are read from the
keyboard until a non-digit character is hit. If the character is a digit, say 7 was hit, the
ASCII code for 0 is subtracted from it to get the equivalent numeric value
DIGIT_VALUE, i.e. in this case 30 hexadecimal (the code for '0'), will be subtracted from
37 hexadecimal (the code for '7'), to give DIGIT_VALUE=7. The NUMBER entered so
far is then multiplied by ten and the current DIGIT_VALUE added.
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
10
11
12
13
14
15
16
17
18
19
1A
1B
1C
1D
1E
1F
NUL
SOH
STX
ETX
EOT
ENQ
ACK
BEL
BS
HT
LF
VT
FF
CR
SO
S1
DLE
DC1
DC2
DC3
DC4
NAK
SYN
ETB
CAN
EM
SUB
ESC
FS
GS
RS
US
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
20
21
22
23
24
25
26
27
28
29
2A
2B
2C
2D
2E
2F
30
31
32
33
34
35
36
37
38
39
3A
3B
3C
3D
3E
3F
SP
!
"
#
$
%
$amp;
'
{
}
*
+
,
/
0
1
2
3
4
5
6
7
8
9
:
;
&lt
=
&gt
?
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
40
41
42
43
44
45
46
47
48
49
4A
4B
4C
4D
4E
4F
50
51
52
53
54
55
56
57
58
59
5A
5B
5C
5D
5E
5F
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
{
\
}
^
_
96
97
98
99
100
101
102
103
104
105
106
107
108
109
111
112
113
114
115
116
118
119
120
121
122
123
124
125
126
127
60
61
62
63
64
65
66
67
68
69
6A
6B
6C
6D
6E
6F
70
71
72
73
74
75
76
77
78
79
7A
7B
5C
7D
7E
7F
`
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
{
}
~
DEL
Table 2: The ASCII Character Codes: columns are decimal and hexadecimal numeric
character code value followed by the character
When character information is transmitted over a noisy communications channel a
parity bit can replace bit 7 (which is not used in the ASCII code) or be added to make
the total character length of 9-bits (for more details of parity checking see the Problem
for Chapter 12).
Representation of Data Within the Computer
page
11
Download