國立聯合大學電子工程學系蕭裕弘

advertisement
Chapter 3
Numeral System and
Data Representation
國立聯合大學 電子工程學系
蕭裕弘
Chapter Goals
 介紹不同的數字系統
 說明不同數字系統之間的轉換方法
 介紹二進位的算數運算
 說明類比訊號與數位訊號的差異
 介紹電腦系統常用的數字系統與編
碼方式
 介紹電腦系統常用的資料表示法
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 2 / 63
1. Numeral Systems
 A numeral is a symbol or group of symbols that
represents a number.


0, 1, 2, 3, 4, 5, 6, 7, 8, 9
I, II, III, IV, V, VI, VII, VIII, IX, X, ...
 A numeral system (or system of numeration) is a
framework where a set of numbers are represented by
numerals in a consistent manner.
 Number system:


A set of objects on which arithmetic operations can be
performed.
E.g.: the real numbers, the rational numbers
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 3 / 63
Types of Numeral Systems - 1
 The unary numeral system

Every natural number is represented by a corresponding
number of symbols.
 E.g.:
If the symbol $ is chosen, then the number seven would
be represented by $$$$$$$.

The unary notation can be abbreviated by introducing
different symbols for certain new values.
 E.g.:
if $ stands for one, % for ten and # for 100, then the
number 304 can be compactly represented as ###$$$$ and
number 123 as #%%$$$.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 4 / 63
Types of Numeral Systems - 2
 The positional system:



A system in which each position has a value represented
by a unique symbol or character.
For each position, the resultant value of each position is
the value of that character multiplied by a power of the
base number for that numeral system.
The position of each character or symbol (usually called
a digit) counting from the right determines the power of
the base that is to be multiplied by that digit.
0123456789
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 5 / 63
Decimal Numeral System
 Decimal is the base 10 numeral system:



The symbols 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9 are used
The decimal point
The sign symbols + (plus) and − (minus)
Digit
2
6
7
4
Position
3
2
1
0
= (2 * 103) + (6 * 102) + (7 * 101) + (4 * 100)
12.345
= (1 * 101) + (2 * 100 ) + (3 * 10-1) + (4 * 10-2) + (5 * 10-3)
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 6 / 63
K Base Numeral System
 The symbols 0, 1, 2, ..., K-1 are used.
NK = (dpdp-1~d1d0.d-1d-2d-(q-1)d-q)K
N10 = (dp * Kp) + (dp-1 * Kp-1) + ...
+ (d1 * K1) + (d0 * K0)
+ (d-1 * K-1) + (d-2 * K-2) + ...
+ (d-(q-1) * K-(q-1)) + (d-q * K-q)
dp: Most significant digit
p
i
d
K
 i
i 0
q
i
d
K
 i
i  1
d-q: Least significant digit
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 7 / 63
Binary Numeral System
 The binary numeral system is a system for
representing numbers in which a radix of two is
used; that is, each digit in a binary numeral may
have either of two different values.
 Typically, the symbols 0 and 1 are used to
represent binary numbers.
 Owing to its relatively straightforward
implementation in electronic circuitry, the binary
system is used internally by virtually all modern
computers.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Decimal
Binary
0
0000
1
0001
2
0010
3
0011
4
0100
5
0101
6
0110
7
0111
8
1000
9
1001
Chapter 3: Page 8 / 63
The Octal and Hexadecimal Numeral Systems
Decimal
Binary
Octal
Hexadecimal
Decimal
Binary
Octal
Hexadecimal
0
0000
00
0
8
1000
10
8
1
0001
01
1
9
1001
11
9
2
0010
02
2
10
1010
12
A
3
0011
03
3
11
1011
13
B
4
0100
04
4
12
1100
14
C
5
0101
05
5
13
1101
15
D
6
0110
06
6
14
1110
16
E
7
0111
07
7
15
1111
17
F
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 9 / 63
2. Convert Binary to and from Decimal System
 Binary  Decimal

101102 = 1 * 24 + 1 * 22 + 1 * 21 = 2210

10.112 = 1 * 21 + 1 * 2-1 + 1 * 2-2 = 2.7510
 Decimal  Binary
0.75
2
22
0
2
11
1
2
5
1
2
2
0
2
1
1
0
*
2
1.50
*
2
1.00
0.7510 = 0.112
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 10 / 63
Convert Octal to and from Decimal System
 Octal  Decimal

7238 = 7 * 82 + 2 * 81 + 3 * 80 = 46710

7.238 = 7 * 80 + 2 * 8-1 + 3 * 8-2 = 7.17187510
 Decimal  Octal
0.3125
8
467
3
58
2
7
7
8
8
*
8
2.5000
*
8
4.0000
0
0.321510 = 0.248
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 11 / 63
Convert Hexadecimal to and from Decimal System
 Hexadecimal  Decimal

AB16 = A * 161 + B * 160 = 17110

A.816 = A * 160 + 8 * 16-1 = 10.510
 Decimal  Hexadecimal
16
16
171
11
B
10
10
A
0
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 12 / 63
Conversion among Base 2, 8, 16
 Octal  Binary

5762.138 = 101 111 110 010.001 0112
 Binary  Octal

11 010 111.101 12 = 327.548
 011 010 111.101 1002
 Hexadecimal  Binary

E8C4.B16 = 1110 1000 1100 0100.10112
 Binary  Hexadecimal

10 1101 0111 1010.1110 012 = 2D7A.E416
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 13 / 63
3. Binary Arithmetic - Addition
0
0
1
1
+
+
+
+
0
1
0
1
=
=
=
=
0
1
1
10 (the 1 is carried)
1 1 1 1
(carry)
1
1
(carry)
0 1 1 0 1
13
1 . 0 1
1.25
+ 1 0 1 1 1
23
+ 0 . 1 1
0.75
1 0 0 1 0 0
36
1 0 . 0 0
2.00
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 14 / 63
Binary Arithmetic - Subtraction
0
0
1
1
*
-
-
0
1
0
1
=
=
=
=
0
1 (with borrow)
1
0
* * * (borrow)
*
(borrow)
1 1 0 1 1 1 0
110
1 . 1 0 1
1.625
1 0 1 1 1
23
- 0 . 0 1 1
0.375
1 0 1 0 1 1 1
87
1 . 0 1 0
1.250
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 15 / 63
Binary Arithmetic - Multiplication
0
0
1
1
*
*
*
*
*
0
1
0
1
=
=
=
=
0
0
0
1
1 0 1 0
10
1 0
2
0 0 0 0
1.25
1 0
2
0 0 0
1 0 1 0
1 0 1 0 0
*
1.0 1
1 0 1
20
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
1 0.1 0
2.50
Chapter 3: Page 16 / 63
Binary Arithmetic - Division
1001
(9)
11001
11101001
1001
1011
1001
0100
1000
10001
1001
1000
(25)
(233)
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
(8)
Chapter 3: Page 17 / 63
4. Analog and Digital Information
 Analog signal

A signal that has a continuous nature rather than a pulsed or discrete
nature.
 Digital signal

A signal in which discrete steps are used to represent information.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 18 / 63
Advantages and Disadvantages of Digitization
 The advantages of digitization

reliable high-speed signal transmission

quality duplication

easy manipulation and processing
 The primary disadvantage of digital
signals is their large size resulting in
high-storage requirements.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 19 / 63
Analog-to-Digital Conversion
 The continuous signal
is usually sampled at
regular intervals by an
analog to digital
converter (ADC) and
the value of the
continuous signal in
that interval is
represented by a
discrete value.
 Sampling
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 20 / 63
Why Do We Use Binary?
 Modern computers are designed to use and manage
binary values because the devices that store and
manage the data are far less expensive and far more
reliable if they only have to represent on of two
possible values.
V
1
On
T
V
Off
T
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
0
Chapter 3: Page 21 / 63
Data and Computer
 Computers are multimedia devices, dealing with a
vast categories of information:





Numbers
Text
Images and graphics
Audio
Video
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 22 / 63
5. Representing Integer Data
 In computer science, the term integer is used to refer to any data type which
can represent some subset of the mathematical integers.
 The most common representation of a positive
integer is a string of bits, using the
binary numeral system.
 Four different ways to represent negative
numbers in a binary numeral system:

Signed-magnitude

One’s complement

Two’s complement

Excess N
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
xxxx xxxx (x: 0 or 1)
可用來表示
0 ~ (28 – 1) = 255
Chapter 3: Page 23 / 63
Signed-Magnitude Representation
 In mathematics, signed numbers in some
arbitrary base is done in the usual way, by
prefixing it with a "-" sign. However, on a
computer, there is no single way of representing
a number's sign.
 One may first approach this problem of
representing a number's sign by allocating one
bit to represent the sign:

Set that bit (often the most significant bit) to 0
for a positive number.

Set that bit to 1 for a negative number.

The remaining bits in the number indicate the
(positive) magnitude.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Sign bit
0111
0000
1000
1111
1111
0000
0000
1111




+127
+ 0
- 0
-127
-2N-1 + 1  2N-1 - 1
缺點:
1. 有 +0 與 -0
2. X – Y  X + (-Y)
Chapter 3: Page 24 / 63
One's Complement Representation
 The 1's complement representation in binary of a positive integer is no
different from the sign-magnitude representation of that integer.
 The 1's complement in binary of a negative integer is obtained by
subtracting its magnitude from 2n -1 where n is the number of bits used to
store the integer in binary.
* Convert -36 in a byte to 1's complement form
Step 1: convert the magnitude of the integer to binary
+3610 = 0010 01002
Step 2: 111111112 (28 - 1) - 001001002
1111 1111
- 0010 0100
1101 1011
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
0111 1111  +127
0000 0000  +
0
1111 1111  -
0
1000 0000  -127
-2N-1 + 1  2N-1 - 1
Chapter 3: Page 25 / 63
Two’s Complement Representation - 1
 With two's complement notation, all
integers are represented using a fixed
number of bits with the leftmost bit
given a negative weight.

E.g.:

1001 00102 = -1 * 27 + 1 * 24 + 1 * 21
= -128 + 16 + 2 = -11010

1000 00002 = -1 * 27 = -12810

1111 11112 = -110
-2N-1  2N-1 - 1
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
0111 1111  +127
0111 1110  +126
...
0000 0010  +
2
0000 0001  +
1
0000 0000  +
0
1000 0000  -128
1000 0001  -127
...
1111 1110  -
2
1111 1111  -
1
Chapter 3: Page 26 / 63
Advantages of Two's Complement Representation
 It's easy to negate any integers:
simply complement each bit and add
1 to the result.
 The left most bit tells you if the
integer is positive (0) or negative (1).
 The normal rules used in the addition
of (unsigned) binary integers still
work (throw away any bit carried out
of the left-most position).
 只需利用加法電路即可執行加法與
減法。
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
* Convert -36 in a byte to 2's
complement form
Step 1: convert the magnitude of
the integer to binary
+3610 = 0010 01002
Step 2: complement each bit
0010 0100
=> 1101 1011
Step 3: Add I to the result
1101 1011
+
1
1101 1100
Chapter 3: Page 27 / 63
Excess-N Representation
 This is a representation that is
primarily used in floating point
numbers.
 It uses a specific number as a base.
Under excess-N, a standard number
representation is 'shifted' downwards
such that the number 0 is represented
as N as a binary number.
 For example the Excess-3
representation for 3 bits is as left:
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Digits
Binary
Actual
value
0
000
-3
1
001
-2
2
010
-1
3
011
0
4
100
1
5
101
2
6
110
3
7
111
4
Chapter 3: Page 28 / 63
Comparison of Different Representations
Decimal
Sign-M
1’s
2’s
Decimal
Sign-M
1’s
2’s
+8
--
--
--
-8
--
--
1000
+7
0111
0111
0111
-7
1111
1000
1001
+6
0110
0110
0110
-6
1110
1001
1010
+5
0101
0101
0101
-5
1101
1010
1011
+4
0100
0100
0100
-4
1100
1011
1100
+3
0011
0011
0011
-3
1011
1100
1101
+2
0010
0010
0010
-2
1010
1101
1110
+1
0001
0001
0001
-1
1001
1110
1111
+0
0000
0000
0000
-0
1000
1111
0000
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 29 / 63
Calculating Two's Complement
 Addition (5 + (-5))
 Subtraction

35 - 15 = 35 + (-15)
+5 => 0000 0101
+35 => 0010 0011
-5 => 1111 1010 + 1
+15 => 0000 1111
-15 => 1111 0000 + 1
=> 1111 1011
=> 1111 0001
0000 0101 (+5)
0010 0011 (+35)
+ 1111 1011 (-5)
1 0000 0000
+ 1111 0001 (-15)
(0)
1 0001 0100
discard
(20)
discard
X - Y = X + (-Y)
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 30 / 63
Common Integral Data Types
Bits
Name
Range
Uses
byte, octet
Signed: -128 (-27) to +127 (+27 - 1) C: char
Unsigned: 0 to +255
Java: byte
16
word
Signed: -32,768 to +32,767
Unsigned: 0 to +65,535 (+216 - 1)
C: short int
Jave: short int
32
word,
double word,
long word
Signed: -231 to +231 - 1
Unsigned: 0 to +232 - 1
C: long int
Java: int
64
long word,
quadword
Signed: -263 to +263 - 1
Unsigned: 0 to +264 - 1
C99: long long int
Java: long int
8
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 31 / 63
Arithmetic Overflow
 In a digital computer, the condition that occurs when a
calculation produces a result that is greater than a
given register or storage location can store or
represent.

E.g.: In 8-bit 2’s complement representation
0111 1111
+ 0000 0001
1000 0000
(+127)
(+1)
(-128)
1000 0011
+ 1000 0001
10000 0100
(-126)
(-127)
(+4)
Positive + Positive  Negatives
Negative + Negative  Positive
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 32 / 63
6. Other Numeral Systems - 1
 Binary coded decimal (BCD)
 2421
Digit
BCD
Digit
2421
0
0000
0
0000
1
0001
1
0001
2
0010
2
0010
3
0011
3
0011
4
0100
4
0100
5
0101
5
1011
6
0110
6
1100
7
0111
7
1101
8
1000
8
1110
9
1001
9
1111
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 33 / 63
Other Numeral Systems - 2
 Biquinary code (二五碼)
 84-2-1
Digit
84-2-1
Digit
5043210
0
0000
0
0100001
1
0111
1
0100010
2
0110
2
0100100
3
0101
3
0101000
4
0100
4
0110000
5
1011
5
1000001
6
1010
6
1000010
7
1001
7
1000100
8
1000
8
1001000
9
1111
9
1010000
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 34 / 63
Other Numeral Systems - 3
 Gray code


A code assigning to each of a contiguous set of integers, or to each member
of a circular list, a word of symbols such that each two adjacent code words
differ by one symbol.
There can be more than one Gray code for a given word length, but the term
was first applied to a particular binary code for the non-negative integers,
the binary-reflected Gray code or BRGC.
0
00=0
01=1
G1 =
1
G2 =
11=2
10=3
Gn+1 = {0 Gn, 1 Gnref}
G1 = {0, 1}, n >= 1
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
0 00
0 01
0 11
0 10
=0
=1
=2
=3
1 10
1 11
1 01
1 00
=4
=5
=6
=7
G3 =
Chapter 3: Page 35 / 63
7. Floating-Point Representations
 A floating-point number is a digital representation for a number in a certain
subset of the rational numbers, and is often used to approximate an arbitrary
real number on a computer.


In particular, it represents an integer or fixed-point number (the significand
or, informally, the mantissa) multiplied by a base (usually 2 in computers)
to some integer power (the exponent).
When the base is 2, it is the binary analog of scientific notation (in base 10).
 A floating-point number a can be represented by two numbers m and e, such
that a = m × be.



m is a p digit number of the form ±d.ddd...ddd (each digit being an integer
between 0 and b−1 inclusive).
If the leading digit of m is non-zero, then the number is said to be
normalized.
Some descriptions use a separate sign bit (s, which represents −1 or +1) and
require m to be positive.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 36 / 63
IEEE Floating-Point Standard (IEEE 754)
 The IEEE floating-point standard (IEEE 754) is an IEEE standard, used by many
CPUs and FPUs, which





defines formats for representing floating-point numbers;
representations of special values (i.e., zero, infinity, very small values (denormal
numbers), and bit combinations that don't represent a number (NaN));
five exceptions, when they occur, and what happens when they do occur;
four rounding modes;
a set of floating-point operations that will work identically on any conforming
system.
 IEEE 754 specifies four formats for representing floating-point values:





single-precision (32-bit)
double-precision (64-bit)
single-extended precision (>= 43-bit, not commonly used)
double-extended precision (>= 79-bit, usually implemented with 80 bits).
Only 32-bit values are required by the standard, the others are optional.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 37 / 63
IEEE 754 – Single-Precision
 A binary floating-point number is stored in a 32 bit word.
1
8
23
S
Exponent (e)
Mantissa or fraction (f)
31 30
23 22
0
Value = s * m * 2e-127
s = 1 if S = 0; s = -1 if S = 1. m = 1.f.
 The set of possible data values can be divided into the following classes:





Zeroes: Exp: 0, Fraction: 0
Normalised numbers: Exp: 1-254 (bias + 127), Fraction: any
Denormalised numbers: Exp: 0, Fraction: non zero
Infinities: Exp: 255, Fraction: 0
NaN (Not a Number): Exp: 255, Fraction: non zero
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 38 / 63
IEEE 754 – Single-Precision - Examples
 10.510 = 1010.12
S
0
+
Exponent
1000 0010
3=130-127
Mantissa
0101 0000 0000 0000 0000 000
1.0101
 -0.510 = -0.12
S
1
-
Exponent
Mantissa
0111 1110
0000 0000 0000 0000 0000 000
-1=126-127 1.0000
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 39 / 63
IEEE 754 – Double-Precision
1
S
11
52
Exponent (e)
63 62
Mantissa or fraction (f)
52 51
0
Value = s * m * 2e-1023
s = 1 if S = 0;
s = -1 if S = 1.
m = 1.f.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 40 / 63
Problems with Floating-Point
 Floating-point numbers usually behave very similarly to the real numbers they are
used to approximate. However, this can easily lead programmers into overconfidently ignoring the need for numerical analysis.
 Errors in floating-point computation can include:

Rounding

Non-representable numbers: for example, the literal 0.1 cannot be represented
exactly by a binary floating-point number

Rounding of arithmetic operations: for example 2/3 might yield 0.6666667

Absorption: 1×1015 + 1 = 1×1015

Cancellation: subtraction between nearly equivalent operands

Overflow, which usually yields an infinity

Underflow

Invalid operations (such as an attempt to calculate the square root of a non-zero
negative number). Invalid operations yield a result of NaN (not a number).
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 41 / 63
8. The Hierarchy of Data Organization
Bit
位元
0或1
Character
字元
A (ASCII = 65)
Data field
資料欄位
John
Data record
資料記錄
File
檔案
Database
資料庫
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
John, 20, Male
John, 20, Male
Mary, 21, Female
File1, file2, …
Chapter 3: Page 42 / 63
Bit and Bytes
 位元 (bit)

在數位電腦系統中,所有資料都是由一組位元 (bit) 所組成的。每個位
元的值可以是 0 或 1。

Bit = Binary digit 的縮寫。
 位元組 (byte) 或字元 (character)

由於位元所能代表的內容只有 0 或 1,為了讓人們更方便記憶與溝通,
於是將 8 個位元 (bits) 組合成一個位元組 (byte),並以位元組作為資料
處理的基本單位。

為了讓位元的組合能用於代表人類所能瞭解的資料,因此人們設計了
多種編碼系統 (encoding system),以建立位元組合與字元 (character) 的
對應方式。

常見的字元編碼方式大都採用位元組容量的倍數來處理,如一個位元
組 (28 = 256) 或是兩個位元組 (216 = 65536)。
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 43 / 63
From Data Field to Database
 資料欄位 (data field)

由數個字元或位元組所組成讓資料擁有意義的最低邏輯單位。

例如:姓名欄位、年齡欄位、性別欄位等。
 資料記錄 (data record)

由數個相關的資料欄位所組成可用來描述一個是件或項目的資料單位。

例如:由姓名欄位、年齡欄位、性別欄位所組成的學生資料記錄。
 檔案 (file)

由數筆相關資料記錄所組成的資料單位。

例如:由同班學生之資料記錄所組成的班級資料檔案。
 資料庫 (database)

由相關之檔案所組合成的資料單位。檔案之間會利用一些技術建立邏
輯關係。
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 44 / 63
儲存設備常用單位
 Byte: 8 bits
 KB: Kilobyte = 210 bytes = 1024 bytes (KiB)
 MB: Megabyte = 220 bytes = 1,048,576 bytes (MiB)
 GB: Gigabyte = 230 bytes = 1,073,741,824 bytes (GiB)
 TB: Terabyte = 240 bytes = 1,099,511,627,776 bytes (TiB)
 PB: Petabyte = 250 bytes = 1,125,899,906,842,624 bytes (PiB)
 EB: Exabyte = 260 bytes = 1,152,921,504,606,846,976 bytes (EiB)
Byte  KB  MB  GB  TB  PB  EB
Word: A group of one or more bytes.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 45 / 63
9. Representing Text
 To represent a text document in digital form, we need to be
able to represent every possible character that may appear.
 There are finite number of characters to represent, so the
general approach is to list them all and assign each a binary
string.
 A character set is a particular mapping between characters and
binary strings.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 46 / 63
常見的字元系統
 美國標準資訊交換碼 (ASCII, American
Standard Code for Information Interchange)


編碼長度:8 bits (早期為 7 bits)
編碼內容:鍵盤上可見到的英文字母、阿拉
伯數字與符號,還有一些控制字元。
 Big-5


編碼長度:16 bits
編碼內容:常用的中文字與符號。
 Unicode


編碼長度:16 bits
編碼內容:世界上常見的文字符號。
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 47 / 63
ASCII Examples
 Control characters (0 ~ 31, 127)
 Printable characters (32 ~ 126)
ASCII
字元
ASCII
0
NUL
32 (20)
1
SOH
33 (21)
2
STX
3
字元
ASCII
字元
ASCII
字元
ASCII
字元
48 (30)
0
65 (41)
A
97 (61)
a
!
49 (31)
1
66 (42)
B
98 (62)
b
34 (22)
“
50 (32)
2
67 (43)
C
99 (63)
c
ETX
35 (23)
#
51 (33)
3
68 (44)
D
100 (64)
d
4
EOT
36 (24)
$
52 (34)
4
69 (45)
E
101 (65)
e
5
ENQ
37 (25)
%
53 (35)
5
70 (46)
F
102 (66)
f
6
ACK
38 (26)
&
54 (36)
6
71 (47)
G
103 (67)
g
7
BEL
39 (27)
‘
55 (37)
7
72 (48)
H
104 (68)
h
8
BS
40 (28)
(
56 (38)
8
73 (49)
I
105 (69)
i
9
HT
41 (29)
)
57 (39)
9
74 (4A)
J
106 (6A)
j
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 48 / 63
Big-5 Code
 Big-5 code


電腦處理繁體中文字的編碼系統
使用 16 bits
高位元組 低位元組
線上範例
位元組
編碼範圍
高位元組
81 ~ FE
常用字
5401 0XA4 0X40 ~ 0XC6 0X7E
40 ~ 7E
次常用字
7652 0XC9 0X40 ~ 0XF9 0XD5
A1 ~ FE
特殊符號
408 0XA1 0X40 ~ 0XA3 0XBF
字數
編碼範圍
低位元組
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 49 / 63
Unicode
 The Unicode Standard is the universal character
encoding standard used for representation of text for
computer processing.
 The original goal was to use a single 16-bit encoding
that provides code points for more than 65,000
characters.
 The Unicode Standard defines codes for characters
used in all the major languages written today.
綉  綉
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 50 / 63
10. Representing Images and Graphics
 Images and graphics data consists of
still picture.
 Methods for storing graphics data:


Bitmap
 Bitmap graphics form images as a map
of hundreds of thousands of dots, called
as pixels.
 The number of pixels used to represent a
picture is called the resolution.
Vector
 Use of geometrical primitives such as
points, lines, curves, and polygons to
represent images in computer graphics.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 51 / 63
Representing Monochrome Graphics
 A monochrome graphic is the simplest type of bitmap.



It differentiate between only a foreground color and a
background color.
Suppose that these colors are black (0) and white (1).
One bit per pixel.
0
0
1
1
1
1
1
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
0
1
0
0
1
0
0
1
0
0
0
1
0
0
0
1
0
0
1
0
0
0
0
1
1
1
1
1
Chapter 3: Page 52 / 63
Representing Grayscale Graphics
 In grayscale images, each pixel can be not only pure
black or pure white but also any of the 254 shades of
gray in between.
 One byte per pixel.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 53 / 63
Representing Color Graphics - 1
 Color is a sensation caused by light as it interacts with
the eye, brain, and our experience.
 Media that transmit light (such as television) use
additive color mixing with primary colors of red,
green, and blue, each of which stimulates one of the
three types of the eye's color receptors with as little
stimulation as possible of the other two.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 54 / 63
Representing Color Graphics - 2
 Color depth: The amount of data used to represent
a color.

1-bit color (21) black or white

8-bit color uses 8 bits to create a color, resulting in
256 colors. This is called a limited color pallet.

16-bit uses 16 bits to create life-like colors, with a
total of 65,536 colors.

24 and 32 bit color can each have 16,777,216 and
4,294,967,296 colors, respectively. This type of color
is called true color, since it can potentially mimic
many colors found in the real world.

In 24 bit color, each number in an RGB value gets 8
bits.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 55 / 63
Representing Color Graphics - 3
16,777,216
256
16
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 56 / 63
11. Data Compression
 如果要儲存一個 true color 的 640 * 480 的影像檔案,那麼
所需的 bytes 數為

640 * 480 * 3 = 921,600 Bytes  921 KB
 如果要儲存一個每個畫面的解析度為 352 * 240,每秒鐘播
放 30 個畫面的全彩視訊資料,那麼每秒鐘視訊資料所需
的 bytes 數為

30 * 352 * 240 * 3 = 7,603,200 Bytes  7 MB
 In computer science, data compression is the process of
encoding data so that it takes less storage space or less
transmission time than it would if it were not compressed.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 57 / 63
Types of Data Compression Algorithms
 Lossless data compression

The original data can be reconstructed exactly from the compressed data.

Lossless data compression is used in software compression tools such as the
highly popular Zip format, used by PKZIP and WinZip.
 Lossy data compression

One where compressing a file and then decompressing it retrieves a file that
may well be different to the original, but is "close enough" to be useful in
some way.

Lossy methods are most often used for compressing sound or images.

The advantage of lossy methods over lossless methods:

In some cases a lossy method can produce a much smaller compressed file
than any known lossless method, while still meeting the requirements of the
application.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 58 / 63
Data Compression Algorithm Examples
 Run-length encoding

A very simple form of data compression in which runs of data are
stored as a single data value and count, rather than as the original
run.
WWWWWBBBBWWWWWBWWWWWWWWWWWWBBB
 W5B4W5B1W12B3
 Huffman coding

An entropy encoding algorithm used for data compression that finds
the optimal system of encoding strings based on the relative
frequency of each character.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
線上範例
Chapter 3: Page 59 / 63
常見的影像檔案格式
File
Extension
.bmp
.gif
.jpg, .jpeg
.png
Proper Name
Description
Windows Bitmap
Commonly used by Microsoft Windows programs, and the
Windows operating system itself. Lossless compression can be
specified, but some programs use only uncompressed files.
Graphics
Interchange
Format
Used extensively on the web, but sometimes avoided due
to patent issues. Supports animated images. Supports
only 255 colors per frame, so requires lossy quantization
for full-color photos; using multiple frames can improve
color precision. Uses lossless, patented LZW
compression.
Used extensively for photos on the web. Uses lossy
Joint Photographic
compression; the quality can vary greatly depending on
Experts Group
the compression settings.
Portable Network
Graphics
Lossless compressed bitmap image format, originally
designed to replace the use of GIF on the web.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 60 / 63
12. Representing Audio Data
 Computers often process audio data after it has been digitally encoded by a
method call waveform audio.
 Audio CD:

Sampling rate: 44,100 Times/Sec

Number of bits per sample: 16
 常見的音訊檔案格式

WAV: Microsoft and IBM audio file format standard for storing audio on
PCs.

MP3 (MPEG-1 Audio Layer III): it is lossy.

WMA: a proprietary compressed audio file format used by Microsoft.

Quicktime: a digital video technology developed and produced by Apple
Computer.

RealAudio: an audio codec developed by RealNetworks.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 61 / 63
Representing Video Data
 Video is basically a three-dimensional array of color pixels

Two dimensions serve as spatial (horizontal and vertical) directions of the (moving)
pictures.

One dimension represents the time domain.
 A frame is a set of all pixels that (approximately) correspond to a single point in
time. Basically, a frame is the same as a (still) picture.
 However, video data contains spatial and temporal redundancy. Video compression
typically reduces this redundancy using lossy compression. Usually this is achieved
by image compression techniques to reduce spatial redundancy from frames and
motion compensation techniques to reduce temporal redundancy.
 The Moving Picture Experts Group (MPEG) is a small group charged with the
development of video and audio encoding standards.
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 62 / 63
常見的視訊規格
Format
VCD
SVCD
DVD
QT
RM
DV
Resolution
(NTSC/PAL)
352*240
352*288
480*480
480*576
720*480
720*576
640*480
320*240
720*480
720*576
RM
DV
Video
compression
MPEG1
MPEG2
MPEG1/2
Sorenson,
Cinepak,
MPEG4
Size/min
10 MB
10-20 MB
30-70 MB
4-20 MB
2-5 MB
216 MB
Quality
Good
Great
Excellent
Great
Decent
Excellent
國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘
Chapter 3: Page 63 / 63
Download