Chapter 3 Numeral System and Data Representation 國立聯合大學 電子工程學系 蕭裕弘 Chapter Goals 介紹不同的數字系統 說明不同數字系統之間的轉換方法 介紹二進位的算數運算 說明類比訊號與數位訊號的差異 介紹電腦系統常用的數字系統與編 碼方式 介紹電腦系統常用的資料表示法 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 2 / 63 1. Numeral Systems A numeral is a symbol or group of symbols that represents a number. 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 I, II, III, IV, V, VI, VII, VIII, IX, X, ... A numeral system (or system of numeration) is a framework where a set of numbers are represented by numerals in a consistent manner. Number system: A set of objects on which arithmetic operations can be performed. E.g.: the real numbers, the rational numbers 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 3 / 63 Types of Numeral Systems - 1 The unary numeral system Every natural number is represented by a corresponding number of symbols. E.g.: If the symbol $ is chosen, then the number seven would be represented by $$$$$$$. The unary notation can be abbreviated by introducing different symbols for certain new values. E.g.: if $ stands for one, % for ten and # for 100, then the number 304 can be compactly represented as ###$$$$ and number 123 as #%%$$$. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 4 / 63 Types of Numeral Systems - 2 The positional system: A system in which each position has a value represented by a unique symbol or character. For each position, the resultant value of each position is the value of that character multiplied by a power of the base number for that numeral system. The position of each character or symbol (usually called a digit) counting from the right determines the power of the base that is to be multiplied by that digit. 0123456789 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 5 / 63 Decimal Numeral System Decimal is the base 10 numeral system: The symbols 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9 are used The decimal point The sign symbols + (plus) and − (minus) Digit 2 6 7 4 Position 3 2 1 0 = (2 * 103) + (6 * 102) + (7 * 101) + (4 * 100) 12.345 = (1 * 101) + (2 * 100 ) + (3 * 10-1) + (4 * 10-2) + (5 * 10-3) 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 6 / 63 K Base Numeral System The symbols 0, 1, 2, ..., K-1 are used. NK = (dpdp-1~d1d0.d-1d-2d-(q-1)d-q)K N10 = (dp * Kp) + (dp-1 * Kp-1) + ... + (d1 * K1) + (d0 * K0) + (d-1 * K-1) + (d-2 * K-2) + ... + (d-(q-1) * K-(q-1)) + (d-q * K-q) dp: Most significant digit p i d K i i 0 q i d K i i 1 d-q: Least significant digit 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 7 / 63 Binary Numeral System The binary numeral system is a system for representing numbers in which a radix of two is used; that is, each digit in a binary numeral may have either of two different values. Typically, the symbols 0 and 1 are used to represent binary numbers. Owing to its relatively straightforward implementation in electronic circuitry, the binary system is used internally by virtually all modern computers. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Decimal Binary 0 0000 1 0001 2 0010 3 0011 4 0100 5 0101 6 0110 7 0111 8 1000 9 1001 Chapter 3: Page 8 / 63 The Octal and Hexadecimal Numeral Systems Decimal Binary Octal Hexadecimal Decimal Binary Octal Hexadecimal 0 0000 00 0 8 1000 10 8 1 0001 01 1 9 1001 11 9 2 0010 02 2 10 1010 12 A 3 0011 03 3 11 1011 13 B 4 0100 04 4 12 1100 14 C 5 0101 05 5 13 1101 15 D 6 0110 06 6 14 1110 16 E 7 0111 07 7 15 1111 17 F 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 9 / 63 2. Convert Binary to and from Decimal System Binary Decimal 101102 = 1 * 24 + 1 * 22 + 1 * 21 = 2210 10.112 = 1 * 21 + 1 * 2-1 + 1 * 2-2 = 2.7510 Decimal Binary 0.75 2 22 0 2 11 1 2 5 1 2 2 0 2 1 1 0 * 2 1.50 * 2 1.00 0.7510 = 0.112 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 10 / 63 Convert Octal to and from Decimal System Octal Decimal 7238 = 7 * 82 + 2 * 81 + 3 * 80 = 46710 7.238 = 7 * 80 + 2 * 8-1 + 3 * 8-2 = 7.17187510 Decimal Octal 0.3125 8 467 3 58 2 7 7 8 8 * 8 2.5000 * 8 4.0000 0 0.321510 = 0.248 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 11 / 63 Convert Hexadecimal to and from Decimal System Hexadecimal Decimal AB16 = A * 161 + B * 160 = 17110 A.816 = A * 160 + 8 * 16-1 = 10.510 Decimal Hexadecimal 16 16 171 11 B 10 10 A 0 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 12 / 63 Conversion among Base 2, 8, 16 Octal Binary 5762.138 = 101 111 110 010.001 0112 Binary Octal 11 010 111.101 12 = 327.548 011 010 111.101 1002 Hexadecimal Binary E8C4.B16 = 1110 1000 1100 0100.10112 Binary Hexadecimal 10 1101 0111 1010.1110 012 = 2D7A.E416 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 13 / 63 3. Binary Arithmetic - Addition 0 0 1 1 + + + + 0 1 0 1 = = = = 0 1 1 10 (the 1 is carried) 1 1 1 1 (carry) 1 1 (carry) 0 1 1 0 1 13 1 . 0 1 1.25 + 1 0 1 1 1 23 + 0 . 1 1 0.75 1 0 0 1 0 0 36 1 0 . 0 0 2.00 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 14 / 63 Binary Arithmetic - Subtraction 0 0 1 1 * - - 0 1 0 1 = = = = 0 1 (with borrow) 1 0 * * * (borrow) * (borrow) 1 1 0 1 1 1 0 110 1 . 1 0 1 1.625 1 0 1 1 1 23 - 0 . 0 1 1 0.375 1 0 1 0 1 1 1 87 1 . 0 1 0 1.250 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 15 / 63 Binary Arithmetic - Multiplication 0 0 1 1 * * * * * 0 1 0 1 = = = = 0 0 0 1 1 0 1 0 10 1 0 2 0 0 0 0 1.25 1 0 2 0 0 0 1 0 1 0 1 0 1 0 0 * 1.0 1 1 0 1 20 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 1 0.1 0 2.50 Chapter 3: Page 16 / 63 Binary Arithmetic - Division 1001 (9) 11001 11101001 1001 1011 1001 0100 1000 10001 1001 1000 (25) (233) 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 (8) Chapter 3: Page 17 / 63 4. Analog and Digital Information Analog signal A signal that has a continuous nature rather than a pulsed or discrete nature. Digital signal A signal in which discrete steps are used to represent information. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 18 / 63 Advantages and Disadvantages of Digitization The advantages of digitization reliable high-speed signal transmission quality duplication easy manipulation and processing The primary disadvantage of digital signals is their large size resulting in high-storage requirements. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 19 / 63 Analog-to-Digital Conversion The continuous signal is usually sampled at regular intervals by an analog to digital converter (ADC) and the value of the continuous signal in that interval is represented by a discrete value. Sampling 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 20 / 63 Why Do We Use Binary? Modern computers are designed to use and manage binary values because the devices that store and manage the data are far less expensive and far more reliable if they only have to represent on of two possible values. V 1 On T V Off T 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 0 Chapter 3: Page 21 / 63 Data and Computer Computers are multimedia devices, dealing with a vast categories of information: Numbers Text Images and graphics Audio Video 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 22 / 63 5. Representing Integer Data In computer science, the term integer is used to refer to any data type which can represent some subset of the mathematical integers. The most common representation of a positive integer is a string of bits, using the binary numeral system. Four different ways to represent negative numbers in a binary numeral system: Signed-magnitude One’s complement Two’s complement Excess N 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 xxxx xxxx (x: 0 or 1) 可用來表示 0 ~ (28 – 1) = 255 Chapter 3: Page 23 / 63 Signed-Magnitude Representation In mathematics, signed numbers in some arbitrary base is done in the usual way, by prefixing it with a "-" sign. However, on a computer, there is no single way of representing a number's sign. One may first approach this problem of representing a number's sign by allocating one bit to represent the sign: Set that bit (often the most significant bit) to 0 for a positive number. Set that bit to 1 for a negative number. The remaining bits in the number indicate the (positive) magnitude. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Sign bit 0111 0000 1000 1111 1111 0000 0000 1111 +127 + 0 - 0 -127 -2N-1 + 1 2N-1 - 1 缺點: 1. 有 +0 與 -0 2. X – Y X + (-Y) Chapter 3: Page 24 / 63 One's Complement Representation The 1's complement representation in binary of a positive integer is no different from the sign-magnitude representation of that integer. The 1's complement in binary of a negative integer is obtained by subtracting its magnitude from 2n -1 where n is the number of bits used to store the integer in binary. * Convert -36 in a byte to 1's complement form Step 1: convert the magnitude of the integer to binary +3610 = 0010 01002 Step 2: 111111112 (28 - 1) - 001001002 1111 1111 - 0010 0100 1101 1011 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 0111 1111 +127 0000 0000 + 0 1111 1111 - 0 1000 0000 -127 -2N-1 + 1 2N-1 - 1 Chapter 3: Page 25 / 63 Two’s Complement Representation - 1 With two's complement notation, all integers are represented using a fixed number of bits with the leftmost bit given a negative weight. E.g.: 1001 00102 = -1 * 27 + 1 * 24 + 1 * 21 = -128 + 16 + 2 = -11010 1000 00002 = -1 * 27 = -12810 1111 11112 = -110 -2N-1 2N-1 - 1 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 0111 1111 +127 0111 1110 +126 ... 0000 0010 + 2 0000 0001 + 1 0000 0000 + 0 1000 0000 -128 1000 0001 -127 ... 1111 1110 - 2 1111 1111 - 1 Chapter 3: Page 26 / 63 Advantages of Two's Complement Representation It's easy to negate any integers: simply complement each bit and add 1 to the result. The left most bit tells you if the integer is positive (0) or negative (1). The normal rules used in the addition of (unsigned) binary integers still work (throw away any bit carried out of the left-most position). 只需利用加法電路即可執行加法與 減法。 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 * Convert -36 in a byte to 2's complement form Step 1: convert the magnitude of the integer to binary +3610 = 0010 01002 Step 2: complement each bit 0010 0100 => 1101 1011 Step 3: Add I to the result 1101 1011 + 1 1101 1100 Chapter 3: Page 27 / 63 Excess-N Representation This is a representation that is primarily used in floating point numbers. It uses a specific number as a base. Under excess-N, a standard number representation is 'shifted' downwards such that the number 0 is represented as N as a binary number. For example the Excess-3 representation for 3 bits is as left: 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Digits Binary Actual value 0 000 -3 1 001 -2 2 010 -1 3 011 0 4 100 1 5 101 2 6 110 3 7 111 4 Chapter 3: Page 28 / 63 Comparison of Different Representations Decimal Sign-M 1’s 2’s Decimal Sign-M 1’s 2’s +8 -- -- -- -8 -- -- 1000 +7 0111 0111 0111 -7 1111 1000 1001 +6 0110 0110 0110 -6 1110 1001 1010 +5 0101 0101 0101 -5 1101 1010 1011 +4 0100 0100 0100 -4 1100 1011 1100 +3 0011 0011 0011 -3 1011 1100 1101 +2 0010 0010 0010 -2 1010 1101 1110 +1 0001 0001 0001 -1 1001 1110 1111 +0 0000 0000 0000 -0 1000 1111 0000 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 29 / 63 Calculating Two's Complement Addition (5 + (-5)) Subtraction 35 - 15 = 35 + (-15) +5 => 0000 0101 +35 => 0010 0011 -5 => 1111 1010 + 1 +15 => 0000 1111 -15 => 1111 0000 + 1 => 1111 1011 => 1111 0001 0000 0101 (+5) 0010 0011 (+35) + 1111 1011 (-5) 1 0000 0000 + 1111 0001 (-15) (0) 1 0001 0100 discard (20) discard X - Y = X + (-Y) 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 30 / 63 Common Integral Data Types Bits Name Range Uses byte, octet Signed: -128 (-27) to +127 (+27 - 1) C: char Unsigned: 0 to +255 Java: byte 16 word Signed: -32,768 to +32,767 Unsigned: 0 to +65,535 (+216 - 1) C: short int Jave: short int 32 word, double word, long word Signed: -231 to +231 - 1 Unsigned: 0 to +232 - 1 C: long int Java: int 64 long word, quadword Signed: -263 to +263 - 1 Unsigned: 0 to +264 - 1 C99: long long int Java: long int 8 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 31 / 63 Arithmetic Overflow In a digital computer, the condition that occurs when a calculation produces a result that is greater than a given register or storage location can store or represent. E.g.: In 8-bit 2’s complement representation 0111 1111 + 0000 0001 1000 0000 (+127) (+1) (-128) 1000 0011 + 1000 0001 10000 0100 (-126) (-127) (+4) Positive + Positive Negatives Negative + Negative Positive 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 32 / 63 6. Other Numeral Systems - 1 Binary coded decimal (BCD) 2421 Digit BCD Digit 2421 0 0000 0 0000 1 0001 1 0001 2 0010 2 0010 3 0011 3 0011 4 0100 4 0100 5 0101 5 1011 6 0110 6 1100 7 0111 7 1101 8 1000 8 1110 9 1001 9 1111 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 33 / 63 Other Numeral Systems - 2 Biquinary code (二五碼) 84-2-1 Digit 84-2-1 Digit 5043210 0 0000 0 0100001 1 0111 1 0100010 2 0110 2 0100100 3 0101 3 0101000 4 0100 4 0110000 5 1011 5 1000001 6 1010 6 1000010 7 1001 7 1000100 8 1000 8 1001000 9 1111 9 1010000 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 34 / 63 Other Numeral Systems - 3 Gray code A code assigning to each of a contiguous set of integers, or to each member of a circular list, a word of symbols such that each two adjacent code words differ by one symbol. There can be more than one Gray code for a given word length, but the term was first applied to a particular binary code for the non-negative integers, the binary-reflected Gray code or BRGC. 0 00=0 01=1 G1 = 1 G2 = 11=2 10=3 Gn+1 = {0 Gn, 1 Gnref} G1 = {0, 1}, n >= 1 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 0 00 0 01 0 11 0 10 =0 =1 =2 =3 1 10 1 11 1 01 1 00 =4 =5 =6 =7 G3 = Chapter 3: Page 35 / 63 7. Floating-Point Representations A floating-point number is a digital representation for a number in a certain subset of the rational numbers, and is often used to approximate an arbitrary real number on a computer. In particular, it represents an integer or fixed-point number (the significand or, informally, the mantissa) multiplied by a base (usually 2 in computers) to some integer power (the exponent). When the base is 2, it is the binary analog of scientific notation (in base 10). A floating-point number a can be represented by two numbers m and e, such that a = m × be. m is a p digit number of the form ±d.ddd...ddd (each digit being an integer between 0 and b−1 inclusive). If the leading digit of m is non-zero, then the number is said to be normalized. Some descriptions use a separate sign bit (s, which represents −1 or +1) and require m to be positive. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 36 / 63 IEEE Floating-Point Standard (IEEE 754) The IEEE floating-point standard (IEEE 754) is an IEEE standard, used by many CPUs and FPUs, which defines formats for representing floating-point numbers; representations of special values (i.e., zero, infinity, very small values (denormal numbers), and bit combinations that don't represent a number (NaN)); five exceptions, when they occur, and what happens when they do occur; four rounding modes; a set of floating-point operations that will work identically on any conforming system. IEEE 754 specifies four formats for representing floating-point values: single-precision (32-bit) double-precision (64-bit) single-extended precision (>= 43-bit, not commonly used) double-extended precision (>= 79-bit, usually implemented with 80 bits). Only 32-bit values are required by the standard, the others are optional. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 37 / 63 IEEE 754 – Single-Precision A binary floating-point number is stored in a 32 bit word. 1 8 23 S Exponent (e) Mantissa or fraction (f) 31 30 23 22 0 Value = s * m * 2e-127 s = 1 if S = 0; s = -1 if S = 1. m = 1.f. The set of possible data values can be divided into the following classes: Zeroes: Exp: 0, Fraction: 0 Normalised numbers: Exp: 1-254 (bias + 127), Fraction: any Denormalised numbers: Exp: 0, Fraction: non zero Infinities: Exp: 255, Fraction: 0 NaN (Not a Number): Exp: 255, Fraction: non zero 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 38 / 63 IEEE 754 – Single-Precision - Examples 10.510 = 1010.12 S 0 + Exponent 1000 0010 3=130-127 Mantissa 0101 0000 0000 0000 0000 000 1.0101 -0.510 = -0.12 S 1 - Exponent Mantissa 0111 1110 0000 0000 0000 0000 0000 000 -1=126-127 1.0000 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 39 / 63 IEEE 754 – Double-Precision 1 S 11 52 Exponent (e) 63 62 Mantissa or fraction (f) 52 51 0 Value = s * m * 2e-1023 s = 1 if S = 0; s = -1 if S = 1. m = 1.f. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 40 / 63 Problems with Floating-Point Floating-point numbers usually behave very similarly to the real numbers they are used to approximate. However, this can easily lead programmers into overconfidently ignoring the need for numerical analysis. Errors in floating-point computation can include: Rounding Non-representable numbers: for example, the literal 0.1 cannot be represented exactly by a binary floating-point number Rounding of arithmetic operations: for example 2/3 might yield 0.6666667 Absorption: 1×1015 + 1 = 1×1015 Cancellation: subtraction between nearly equivalent operands Overflow, which usually yields an infinity Underflow Invalid operations (such as an attempt to calculate the square root of a non-zero negative number). Invalid operations yield a result of NaN (not a number). 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 41 / 63 8. The Hierarchy of Data Organization Bit 位元 0或1 Character 字元 A (ASCII = 65) Data field 資料欄位 John Data record 資料記錄 File 檔案 Database 資料庫 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 John, 20, Male John, 20, Male Mary, 21, Female File1, file2, … Chapter 3: Page 42 / 63 Bit and Bytes 位元 (bit) 在數位電腦系統中,所有資料都是由一組位元 (bit) 所組成的。每個位 元的值可以是 0 或 1。 Bit = Binary digit 的縮寫。 位元組 (byte) 或字元 (character) 由於位元所能代表的內容只有 0 或 1,為了讓人們更方便記憶與溝通, 於是將 8 個位元 (bits) 組合成一個位元組 (byte),並以位元組作為資料 處理的基本單位。 為了讓位元的組合能用於代表人類所能瞭解的資料,因此人們設計了 多種編碼系統 (encoding system),以建立位元組合與字元 (character) 的 對應方式。 常見的字元編碼方式大都採用位元組容量的倍數來處理,如一個位元 組 (28 = 256) 或是兩個位元組 (216 = 65536)。 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 43 / 63 From Data Field to Database 資料欄位 (data field) 由數個字元或位元組所組成讓資料擁有意義的最低邏輯單位。 例如:姓名欄位、年齡欄位、性別欄位等。 資料記錄 (data record) 由數個相關的資料欄位所組成可用來描述一個是件或項目的資料單位。 例如:由姓名欄位、年齡欄位、性別欄位所組成的學生資料記錄。 檔案 (file) 由數筆相關資料記錄所組成的資料單位。 例如:由同班學生之資料記錄所組成的班級資料檔案。 資料庫 (database) 由相關之檔案所組合成的資料單位。檔案之間會利用一些技術建立邏 輯關係。 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 44 / 63 儲存設備常用單位 Byte: 8 bits KB: Kilobyte = 210 bytes = 1024 bytes (KiB) MB: Megabyte = 220 bytes = 1,048,576 bytes (MiB) GB: Gigabyte = 230 bytes = 1,073,741,824 bytes (GiB) TB: Terabyte = 240 bytes = 1,099,511,627,776 bytes (TiB) PB: Petabyte = 250 bytes = 1,125,899,906,842,624 bytes (PiB) EB: Exabyte = 260 bytes = 1,152,921,504,606,846,976 bytes (EiB) Byte KB MB GB TB PB EB Word: A group of one or more bytes. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 45 / 63 9. Representing Text To represent a text document in digital form, we need to be able to represent every possible character that may appear. There are finite number of characters to represent, so the general approach is to list them all and assign each a binary string. A character set is a particular mapping between characters and binary strings. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 46 / 63 常見的字元系統 美國標準資訊交換碼 (ASCII, American Standard Code for Information Interchange) 編碼長度:8 bits (早期為 7 bits) 編碼內容:鍵盤上可見到的英文字母、阿拉 伯數字與符號,還有一些控制字元。 Big-5 編碼長度:16 bits 編碼內容:常用的中文字與符號。 Unicode 編碼長度:16 bits 編碼內容:世界上常見的文字符號。 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 47 / 63 ASCII Examples Control characters (0 ~ 31, 127) Printable characters (32 ~ 126) ASCII 字元 ASCII 0 NUL 32 (20) 1 SOH 33 (21) 2 STX 3 字元 ASCII 字元 ASCII 字元 ASCII 字元 48 (30) 0 65 (41) A 97 (61) a ! 49 (31) 1 66 (42) B 98 (62) b 34 (22) “ 50 (32) 2 67 (43) C 99 (63) c ETX 35 (23) # 51 (33) 3 68 (44) D 100 (64) d 4 EOT 36 (24) $ 52 (34) 4 69 (45) E 101 (65) e 5 ENQ 37 (25) % 53 (35) 5 70 (46) F 102 (66) f 6 ACK 38 (26) & 54 (36) 6 71 (47) G 103 (67) g 7 BEL 39 (27) ‘ 55 (37) 7 72 (48) H 104 (68) h 8 BS 40 (28) ( 56 (38) 8 73 (49) I 105 (69) i 9 HT 41 (29) ) 57 (39) 9 74 (4A) J 106 (6A) j 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 48 / 63 Big-5 Code Big-5 code 電腦處理繁體中文字的編碼系統 使用 16 bits 高位元組 低位元組 線上範例 位元組 編碼範圍 高位元組 81 ~ FE 常用字 5401 0XA4 0X40 ~ 0XC6 0X7E 40 ~ 7E 次常用字 7652 0XC9 0X40 ~ 0XF9 0XD5 A1 ~ FE 特殊符號 408 0XA1 0X40 ~ 0XA3 0XBF 字數 編碼範圍 低位元組 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 49 / 63 Unicode The Unicode Standard is the universal character encoding standard used for representation of text for computer processing. The original goal was to use a single 16-bit encoding that provides code points for more than 65,000 characters. The Unicode Standard defines codes for characters used in all the major languages written today. 綉 綉 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 50 / 63 10. Representing Images and Graphics Images and graphics data consists of still picture. Methods for storing graphics data: Bitmap Bitmap graphics form images as a map of hundreds of thousands of dots, called as pixels. The number of pixels used to represent a picture is called the resolution. Vector Use of geometrical primitives such as points, lines, curves, and polygons to represent images in computer graphics. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 51 / 63 Representing Monochrome Graphics A monochrome graphic is the simplest type of bitmap. It differentiate between only a foreground color and a background color. Suppose that these colors are black (0) and white (1). One bit per pixel. 0 0 1 1 1 1 1 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 0 1 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 1 1 1 1 1 Chapter 3: Page 52 / 63 Representing Grayscale Graphics In grayscale images, each pixel can be not only pure black or pure white but also any of the 254 shades of gray in between. One byte per pixel. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 53 / 63 Representing Color Graphics - 1 Color is a sensation caused by light as it interacts with the eye, brain, and our experience. Media that transmit light (such as television) use additive color mixing with primary colors of red, green, and blue, each of which stimulates one of the three types of the eye's color receptors with as little stimulation as possible of the other two. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 54 / 63 Representing Color Graphics - 2 Color depth: The amount of data used to represent a color. 1-bit color (21) black or white 8-bit color uses 8 bits to create a color, resulting in 256 colors. This is called a limited color pallet. 16-bit uses 16 bits to create life-like colors, with a total of 65,536 colors. 24 and 32 bit color can each have 16,777,216 and 4,294,967,296 colors, respectively. This type of color is called true color, since it can potentially mimic many colors found in the real world. In 24 bit color, each number in an RGB value gets 8 bits. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 55 / 63 Representing Color Graphics - 3 16,777,216 256 16 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 56 / 63 11. Data Compression 如果要儲存一個 true color 的 640 * 480 的影像檔案,那麼 所需的 bytes 數為 640 * 480 * 3 = 921,600 Bytes 921 KB 如果要儲存一個每個畫面的解析度為 352 * 240,每秒鐘播 放 30 個畫面的全彩視訊資料,那麼每秒鐘視訊資料所需 的 bytes 數為 30 * 352 * 240 * 3 = 7,603,200 Bytes 7 MB In computer science, data compression is the process of encoding data so that it takes less storage space or less transmission time than it would if it were not compressed. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 57 / 63 Types of Data Compression Algorithms Lossless data compression The original data can be reconstructed exactly from the compressed data. Lossless data compression is used in software compression tools such as the highly popular Zip format, used by PKZIP and WinZip. Lossy data compression One where compressing a file and then decompressing it retrieves a file that may well be different to the original, but is "close enough" to be useful in some way. Lossy methods are most often used for compressing sound or images. The advantage of lossy methods over lossless methods: In some cases a lossy method can produce a much smaller compressed file than any known lossless method, while still meeting the requirements of the application. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 58 / 63 Data Compression Algorithm Examples Run-length encoding A very simple form of data compression in which runs of data are stored as a single data value and count, rather than as the original run. WWWWWBBBBWWWWWBWWWWWWWWWWWWBBB W5B4W5B1W12B3 Huffman coding An entropy encoding algorithm used for data compression that finds the optimal system of encoding strings based on the relative frequency of each character. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 線上範例 Chapter 3: Page 59 / 63 常見的影像檔案格式 File Extension .bmp .gif .jpg, .jpeg .png Proper Name Description Windows Bitmap Commonly used by Microsoft Windows programs, and the Windows operating system itself. Lossless compression can be specified, but some programs use only uncompressed files. Graphics Interchange Format Used extensively on the web, but sometimes avoided due to patent issues. Supports animated images. Supports only 255 colors per frame, so requires lossy quantization for full-color photos; using multiple frames can improve color precision. Uses lossless, patented LZW compression. Used extensively for photos on the web. Uses lossy Joint Photographic compression; the quality can vary greatly depending on Experts Group the compression settings. Portable Network Graphics Lossless compressed bitmap image format, originally designed to replace the use of GIF on the web. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 60 / 63 12. Representing Audio Data Computers often process audio data after it has been digitally encoded by a method call waveform audio. Audio CD: Sampling rate: 44,100 Times/Sec Number of bits per sample: 16 常見的音訊檔案格式 WAV: Microsoft and IBM audio file format standard for storing audio on PCs. MP3 (MPEG-1 Audio Layer III): it is lossy. WMA: a proprietary compressed audio file format used by Microsoft. Quicktime: a digital video technology developed and produced by Apple Computer. RealAudio: an audio codec developed by RealNetworks. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 61 / 63 Representing Video Data Video is basically a three-dimensional array of color pixels Two dimensions serve as spatial (horizontal and vertical) directions of the (moving) pictures. One dimension represents the time domain. A frame is a set of all pixels that (approximately) correspond to a single point in time. Basically, a frame is the same as a (still) picture. However, video data contains spatial and temporal redundancy. Video compression typically reduces this redundancy using lossy compression. Usually this is achieved by image compression techniques to reduce spatial redundancy from frames and motion compensation techniques to reduce temporal redundancy. The Moving Picture Experts Group (MPEG) is a small group charged with the development of video and audio encoding standards. 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 62 / 63 常見的視訊規格 Format VCD SVCD DVD QT RM DV Resolution (NTSC/PAL) 352*240 352*288 480*480 480*576 720*480 720*576 640*480 320*240 720*480 720*576 RM DV Video compression MPEG1 MPEG2 MPEG1/2 Sorenson, Cinepak, MPEG4 Size/min 10 MB 10-20 MB 30-70 MB 4-20 MB 2-5 MB 216 MB Quality Good Great Excellent Great Decent Excellent 國立聯合大學電子工程學系 – 計算機概論 – 蕭裕弘 Chapter 3: Page 63 / 63