CS 2110: Data Types and Representations 2 Lecture Slides

CS 2110: Data Types and Representations 2 Aaron Hillegass Georgia Tech Bitwise NOT 2/25 Denoted ! and ∼ and NOT and ¬. Truth table: A NOT A 0 1 1 0 Example on 8-bit word: ~(0b00110101) = 0b11001010 Bitwise AND 3/25 Denoted & and ∧ and AND. Truth table: A B A AND B 0 0 0 1 0 0 0 1 0 1 1 1 𝑛 inputs? All must be 1. Example on 8-bit words: 0b00110101 & 0b11100011 = 0b00100001 Bitwise OR 4/25 Denoted | and ∨ and OR. Truth table: A B A OR B 0 0 0 1 0 1 0 1 1 1 1 1 𝑛 inputs? At least one must be 1. Example on 8-bit words: 0b00110101 | 0b11100011 = 0b11110111 Bitwise XOR 5/25 Denoted ^ and ⊕ and XOR. Truth table: A B A XOR B 0 0 0 1 0 1 0 1 1 1 1 0 𝑛 inputs? iff odd number of 1s. Example on 8-bit words: 0b00110101 ^ 0b11100011 = 0b11010110 NAND and NOR 6/25 A B A NAND B A B A NOR B 0 0 1 0 0 1 1 0 1 1 0 0 0 1 1 0 1 0 1 1 0 1 1 0 𝑛 inputs? At least one must be 0. 𝑛 inputs? All must be 0. Left Shift 7/25 All bits move to the left 𝑛 positions Leftmost bit is list, Rightmost is zero. uint16_t v = 7; uint16_t u = v << 2; # v = 0000000000000111 # u = 0000000000011100 Shifting left one is equivalent to multiplying by 2. Logical Right Shift 8/25 All bits move to the right 𝑛 positions Rightmost bit is list, Leftmost is zero. uint16_t v = 32768; // v = 1000000000000000 uint16_t u = v >> 2; // u = 0010000000000000 Shifting right one is equivalent to dividing by 2 and discarding fractional part. Arithmetic Right Shift All bits move to the right 𝑛 positions Rightmost bit is sign bit, Leftmost is sign bit. int v = -1536; // v = 1111101000000000 int u = v >> 2; // u = 1111111010000000 11111110100000002 = −38410 = − 1536 4 . Shifting right one is equivalent to dividing by 2 and rounding towards negative infinity. In this class, if we say “right shift” we mean logical right shift. 9/25 Creating Bitmasks using bitwise OR int fd = open("myfile.txt", O_RDWR | O_CREAT | O_TRUNC, 0644); Each bit represents a flag. int O_RDWR = 1 << 1 // = 0x2; int O_CREAT = 1 << 6 // = 0x40; int O_TRUNC = 1 << 8 // = 0x100; 10/25 Reading Bitmasks using bitwise AND int open(const char *pathname, int flags, ...) { if (flags & O_RDWR) { // Handle read-write mode } if (flags & O_CREAT) { // Handle create mode } if (flags & O_TRUNC) { // Handle truncate ååmode } if (flags & O_APPEND) { // Handle append mode }å } 11/25 Octal Notation 12/25 A lot of bitmasks are declared using octal notation. It is just base-8, and you can recognize it because we always lead with a zero: 0127 = 1 × 82 + 2 × 81 + 7 × 80 = 8710 Here’s some C code: uint16_t x = 0x1234; x = x & ~0777; Scientific Notation 13/25 −2.45 × 107 • Sign • Mantissa: One non-zero digit to the left of the decimal point. • Exponent: A signed integer. How would you convert this idea to binary? Reminder: Radix Point for Binary Numbers 1010.110012 = 23 + 21 + 20 + 2−1 + 2−2 + 2−5 = 10.7812510 .𝑎1 𝑎2 𝑎3 …𝑎𝑛 = ∑𝑛1 𝑎𝑖 × 2−𝑖 14/25 32-bit IEEE-754 Floating Point Numbers 31 30 - 23 22 - 0 1 10000001 01100000000000000000000 1 bit 8 bits 23 bits 𝑆 𝐸 𝑀 (−1)𝑆 × 1.𝑀 × 2𝐸−127 Example: (−1) × 1.0112 × 2129−127 = −1.37510 × 22 = −5.5 About 7 decimal digits of precision, max is about 3.4 × 1038 15/25 64-bit IEEE-754 Floating Point Numbers 63 62 - 52 51 - 0 1 bit 11 bits 52 bits 𝑆 𝐸 𝑀 𝑆 𝐸−1023 (−1) × 1.𝑀 × 2 About 16 decimal digits of precision, max is about 1.8 × 10308 16/25 But...how do we represent zero? 31 30 - 23 22 - 0 1 bit 8 bits 23 bits 𝑆 𝐸 𝑀 And NaN and ∞ and −∞? 17/25 And the IEEE spoketh 31 30 - 23 22 - 0 1 bit 8 bits 23 bits 𝑆 𝐸 𝑀 Special numbers: • Zero: 𝑀 = 0, 𝐸 = 0 • ∞: 𝑀 = 0, 𝐸 = 255 • −∞: 𝑀 = 0, 𝐸 = 255, 𝑆 = 1 • NaN: 𝑀 ≠ 0, 𝐸 = 255 18/25 Subnormals To get really small numbers, we have subnormals. For all the normals, the mantissa 𝑀 represents 1.𝑀 . For subnormals. the mantissa 𝑀 represents 0.𝑀 . If 𝐸 is zero and 𝑀 is not, it is a subnormal. Smallest positive number is 0 00000000 00000000000000000000001 It represents 2−126 × 2−23 = 2−149 19/25 bfloat16 20/25 Deep learning like more numbers with less precision. Google developed bfloat16: 15 14 - 7 6-0 1 bit 8 bits 7 bits 𝑆 𝐸 𝑀 (−1)𝑆 × 1.𝑀 × 2𝐸−127 ASCII text encoding 00 nul 08 bs 10 dle 18 can 20 sp 28 ( 30 0 38 8 40 @ 48 H 50 P 58 X 60 ` 68 h 70 p 78 x 01 soh 09 ht 11 dc1 19 em 21 ! 29 ) 31 1 39 9 41 A 49 I 51 Q 59 Y 61 a 69 i 71 q 79 y 02 stx 0a nl 12 dc2 1a sub 22 " 2a * 32 2 3a : 42 B 4a J 52 R 5a Z 62 b 6a j 72 r 7a z 03 etx 0b vt 13 dc3 1b esc 23 # 2b + 33 3 3b ; 43 C 4b K 53 S 5b [ 63 c 6b k 73 s 7b { 21/25 04 eot 0c np 14 dc4 1c fs 24 $ 2c , 34 4 3c < 44 D 4c L 54 T 5c \ 64 d 6c l 74 t 7c | 05 enq 0d cr 15 nak 1d gs 25 % 2d 35 5 3d = 45 E 4d M 55 U 5d ] 65 e 6d m 75 u 7d } 06 ack 0e so 16 syn 1e rs 26 & 2e . 36 6 3e > 46 F 4e N 56 V 5e ^ 66 f 6e n 76 v 7e ~ 07 bel 0f si 17 etb 1f us 27 ' 2f / 37 7 3f ? 47 G 4f O 57 W 5f _ 67 g 6f o 77 w 7f del ASCII strings in memory 22/25 char *greeting = "Hello!"; ‘H’ ‘e’ ‘l’ ‘l’ ‘o’ ‘!’ ‘\0’ 48 65 Notice: • A is 6510 • Z is 9010 • a is 9710 • z is 12210 6c 6c 6f 21 00 ‘K’ + 32 = ‘k’ Is there a cheaper way? Newline Madness In 1972, Unix used ‘\n’ (ASCII 1010 ). In 1982, Microsoft DOS used ‘\r\n’ (ASCII 1310 , ASCII 1010 ). We have never recovered. Most systems have dos2unix and unix2dos. 23/25 UTF-8 is a superset of ASCII 24/25 • UTF-8 is the most common encoding for the web. • ASCII only used 7 bits. Turning on the 8th bit means “This is UTF-8.” • UTF-8 represents any Unicode “code point” using 1 to 4 bytes. • Sometimes a single character requires multiple code points. First Last byte 1 byte 2 byte 3 U+0000 U+007F 0yyyzzzz U+0080 U+07FF 110xxxyy 10yyzzzz U+0800 U+FFFF 1110wwww 10xxxxyy 10yyzzzz U+10000 U+10FFFF 11110uvv 10vvwwww 10xxxxyy byte 4 10yyzzzz Questions? Reading Patt: 2.1 - 2.6 Slides by Aaron Hillegass

CS 2110: Data Types and Representations 2 Lecture Slides

Related documents

Products

Support

CS 2110: Data Types and Representations 2 Lecture Slides

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib