Bits, Bytes, Ints

advertisement
CS 105
“Tour of the Black Holes of Computing!”
Computer Systems
Introduction
Rett Bull, Geoff Kuenning
Spring 2013
Topics:


–1–
Class Intro
Data Representation
CS 105
Course Theme

Abstraction is good, but don’t forget reality!
Many CS Courses emphasize abstraction


Abstract data types
Asymptotic analysis
These abstractions have limits


Especially in the presence of bugs
Need to understand underlying implementations
Useful outcomes

Become more effective programmers
 Able to find and eliminate bugs efficiently
 Able to tune program performance

Prepare for later “systems” classes in CS
 Compilers, Operating Systems, File Systems, Computer
Architecture, Robotics, etc.
–2–
CS 105
Textbooks
Randal E. Bryant and David R. O’Hallaron,

“Computer Systems: A Programmer’s Perspective”, 2nd
Edition, Prentice Hall, 2011.
Brian Kernighan and Dennis Ritchie,

“The C Programming Language, Second Edition”, Prentice
Hall, 1988
Larry Miller and Alex Quilici

–3–
The Joy of C, Wiley, 1997
CS 105
Syllabus



–4–
Syllabus on Web: http://www.cs.hmc.edu/~geoff/cs105
Calendar defines due dates
Labs: cs105submit for some, others have specific directions
CS 105
Notes:
Work groups



You must work in pairs on all labs
Honor-code violation to work without your partner!
Corollary: showing up late doesn’t harm only you
Handins


Check calendar for due dates
Electronic submissions only
Grading Characteristics

Lab scores tend to be high
 Serious handicap if you don’t hand a lab in

Tests & quizzes typically have a wider range of scores
 I.e., they’re primary determinant of your grade
» …but not the ONLY one

–5– 
Do your share of lab work and reading, or bomb tests
Do practice problems in book
CS 105
Facilities
Assignments will use Intel computer systems

Not all machines are created alike
 Even Intel Macs aren’t necessarily compatible
 Most modern machines (including Knuth) are 64-bit

Wilkes: x86/Linux specifically set up for this class

Log in on a Mac, then ssh to Wilkes
 If you want fancy programs, start X11 first
 Directories are cross-mounted, so you can edit on Knuth
or your Mac, and Wilkes will see your files
…or ssh into Wilkes from your dorm
 All programs must run on Wilkes: that’s where we
grade

–6–
CS 105
CS 105
“Tour of the Black Holes of Computing”
Integers
Topics

Numeric Encodings
 Unsigned & two’s complement

Programming Implications
 C promotion rules

Basic operations
 Addition, negation, multiplication

Programming Implications
 Consequences of overflow
 Using shifts to perform power-of-2 multiply/divide
CS 105
C Puzzles



Taken from old exams
Assume machine with 32-bit word size, two’s complement
integers
For each of the following C expressions, either:
 Argue that it is true for all argument values, or
 Give example where it is not true
• x < 0
 ((x*2) < 0)
• ux >= 0
Initialization
• x & 7 == 7
 (x<<30) < 0
int x = foo();
• ux > -1
int y = bar();
• x > y
unsigned ux = x;
• x * x >= 0
unsigned uy = y;
• x > 0 && y > 0  x + y > 0
–8–
 -x < -y
• x >= 0
 -x <= 0
• x <= 0
 -x >= 0
CS 105
Encoding Integers
Unsigned
B2U(X ) 
w1
 xi 2
Two’s Complement
i
B2T (X )   xw1 2
i0
w1

w2
 xi 2 i
i0
short int x = 15213;
short int y = -15213;

Sign
Bit
C short 2 bytes long
x
y
Decimal
15213
-15213
Hex
3B 6D
C4 93
Binary
00111011 01101101
11000100 10010011
Sign Bit

For 2’s complement, most-significant bit indicates sign
 0 for nonnegative
 1 for negative
–9–
CS 105
Encoding Integers (Cont.)
x =
y =
– 10 –
Weight
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
-32768
Sum
15213: 00111011 01101101
-15213: 11000100 10010011
15213
1
1
0
0
1
4
1
8
0
0
1
32
1
64
0
0
1
256
1
512
0
0
1
2048
1
4096
1
8192
0
0
0
0
15213
-15213
1
1
1
2
0
0
0
0
1
16
0
0
0
0
1
128
0
0
0
0
1
1024
0
0
0
0
0
0
1
16384
1 -32768
-15213
CS 105
Numeric Ranges
Unsigned Values
 UMin
= 0
Two’s-Complement Values
 TMin
=
–2w–1
000…0

UMax
100…0
=
2w – 1

111…1
TMax
=
2w–1 – 1
011…1
Other Values
 Minus 1
111…1
Values for W = 16
UMax
TMax
TMin
-1
0
– 11 –
Decimal
65535
32767
-32768
-1
0
Hex
FF FF
7F FF
80 00
FF FF
00 00
Binary
11111111 11111111
01111111 11111111
10000000 00000000
11111111 11111111
00000000 00000000
CS 105
Values for Different Word Sizes
W
UMax
TMax
TMin
8
255
127
-128
16
65,535
32,767
-32,768
32
4,294,967,295
2,147,483,647
-2,147,483,648
Observations

|TMin | =
C Programming
TMax + 1

 Asymmetric range

UMax
=
64
18,446,744,073,709,551,615
9,223,372,036,854,775,807
-9,223,372,036,854,775,808
2 * TMax + 1
#include <limits.h>
 K&R Appendix B11

Declares constants, e.g.,
 ULONG_MAX
 LONG_MAX
 LONG_MIN

– 12 –
Values platform-specific
CS 105
An Important Detail
No Self-Identifying Data




Looking at a bunch of bits doesn’t tell you what they mean
Could be signed, unsigned integer
Could be floating-point number
Could be part of a string
Only the Program (Instructions) Knows for Sure!
– 13 –
CS 105
Unsigned & Signed
Numeric Values
X
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
– 14 –
B2U(X)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
B2T(X)
0
1
2
3
4
5
6
7
–8
–7
–6
–5
–4
–3
–2
–1
Equivalence

Same encodings for
nonnegative values
Uniqueness


Every bit pattern represents
unique integer value
Each representable integer
has unique bit encoding
CS 105
Casting Signed to Unsigned
C Allows Conversions from Signed to Unsigned
short int
x = 15213;
unsigned short int ux = (unsigned short) x;
short int
y = -15213;
unsigned short int uy = (unsigned short) y;
Resulting Value


No change in bit representation
Nonnegative values unchanged
 ux = 15213

Negative values change into (large) positive values
 uy = 50323
– 15 –
CS 105
Relation Between
Signed & Unsigned
Weight
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
Sum

– 16 –
uy
=
-15213
1
1
1
2
0
0
0
0
1
16
0
0
0
0
1
128
0
0
0
0
1
1024
0
0
0
0
0
0
1
16384
1 -32768
-15213
y + 2 * 32768
=
50323
1
1
1
2
0
0
0
0
1
16
0
0
0
0
1
128
0
0
0
0
1
1024
0
0
0
0
0
0
1
16384
1
32768
50323
y + 65536
CS 105
Signed vs. Unsigned in C
Constants


By default are considered to be signed integers
Unsigned if have “U” as suffix
0U, 4294967259u
Casting

Explicit casting between signed & unsigned same as U2T and
T2U
int tx, ty;
unsigned ux, uy;
tx = (int) ux;
uy = (unsigned) ty;

Implicit casting also occurs via assignments and procedure calls
tx = ux;
uy = ty;
– 17 –
CS 105
Casting Surprises
Expression Evaluation
 If
you mix unsigned and signed in single expression, signed
values are implicitly cast to unsigned
 Including comparison operations <, >, ==, <=, >=
 Examples
for W = 32
Constant1
– 18 –
Constant2
0
-1
-1
0u
0
0u
2147483647
2147483647u
-1
(unsigned) -1
2147483647
2147483647
-2147483648
-2147483648
-2
-2
2147483648u
(int) 2147483648u
Relation Evaluation
CS 105
Casting Surprises
Expression Evaluation
 If
you mix unsigned and signed in single expression, signed
values are implicitly cast to unsigned
 Including comparison operations <, >, ==, <=, >=
 Examples
for W = 32
Constant1
– 19 –
0
-1
-1
2147483647
2147483647U
2147483647u
-1
(unsigned) -1
2147483647
2147483647
2147483647
2147483647
Constant2
Relation Evaluation
0U
0u
0
0U
0u
-2147483648
-2147483648
-2
-2
2147483648U
2147483648u
(int) 2147483648U
2147483648u
==
<
>
>
<
>
>
<
>
unsigned
signed
unsigned
signed
unsigned
signed
unsigned
unsigned
CS 105
signed
Explanation of Casting Surprises
2’s Comp.  Unsigned


Ordering Inversion
Negative  Big Positive
TMax
2’s Comp.
Range
– 20 –
0
–1
–2
TMin
UMax
UMax – 1
TMax + 1
TMax
Unsigned
Range
0
CS 105
Sign Extension
Task:


Given w-bit signed integer x
Convert it to w+k-bit integer with same value
Rule:


Make k copies of sign bit:
X  = xw–1 ,…, xw–1 , xw–1 , xw–2 ,…, x0
w
k copies of MSB
X
• • •
• • •
X
– 21 –
• • •
k
• • •
w
CS 105
Sign Extension Example
short int x = 15213;
int
ix = (int) x;
short int y = -15213;
int
iy = (int) y;
Decimal
Hex
15213
3B
15213 00 00 3B
-15213
C4
-15213 FF FF C4
x
ix
y
iy


– 22 –
6D
6D
93
93
Binary
00111011
00000000 00000000 00111011
11000100
11111111 11111111 11000100
01101101
01101101
10010011
10010011
Converting from smaller to larger integer data type
C automatically performs sign extension
CS 105
Why Should I Use Unsigned?
Be careful using

C compilers on some machines generate less efficient code
unsigned i;
for (i = 1; i < cnt; i++)
a[i] += a[i-1];

Easy to make mistakes
for (i = cnt-2; i >= 0; i--)
a[i] += a[i+1];
Do use when performing modular arithmetic


Multiprecision arithmetic
Other esoteric stuff
Do use when need extra bit’s worth of range


– 23 –
Working right up to limit of word size
E.g., array sizes
CS 105
Negating with Complement &
Increment
Claim: Following holds for 2’s complement
~x + 1 == -x
Complement

Observation: ~x + x == 1111…112 == -1
x
1001 1101
+ ~x
0110 0010
-1
1111 1111
Increment


~x + x + (-x + 1)
~x + 1
== -1 + (-x + 1)
== -x
Warning: Be cautious treating int’s as integers
– 24 – 
OK here (associativity holds)
CS 105
Comp. & Incr. Examples
x = 15213
Decimal
x
15213
~x
-15214
~x+1 -15213
y
-15213
Hex
3B 6D
C4 92
C4 93
C4 93
Binary
00111011 01101101
11000100 10010010
11000100 10010011
11000100 10010011
0
0
~0
~0+1
– 25 –
Decimal
0
-1
0
Hex
00 00
FF FF
00 00
Binary
00000000 00000000
11111111 11111111
00000000 00000000
CS 105
Unsigned Addition
u
• • •
v
• • •
u+v
• • •
UAddw(u , v)
• • •
Operands: w bits
+
True Sum: w+1 bits
Discard Carry: w bits
Standard Addition Function

Ignores carry output
Implements Modular Arithmetic
s =
UAddw(u , v)
= u + v mod 2w
 u  v
UAddw (u,v)  
w
u  v  2
– 26 –
u  v  2w
u  v  2w
CS 105
Two’s-Complement Addition
u
• • •
v
• • •
u+v
• • •
TAddw(u , v)
• • •
Operands: w bits
+
True Sum: w+1 bits
Discard Carry: w bits
TAdd and UAdd have identical bit-level behavior
– 27 –

Signed vs. unsigned addition in C:
int s, t, u, v;
s = (int) ((unsigned) u + (unsigned) v);
t = u + v

Will give s == t
CS 105
Detecting 2’s-Comp. Overflow
Task



2w–1
Given s = TAddw(u , v)
Determine if s = Addw(u , v)
Example
int s, u, v;
s = u + v;
PosOver
2w –1
0
Claim

Overflow iff either:
NegOver
u, v < 0, s  0 (NegOver)
u, v  0, s < 0 (PosOver)
– 28 –
CS 105
Multiplication
Computing exact product of w-bit numbers x, y

Either signed or unsigned
Ranges

Unsigned: 0 ≤ x * y ≤ (2w – 1) 2 = 22w – 2w+1 + 1
 Up to 2w bits

Two’s complement min: x * y ≥ (–2w–1)*(2w–1–1) = –22w–2 + 2w–1
 Up to 2w–1 bits (including 1 for sign)

Two’s complement max: x * y ≤ (–2w–1) 2 = 22w–2
 Up to 2w bits, but only for (TMinw)2
Maintaining exact results


– 29 –
Would need to keep expanding word size with each product
computed
Done in software by “arbitrary-precision” arithmetic
packages
CS 105
Power-of-2 Multiply by Shifting
Operation

u << k gives u * 2k

Both signed and unsigned
Operands: w bits
True Product: w+k bits
Discard k bits: w bits
Examples
u · 2k
2k
0 ••• 0 1 0 ••• 0 0
• • •
UMultw(u , 2k)
•••
0 ••• 0 0
0 ••• 0 0
TMultw(u , 2k)

u << 3
u << 5 - u << 3

Most machines shift and add much faster than multiply

– 30 –
*
u
k
• • •
==
==
u * 8
u * 24
 Compiler generates this code automatically
CS 105
Unsigned Power-of-2 Divide
by Shifting
Quotient of unsigned by power of 2

u >> k gives  u / 2k 

Uses logical shift
k
u
Operands:
Division:
Result:
x
x >> 1
x >> 4
x >> 8
– 31 –
/
2k
•••
•••
0 ••• 0 1 0 ••• 0 0
u / 2k
0 •••
•••
 u / 2k 
0 •••
•••
Division
15213
7606.5
950.8125
59.4257813
Binary Point
Computed
15213
7606
950
59
Hex
3B 6D
1D B6
03 B6
00 3B
.
•••
Binary
00111011 01101101
00011101 10110110
00000011 10110110
00000000 00111011
CS 105
Download