Stuff to know! Introductions Instruction Set Architecture (ISA)

advertisement
Stuff to know!
Introductions
Instruction Set Architecture (ISA)
Today’s topics




Architecture overview
Machine instructions
Instruction Execution Cycle
CISC machines



RISC machines
Parallelism



Instruction-level
Processor-level
Internal representation


Microprograms
Limits of representation
External representation

Binary, octal, decimal, hexadecimal number systems
Terms





CPU: Central Processing Unit
ALU: Arithmetic/Logic Unit
Memory: storage for data and programs (separate from
CPU)
Register: fast temporary storage inside the CPU
Bus: parallel "wires" for transferring a set of electrical
signals simultaneously






Internal: Transfers signals among CPU components
Control: Carries signals for memory and I/O operations
Address: Links to specific memory locations
Data: Carries data CPU  memory
Microprogram: sequence of micro-instructions required
to execute a machine instruction
Cache: temporary storage for faster access

Note: caching takes place at many levels in a computer system
Registers

General/Temporary: fast local memory inside the CPU







one type of cache
Control: dictates current state of the machine
Status: indicates error conditions
IR: Instruction Register (holds current instruction)
IP: Instruction Pointer (holds memory address of next
instruction) [often called Program Counter (PC)]
MAR: Memory Address Register (holds address of
memory location currently referenced)
MDR: Memory Data Register: holds data being set to or
retrieved from the memory address in the MAR
Machine instructions

Each computer architecture provides a set of
machine-level instructions



Instruction Set Architecture (ISA)
Specific to one particular architecture
Like everything inside a computer, machine
instructions are implemented electrically

Micro-instructions set the switches in the control
register
Hypothetical CISC* machine

Shows hardware components



Does not show digital logic level or microprograms.
Shows how machine-level instructions can be
stored and executed.
Illustrates
Finite-state machine
 *CISC


Complex Instruction Set Computer
VonNeumann architecture
 Instruction execution cycle

Real computers …

Use the “stored program” concept

VonNeumann architecture


Program is stored in memory, and is executed
under the control of the operating system
Operate using an Instruction Execution
Cycle
Instruction Execution Cycle
1.
2.
3.
4.
Fetch next instruction (at address in IP) into IR.
Increment IP to point to next instruction.
Decode instruction in IR
If instruction requires memory access,
A.
B.
Determine memory address.
Fetch operand from memory into a CPU register, or send
operand from a CPU register to memory.
Execute micro-program for instruction
6.
Go to step 1.
Note: default execution is sequential
5.
Example CISC Instruction
ADD
R1, mem1
;(Add contents of memory location mem1 to register R1)
Example ADD Microprogram
(each microinstruction executes in one clock cycle)
1.
2.
3.
4.
5.
6.
7.
Copy contents of R1 to ALU Operand_1
Move address of mem1 to MAR
Signal memory fetch (gets contents of memory address
currently in MAR into MDR)
Copy contents of MDR into ALU Operand_2
Signal ALU addition
Check Status Register
Copy contents of ALU Result to R1
Improving CISC

CISC speed (and convenience) is
increased by
more efficient microprograms
 more powerful ISA level instructions
 cache memory
 more registers
 wider buses
 making it smaller
 more processors
 floating point instructions
 Etc.

Clock Cycles

So how “slow” is this?
It isn’t slow
 Execution near light-speed


Clock cycle length determines CPU speed

(mostly)
Limitations of CISC
Improving a specific architecture requires
instructions to be backward compatible.
So … how about a different architecture?
RISC machines
Reduced Instruction Set Computer



Much smaller set of instructions at ISA level
Instructions are like CISC micro-instructions
RISC assembly level programs look much
longer (more instructions) than CISC
assembly level programs, but they execute
faster. Why?
RISC design principles


Instructions executed directly by hardware (no
microprograms).
Maximize rate of fetching instructions.


Instructions easy to decode



Instruction cache
Fetching operands, etc.
Only LOAD and STORE instructions reference
memory.
Plenty of registers
More speed improvement

Minimize memory and I/O accesses
Cache
 Separate I/O unit (buffers/processing)
 Separate network communication unit (NIC)
 Etc.


Parallel processing
Parallelism (overview)

Instruction-level parallelism



pipeline
cache
Processor-level parallelism


multiprocessor (multiple CPUs, common memory)
multicomputer (multiple CPUs, each with own
memory)
Pipelining
Pipelining Equations!

For k execution stages, n instructions require k
+ (n – 1)
S1
1
S2
S3
S4
S5
I-1
2
S1
I-1
3
I-1
4
I-1
5
8
9
10
11
12
1
I-1
2
I-2
S2
S3
S4
S5
S6
I-1
I-1
6
7
S6
I-1
3
4
I-2
I-2
I-2
I-1
I-2
5
I-2
No Pipelining
I-2
6
I-2
7
I-2
I-2
I-1
I-1
I-2
6-stage pipeline
I-1
I-2
Instruction Caching

Hardware provides area for multiple
instructions in the CPU
Reduces number of memory accesses
 Instructions are available for immediate
execution
 Might cause problems with decision,
repetition, and procedure structures in
programs

Multiprocessor (shared memory)
Multicomputer (distributed memory)
Comparisons

Cache and Pipelining


Implemented in hardware
Multiprocessor
Difficult to build
 Relatively easy to program


Multicomputer
Easy to build (given networking technology)
 Extremely difficult to program

Other types of parallelism


Hybrid systems
Scalable architectures



Add more processors (nodes), without having
to re-invent the system
Simulated parallelism
Super-Scalar
Applications of Parallelism

Multi-user systems
Networks
 Internet


Speed up single processes
Chess example
 Other AI applications

Questions?

Parallelism


… more later
Internal representation (tomorrow!)
Data
 Instructions
 Addresses

Internal representation

Just like everything else in a computer, the
representation of numbers is implemented
electrically





switches set to off or on
with open(transparent)/closed(opaque) gates.
There are two states for each gate
The binary number system uses two digits (0 and 1)
In order to simplify discussion, we use the standard
shorthand to transcribe the computer
representation:


off is written as digit 0
on is written as digit 1
External representation



Use the binary number system to represent
numeric values electrically.
Switches (gates) are grouped into bytes,
words, etc., to represent the digits of a
binary number.
Note: The number of gates in a group
depends on the computer architecture and
the type of data represented. E.G.,

For Intel-based architectures
byte = 8-bits, word = 2 bytes (16 bits)
 integers use 2, 4, 8, or 10 bytes

Binary number system



has 2 digits: 0 and 1 (binary digit)
has places and place values determined by powers
of 2.
(in theory) can uniquely represent any integer value


A binary representation is just another way of writing a
number that we are accustomed to seeing in decimal
form.
(in practice, inside the computer) representation is
finite

Representations with too many digits get chopped.
Internal representation




Place values (right-to-left) are 20,21,22,23,24, etc.
Bits are numbered (right-to-left) starting at 0
Place value depends on number of "bits" defined for the
type.
Example:

15
A 16-bit integer might be
14
13
12
11
10
9
8
7
(red is "on")
6
5
4
3
2
1
0
(bit numbers)
… transcribed by a human as 0000000010110010
To convert to its familiar decimal representation, just add up the
place values of the places that are "on".
Converting binary to decimal
215 214 213 212 211 210 29
28
27
26
25
24
23
22
21
20
32768
256
128
64
32
16
8
4
2
1
0
16384
8192
4096
2048
1024
512
0 0 0 0 0 0 0 1 0 1 1 0 0 1 0
in decimal form:
128 + 32 + 16 + 2 = 178
How many different codes (integers) can be represented using
16 bits?
What is the largest (unsigned) integer that can be represented
using 16 bits?
What is the largest (unsigned) integer that can be represented
using 32 bits?
Prove that for n-bit representation, number of codes is 2n,
largest unsigned integer is 2n – 1, and largest signed integer
is 2n-1 - 1
Converting decimal to binary

Method 1:
Removing largest powers of 2

Method 2:
Successive division by 2
Converting decimal to binary


Example: 157
Method 1:Removing largest powers of 2
157 – 128 = 29
29 – 16 = 13
13 – 8 = 5
5–4=1
1–1=0
10011101
Method 2:Successive division by 2


157 ÷ 2 =
78 ÷ 2 =
39 ÷ 2 =
19 ÷ 2 =
9÷2=
4÷2=
2÷2=
1÷2=

78 R 1
39 R 0
19 R 1
9R1
4R1
2R0
1R0
0R1
10011101
Numeric representation

We will show (later) exactly how an electrical
operation can be performed on two electrical
numeric representations to give an electrical
result that is consistent with the rules of
arithmetic.
… but not quite consistent …


Since the number of gates in each group (byte,
word, etc.) is finite, computers can represent
numbers with finite precision only.
Example:


Suppose that signed integer data is represented using
16 gates. Then the largest integer that can be
represented is 65535. What happens if we add 1 ?
If necessary, representations are truncated;
overflow / underflow can occur, and the Status
Register will be set
Representing negative integers

Must specify size!
Specify n: number of bits (8, 16, 32, etc.)
 There are 2n possible "codes"


Separate the "codes" so that half of them
represent negative numbers.

Note that exactly half of the codes have 1 in the
"leftmost" bit.)
Binary form of negative numbers



Several methods, each with disadvantages.
We will focus on twos-complement form
For a negative number x :
specify number of bits
 start with binary representation of |x|
 change every bit to its opposite, then add 1 to the
result.

Binary form of negative numbers

Example: -13 in 16-bit twos-complement



|-13| = 13 = 0000000000001101
ones-complement is 1111111111110010
add 1 to get 1111111111110011 = -13

Note that -(-13) should give 13. Try it.

Hexadecimal representation?




Convert binary to hex in the usual way
-13 = 1111111111110011 = FFF3 H = 0xfff3
Note: For byte, word, etc., if the first hex digit is greater
than or equal to 8, the value is negative.
Convert negative binary to decimal?

Find twos complement, convert, and prepend a minus
sign.
Signed numbers using 4-bit
twos-complement form
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
1000 1001 1010 1011 1100 1101 1110 1111 0000 0001 0010 0011 0100 0101 0110 0111

Notice that all of the negative numbers have 1 in
the leftmost bit. All of the non-negative
numbers have 0 in the leftmost bit.


For this reason, the leftmost bit is called the sign bit
Note: Nobody uses 4-bit representations
(“nibble”), but there’s not enough room to show
8-bit representations here.

You can extend this diagram to 8-bit, 16-bit, etc.
n-bit twos-complement form

The 2n possible codes give





all zero
2n-1 - 1 positive numbers
2n-1 negative numbers
Note: zero is its own complement
Note: there is one “weird” number


01111111 + 1 = 10000000
127 + 1 = -128


(inconsistent with rules of arithmetic)
127 is the largest number that can be represented in 8
bits. This means that -(-128) cannot be represented
with 8 bits.

i.e., the 2's-complement of 10000000 is 10000000
Signed or Unsigned?

A 16-bit representation could be used for signed or
unsigned numbers


16-bit signed range is -32768 .. +32767
16-bit unsigned range is 0 .. 65535

Both forms use the same 216 codes

Example:



1010101010101010 unsigned is 43690 decimal
1010101010101010 signed is -21846
Programmer must tell the computer which form is being
used.
Other representations



Every integer number has a unique
representation in each "base"  2
Hexadecimal is commonly used for easily
converting binary to a more manageable
form.
example 16-bit binary to hexadecimal:
Binary
0001 0111 1011
Hexadecimal
1
7
B
Write it as 0x17BD or 17BDh
1101
D
Questions?
Read Irvine Chapter 17.1
Download