Instruction Set Architecture Y86 Processor State Encoding Registers

advertisement
Lecture 7A
Y86 Processor State
Computer Architecture I
Program registers
Instruction Set Architecture
%eax
%ecx
%edx
%ebx
Assembly Language View
Application
Program
Processor state
Registers, memory, …
Compiler
OS
PC
Single-bit flags set by arithmetic or logical instructions
CPU
Design
Layer of Abstraction
Above: how to program machine
» OF: Overflow
sequence
Use variety of tricks to make it run fast
Indicates address of instruction
Byte-addressable storage array
Words stored in little-endian byte order
E.g., execute multiple instructions
Datorarkitektur 2007
simultaneously
Y86 Instructions
Format
1--6 bytes of information read from memory
» Can determine instruction length from first byte
» Not as many instruction types, and simpler encoding than
with IA32
Each accesses and modifies some part(s) of the program state
Datorarkitektur 2007
F7A – 2 –
Instruction Example
Addition Instruction
Generic Form
Encoded Representation
addl rA, rB
Encoding Registers
6 0 rA rB
Add value in register rA to that in register rB
Store result in register rB
Each register has 4-bit ID
%esi
%edi
%esp
%ebp
SF:Negative
Memory
Chip
Layout
Below: what needs to be built
ZF: Zero
Program Counter
Circuit
Design
Processor executes instructions in a
Note that Y86 only allows addition to be applied to register data
6
7
4
5
Set condition codes based on result
e.g., addl %eax,%esi Encoding: 60 06
Two-byte encoding
Same encoding as in IA32
First indicates instruction type
Register ID 8 indicates “no register”
Second gives source and destination registers
Will use this in our hardware design in multiple places
F7A – 3 –
OF ZF SF
Condition Codes
ISA
How instructions are encoded as bytes
0
1
2
3
Memory
Same 8 as with IA32. Each 32 bits
addl, movl, andl, …
%eax
%ecx
%edx
%ebx
Condition
codes
Program Registers
Instructions
F7A – 1 –
%esi
%edi
%esp
%ebp
Datorarkitektur 2007
F7A – 4 –
Datorarkitektur 2007
Arithmetic and Logical Operations
Instruction Code
Add
addl rA, rB
Function Code
6 0 rA rB
Refer to generically as “OPl”
Subtract (rA from rB)
Low-order 4 bytes in first
6 1 rA rB
2 0 rA rB
Register --> Register
irmovl V, rB
3 0 8 rB
V
Immediate --> Register
rmmovl rA, D(rB)
4 0 rA rB
D
Register --> Memory
mrmovl D(rB), rA
5 0 rA rB
D
Memory --> Register
instruction word
Set condition codes as side
And
andl rA, rB
rrmovl rA, rB
Encodings differ only by
“function code”
subl rA, rB
Move Operations
effect
6 2 rA rB
Exclusive-Or
xorl rA, rB
Like the IA32 movl instruction
6 3 rA rB
Simpler format for memory addresses
Give different names to keep them distinct
Datorarkitektur 2007
F7A – 5 –
Move Instruction Examples
Datorarkitektur 2007
F7A – 6 –
Jump Instructions
Jump Unconditionally
Little endian
IA32
Y86
Encoding
movl $0xabcd, %edx
irmovl $0xabcd, %edx
30 82 cd ab 00 00
jmp Dest
7 0
Dest
Jump When Less or Equal
jle Dest
7 1
Encodings differ only by
Dest
Jump When Less
movl %esp, %ebx
rrmovl %esp, %ebx
20 43
movl -12(%ebp),%ecx
mrmovl -12(%ebp),%ecx
50 15 f4 ff ff ff
movl %esi,0x41c(%esp)
rmmovl %esi,0x41c(%esp)
40 64 1c 04 00 00
jl Dest
7 2
Dest
7 3
movl $0xabcd, (%eax)
—
Jump When Not Equal
7 4
jne Dest
movl %eax, 12(%eax,%edx)
—
Jump When Greater or Equal
movl (%ebp,%eax,4),%ecx
—
jge Dest
7 5
“function code”
Based on values of condition
codes
Jump When Equal
je Dest
Refer to generically as “jXX”
Dest
Same as IA32 counterparts
Encode full destination
Dest
address
Unlike PC-relative addressing
Dest
seen in IA32
Jump When Greater
jg Dest
F7A – 7 –
Datorarkitektur 2007
F7A – 8 –
7 6
Dest
Datorarkitektur 2007
Y86 Program Stack
Stack “Bottom”
Stack Operations
Region of memory holding program
pushl rA
a 0 rA 8
data
•
Increasing
•
Addresses
•
Used in Y86 (and IA32) for
Decrement %esp by 4
supporting procedure calls
Store word from rA to memory at %esp
Stack top indicated by %esp
Like IA32
Address of top stack element
Stack grows toward lower
addresses
popl rA
Top element is at highest address in
%esp
Stack “Top”
the stack
Read word from memory at %esp
Save in rA
stack pointer
Increment %esp by 4
pointer
Like IA32
Datorarkitektur 2007
Subroutine Call and Return
call Dest
8 0
0 rA 8
When pushing, must first decrement
When popping, increment stack
F7A – 9 –
b
Datorarkitektur 2007
F7A – 10 –
Miscellaneous Instructions
nop
Dest
Push address of next instruction onto stack
0 0
Don’t do anything
Start executing instructions at Dest
Like IA32
ret
halt
9
1 0
Stop executing instructions
0
IA32 has comparable instruction, but can’t execute it in user
Pop value from stack
mode
Use as address for next instruction
We will use it to stop the simulator
Like IA32
F7A – 11 –
Datorarkitektur 2007
F7A – 12 –
Datorarkitektur 2007
Writing Y86 Code
Y86 Code Generation Example
Try to Use C Compiler as Much as Possible
First Try
Write code in C
/* Find number of elements in
null-terminated list */
int len1(int a[])
{
int len;
for (len = 0; a[len]; len++)
;
return len;
}
Transliterate into Y86
Coding Example
Find number of elements in null-terminated list
int len1(int a[]);
5043
6125
7395
Hard to do array indexing on
Y86
Compile for IA32 with gcc -S
a
Problem
Write typical array code
⇒ 3
Since don’t have scaled
addressing modes
Similar to SPARC
loop:
incl %eax
entry:
cmpl $0,(%edx,%eax,4)
jne L18
Compile with gcc -O2 -S
0
Datorarkitektur 2007
F7A – 13 –
Datorarkitektur 2007
F7A – 14 –
Y86 Code Generation Example #2
Y86 Code Generation Example #3
Second Try
IA32 Code
Write with pointer code
Result
Don’t need to do indexed
addressing
/* Find number of elements in
null-terminated list */
int len2(int a[])
{
int len = 0;
while (*a++)
len++;
return len;
}
Setup
len2:
pushl %ebp
xorl %ecx,%ecx
movl %esp,%ebp
movl 8(%ebp),%edx
movl (%edx),%eax
jmp entry
loop:
movl (%edx),%eax
incl %ecx
entry:
addl $4,%edx
testl %eax,%eax
jne loop
Y86 Code
Setup
len2:
pushl %ebp
#
xorl %ecx,%ecx
#
rrmovl %esp,%ebp
#
mrmovl 8(%ebp),%edx #
mrmovl (%edx),%eax #
jmp entry
#
Save %ebp
len = 0
Set frame
Get a
Get *a
Goto entry
Compile with gcc -O2 -S
F7A – 15 –
Datorarkitektur 2007
F7A – 16 –
Datorarkitektur 2007
Y86 Program Structure
Y86 Code Generation Example #4
IA32 Code
Y86 Code
Loop + Finish
Loop + Finish
loop:
movl (%edx),%eax
loop:
mrmovl (%edx),%eax
irmovl $1,%esi
addl %esi,%ecx
entry:
irmovl $4,%esi
addl %esi,%edx
andl %eax,%eax
jne loop
rrmovl %ebp,%esp
rrmovl %ecx,%eax
popl %ebp
ret
incl %ecx
entry:
addl $4,%edx
testl %eax,%eax
jne loop
movl %ebp,%esp
movl %ecx,%eax
popl %ebp
ret
# Get *a
# len++
#
#
#
#
#
a++
*a == 0?
No--Loop
Pop
Rtn len
Assembling Y86 Program
F7A – 19 –
address 0
Must set up stack
Make sure don’t
overwrite code!
Must initialize data
Can use symbolic
names
Datorarkitektur 2007
F7A – 18 –
unix> yis eg.yo
Instruction set simulator
Actually looks like disassembler output
b3130000
ed170000
e31c0000
00000000
Program starts at
# List of elements
# Function
len2:
. . .
Generates “object code” file eg.yo
308400010000
2045
308218000000
a028
8028000000
10
# Push argument
# Call Function
# Halt
Simulating Y86 Program
unix> yas eg.ys
0x000:
0x006:
0x008:
0x00e:
0x010:
0x015:
0x018:
0x018:
0x018:
0x01c:
0x020:
0x024:
# Set up stack
# Set up frame
# Allocate space for stack
.pos 0x100
Stack:
Datorarkitektur 2007
F7A – 17 –
irmovl Stack,%esp
rrmovl %esp,%ebp
irmovl List,%edx
pushl %edx
call len2
halt
.align 4
List:
.long 5043
.long 6125
.long 7395
.long 0
|
|
|
|
|
|
|
|
|
|
|
|
irmovl Stack,%esp
rrmovl %esp,%ebp
irmovl List,%edx
pushl %edx
call len2
halt
.align 4
List:
.long 5043
.long 6125
.long 7395
.long 0
Computes effect of each instruction on processor state
Prints changes in state from original
# Set up stack
# Set up frame
Stopped in 41 steps at PC = 0x16. Exception 'HLT', CC Z=1 S=0 O=0
Changes to registers:
%eax:
0x00000000
0x00000003
%ecx:
0x00000000
0x00000003
%edx:
0x00000000
0x00000028
%esp:
0x00000000
0x000000fc
%ebp:
0x00000000
0x00000100
%esi:
0x00000000
0x00000004
# Push argument
# Call Function
# Halt
# List of elements
Changes to memory:
0x00f4:
0x00f8:
0x00fc:
Datorarkitektur 2007
F7A – 20 –
0x00000000
0x00000000
0x00000000
0x00000100
0x00000015
0x00000018
Datorarkitektur 2007
RISC Instruction Sets
CISC Instruction Sets
Reduced Instruction Set Computer
Complex Instruction Set Computer
Internal project at IBM, later popularized by Hennessy (Stanford)
Dominant style through mid-80’s
and Patterson (Berkeley)
Stack-oriented instruction set
Fewer, simpler instructions
Use stack to pass arguments, save program counter
Might take more to get given task done
Explicit push and pop instructions
Can execute them with small and fast hardware
Arithmetic instructions can access memory
Register-oriented instruction set
addl %eax, 12(%ebx,%ecx,4)
Many more (typically 32) registers
requires memory read and write
Use for arguments, return pointer, temporaries
Complex address calculation
Only load and store instructions can access memory
Condition codes
Similar to Y86 mrmovl and rmmovl
Set as side effect of arithmetic and logical instructions
No Condition codes
Philosophy
Test instructions return 0/1 in register
Add instructions to perform “typical” programming tasks
SPARC has condition codes
Datorarkitektur 2007
F7A – 21 –
Datorarkitektur 2007
F7A – 22 –
CISC vs. RISC
Summary
Original Debate
Y86 Instruction Set Architecture
Similar state and instructions as IA32
Strong opinions!
Simpler encodings
CISC proponents---easy for compiler, fewer code bytes
Somewhere between CISC and RISC
RISC proponents---better for optimizing compilers, can make run
How Important is ISA Design?
fast with simple chip design
Less now than before
Current Status
With enough hardware, can make almost anything go fast
For desktop processors, choice of ISA not a technical issue
Intel is moving away from IA32
With enough hardware, can make anything run fast
Does not allow enough parallel execution
Code compatibility more important
Introduced IA64
» 64-bit word sizes (overcome address space limitations)
» Radically different style of instruction set with explicit parallelism
» Requires sophisticated compilers
For embedded processors, RISC makes sense
Smaller, cheaper, less power
F7A – 23 –
Datorarkitektur 2007
F7A – 24 –
Datorarkitektur 2007
Download