Ch 4 - Personal.kent.edu

advertisement
CS 35101 Computer
Architecture
Section 600
Dr. Angela Guercio
Fall 2010
The Microarchitecture Level
• The level above the digital logic level is the
microarchitecture level.
 Its job is to implement the ISA (Instruction Set
Architecture) level above it.
 The design of the microarchitecture level
depends on the ISA being implemented, as well
as the cost and performance goals of the
computer.
 Many modern ISAs (particularly RISC) have
simple instructions that execute in a cycle.
The Microarchitecture Level
 More complex ISAs (such as the Pentium 4)
require many cycles per instruction.
 Executing the instruction may require:
• locating the operands in memory
• reading them
• storing the results back into memory
 The sequencing of operations within a single
instruction often leads to a different approach to
control than for simple ISAs.
An Example
Microarchitecture
 We will now show an example
microarchitecture level based on a subset of the
Java Virtual Machine ISA.
 The subset contains only integer operations and
is called the IJVM.
 IJVM has some relatively complex instructions.
• Many such architectures have been implemented
through microprogramming.
• The microarchitecture will contain a microprorgam
(in ROM), whose job is to fetch, decode, and
execute IJVM instructions.
An Example
Microarchitecture
 We can think of the design of the
microarchitecture as a programming problem,
where each instruction of the ISA level is a
function to be called by a master program.
 In this model, the master program is a simple,
endless loop that determines a function to be
invoked, calls the function, then starts over.
• The microprogram has a set of variables called the
state of the computer, that can be accessed by all the
functions.
An Example
Microarchitecture
• Each function changes at least some of the variables
making up the state.
• For example, the Program Counter (PC) is part of
the state. It indicates the memory location
containing the next function to be executed.
• During the execution of each instruction, the PC is
advanced to point to the next instruction to be
executed.
• Each IJVM instruction has a few fields, usually one
or two. The first field of every instruction is the
opcode. Many instructions have an additional field,
which specifies the operand.
The Data Path
 The data path is that part of the CPU containing
the ALU, its inputs, and its outputs.
 The registers at the microarchitecture level are
accessible only at that level.
 Most registers can drive their contents onto the
B bus.
 The output of the ALU drives the shifter and
the C bus, whose value can be written into one
or more registers at the same time.
 There are six ALU control lines.
The Data Path
The data path of the example
microarchitecture used in this
chapter.
The Data Path
•
•
•
•
F0 and F1 determine the ALU operation.
ENA and ENB individually enable the inputs.
INVA inverts the left input
INC adds 1 to the result.
 Not all 64 combinations of input lines do
something useful, however.
• The ALU needs two data inputs: a left input (A) and
a right input (B). Attached to the left input is a
holding register, H. Attached to the right input is the
B bus, which can be loaded from any of the nine
sources.
The Data Path
Useful combinations of ALU signals and the function performed.
The Data Path
 H can be loaded initially by choosing an ALU
function that just passes the right input (from
the B bus) through to the ALU output.
• For example, add the ALU inputs, but with ENA
negated.
 Two other control lines can be used
independently to control the output from the
ALU.
• SLL8 (Shift Left Logical) shifts the contents left by
1 byte, filling the 8 least significant bits with 0s.
The Data Path
• SRA1 (Shift Right Arithmetic) shifts the contents
right by 1 bit, leaving the most significant bit
unchanged.
• It is explicitly required to read and write the same
register on one cycle since reading and writing are
done at different times within the cycle.
 The timing of the events in a cycle is shown in
the next slide.
 A short pulse is derived from the main clock.
 The subcycles can best be thought of as being
implicit.
The Data Path
Timing diagram of one data path cycle.
Memory Operation
 The machine has two different ways to
communicate with memory:
• a 32-bit word-addressable memory port
• an 8-bit, byte addressable memory port
• The 32-bit port is controlled by two registers, MAR
(Memory Address Register) and MDR (Memory
Data Register).
• The 8-bit port is controlled by one register, PC,
which reads 1 byte into the low-order 8 bits of
MBR. The port can only read memory from data,
not write.
Memory Operation
 Each of these registers is driven by one or two
control signals.
• An open arrow under a register indicates a control
signal that enables the register’s output onto the B
bus.
• A solid black arrow indicates a control signal that
writes the register from the C bus.
• MAR contains word addresses.
• PC contains byte addresses.
• In the actual physical implementation, there is only
one real memory and it is byte oriented.
Memory Operation
 When MAR is placed on the address bus, its 32
bits do not map onto the 32 address lines, 0-31,
directly.
 Instead MAR bit 0 is wired to bus line 2, MAR
bit 1 to 3, etc.
 The upper 2 bits of MAR are discarded since
they are needed only for word addresses above
232 which are illegal on our 4-GB machine.
 When MAR is 2, address 8 is put on the bus.
When MAR is 1, address 4 is put on the bus.
Memory Operation
Mapping of the bits in MAR to the address bus.
Memory Operation
 Data read from the memory through the 8-bit
memory port are returned in MBR, an 8-bit
register.
• MBR can be copied onto the B bus in one of two
ways:
 The 32-bit value can be unsigned, by adding 24 zeros to
the upper bits.
 The value can be treated as a signed value between -128
and 127 and the sign bit extended to the upper 24 bits.
This is known as sign extension.
• The choice of these two options is given by the
control signals asserted.
Microinstructions
 To control the data path, we need 29 signals.
These can be divided into five groups:
• 9 signals to control writing data from the C bus into
the registers.
• 9 signals to control enabling registers onto the B bus
for ALU input.
• 8 signals to control the ALU and shifter functions.
• 2 signals (not shown) to indicate memory read/write
via MAR/MDR.
• 1 signals (not shown) to indicate memory fetch via
PC/MBR.
Microinstructions
 The values of these 29 control signals specify
the operations for one cycle of the data path.
 A cycle consists of moving values from the
registers, through the CPU, and back to the
registers.
 In addition, if a memory read data signal is
asserted, the memory operation is started at the
end of the data path cycle, after MAR has been
loaded. The data will be available at the end of
the following cycle, and usable in the cycle
after that.
Microinstructions
 We are assuming that the cache hit ratio is
100%.
 The output on the C bus can be written into
more than one register, but only one register
can be enabled onto the B bus in a cycle.
• We use 4 bits to determine which register, and use a
decoder (7 of these 16 signals are not needed).
 The data path can be controlled with 9 + 4 + 8
+ 2 + 1 = 24 signals, however we need
additional signals to determine what to do on
the next cycle.
Microinstructions
The microinstruction format for the Mic-1.
Microinstructions
 For this, we will a 9 bit NEXT_ADDRESS
field and a 3 bit JAM field.
 A sequencer is responsible for stepping through
the sequence of operations necessary for the
execution of a single ISA instruction.
 The sequencer produces two kinds of
information each cycle:
• The state of every control signal in the system.
• The address of the microinstruction that is to be
executed next.
Microinstructions
The complete block
diagram of our example
microarchitecture, the
Mic-1.
Microinstruction Control
 The control store can be thought of as a
memory that holds the complete microprogram.
 It holds 512 words, each one containing a 36bit microinstruction.
 The words in the control store are not usually
executed in address order since
microinstruction sequences tend to be short.
 Each microinstruction specifies its successor.
 MPC is the address register and MIR the data
register for the control store.
Microinstruction Control
 At the start of each clock cycle, MIR is loaded
from the word in the control store pointed to by
MPC.
 Once MIR is loaded, the various signals
propagate out into the data path. A register is
put on the B bus, the ALU performs an
operation, etc. When the ALU, N, Z, and shifter
outputs are stable, the N and Z values are saved
in a pair of 1-bit flip-flops. Now, the registers
are loaded via the C bus which receives its
value from the shifter. Finally, MPC is loaded.
Microinstruction Control
 To determine which microinstruction to execute
next, the nine-bit NEXT_ADDRESS field is
copied to MPC. Then the JAM field is
inspected. If it is zero, nothing more needs to be
done.
 If JAMN is set, the 1-bit N flip-flop is ORed
into the high-order bit of MPC.
 If JAMZ is, the 1-bit Z flip-flop is ORed into
the high-order bit of MPC.
 If both are set, both are ORed.
Microinstruction Control
 Thus MPC takes on the value of
NEXT_ADDRESS or of NEXT_ADDRESS
with the high-order bit ORed with 1.
 If JMPC is set, the 8 MBR bits are bitwise
ORed with the 8 low-order bits of the
NEXT_ADDRESS field (which will usually be
zero in this case).
• This allows for an efficient multiway branch (jump)
to be implemented. Typically MBR contains an
opcode, so JMPC results in the selection of the next
microinstruction to be executed for every opcode.
Microinstruction Control
A microinstruction with JAMZ set to 1 has two potential successors.
An Example ISA: IJVM
 The IJVM uses a stack for storing local
variables and parameters.
• The register LV points to the base of local variables
for the current procedure.
• The register SP points to the highest word of local
variables.
• The data structure between LV and SP is called the
local variable frame.
 When another procedure is called, another
frame is pushed onto the stack. This is
illustrated in the following slide.
IJVM Stacks
Use of a stack for storing local variables.
(a) While A is active. (b) After A calls B.
(c) After B calls C. (d) After C and B return and A calls D.
IJVM Stacks
 Stacks have another use, in addition to holding
local variables. They can be used for holding
operands during the computation of an
arithmetic expression.
 When used this way, the stack is referred to as
the operand stack.
 The operation of the stack during the execution
of the computation a1 = a2 + a3 is shown in the
next slide.
IJVM Stacks
Use of an operand stack for doing an arithmetic computation.
The IJVM Memory Model
 The IJVM memory can be seen as an array of
4,294,967,296 bytes (4 GB) or an array of
1,073,741,824 words.
 The following areas of memory are defined:
• The Constant Pool is a non writeable area consisting
of constants, strings, and pointers to other areas of
memory.
• The Local Variable Frame is the stack used for
methods.
• The Operand Stack.
• The Method Area contains the code for methods.
The IJVM Memory Model
The various parts of the IJVM memory.
IJVM Instruction Set
The IJVM instruction set. The operands byte, const, and varnum
are 1 byte. The operands disp, index, and offset are 2 bytes.
IJVM Instruction Set
 The INVOKEVIRTUAL function invokes
another method. IRETURN returns to the
calling method.
 The INVOKEVIRTUAL described here works
only on methods within its own object (the
method is not determined dynamically).
 This is not the way Java does it, but is similar to
C or Pascal. This is done for simplicity.
• A pointer to the method (OBJREF) and the
parameters are pushed on the stack prior to invoking
the method.
IJVM Instruction Set
(a) Memory before executing INVOKEVIRTUAL.
(b) After executing it.
IJVM Instruction Set
(a) Memory before executing IRETURN.
(b) After executing it.
Example
(a) A Java fragment.
(b) The corresponding Java assembly language.
(c) The IJVM program in hexadecimal.
Example
The stack after each instruction of Fig. 4-14(b).
Download