Complex Instruction Set Computer

advertisement
Manish Kulkarni
Department of Electrical and Computer Engineering
Auburn University, Auburn, AL 36849
mmk0002@auburn.edu
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
1
Overview
 What is CISC and Why to learn?
 History
 Architecture
 Typical x86 design
 Characteristics & Addressing modes
 CISC Vs RISC
 Example Programs
 The Performance Equation
 FAQs
 Recent Developments & Future Scope
 Resources
 Questions
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
2
What is CISC?
 Definition: Pronounced "sisk" and standing for Complex
Instruction Set Computer, is a Microprocessor Architecture
that aims at achieving complex operations with single
instructions and favors the richness of the instruction set
(typically as many as 200 unique instructions) over the speed
with which individual instructions are executed.
Why should I know about CISC?
 Today’s computers still use processors which are based on CISC
designs
 It has been a prominent architecture since 1978
 Most Emerging Processor designs combine features of CISC
and RISC to create better designs.
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
3
History
Generation
First introduced
Prominent Consumer CPU linear / physical address
brands
space
1 (IA-16)
1978
Intel 8086, Intel 8088
2
1982
Intel 80186, Intel 80188,
NEC V20
see above
2
1982
Intel 80286
16-bit (30-bit virtual) / 24- MMU, for protected mode
bit (segmented)
and a larger address space
3 (IA-32)
1985
Intel386, AMD Am386
32-bit (46-bit virtual) / 32- 32-bit instruction set,
bit
MMU with paging
4
1989
Intel486
see above
RISC-like pipelining,
integrated FPU, on-chip
cache
5
1993
Pentium, Pentium MMX
see above
superscalar, 64-bit
databus, faster FPU, MMX
5/6
1996
Cyrix 6x86, Cyrix MII
see above
register renaming,
speculative execution
Pentium Pro, AMD K5
μ-op translation, PAE (not
see above / 36-bit physical
K5), integrated L2 cache
(PAE)
(not K5)
6
1995
Notable (new) features
16-bit / 20-bit (segmented) first x86 microprocessors
hardware for fast address
calculations, fast mul/div
etc
Continued….
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
4
Continued….
Prominent Consumer CPU linear / physical address
brands
space
AMD K6/-2/3, Pentium
see above
II/III
Generation
First introduced
6
1997
7
1999
Athlon, Athlon XP
see above
7
2000
Pentium 4
see above
6/7-M
2003
Pentium M
see above
L3-cache support, 3D Now,
SSE
superscalar FPU, wide
design (up to three x86
instr./clock)
deeply pipelined, high
frequency, SSE2, hyperthreading
optimized for low power
8 (x86-64)
2003
Athlon 64
64-bit / 40-bit physical in
first impl.
x86-64 instruction set, ondie memory controller
8
2004
Prescott
see above
very deeply pipelined,
very high frequency, SSE3
9
2006
Intel Core, Intel Core 2
see above (some are 32bit only)
low power, multi-core,
lower clock frequency
see above
monolithic quad-core, 128
bit FPUs, SSE4a Hyper
Transport 3, native
memory controller, on-die
L3 cache
10
4/28/2008
2007-2008
AMD Phenom
Computer Architecture & Design (6200)
Class Presentation
Notable (new) features
5
Architecture
A typical x86 Architecture
Intel 8086 Architecture, the 1st member of x86 family
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
6
Characteristics
o
o
o
o
CISC are Mostly Von Neumann Architecture (There are few exceptions)
Same bus for program memory, data memory, I/O, registers, etc
Generally Micro-coded ,Variable length instructions
Segmentation is possible with Segment Register s like DS, ES and an offset which
can be common to all segments.
o Many powerful instructions are supported, making the assembly language
programmer’s job much easier.
o Physical Memory Extension Possible
Addressing modes
o
o
o
o
o
o
o
Register Addressing Mode
Memory Addressing Modes
Displacement Only Addressing Mode
Register Indirect Addressing Modes
Indexed Addressing Modes
Based Indexed Addressing Modes
Based Indexed Plus Displacement Addressing
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
7
CISC Vs RISC
Example Program
Main Memory
General Purpose
Registers
ALU
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
8
Consider following task of Multiplication
15
20
Operands:
M[2:3] = operand 1 (15)
M[5:2] = operand 2(20)
Task : Multiplication
Result:
M[2:3] <= result
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
9
The CISC Approach
 Instruction :
 Operations:
1.
MULT 2:3, 5:2
2.
3.
4.
Loads the two operands into
separate registers
Multiplies the operands in the
execution unit
Then stores the product in the
some temporary register
Stores value back to memory
location 2:3
• MULT is what is known as a "complex instruction."
• Operates directly on the computer's memory banks
• Does not require the programmer to explicitly call any loading or storing
functions.
• closely resembles a command in a higher level language.
e.g. a ‘C’ statement
"a = a * b."
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
10
The RISC Approach
 Instructions :
 Operations:
1.
LW
LW
MULT
SW
A, 2:3
B, 5:2
A, B
2:3, A
2.
3.
4.
Load operand1 into register A
Load operand2 into register B
Multiply the operands in the
execution unit and store result in A
Store value of A back to memory
location 2:3
• These set of Instructions is known as a “Reduced Instructions."
• Cannot Operate directly on the computer's memory banks
• Requires the programmer to explicitly call any loading or storing functions.
• RISC processors only use simple instructions that can be executed within one
clock cycle
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
11
CISC
RISC
 Primary goal is to complete a
 Primary goal is to speedup






task in as few lines of
assembly as possible
Emphasis on hardware
Includes multi-clock
complex instructions
Memory-to-memory:
"LOAD" and "STORE"
incorporated in instructions
Small code sizes
High cycles per second
Variable length Instructions
4/28/2008
individual instruction
 Emphasis on software
 Single-clock,




reduced instruction only
Register to register:
"LOAD" and "STORE"
are independent instructions
Large code sizes
Low cycles per second
Equal length instructions
which make pipelining
possible
Computer Architecture & Design (6200)
Class Presentation
12
The Performance Equation
The following equation is commonly used for expressing a computer's
performance ability:
1
2
The CISC approach
• minimizes the number of instructions per program (2)
• sacrificing the number of cycles per instruction. (1)
RISC does the opposite
• reduces the cycles per instruction (1)
• sacrificing number of instructions per program (2)
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
13
FAQs
Which one is faster?
Well, it is commonly accepted that RISC ISA's should make computers faster.
The main reason why is because RISC computers figure out more words in a
shorter amount of time due to pipelining.
So why isn't my computer a RISC?
• CISC ISA's were implemented in the first personal computers
• With more people buying computers, CISC isa's became more prominent
• Software (especially OS) was developed and "translated" so that personal
computers speaking x86 would be able to interact with its users
• Because there was so much software written for computers "speaking" x86,
people continued to buy those computers.
• If we tried to switch to another ISA, we would not have all of the software
choices we have now.
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
14
So why would someone want to develop another ISA?
• x86 (and CISC) make poor use of the faster hardware we have now.
• Another problem with x86 is that people have been trying to make it faster for
a long time, at least 20 years, and after a while you have found most of the ways to
speed the computer up significantly
Why don't we just switch to RISC?
• Although it is not used on your desktop PC, RISC ISA's are implemented in
many mainframe computers.
• Programmers have been trying to make RISC faster for a long time, and they
have found many of the areas in which it is able to be sped up significantly.
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
15
Where are we running into problems speeding up RISC and
CISC?
We are running into problems with speeding up the computer in 2 areas
1. Branching Decisions and predictions consume good amount of processing
time
2. Access to memory to fetch instruction and data
So What we are going to do?
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
16
Recent Developments & Future Scope
o The terms RISC and CISC have become less meaningful with
the continued evolution of both CISC and RISC designs and
implementations.
o Modern x86 processors also decode and split more complex
instructions into a series of smaller internal "micro-operations"
which can thereby be executed in a pipelined (parallel) fashion,
thus achieving high performance on a much larger subset of
instructions.
o Attempts have been made to combine features of both RISC
and CISC to develop a new approach
o Intel has teamed up with Hewlett-Packard to design a new
type of ISA. They are calling it IA-64 (Intel Architecture 64)
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
17
IA-64
What is IA-64?
• IA-64 is a new instruction set architecture.
• IA-64 seeks to address: branch delays and memory latency.
What main principles is IA-64 designed around?
• IA-64 seeks to exploit instruction level parallelism to the highest degree.
• Intel and HP have called their method of exploiting this parallelism in IA-64 EPIC
(Explicitly Parallel Instruction Computing).
• EPIC simulates parallelism by having the compiler find what instructions can be
executed in parallel and "explicitly" package them for the CPU.
How does IA-64 help with branch delays?
• IA-64 takes a unique approach of prediction to reduce the consequences of branch
delays.
• The compiler can append a predicate to any instruction it chooses. The compiler will
append predicates to instructions that depend on the outcome of a branch in order to
help reduce branch penalties.
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
18
How does IA-64 deal with memory latency issues?
• Memory latency occurs because CPU processing speed is significantly faster than the
speed of fetching data from memory.
• IA-64 suggests a new way to eliminate some memory latency problems, speculative
loading.
IA-64 Realities:
• "A study in ISCA '95 by S. Malhlke, et. al. demonstrated that predication removed over
50% of the branches and 40% of the mis-predicted branches from several popular
benchmark programs."
( http://www.hp.com/esy/technology/ia_64/products/isapress.html )
• IA-64 lack compatibility with Intel x86 and HP PA-RISC architectures, so this additional
compatibility logic will take lot of die space.
• Presently, the compilers are in experiment phase and IA-64 has no OS support.
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
19
Resources
o http://www.pctechguide.com/glossary/WordFind.php?wordInput=CISC
o http://www.cs.umd.edu/class/fall2001/cmsc411/projects/IA64/
o http://cse.stanford.edu/class/sophomore-college/projects00/risc/risccisc/index.html
o http://en.wikipedia.org/wiki/Complex_instruction_set_computer
o http://en.wikipedia.org/wiki/X86
o http://arstechnica.com/cpu/4q99/risc-cisc/rvc-6.html
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
20
Questions ??
4/28/2008
Computer Architecture & Design (6200)
Class Presentation
21
Download