Lecture 11 - Evolution and Performance1

advertisement
+
CS 325: CS Hardware and Software
Organization and Architecture
Computer Evolution and
Performance 1
+
Outline

Generations in Computer Organization

Milestones in Computer Organization

Von Neumann Architecture

Moore’s Law

CPU Transistor sizes and count

Memory Hierarchy

Performance

Cache Memory

Performance Issues and Solutions
+
History – Generations in Computer
Organization

Zeroth generation – mechanical computers (1642 – 1945)


First generation – vacuum tubes (1945 –1955)


Harwell CADET, TX-0
Third generation – integrated circuits (1965 –1980)


ENIAC, Colossus
Second generation – transistors (1955 –1965)


Blaise Pascal’s mechanical calculator
IBM System/360
Fourth generation – VLSI (Very Large Scale Integration ) (1980 - ?)

Microprocessors
 Supercomputers
+
History – Milestones in Computer
Organization

1642 – Blaise Pascal,


c. 1670 – Baron Gottfried Wilhelm von Leibniz,


Mechanical machine with gears used to add and subtract.
Mechanical machine with gears used to add, subtract, multiply, and
divide.
1834 – Charles Babbage,

Analytical Engine


Ada Lovelace wrote its “assembler”
1936 – Konrad Zuse,

Z1

Calculator made of electromagnetic relays
+
History – Milestones in Computer
Organization

1940’s – John Atanasoff, George Stibbitz, Howard Aiken,


Binary arithmetic

Capacitors for memory
COLOSSUS, the first electronic computer.
1946 – John Mauchley, John Presper Eckert,



1943 – Alan Turing and British Government,


Each worked independently on calculating machines with properties
such as:
ENIAC, vacuum tubes
1949 – Maurice Wilkes,

EDSAC, first stored program computer.
+
History – Milestones in Computer
Organization

1952 – John von Neumann,

von Neumann architecture – most current computers use this basic
design.


1950’s – Researchers at MIT,


Transistorized Experimental Computer Zero (TX-0), first computer to
use transistors (3600 transistors).
1960 – Digital Equipment Corporation (DEC),


Stored program computer w/ shared memory for instructions and
data.
PDP-1, first mini computer
1961 – IBM,

1401, popular small business computer.
+
History – Milestones in Computer
Organization

1962 – IBM,

7094 and 709 using transistors.


1963 – Burroghs,


B500, first computer designed for a high-level language (Algol,
precursor to C).
1964 – Seymour Cray, Control Data Group,

6600, 10x faster than IBM 7094


Used mainly for scientific computing.
Highly parallelized CPU
1965 – PDP-8,

First mass-market mini computer.

Used a single bus.
+
History – Milestones in Computer
Organization

1964 – IBM,


First computers with multiprogramming.

Used integrated circuits (dozens of transistors on one “chip”).
8080, first general purpose computer on a chip.
1974 – Cray-1,



1974 – Intel,


System/360, family of compatible computers (low end to high end).
First vector computer (single instructions on vectors of numbers).
1978 – DEC,

VAX, first 32-bit mini computer.
+
History – Milestones in Computer
Organization

1978 – Steve Jobs, Steve Wozniak,


1981 – IBM,


MIPS, first commercial RISC computer.
1987 – Sun Microsystems,


IBM PC becomes the most popular personal computer.
 Used MS-DOS by Microsoft as OS.
 CPU developed by Intel.
1985 – MIPS (company),


Apple personal computer
SPARC, popular RISC computer.
1990 – IBM,

RS6000, first superscalar machine
+
History – Milestones in Computer
Organization

1993 – Intel,

Pentium CPU released.

32-bit, 60 Mhz

3.2 million transistors

$878

1993 – NVIDIA is founded.

1994 – Intel,


Pentium 2 CPU released.
1999 – Intel,

Pentium 3 CPU released.

32-bit, 550 Mhz
+
History – Milestones in Computer
Organization

2003 – AMD,


Athlon 64 CPU is released.
2005 – AMD, Intel,

Dual core CPU’s released.
+
Computer and CPU Organization

Definition:

The terms processor, CPU, and computational engine refer broadly to
any mechanism that drives computation.
+
Von Neumann Architecture
 Stored

Program/Data concept
Characteristic of most modern CPUs
 Main
 ALU
memory stores programs (instructions) and data.
operates on binary data.
 Control
unit interprets instructions from memory and
passes information along to ALU.
+
Von Neumann Architecture – Three
Basic Components
 CPU
 Memory
 I/O
facilities
Control units
 Busses

All interact to form a complete computer
+
Structure of Von Neumann
Architecture
+
Milestones in Computer
Organization

Moore’s Law:

Number of transistors on a chip doubles every 18 months.
+
Milestones in Computer
Organization

Moore’s Law:

Increased density of components on chip.

Originally, thought number of transistors on a chip will double every
year.

Since the 1970’s, development has slowed. Number of transistors
doubles every 18 months.

Cost of chip has remained almost unchanged.

Higher packing density means shorter electrical path, giving higher
performance.

Smaller size gives increased flexibility.

Reduced power and cooling requirements.

Fewer interconnections increases reliability.
+
CPU Transistor Sizes - Intel

8086 – 29K transistors, 3µm

80186 – 29K transistors, 2µm

80286 – 134k transistors, 1.5µm

80386 – 855k transistors, 1µm

80486 – 1.6m transistors, 0.6µm

Pentium 1 – 4.5m transistors, 0.35µm

Pentium 2 – 7.5m transistors, 0.35µm

Pentium 3 – 9.5m transistors, 0.25µm

Pentium 4 – 42m transistors, 0.18µm

Pentium m – 140m transistors, 0.13µm

Pentium D – 230m transistors, 90nm

Core 2 – 291m transistors, 65nm

Current Intel architecture: Core I series “Haswell”

I7: 1.4b transistors, 22nm
+
Growth in CPU Transistor Count
+
Memory Hierarchy Importance

1980: No cache memory on CPU.

1989: First Intel CPU that included cache memory.

1995: 2-level cache on CPU.

2003: 3-level cache on CPU.

2013: 4-level cache on CPU.

Intel Haswell architecture
w/ integrated Iris Pro Graphics
+
How to Increase Performance?
 Pipelining
 On
board cache memory
 Branch
 Data
prediction
flow analysis
 Speculative
execution
+
Performance Balance
 CPU
performance increasing
 Memory
capacity increasing
 Memory
speed lagging behind CPU performance
+
Core Memory
 1950’s
1

– 1970’s
core = 1 bit
Polarity determines logical “1” or “0”
 Roughly
 Up
1Mhz clock rate.
to 32kB storage.
+
Semiconductor Memory
 1970’s
- Today
Fairchild
 Size of a single core



i.e. 1 bit of magnetic core storage
Holds 256 bits
 Non-destructive

SDRAM most common, uses capacitors.
 Much

faster than core
Today: 1.3 – 3.1 Ghz
 Capacity

read, but volatile
approximately doubles each year.
Today: 64GB per single DIMM
+
CPU (Logic) and Memory
Performance Gap
+
Solutions
 Increase

Make DRAM “wider” rather than “deeper”
 Change

DRAM interface
Cache
 Reduce

frequency of memory access
More complex cache and cache on chip
 Increase


number of bits retrieved at one time
interconnection bandwidth
High speed buses
Hierarchy of buses
+
Improvements in CPU Organization
and Architecture
 Increase

Fundamentally due to shrinking logic gate size
 More gates, packed more tightly, increasing clock rate
 Propagation time for signals reduced
 Increase


size and speed of caches
Dedicating part of processor chip
 Cache access times drop significantly
 Change

hardware speed of processor
processor organization and architecture
Increase effective speed of execution
Parallelism
+
Problems with Clock Speed and
Logic Density

Power



Resistor-Capacitor (RC) delay





Speed at which electrons flow limited by resistance and capacitance
of metal wires connecting them
Delay increases as RC product increases
Wire interconnects thinner, increasing resistance
Wires closer together, increasing capacitance
Memory latency


Power density increases with density of logic and clock speed
Dissipating heat
Memory speeds lag processor speeds
Solution:

More emphasis on organizational and architectural approaches
+
Intel CPU Performance
+
Increased Cache Capacity

Typically two or three levels of cache between processor and main
memory.

Chip density increased

More cache memory on chip

Faster cache access

Pentium chip devoted about 10% of chip area to cache.

Pentium 4 devotes about 50%.
+
Increased Cache Capacity
+
+
More Complex Execution Logic
 Enable
parallel execution of instructions
 Pipeline
works like assembly line
 Different
stages of execution of different instructions
at same time along pipeline
 Superscalar
allows multiple pipelines within
single processor
 Instructions
that do not depend on one another can
be executed in parallel
+
Diminishing Returns
 Internal
organization of processors complex
 Can
get a great deal of parallelism
 Further significant increases likely to be relatively
modest
 Benefits
from cache are reaching limit
 Increasing
clock rate runs into power
dissipation problem
 Some
fundamental physical limits are being reached
+
New Approach – Multiple Cores
 Multiple

processors on single chip
Large shared cache
 Within
a processor, increase in performance
proportional to square root of increase in complexity
 If
software can use multiple processors, doubling
number of processors almost doubles performance
 So, use
two simpler processors on the chip rather than
one more complex processor
 With

two processors, larger caches are justified
Power consumption of memory logic less than processing logic
Download