Uploaded by Enrique Martin Franco Vera

COMPUTER ORGANIZATION AND DESIGN

advertisement
COMPUTER ORGANIZATION AND DESIGN
5th
Edition
The Hardware/Software Interface
Chapter 1
Computer Abstractions and
Technology
■
Which airplane has the best performance?
Boeing 777
Boeing 777
Boeing 747
Boeing 747
BAC/Sud Concorde
BAC/Sud Concorde
Douglas DC-8-50
Douglas DC-8-50
0
125
250
375
0
500
Passenger Capacity
Boeing 777
Boeing 747
Boeing 747
BAC/Sud Concorde
BAC/Sud Concorde
Douglas DC-8-50
Douglas DC-8-50
350
700
1050
Cruising Speed (mph)
4500
6750
9000
Cruising Range (miles)
Boeing 777
0
2250
§1.6 Performance
Defining Performance
1400
0
75000
150000 225000 300000
Passengers x mph
Chapter 1 — Computer Abstractions and Technology — 26
Relative Performance
■
■
Define Performance = 1/Execution Time
“X is n time faster than Y”
Performanc e X Performanc e Y
= Execution time Y Execution time X = n
■
Example: time taken to run a program
■
■
■
10s on A, 15s on B
Execution TimeB / Execution TimeA
= 15s / 10s = 1.5
So A is 0.5 (50%) faster than B
Chapter 1 — Computer Abstractions and Technology — 28
CPU Time
CPU Time = CPU Clock Cycles × Clock Cycle Time
CPU Clock Cycles
=
Clock Rate
■
Performance improved by
■
■
■
Reducing number of clock cycles
Increasing clock rate
Hardware designer must often trade off clock
rate against cycle count
Chapter 1 — Computer Abstractions and Technology — 31
CPU Time Example
■
■
Computer A: 2GHz clock, 10s CPU time
Designing Computer B
■
■
■
Aim for 6s CPU time
Can do faster clock, but causes 1.2 × clock cycles
How fast must Computer B clock be?
Clock CyclesB 1.2 × Clock Cycles A
Clock RateB =
=
CPU Time B
6s
Clock Cycles A = CPU Time A × Clock Rate A
= 10s × 2GHz = 20 × 10 9
1.2 × 20 × 10 9 24 × 10 9
Clock RateB =
=
= 4GHz
6s
6s
Chapter 1 — Computer Abstractions and Technology — 32
Instruction Count and CPI
Clock Cycles = Instruction Count × Cycles per Instruction
CPU Time = Instruction Count × CPI × Clock Cycle Time
Instruction Count × CPI
=
Clock Rate
■
Instruction Count for a program
■
■
Determined by program, ISA and compiler
Average cycles per instruction
■
■
Determined by CPU hardware
If different instructions have different CPI
■
Average CPI affected by instruction mix
Chapter 1 — Computer Abstractions and Technology — 33
CPI Example
■
■
■
■
Computer A: Cycle Time = 250ps, CPI = 2.0
Computer B: Cycle Time = 500ps, CPI = 1.2
Same ISA
Which is faster, and by how much?
CPU Time
A
CPU Time
B
= Instruction Count × CPI × Cycle Time
A
A
= I × 2.0 × 250ps = I × 500ps
A is faster…
= Instruction Count × CPI × Cycle Time
B
B
= I × 1.2 × 500ps = I × 600ps
CPU Time
B = I × 600ps = 1.2
CPU Time
I × 500ps
A
…by 20%
Chapter 1 — Computer Abstractions and Technology — 34
CPI in More Detail
■
Various instruction types (i) uses various
numbers of CPU clock cycles
n
Clock Cycles = ∑ (CPIi × Instruction Count i )
i=1
■
Weighted average CPI
n
Clock Cycles
Instruction Count i $
'
CPI =
= ∑ % CPIi ×
"
Instruction Count i=1 &
Instruction Count #
Relative frequency
Chapter 1 — Computer Abstractions and Technology — 35
CPI Example
■
■
Alternative compiled code sequences using
instructions in classes A, B, C
Class
A
B
C
CPI for class
1
2
3
IC in sequence 1
2
1
2
IC in sequence 2
4
1
1
Sequence 1: IC = 5
■
■
Clock Cycles
= 2×1 + 1×2 + 2×3
= 10
Avg. CPI = 10/5 = 2.0
■
Sequence 2: IC = 6
■
■
Clock Cycles
= 4×1 + 1×2 + 1×3
=9
Avg. CPI = 9/6 = 1.5
Chapter 1 — Computer Abstractions and Technology — 36
Performance Summary
The BIG Picture
Instructions Clock cycles Seconds
CPU Time =
×
×
Program
Instruction Clock cycle
■
Performance depends on
■
■
■
■
Algorithm: affects IC, possibly CPI
Programming language: affects IC, CPI
Compiler: affects IC, CPI
Instruction set architecture: affects IC, CPI, Tc
Chapter 1 — Computer Abstractions and Technology — 37
■
Improving an aspect of a computer and
expecting a proportional improvement in overall
performance
Timproved
■
Taffected
=
+ Tunaffected
improvemen t factor
Example: multiply accounts for 80s/100s
■
How much improvement in multiply performance to
get 5× overall?
80
20 =
+ 20
n
■
§1.10 Fallacies and Pitfalls
Pitfall: Amdahl’s Law
■
Cannot be done!
Corollary: make the common case fast
Chapter 1 — Computer Abstractions and Technology — 46
Fallacy: Low Power at Idle
■
Look back at i7 power benchmark
■
■
■
■
Google data center
■
■
■
At 100% load: 258W
At 50% load: 170W (66%)
At 10% load: 121W (47%)
Mostly operates at 10% – 50% load
At 100% load less than 1% of the time
Consider designing processors to make
power proportional to load
Chapter 1 — Computer Abstractions and Technology — 47
Pitfall: MIPS as a Performance Metric
■
MIPS: Millions of Instructions Per Second
■
Doesn’t account for
■
■
Differences in ISAs between computers
Differences in complexity between instructions
MIPS =
=
■
Instruction count
Execution time × 10 6
Instruction count
Clock rate
=
6
Instruction count × CPI
CPI
×
10
6
× 10
Clock rate
CPI varies between programs on a given CPU
Chapter 1 — Computer Abstractions and Technology — 48
Download