Execution from the Perspective of Hardware Low end of the

Execution from the Perspective of Hardware
 Low end of the Abstraction Stack (Hardware)
o A series of contracts
 Modular
 Narrows in downward direction
o Instruction set architecture
 Set of commands that a processor understands
 Tension between expressivity and viability of implementation
 Layer determines much of HW compatibility
o OS? Virtualization?
 Not strictly necessary
 Useful in practice
 Manage visibility of HW (hardware) layer, ABIs
 Abstractation stack is similar across most things we consider computers – machines supporting
traditional programming models, from smartphones to supercomputers
 Computation is universal – any computer can compute anything computable
o Computer – turing machine (theoretical model of what computation means
o Computable – decidable on a turing machine – can be expressed as an algorithm
o Actual computers aren't turing machines – they have finite resources
 What do hardware designers care about
o Performance
 Latency – how quickly can I do one task
 Throughput – how much work/unit time
o Compatibility/correctness
 Economic motivation – HW more useful when SW exists
 Very expensive to change mind or fix bugs
o Area
 More circuits and more complicated circuits are often bigger
 Bigger chips are superlinearly more expensive
 Finite resources have deep implications
o Power/energy
 Practical thermal/packaging constraints
 key metric for both mobile, high performance computing
 modern computing is energy-constrained
o mobile
 battery size
 battery life
 energy/charge – power – thermal
o warehouse scale
 energy is non-trivial fraction of TCO
 exascale – 10^11 FLOPS/J
Exascale vs Human Brain
o Exascale
 1018 Floating Point operations
 10 s of megawatts
 Vision – SW approaches real time
o Human brain
 1015 synapses
 20 W power don't know encoding
 Vision – real time high definition
Simple model of computer
o Input to memory
o Memory, datapath, control all bi-directional connections
o Output from memory
o Stored program is kept in memory
Program vs. data
o Instruction data looks same as operand data
o Program can read its own instructions
o Program may rewrite its own instructions
o Access patterns are different
o Correctness/performance/security concerns
HW view of program
o Program as a graph of instructions
 Set of operations to perform
 Set of operands to perform them on
 Set of sequencing constraints among operations
o Multiple ways to express each of the above
 Implicit vs. explicit operands
 Von Neuman vs. dataflow ordering
 VN – one after another after another – very structured – you are here
 Dataflow – dependency of inputs and outputs – independent inputs and
outputs run
Costs/benefits of simple models
o Simple model simple to implement
 Designed in era where area was primary constraint
 Verification still a key cost of system development
o Simple model simple to reason about
 For one processor
 Concurrency not simplified
o How to express parallelism
 Von Neumann model imposes artificial ordering constraints
 Significant inefficiency in re-extracting inherent parallelism
How to express distribution of computation over time/space
 Dynamic vs. static
 Local vs. global – model does not aggressively capture locality
Key trends and ideas
o Moore's law
 Integration increases exponentially
 Not actually a law – more like a self-fulfilling prophecy
 Throw transistors at the problem
 Double number of transistors on a chip every 12-18 months
o Amdahl's law
 Actually a law
 Degree of improvement bounded by degree of applicability of optimization
o Abstraction
 Free to do a lot in the microarchitecture
 X86 micro-operations (internal ISA)
o Local is
 Faster
 Smaller
 Less energy