Execution from the Perspective of Hardware Low end of the Abstraction Stack (Hardware) o A series of contracts Modular Narrows in downward direction o Instruction set architecture Set of commands that a processor understands Tension between expressivity and viability of implementation Layer determines much of HW compatibility o OS? Virtualization? Not strictly necessary Useful in practice Manage visibility of HW (hardware) layer, ABIs Abstractation stack is similar across most things we consider computers – machines supporting traditional programming models, from smartphones to supercomputers Computation is universal – any computer can compute anything computable o Computer – turing machine (theoretical model of what computation means o Computable – decidable on a turing machine – can be expressed as an algorithm o Actual computers aren't turing machines – they have finite resources What do hardware designers care about o Performance Latency – how quickly can I do one task Throughput – how much work/unit time o Compatibility/correctness Economic motivation – HW more useful when SW exists Very expensive to change mind or fix bugs o Area More circuits and more complicated circuits are often bigger Bigger chips are superlinearly more expensive Finite resources have deep implications o Power/energy Practical thermal/packaging constraints key metric for both mobile, high performance computing modern computing is energy-constrained o mobile battery size battery life energy/charge – power – thermal o warehouse scale energy is non-trivial fraction of TCO exascale – 10^11 FLOPS/J Exascale vs Human Brain o Exascale 1018 Floating Point operations 10 s of megawatts Vision – SW approaches real time o Human brain 1015 synapses 20 W power don't know encoding Vision – real time high definition Simple model of computer o Input to memory o Memory, datapath, control all bi-directional connections o Output from memory o Stored program is kept in memory Program vs. data o Instruction data looks same as operand data o Program can read its own instructions o Program may rewrite its own instructions o Access patterns are different o Correctness/performance/security concerns HW view of program o Program as a graph of instructions Set of operations to perform Set of operands to perform them on Set of sequencing constraints among operations o Multiple ways to express each of the above Implicit vs. explicit operands Von Neuman vs. dataflow ordering VN – one after another after another – very structured – you are here Dataflow – dependency of inputs and outputs – independent inputs and outputs run Costs/benefits of simple models o Simple model simple to implement Designed in era where area was primary constraint Verification still a key cost of system development o Simple model simple to reason about For one processor Concurrency not simplified o How to express parallelism Von Neumann model imposes artificial ordering constraints Significant inefficiency in re-extracting inherent parallelism o How to express distribution of computation over time/space Dynamic vs. static Local vs. global – model does not aggressively capture locality Key trends and ideas o Moore's law Integration increases exponentially Not actually a law – more like a self-fulfilling prophecy Throw transistors at the problem Double number of transistors on a chip every 12-18 months o Amdahl's law Actually a law Degree of improvement bounded by degree of applicability of optimization o Abstraction Free to do a lot in the microarchitecture X86 micro-operations (internal ISA) o Local is Faster Smaller Less energy