Architecture/Compiler Co-Design space exploration Christophe Dubach Timothy M. Jones Michael F.P. O'Boyle School of Informatics, Institute for Computing Systems Architecture, University of Edinburgh Monday 3rd November 2008 ECDF review meeting 1 Introduction ► Energy-efficient ● microprocessors / software Search large design spaces ► Machine-learning ● Model the design space ► Training ● for design-space exploration + Validation Large data set required ➜ simulations 2 Outline ► Past ● Research: Microarchitecture Design ► Current ● Architecture/Compiler Optimisations Co-Design ► Future ● Research: Research: Multicore Design Space Exploration ► Research Outcome 3 Past Research: Microarchitecture Design ► Parameters ● ● ● ● Pipeline width Reorder buffer, issue queue, load/store queue Register file size and number of ports Caches and branch predictor (Intel Pentium) 4 Different possible designs configuration A ● ● small pipeline width large data cache configuration Z ● ● wide pipeline small caches 5 Design Space Exploration program source default compiler settings Human readable program binary Simulation Machine readable Compilation conf. A conf. B 3000 ... conf. Z ► Compile program once ► Simulate on different configurations ► Obtain delay & energy values 6 ED ED (Energy x Delay) space Microarchitectural configurations 7 Performance prediction 1.2e+08 prediction real 1e+08 6e+07 ED Energy 8e+07 4e+07 2e+07 0 0 500 1000 1500 2000 2500 Microarchitectural configurations 3000 Configurations 8 Experiments ► Setup ● 26 programs (SPEC2k) ● sample space: 3000 design points ● total simulations: 78,000 ► Time ● simulation time on average: 16 minutes ● 900 days on a single machine ● ~2 weeks with Condor (Informatics) ● 3 days with ECDF 9 Current Research: Architecture/Compiler Co-Design ► Embedded ● mobile phones, pda, set-top box, cameras... ► Fixed ● set of applications run for a very long time ► Software ● systems optimisations make a difference compiler optimisations affect performance/energy 10 Impact of optimisations: 1000 random compiler settings Simulations program source default compiler settings program binary conf. A 11 Impact of optimisations: 1000 random compiler settings Simulations program source default compiler settings Different compiler settings program binary conf. A program binary 1 program binary 2 ... program binary 1000 12 Impact of optimisations: 1000 random compiler settings Simulations program source default compiler settings Different compiler settings program binary conf. A program binary 1 program binary 2 ... program binary 1000 13 Current Research: Architecture/Compiler Co-Design Simulations program source default compiler settings Different compiler settings program binary program binary 1 program binary 2 ... conf. A conf. B ... conf. Z program binary 1000 14 Current Research: Architecture/Compiler Co-Design Simulations program source default compiler settings Different compiler settings program binary program binary 1 program binary 2 ... conf. A conf. B ... conf. Z program binary 1000 15 Design Space 7 O1 6 ED 5 4 3 2 1 0 Architectural Configurations 16 Co-Design 7 O1 6 ED 5 4 3 2 1 0 Architectural Configurations 17 Experiments ► Space ● MiBench: 35 programs ● sample of 200 arch & 1000 optimisations ● 200 x 1000 = 200,000 design points ● total simulations: 7,000,000 ► Time ● simulation time on average: 2.5 minutes ● 12,000 days on a single machine ! ● ~1 month with ECDF 18 Future Research: Multicore Design Exploration ► Analyse design space of multicore processors ● few cores → large/complicated cores ● many cores → small/simple cores ► How do you go from 1 core to 4 cores? ? 19 Experiments ► Average simulation time ● 1 core: 10 minutes ● 2 cores: 20 minutes ● 4 cores: 40 minutes ● ... ● 64 cores: 640 minutes ► Total 1 day time (12 programs, 1000 conf.) ● 12,000 days on a single machine ● 1 month with ECDF 20 Research Outcome ► 2007: ● ● 1 publication in prestigious MICRO conference 1 paper in submission ► 2008: ● microarch. design space co-design space 1 publication in CASES conference ● 1 paper in submission ● 1 paper in progress ► 2009: ● MICRO 2007: Microarchitectural Design Space Exploration Using An Architecture-Centric Approach. CASES 2008: Exploring and Predicting the Architecture/Optimising Compiler Co-Design Space multicore design space 1 paper in progress 21 Summary ► Large design spaces explored ● requires enormous amount of computing power ● not possible without ECDF ► International publications ● 2 published ● 2 in submission ● 2 in progress ► Thanks to the ECDF team! 22