Specific Choice of Soft Processor Features Mark Grover Prof. Greg Steffan Dept. of Electrical and Computer Engineering Hard and Soft Processors Hard Processors Soft Processors •Made from transistors •Built on FPGA Fabric Processor •Cost millions to make •Are customizable Architecture •Faster in speed •Can cater to application specific needs •Consume less power Verilog Research Problem • Choose the best micro-architectural features – Want to optimize the use of resources • • • • Power consumption(as minimum as possible) Area(as less as possible) Wall Clock Time(lesser the better) Time Spent SPREE • Soft Processor Rapid Exploration Environment • Scanned the whole of design space • Is it viable enough? – What if a new application comes into picture? – What if the performance criteria changes? • Say, the user doesn’t care about area any more? Research Objective Maximum power, area Software Application Enhanced Simulator Part 1 Approximates Enhanced Simulator (MINT) Enhanced Simulator Part 2 •What if a new application comes into picture? •What if the performance criteria changes? Fastest microarchitectural combination Outline • Motivation • Implementation – Implementation Scheme(in general) – Data deciphering • Results – Multiplier option • Discussion • Conclusion • Long Term Goal Implementation Scheme Experimental Data for some Benchmarks Look for trends and dependencies Comparing with the tradeoffs and providing the best solution Propose a suitable relationship Data Deciphering • Multiplier option(Hard/Soft Multiplier) – Approximate cycle count change on using them? • Multiplication operation is converted to a set of shifts and adds – Simulated the algorithm to find the equivalent number of instructions – Plotted the number of equivalent instructions vs. the changes in cycle counts(experimental data) Hard and Soft Multiplier Hard Multiplier • Does the multiply operation as a single instruction • Occupies finite area • Delays the clock by a finite time • Consumes finite amount of power Soft Multiplier • No dedicated multiplier • Each multiply instruction converted into simpler instructions • No change in area, frequency or power Method of Analysis A*B Set of Branches, Shifts and Add instructions Total change in equivalent instructions For all multiply instructions in the benchmark Plot with the change in cycle count (experimental)for all processor variants Outline • Motivation • Implementation – Implementation Scheme(in general) – Data deciphering • Results – Multiplier option • Discussion • Conclusion • Long Term Goal Results • Gnuplot used to plot graphs on log scale • A linear correlation obtained between the points plotted Increase in cycle counts(Log Scale) Example 1 Change in equivalent instructions from hardmultiplier to soft multiplier on pipe5,barrelshift proc Increase in cycle counts(Log Scale) Example 2 Change in equi. instructions from hard-multiplier to soft multiplier on serial shift, high rise processor Outline • Motivation • Implementation – Implementation Scheme(in general) – Data deciphering • Results – Multiplier option • Discussion • Conclusion • Long Term Goal Discussion Generated by gnuplot • “Fit.log” as a good measure of correlation • Percentage uncertainty is expressed by Asymptotic Standard Error(A.S.E) • Example 1- A.S.E is 4.132% • Example 2- A.S.E is 3.166% • A linear dependence is found on log scale A.S.E of all Processor Variants 12 10 8 6 4 2 0 Series1 Outline • Motivation • Implementation – Implementation Scheme(in general) – Data deciphering • Results – Multiplier option • Discussion • Conclusion • Long Term Goal Conclusion • Linear fit enables to predict quite accurately the change in cycle count with change in feature • This change for all the features servers as input to part 2 of the enhanced simulator • Template for future work This gives the approx. change in cycle count for new application Increase in cycle counts(Log Scale) Example 2 From part 1 of MINT by running the application on it Change in equi. instructions from hard-multiplier to soft multiplier on serial shift, high rise processor Future Work • Presently, dealt only with the multiplier option • Similar analysis on other features • Comparison between user demands and approximate cycle counts References • Improving Pipelined Soft Processors with Multithreading, Martin Labrecque and J. Gregory Steffan • Application-Specific Customization of Soft Processor Microarchitecture, Peter Yiannacouras, J. Gregory Steffan and Jonathan Rose Special Thanks • Prof. Greg Steffan • CARG(Compiler & Architecture ReadingGroup) • PaCRaT(Parallelism and Customization Research At university of Toronto) What I learnt? • Research is not a 9 to 5 Job, it’s a lifestyle of discovering something small but relevant from time to time • At times, you see that nothing is bearing fruits for you, then is the time to get off from your seat Thanks Any Questions ???