The Microprocessor is no more General Purpose Design Gap Problems with Fine Grained Approach FPGAs • Area in-efficient – Percentage of chip area for wiring far too high • Too slow – Unavoidable critical paths too long • Routing and Placement is very complex Problems with Fine Grained FPGAs Coarse Grained Reconfigurable computing • Uses reconfigurable arrays with path-widths greater than 1 bit • More area-efficient • Massive reduction in configuration memory and configuration time • Drastic reduction in complexity of Placement & Routing Coarse Grained Architectures Classification • Mesh-based • Linear Arrays based • Cross-bar based Mesh Based Architectures • Arranges PEs in a 2-D array • Encourages nearest neighbor links between adjacent PEs • Eg. KressArray, Matrix, RAW, CHESS Matrix – Mesh based Architecture Matrix – Mesh Based Architecture Architectures based on Linear Arrays • Aimed at mapping pipelines on linear arrays • If pipeline has forks longer lines spanning whole or part of the array are used • Eg. RaPiD, PipeRench PipeRench – Linear Array based architecture PipeRench – Linear Array Based Architecture Cross-bar based Architectures • Communication Network is easy to route • Uses restricted cross-bars with hierarchical interconnect to save area • Eg. PADDI-1, PADDI-2, Pleiades PADDI-2 – Cross-bar based architecture PADDI-2 Cross-bar based Architecture Coarse Grained Architectures EGRA • Architectural template to enable design space exploration • Execute expressions as opposed to operations • Supports heterogeneous cells and various memory interfaces EGRA Evolution of fine grained and coarse grained architectures EGRA – at Cell Level Architectural Exploration Architectural exploration EGRA vs CGRA vs FPGA EGRA – at array level • Organized as a mesh of cells of three types – RACs – Memories – Multipliers • Cells are connected using both nearest neighbor and horizontal-vertical buses • Each cell has a I/O interface, context memory and core Control Unit EGRA Operation • DMA mode – Used to transfer data in bursts to EGRA – To program cells and to read/write from scratchpad memories • Execution mode – Control unit orchestrates data flow between cells EGRA – at array level Experimental Results Experimental Results Experimental Results EGRA Memory Interface • Data register at the output of computational cells • Memory cells can be scattered around in the array • A scratchpad memory outside reconfigurable mesh Architectural exploration - Area Architectural exploration - Delay MORA The reconfigurable Cell Operating modes of RC Interconnection Topology • Hierarchical – Level 1 used within 4x4 quadrant to provide nearest neighbor connectivity – Interleaved Horizontal and Vertical connectivity of length two – Each RC can receive data from at most two other RCs and send data to at-most four other RCs – Data and control across quadrants is guaranteed over Level 2 interconnection Interconnection Topology Computational Strategies •Temporal computational load balancing •Spatial computational load balancing