Slide 1

advertisement
The Microprocessor is no more
General Purpose
Design Gap
Problems with Fine Grained Approach
FPGAs
• Area in-efficient
– Percentage of chip area for wiring far too high
• Too slow
– Unavoidable critical paths too long
• Routing and Placement is very complex
Problems with Fine Grained FPGAs
Coarse Grained Reconfigurable
computing
• Uses reconfigurable arrays with path-widths
greater than 1 bit
• More area-efficient
• Massive reduction in configuration memory
and configuration time
• Drastic reduction in complexity of Placement
& Routing
Coarse Grained Architectures
Classification
• Mesh-based
• Linear Arrays based
• Cross-bar based
Mesh Based Architectures
• Arranges PEs in a 2-D array
• Encourages nearest neighbor links between
adjacent PEs
• Eg. KressArray, Matrix, RAW, CHESS
Matrix – Mesh based Architecture
Matrix – Mesh Based Architecture
Architectures based on Linear Arrays
• Aimed at mapping pipelines on linear arrays
• If pipeline has forks longer lines spanning
whole or part of the array are used
• Eg. RaPiD, PipeRench
PipeRench – Linear Array based
architecture
PipeRench – Linear Array Based
Architecture
Cross-bar based Architectures
• Communication Network is easy to route
• Uses restricted cross-bars with hierarchical
interconnect to save area
• Eg. PADDI-1, PADDI-2, Pleiades
PADDI-2 – Cross-bar based
architecture
PADDI-2 Cross-bar based Architecture
Coarse Grained Architectures
EGRA
• Architectural template to enable design space
exploration
• Execute expressions as opposed to operations
• Supports heterogeneous cells and various
memory interfaces
EGRA
Evolution of fine grained and coarse
grained architectures
EGRA – at Cell Level
Architectural Exploration
Architectural exploration
EGRA vs CGRA vs FPGA
EGRA – at array level
• Organized as a mesh of cells of three types
– RACs
– Memories
– Multipliers
• Cells are connected using both nearest
neighbor and horizontal-vertical buses
• Each cell has a I/O interface, context memory
and core
Control Unit
EGRA Operation
• DMA mode
– Used to transfer data in bursts to EGRA
– To program cells and to read/write from
scratchpad memories
• Execution mode
– Control unit orchestrates data flow between cells
EGRA – at array level
Experimental Results
Experimental Results
Experimental Results
EGRA Memory Interface
• Data register at the output of computational
cells
• Memory cells can be scattered around in the
array
• A scratchpad memory outside reconfigurable
mesh
Architectural exploration - Area
Architectural exploration - Delay
MORA
The reconfigurable Cell
Operating modes of RC
Interconnection Topology
• Hierarchical
– Level 1 used within 4x4 quadrant to provide
nearest neighbor connectivity
– Interleaved Horizontal and Vertical connectivity of
length two
– Each RC can receive data from at most two other
RCs and send data to at-most four other RCs
– Data and control across quadrants is guaranteed
over Level 2 interconnection
Interconnection Topology
Computational Strategies
•Temporal computational load balancing
•Spatial computational load balancing
Download