• What does ICAP stand for ? What is its main use ?
• Why is Partition Pin preferred over Bus Macro?
1
2
3
4
1.
•
•
•
•
Module-based PR:
Implement each Reconfigurable Module as an individual project
Constrain each PR module to be placed in a given partition
Initially full Bitstream is loaded and partial Bitstream of a complete PR module is loaded on demand
Supported by Plan Ahead. Will be covered in detail
2.
•
•
•
•
Difference-based PR:
Implement each Reconfigurable Module as an individual project
Constrain each PR module to be placed in a given partition
Compute the difference of Bitstreams of the Reconfigurable modules to obtain the differential partial bitstream
Initially full Bitstream is loaded and differential partial Bitstream of a PR module is loaded on demand
5
Without PR With PR
6
Original Sobel processed Sepia processed
The Reconfigurable “Filter Engine” will be replaced with
Sobel or Sepia filter part during Runtime partial
Reconfiguration.
Static part
Reconfigurable part
System level block diagram (Implemented on Zync 7000 7Z202 SoC)
Ref: Application Note: Zynq-7000 All Programmable SoC
7
Non PR specific design flow
Vivado : Converts high level code to RTL code
Xilinx Synthesis Tool : Converts RTL code to Netlist
PR specific design flow
PlanAhead tool: Used for
1. Reconfigurable partitioning
2. Floorplan the design
3. Add Reconfigurable modules
4. Run Implementation tools to generate
Full and partial bit stream
Ref: Application Note: Zynq-7000 All Programmable SoC
8
9
Partitioning Style
• A partition defines the smallest atomic area a module can be assigned
• Different Partitioning styles possible
• Not all supported by commercial vendors.
• Island style
• Slot Based
• Grid Based
Island Slot Based Grid based
10
Placement Flexibility
• Partitioning style affects placement and flexibility
• Island style - suffers from fragmentation .
• Offered by the current vendors Xilinx and Altera.
• Slot style - Also suffers from fragmentation but to a lesser extent.
• Some academic tools have explored this style –ReCoBus
• Grid Style - Reduced fragmentation. Difficult to support.
• To enhance flexibility, the PR module must be placed and routed in every region it needs to be configured.
• Additional stress on Island Slot Based Grid based
Bit stream size.
11
• In the Netlist view of the synthesized design, select
FILTER ENGINE to set partition
• The type of partition should be selected as Reconfigurable partition
Ref: Application Note: Zynq-7000 All Programmable SoC
12
Select the Sobel Filter Netlist for the Reconfigurable partition
Ref: Application Note: Zynq-7000 All Programmable SoC
13
Add Sepia filter Netlist for the Reconfigurable partition
Ref: Application Note: Zynq-7000 All Programmable SoC
14
The PlanAhead tool requires the User to manually select the PR region considering the amount of resources required for the most complex reconfigurable module
Ref: Application Note: Zynq-7000 All Programmable SoC
15
Resource Consideration
• Column wise layout of different logic primitives
• Must be considered when placing
• Depending on the type of logic primitives used by the module(SLICEX,
SLICEM, etc), relocation may or may not be possible.
Global Clocks
• When possible, add frames to an RP range in the same clock region rather than adding an additional clock region to avoid clock starvation
16
Fan Outs
Partition Partition
17
Manually optimize the Fanout before the automatic
Placement and routing, done in implementation stage, for a better design
18
• The Final Placed and Routed designs for
Sobel and Sepia filter
Sobel filter
Sepia filter
Ref: Application Note: Zynq-7000 All Programmable SoC
19
• This step will generate Full and Partilal bitstream for Sepia and Sobel filter.
• The full bitstream of sobel could be used as initial bitstream
• The partial bitstream of sepia and Sobel could be loaded to FPGA via PCAP on demand
Ref: Application Note: Zynq-7000 All Programmable SoC
• Row address – 0 to 9
• Top/Bottom row of the FPGA
• Together with row address can locate the tile
• Major Address : Columns 0 onwards
• Minor Address : No. of frames in tile
• Block type : Logic Blocks, BRAMs, Routing Blocks
Frame Address Register
20
Frame Composition
22
Ref: Application Note: Zynq-7000 All Programmable SoC
23
• PR itself requires power
• Power during PR is spent in:
1. Configuration Data Access –
2. Actual configuration of FPGA Resources
Bonamy, R., et al. "Power Consumption Models for the Use of Dynamic and Partial Reconfiguration." Microprocessors and Microsystems
(2014).
Fault Tolerance – Self Healing Architecture
• Fault tolerant Processor
• IF ,MAC and ALU are the
PRMs
• Different configurations available for each module.
• Focus on the self healing feature more than the performance itself.
Psarakis, Mihalis, and Andreas Apostolakis. "Fault tolerant FPGA processor based on runtime reconfigurable modules."
24
Reconfigurable Crypto processor
• Processor can choose from
Different crypto algorithms
• Major Area savings
• Some Power Savings too.
Hori, Yohei, Toshihiro Katashita, and Kazukuni Kobara. "Energy and area saving effect of Dynamic Partial Reconfiguration on a 28-nm process FPGA."
25
Fast Start Up
• Fast Start up is a
2 step configuration
• Useful in time critical systems to initiate a swift system start up.
• Example :
Automotive safety
Meyer, Joachim, et al. "Fast start-up for spartan-6 fpgas using dynamic partial reconfiguration."
26
27
• Complicated design flow
• Tool Support
• Doesn’t support Slot/Grid Style
• Manual Placement Steps
• Manual assistance for reconfiguring different target devices.
Manual steps
• Security issues
• Although encryption option is provided, security issues persist.
• Decreased performance as compared to full configuration.
• Xilinx reports a 10% degradation in clock frequency when using PR.
Xilinx PR Implementation Flow
HDL Design Description
HDL Synthesis
Set Design Constraints
Placement Analysis
Implement Static Design and PR
Modules
Merge
Final Bitsreams
28
• A Run time reconfigurable motion estimation.
• Motion estimation (Block Matching) techniques used in video stabilization.
• Switch between 2 different algorithms (Full Search and Diamond Search) depending on external inputs such as video quality
• Achieve a tradeoff between speed and accuracy based on external inputs.
• Evaluate metrics such as area savings, power savings and reconfiguration time.
• PR tools are not in matured state. So it will be a challenging task to implement the motion estimation algorithms using PR, hence we have a backup plan to implement “Algorithmic approach to partial bit stream relocation”.
29