Component Based Design using Constraint Programming for

advertisement
Component Based Design using Constraint
Programming for Module Placement on FPGAs
Alexander Wold, Dirk Koch, Jim Torresen
Department of Informatics, University of Oslo, Norway
Email: {alexawo,koch,jimtoer}@ifi.uio.no
Abstract—Constraint satisfaction modeling is both an efficient,
and an elegant approach to model and solve many real world
problems. In this paper, we present a constraint solver targeting
module placement in static and partial run-time reconfigurable
systems. We use the constraint solver to compute feasible
placement positions. Our placement model incorporates communication, implementation variants and device configuration
granularity. In addition, we model heterogeneous resources such
as embedded memory, multipliers and logic. Furthermore, we
take into account that logic resources consist of different types
including logic only LUTs, arithmetic LUTs with carry chains,
and LUTs with distributed memory. Our work targets state of the
art field-programmable gate arrays (FPGAs) in both design-time
and run-time applications. In order to evaluate our placement
model and module placer implementation, we have implemented
a repository containing 200 fully functional, placed and routed
relocatable modules. The modules are used to implement complete
systems. This validates the feasibility of both the model and the
module placer. Furthermore, we present simulated results for
run-time applications, and compare this to other state of the art
research. In run-time applications, the results point to improved
resource utilization. This is a result of using a finer tile grid and
complex module shapes.
Keywords—Constraint programming, FPGA, relocatable modules, reconfigurable architectures, partial run-time reconfiguration.
I.
I NTRODUCTION
Run-time reconfigurable systems have been an active research field since the first FPGAs were introduced. At that
time, research into run-time reconfiguration was motivated
by the fact that FPGA devices were too small to fit many
designs [1]. However, FPGA vendors have been quick to
exploit advances in semiconductor process technology. This
has allowed larger device sizes implementing more complex
designs.
The methodology used to design FPGA-based systems
however, has not seen the same advances. On large devices,
implementation of systems with current FPGA tools can take
significant time. For example, to place and route complex systems implemented on large devices can take several days [2].
A component-based design flow has been proposed to
address long place and route times [3]. Here, fully placed and
routed modules are used to implement the system. In addition,
using a component-based design flow to implement run-time
reconfigurable systems can yield improvements in terms of
area and power [4]. Placement positions are computed either
at design time (i.e. offline) or at run-time, depending on the
application.
The work presented in this paper targets static componentbased designs and run-time reconfigurable systems. The contribution of this paper is the improvement in modeling module
placement with realistic real world constraints. In particular,
we improve upon previous work in the field by taking into
account that logic resources are heterogeneous. In Xilinx
FPGAs, there are different types of logic resources: logic only
look up tables (LUTs), arithmetic LUTs with carry chains,
and LUTs with distributed memory. In addition, as in previous
work in the area [5], we model other heterogeneous resources
such as embedded memory, multipliers and logic.
Optimal packing of shapes into a bin is a NP-hard problem [6]. However, given the constraints present in modern
FPGA devices, the solution space is reduced. Therefore, in
this paper, we investigate the scalability of using constraint
programming to place modules given real world constraints.
Module placement is constrained by heterogeneous resources,
configuration granularity, communication and available free
area. The solution space is also determined by the device size,
number of modules, and module layout variants.
With the given constraints, modules can no longer be freely
placed on the reconfigurable device. This is exploited in our
implementation to compute feasible module placements meeting all constraints. The computed module locations are used to
implement complete systems. In this work, we have extended
our previous model [17] and improved the implementation in
the following areas:
•
Heterogeneous logic resources (in addition to other
heterogeneous resources such as memory and multipliers) are taken into consideration when placing
modules.
•
Bus-based and module-to-module communication can
be modeled.
•
Configuration aware placement based on the addressing granularity of the FPGA.
978-1-4673-6180-4/13/$31.00 ©2013 IEEE
Table I.
OVERVIEW OF RELATED WORK
Authors
Design-/
Run-time
Placement
Style
Module
layout
Implementation
variants
Heterogeneous
resources
Communication
Configuration
granularity
Bazargan and
Sarrafzadeh [7]
Both
2D
Rectangular
No
No
No
No
Fekete, Kohler and
Teich [8]
Design
time
2D
Rectangular
No
No
No
No
Yuh, Yang and
Chang [9]
Design
time
2D
Rectangular
No
No
Yes
No
Singhal and
Bozorgzadeh [10]
Design
time
2D
Rectangular
No
No
No
Yes
Singhal and
Bozorgzadeh [5]
Design
time
2D
Rectangular
No
Yes
No
No
Monotone, Sciuto and
Santambrogio [11]
Design
time
2D
Rectangular
No
Yes
Yes
Yes
Belaid, Muller and
Benjemaa [12]
Design
time
2D
Rectangular
No
Yes
No
No
Koester, Porrmann and
Kalte [13]
Run-time
1D/2D
Rectangular
No
Yes
No
NA
for 1D
Koester, Luk, Kalte,
Hagemeyer and Porrmann [14]
Run-time
2D
Rectangular
Yes
Yes
Yes
Yes
Angermeier, Sibirko
Wildermann and Teich [15]
Run-time
2D
Rectangular
No
Yes
Yes
Yes
Feng and
Mehta [16]
Design
time
2D
Arbitrary
No
Yes
No
No
Wold, Koch and
Torresen [17]
Design
time
2D
Arbitrary
Yes
Yes
No
No
This
work
Both
2D
Arbitrary
Yes
Yes
Yes
Yes
The remainder of the paper is organized as follows: background and related work are presented in the following section.
Section III describes the placement model. In Section IV, we
present the module placer implementation. This is followed
by experimental results in Section V, and a conclusion in
Section VI.
II.
BACKGROUND AND RELATED WORK
A large amount of previous research has been undertaken in
the area of module placement targeting reconfigurable devices.
An overview of related work is presented in Table I. The
different approaches can be classified by their properties as
presented in the following sections:
A. Module Placement Algorithms
In [7]–[10], module placement is approached as a fully flexible 2D packing problem. As stated in the introduction, finding
the optimal solution is known to be a NP-hard problem [6].
Heuristics developed for module placement are typically the
same as that of generic 2D bin packing problems. A survey on
placement algorithms is presented in [18]. Besides heuristics,
exact placement techniques and combinations of exact techniques and heuristics have been proposed. This includes ILP
(integer linear programming) [19] and constraint programming.
B. Module Implementation Variants
In [16], floorplanning for static only systems without
presynthesized and implemented modules were examined.
This achieves higher resource utilization compared to models
considering presynthesized and implemented modules. This
is a consequence of fragmentation that occurs when fitting
functions in module bounding boxes.
One method to increase resource utilization when placing
presynthesized and implemented modules is to use implementation variants [14]. An implementation variant can be a
different layout (i.e. a different shape of the module) or a
different design with the same functionality. Implementation
variants improve the average resource utilization as modules
can be placed with less fragmentation. However, implementation variants increase the search space. This results in longer
execution time for the placer as reported in [17].
For online module placement, implementation variants are
selected at run-time. Storing several module variants requires
more memory in the embedded system. However, not all
possible variants that are available at design time will be used
by the actual system at run-time. Consequently, only selected
modules have to be stored in the system.
Table II.
Type
SliceM
SliceL
SliceX
Logic
Yes
Yes
Yes
Table III.
S LICES ON S PARTAN -6 X ILINX FPGA S
Carry chains
Distributed Memory
Yes
Yes
No
CLBXL
Yes
No
No
CLBXM
SLICEX
Shift Register
LUT
Yes
No
No
LUT
FF
LUT
Type
Logic
Carry chains
Distributed Memory
Shift Register
SliceM
SliceL
Yes
Yes
Yes
Yes
Yes
No
Yes
No
C. Heterogeneous Resources
As observed by Montone et al. [11], when the placement
is more constrained (i.e. more realistically modeled), the
results appear less impressive. For example, homogeneous
models [7]–[10] typically achieve better packing results than
heterogeneous models [5], [11]–[17]. However, most FPGAs
today are heterogeneous devices. These devices have different
resources which are limiting the placement flexibility.
Several different methods have been proposed for heterogeneous module placement. Koester [13], [14], takes heterogeneous resources into account by determining the likelihood
for placement positions on an FPGA. By assigning a higher
cost to likely placement positions, the overall module reject
rate is improved in a run-time reconfigurable system.
All related publications considering heterogeneous resources, consider logic primitives as a single resource type.
However, on Xilinx FPGAs, there are different types of logic
primitives, called configurable logic blocks (CLBs). For example, on Spartan-6 FPGAs, a CLB consists of two slices [20],
[21]. There are three different types of slices, SliceM, SliceL
and SliceX. Table II and Table III show the different Slices in
Spartan-6 and Virtex-6 devices. SliceM can be used as logic,
distributed memory, or as a shift register. In addition, SliceM
has carry chains for arithmetic operations. SliceL can be used
as logic, and has carry chains. SliceX does not have carry
chains. A CLB (on a Spartan-6 device) contains one SliceX
and one SliceM, or SliceL, as depicted in Figure 1. A SliceM
can act as a SliceL and SliceX. A SliceL can act as a SliceX.
Therefore, we can relocate modules using SliceL to SliceM
positions. The opposite is however, not possible. Therefore, we
have refined our model to take the different logic primitives
available on recent FPGAs into account.
D. Communication Infrastructure
For run-time reconfigurable systems, different communication architectures have been proposed [15], [22]–[24].
This includes buses, network on chip (NoC), and point-topoint connections. The main difference between a traditional
approach and a communication architecture for componentbased design (and for partial reconfiguration), is to provide
compatible interfaces to different modules. This is particularly
FF
FF
FF
SLICEL
S LICES ON V IRTEX -6 X ILINX FPGA S
SLICEX
SLICEM
LUT
RAM
SRL
FF
Carry
Logic
FF
Carry
Logic
FF
FF
Figure 1. Two CLBs consisting of SliceX and SliceL, and SliceX and SliceM,
respectively.
challenging with relocatable modules to be connected when
placed in different locations, as modeled in this work.
Systems providing regularly structured communication architectures have been presented in [15], [23], [24]. In these
systems, the communication architectures are built from preallocated resources that provide a compatible interface for
several tiles within a reconfigurable region.
In this work, we use GoAhead [25] to implement the
communication infrastructure. FPGA vendor tools (e.g., Xilinx PlanAhead) do not target systems with many relocatable
modules. However, both OpenPR [26] and GoAhead [25]
target such systems. The latter one can generate regularly
structured communication architectures. In addition, GoAhead
allows very fine grained tiling of the reconfigurable resources.
Moreover, it can be used to generate complex shaped modules
as depicted in Figure 2. This reduces fragmentation and allows
implementation variants.
E. Configuration Granularity
In run-time reconfigurable systems, routing modules online
is infeasible. This is because of long execution time and
memory requirements incurring run-time overhead. This is
commonly implemented by tiling a reconfigurable area and
using a communication architecture, as described in the previous section. In this case, no additional routing steps are
needed after placing a module. This implies that modules are
not freely relocatable (i.e. in terms of CLBs on Xilinx FPGAs),
but relocatable on a coarser grained tile grid.
When defining a tile grid for recent Xilinx FPGAs, the
grid should take into account that the smallest addressable
reconfigurable unit is a configuration frame. A configuration
frame contains the configuration data for a set of vertically
aligned resources (e.g. 16 CLBs, 4 BRAMs, or 4 DSP blocks
on Spartan-6 FPGAs). The configuration frame is atomically
updated (i.e. the whole frame is activated in one operation).
In order to prevent hazards due to reconfiguration, two
modules must not share the same frame. As a consequence, the
tile grid should be coarsely grained in vertical direction with
respect to the frame granularity. In horizontal direction, a fine
tile grid is preferable, as this reduces internal fragmentation.
The tile grid is applicable to 2D floorplanners where dynamic
partial reconfiguration is modeled [5], [7]–[12]. Even when
considering the very fine grained configuration capabilities of
Altera Stratix V FPGAs, where the configuration granularity
can be as little as an individual LUT [27], reconfigurable
systems will typically be coarser tiled than what is available
by the underlying FPGA fabric.
III.
R0 (x1 = 32, y1 = 210)
R1 (x1 = 56, y1 = 210)
P LACEMENT MODEL
Using set theory, we formulate the placement model as
follows: A module, M , consist of a sequence of one or more
layouts, M = {L1 , ..., Ln }. A layout, L is an implementation
variant of the module, consisting of a sequence of one or more
resources, L = {R1 , ..., Rn }. A resource, R, represents the
physical resource required by the implementation alternative.
R is defined as a bounding box, R(x0 , y0 , x1 , y1 , k, c), in
which x and y represent the bounding box. k is a finite
sequence where its elements denote the type and order of
resource primitives (i.e. actual resource primitives on the
FPGA) the module requires. c is a set defining compatible
communication interfaces.
The model allows a module to be represented with complex
layout. In Figure 2, a fully functional polyomino shaped
module has been placed on a Xilinx Virtex-6 FPGA. The
module is defined (using FPGA coordinates) as follows:


R0 (x0 = 13, y0 = 0, x1 = 32, y1 = 210, 





k = {CLEXL, ..., DSP }),
M0 =
(1)
R (x = 32, y0 = 126, x1 = 56, y1 = 210, 



 1 0

k = {CLEXM, ..., CLEXM })
Similarly the module definition, the FPGA-area is modeled
as the sequence of resources, F P GA = {R1 , ..., Rn }. The
FPGA-area can also be of complex shape. For example, an
FPGA with a non-reconfigurable region can be modeled. We
use this to accurately model the reconfigurable resources in
Xilinx Virtex-6 (xc6vlx240) device as depicted in Figure 2.
Montone et al. [11] and Smith et al. [19] establish real
world constraints present in modern FPGA devices. This
includes constraints such as intersection between modules (i.e.
preventing multiple modules to use the same area). Using constraint programming syntax ( [28]), we define the intersection
(between R bounding boxes) constraint as follows:
R0 (x0 = 13, y0 = 0)
CLEXM
CLEXL
R1 (x0 = 32, y0 = 126)
BRAM
DSP
Figure 2.
A relocatable module placed on a part of a Xilinx Virtex-6
(xc6vlx240tff784-3) FPGA. Used resources within the bounding box (gray
area) are depicted in black. The remaining resources within the bonding box
are unused (i.e. internal fragmentation). The figure is generated from the XDL
file of a fully functional, placed and routed, SHA-256 module. GoAhead was
used to constraint both logic and routing resources to within the bounding
boxes depicted in the figure. The module took 18 minutes to implement
(synthesis, mapping and place and route) from VHDL source code on a Xeon
X5690.
sequence, Rk , of the FPGA model. The constraint can only
be met when the module is placed within the reconfigurable
region. Thus, this constraint implicitly constrains the module
to within the bounding box of the reconfigurable region. We
formulate this as follows:
{(L(Rk ), F P GA(Rk )|L(Rk ) == F P GA(Rk )}
(3)
For communication we specify a producer/consumer constraint as follows:
{(Ln (Rc ), Lm (Rc ))|Ln (Rc ) == Lm (Rc ))}
(4)
This abstract constraint definition allows any type of communication to be modeled where a module communications with
another module or the static part of the system.
{((Ax0 , Ax1 , Ay0 , Ay1 ), (Bx0 , Bx1 , By0 , By1 ))|
Ax0 < Bx1 , Ax1 > Bx0 , Ay0 < By1 , Ay1 > By0 }
(2)
The resource constraint is satisfied when the resource
sequence, Rk , of the layout is a subsequence of the resource
IV.
C ONSTRAINT S OLVER I MPLEMENTATION
We have implemented the model, with the flow as depicted
in Figure 3. The module placer is implemented as a constraint solver which computes feasible placement positions for
Input
Constraint program
Constraint computation
FPGA Model
Constraint solver
Relocatable
modules
Constraints
Figure 3.
Constraint
program
generation
Constraint
program
Computed
placement
positions
The figure depicts the components used in the constraint solver to compute feasible floorplans.
relocatable modules. Computing feasible placement positions
consist of executing a search and evaluating whether the
constraints are met. If all constraints are met for a module
at a placement location, the placement location is feasible. If
all constraints are met for all modules, the a feasible solution
exists.
State 0
State 1
State 2
State 5
State 3
State 6
State 4
State 7
A. Constraint Evaluation
We implement two types of constraints: static and dynamic
constraints.
1) Static Constraints: Static constraints are constraints
which, when evaluated, always yield the same result. We
evaluate the static constraints before the search is executed.
This removes part of the search space, and avoids constraint
recomputation during search.
For example, FPGAs often have a heterogeneous fabric
with resources of the same type in rows over the device. This
efficiently reduces the horizontal domain for the modules. The
search space in vertical direction can be relatively large (particularly when not considering configuration frame granularity).
However, due to the heterogeneous resources, the search space
will be more constrained in horizontal direction.
The search space is further reduced when taking the
reconfiguration granularity into account. The reconfiguration
granularity is required for partial run-time reconfigurable systems. We take advantage of this by implementing vertical shifts
at the granularity of resources within a configuration frame.
This results in a reduction of the vertical domain, and hence
the search space.
2) Dynamic Constraints: Constraints which depend on the
placement of other modules are dynamic. These constraints
need to be computed to evaluate whether a location is valid.
For example, if module A has a communication constraint
related to another module, B, the position of module B
affects whether the placement of module A is valid. Similarly,
the intersection constraint depends on the placement of other
modules.
State 8
Figure 4. Example of possible module placement during constraint solving.
The rightmost branch results in a feasible floorplan. For clarity, we have only
used rectangular modules in the figure.
B. Constraint Search
The constraint search is implemented with a recursive
depth first search (DFS) function. For each recursive call, the
dynamic constraints are evaluated. The search backtracks if the
constraints cannot be met. figure 4 illustrates the search and
backtracking during module placement. The search returns the
computed placement positions when all modules have been
placed.
C. Search Heuristics
Constraint solvers typically improve performance (i.e. time
it takes to solve the constraint problem) by reducing the search
space. The order in which modules are placed, and the order
in which constraints are tested, have impact on how fast the
search space is pruned. Typically, the execution time of the
module placer improves when placing the most constrained
modules first. Therefore, we order the modules by the largest
Implementation of modules
Module placement
18
Module
implementations
(source code)
Bounding box
constraints
(GoAhead)
Generating
placement problem
15
21
16
11
19
20
Synth/MAP/PAR
Solve placement
(constraint solver)
17
14
9
8
22
7
12
Implemented
modules
Floorplan
23
2
5
3
10
13
Constraint computation
System
implementation
1
Figure 5. An overview of the experimental tool flow used. A set of modules
know to fill 100% of the FPGA is generated. The constraint solved computes
feasible placement for the modules. The computed placement is used to
implement the design from a set of modules.
module first (LMF) heuristic. For example, larger modules
are more constrained with respect to the placement as the
heterogeneous resources reduce the horizontal domain. It is
trivial to generate the initial placement order from the module
size. Thus, the module size order is a good compromise
compared to a heuristic ordering based explicitly on how
constrained the modules are.
V.
E XPERIMENTAL R ESULTS
We have undertaken experiments of the module placer
both in a simulated run-time application and in a designtime application. Currently, no common benchmark is available
to determine the performance (i.e. execution time, packing
quality) of module placement.
To evaluate the module placer in a design-time environment, we have generated synthetic module placement problems. The synthetic placement problems consist of placing implemented (i.e. fully functional, placed and routed) relocatable
modules. The modules have been implemented using CalyptoC, a high level synthesis (HLS) tool and Xilinx tools. We have
constrained the module logic and routing to bounding boxes
using the GoAhead tool flow introduced in Beckhoff et al. [25].
The complete experimental flow is depicted in figure 5.
We have used the experimental scenario provided by
Koester et al. [14] for our run-time experiments. In the runtime environment we simulate the benchmark scenario with
the modules parameters (i.e. not implemented modules) given
in [14]. However, we use a finer tiled grid as this is one of the
advantages with GoAhead.
The experimental results were obtained as follows:
4
6
0
Figure 6. The figure depicts a floorplan computed with the constraint solver.
The FPGA model is a Xilinx Virtex-6 (xc6vlx240tff784-3) device. The middle
part is allocated for the system CPU. The shadings are used to depict the
different modules.
A. Design-time Benchmarks
We first generated a repository of 200 placed and routed
relocatable modules. This was accomplished using a custom
build program supporting multiple cores on multiple computers. The build program automates the implementation of the
modules according to the flow depicted in Figure 5. In addition
the build program allows implementation of many modules
concurrently. This process took approximately 2 days on a 24
core Xeon X5690 host. This includes time spent on modules
that were not possible to fully route successfully.
We have used modules from the YaSSL [29] project.
YaSSL has implementations of standardized cryptographic
and hashing algorithms including AES, DES, MD5 and SHA
variants. The YaSSL modules are written in C. We have implemented the YaSSL modules as hardware modules using the
HLS tool, GoAhead, the Xilinx tool chain, and the concurrent
build program. One of the advantages of using a HLS tool to
implement hardware from software, is the ability to implement
multiple hardware variants from a single software implementation. For example, different transformations including partial
or complete loop unrolling and various number of pipeline
stages are parameterizable and automated in the HLS tool. For
our experiments, this allows us to implement many working
modules with different resource and performance qualities.
From the repository of implemented relocatable modules,
we have generated test cases. The test cases are all solvable
(i.e. a given placement exists where the constraints are met).
All test cases utilized 100% of reconfigurable area allocated
for modules. The test cases are artificial generated placement
problems, using actual placed and routed modules. Each test
case consist of relocatable modules of different shapes and
Table IV.
S IMULATED RUN - TIME B ENCHMARK RESULTS
Number of
modules
Average unsuccessful
placements
Average unsuccessful placements
with our improvements
3
4
5
6
7
8
0%
0%
10%
25%
0%
0%
5%
14%
33%
52%
on the FPGA. For each iteration, the earliest placed module is
removed, and a new random module is placed. If the module
cannot be placed within the area, it must either be delayed or
rejected. The average number of unsuccessful placements is
shown in the Table IV. This occurs when a task is scheduled
to be to be placed, but it cannot be placed.
Figure 7. The figure depicts a system implemented from a set of modules. The
figure is generated from the XDL file of the final system based on modules
from the repository and the computed floorplan. The depicted device is a
Xilinx Virtex-6 (xc6vlx240tff784-3) FPGA.
resource requirements. The generated placement problems
contain up to 30 different modules. In average, the test cases
contain 15% complex shaped ( , , and
shaped variants)
modules, and 15% of the modules span multiple clock regions.
The modules were placed on a model of the Xilinx Virtex-6
(xc6vlx240tff784-3) FPGA.
We used arbitrary module layouts, implementation variants
and a finer vertical grid granularity than Koester et al. The
selected grid granularity consist of 64 slices (i.e. 2 CLB
columns), 4 BRAMs, or 4 DSP blocks. A finer vertical
grid allows the modules to have less internal fragmentation,
thus utilizing the resources better. ReCoBus/GoAhoad allows
implementation of communication for fine grained placement
grids. This allows us to extend the reconfigurable area slightly
while maintaining the average resource utilization. Thus, our
simulations point to improved results compared to [14]. As
shown Table IV, we were able to place more modules with
less placement violations in our simulations.
VI.
The generated test cases were solved by the module placer.
This was accomplished by solving the constraint problem for
the set of modules in the test case. Experimental results of a
solved placement problem are depicted in Figure 6. The Figure
shows the computed floorplan. The computed floorplan is used
to implement the system. In Figure 7, a system implemented
with modules from the repository and placed by the module
placer is depicted. The Figure is generated by parsing the
XDL file of the implemented system. Currently, the generated
systems do not perform any function other than to show the
feasibility of our module placer when integrated with FPGA
tools.
B. Run-time Benchmarks
For run-time experiments, we have used the scenario
provided in [14] by Koester et al. We have used the same
FPGA fabric (Xilinx Virtex4 FX100), modules, schedule, and
the same number of iterations (10000). Our run-time experimental results are based on simulations. In order to get a
valid comparison between the work of Koester et al. and our
implementation, we performed the experiments under the same
conditions.
In the benchmark scenario by Koester et al. [14], we
place n modules. These modules are executed concurrently
C ONCLUSION
In this paper, we have introduced an improved placement
model and module placer implementation. Our work targets
systems implemented on state of the art FPGAs using relocatable modules. We have presented a realistic model. Our placement model supports heterogeneous logic resources, implementation variants, communication and device configuration
granularity. In addition, we model heterogeneous resources
such as embedded memory, multipliers and logic.
The module placer is implemented using constraint solver
techniques. We have combined the module placer fwith other
tools to implement FPGA based systems from a repository of
pre-implemented modules. We have generated complete placed
and routed designs in order to validate that both the model is
realistic, and that our approach is feasible.
Simulations of run-time module placement points to improved results compared to previous work in the area. Here,
the benefit is the result of reduced internal fragmentation and
a finer tile grid. We have reduced the internal fragmentation
by modeling complex shaped modules.
ACKNOWLEDGMENT
This work is funded by T HE R ESEARCH C OUNCIL OF
N ORWAY as part of the Context Switching Reconfigurable
Hardware for Communication Systems (COSRECOS) [30]
project, under grant 191156V30.
[16]
D. Mehta, “Heterogeneous floorplanning for FPGAs,” in 19th
International Conference on VLSI Design held jointly with 5th
International Conference on Embedded Systems Design (VLSID’06).
IEEE, 2006, p. 6 pp.
[17]
[2] S. Areibi, G. Grewal, D. Banerji, and P. Du, “Hierarchical FPGA
placement,” Canadian Journal of Electrical and Computer Engineering,
vol. 32, no. 1, pp. 53–64, Jan. 2007.
A. Wold, D. Koch, and J. Torresen, “Enhancing Resource Utilization
with Design Alternatives in Runtime Reconfigurable Systems,” in 2011
IEEE International Symposium on Parallel and Distributed Processing
Workshops and Phd Forum. Anchorage: IEEE, May 2011, pp.
264–270.
[18]
[3] C. Lavin, M. Padilla, S. Ghosh, B. Nelson, B. Hutchings, and
M. Wirthlin, “Using Hard Macros to Reduce FPGA Compilation
Time,” 2010 International Conference on Field Programmable Logic
and Applications, pp. 438–441, Aug. 2010.
A. Lodi, S. Martello, and M. Monaci, “Two-dimensional packing
problems: A survey,” European Journal of Operational Research, vol.
141, no. 2, pp. 241–252, Sep. 2002.
[19]
A. M. Smith, G. A. Constantinides, and P. Y. K. Cheung, “Integrated
Floorplanning, Module-Selection, and Architecture Generation for
Reconfigurable Devices,” IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, vol. 16, no. 6, pp. 733–744, Jun. 2008.
[20]
Xilinx, “Configurable logic block with AND gate for efficient
multiplication in FPGAS,” pp. 1–58, 2010. [Online]. Available:
www.xilinx.com/support/documentation/user guides/ug384.pdf
[21]
——, “Spartan-6 Family Overview Summary of Spartan-6 FPGA
Features,” pp. 1–11, 2011. [Online]. Available: http://www.xilinx.com/
support/documentation/data sheets/ds160.pdf
[22]
M. Shelburne, C. Patterson, P. Athanas, M. Jones, B. Martin, and
R. Fong, “MetaWire: using FPGA configuration circuitry to emulate a
network-on-chip,” IET Computers & Digital Techniques, vol. 4, no. 3,
p. 159, 2010.
[23]
A. Oetken, S. Wildermann, J. Teich, and D. Koch, “A Bus-based
SoC Architecture for Flexible Module Placement on Reconfigurable
FPGAs,” in 2010 International Conference on Field Programmable
Logic and Applications. Milan, Italy: IEEE, Aug. 2010, pp. 234–239.
[24]
T. S. T. Mak, N. P. Sedcole, P. Y. K. Cheung, W. Luk, T. T.
Mak, P. Sedcole, and P. K. Cheung, “On-FPGA Communication
Architectures and Design Factors,” in 2006 International Conference
on Field Programmable Logic and Applications. IEEE, 2006, pp.
1–8.
[9] P.-H. Yuh, C.-L. Yang, and Y.-W. Chang, “Temporal floorplanning using
the three-dimensional transitive closure subGraph,” ACM Transactions
on Design Automation of Electronic Systems, vol. 12, no. 4, pp.
37–es, Sep. 2007.
[25]
C. Beckhoff, D. Koch, and J. Torresen, “Go Ahead: A Partial
Reconfiguration Framework,” in 2012 IEEE 20th International
Symposium on Field-Programmable Custom Computing Machines.
IEEE, Apr. 2012, pp. 37–44.
[10] L. Singhal and E. Bozorgzadeh, “Multi-layer Floorplanning on a
Sequence of Reconfigurable Designs,” in 2006 International Conference
on Field Programmable Logic and Applications. IEEE, 2006, pp.
1–8.
[26]
A. A. Sohanghpurwala, P. Athanas, T. Frangieh, and A. Wood,
“OpenPR: An Open-Source Partial-Reconfiguration Toolkit for Xilinx
FPGAs,” in 2011 IEEE International Symposium on Parallel and
Distributed Processing Workshops and Phd Forum, no. Xdl. IEEE,
May 2011, pp. 228–235.
[27]
M. Bourgeault, “Altera’s Partial Reconfiguration Flow,” 2011.
[Online]. Available: http://www.eecg.utoronto.ca/∼jayar/FPGAseminar/
FPGA Bourgeault June23 2011.pdf
[28]
K. Apt, Principles of Constraint Programming.
Press, 2003.
[29]
WolfSSL,
“Yassl,”
2013,
available
from
http://www.yassl.com/yaSSL/benchmarks-cyassl.html.
[30]
“Context switching reconfigurable hardware for communication systems,” available from http://www.mn.uio.no/ifi/english
/research/projects/cosrecos.
R EFERENCES
[1] M. Wirthlin and B. Hutchings, “A dynamic instruction set computer,”
in Proceedings IEEE Symposium on FPGAs for Custom Computing
Machines. IEEE Comput. Soc. Press, 1995, pp. 99–107.
[4] S. Liu, R. N. Pittman, A. Form, and J.-L. Gaudiot, “On
energy efficiency of reconfigurable systems with run-time partial
reconfiguration,” in ASAP 2010 - 21st IEEE International Conference
on Application-specific Systems, Architectures and Processors. IEEE,
Jul. 2010, pp. 265–272.
[5] L. Singhal and E. Bozorgzadeh, “Heterogeneous Floorplanner for
FPGA,” in 15th Annual IEEE Symposium on Field-Programmable
Custom Computing Machines (FCCM 2007). IEEE, Apr. 2007, pp.
311–312.
[6] S. Martello and D. Vigo, “Exact Solution of the Two-Dimensional
Finite Bin Packing Problem,” Management Science, vol. 44, no. 3, pp.
388–399, Mar. 1998.
[7] K. Bazargan and M. Sarrafzadeh, “Fast online placement for
reconfigurable computing systems,” in Seventh Annual IEEE
Symposium on Field-Programmable Custom Computing Machines
(Cat. No.PR00375). IEEE Comput. Soc, 1999, pp. 300–302.
[8] S. Fekete, E. Kohler, and J. Teich, “Optimal FPGA module
placement with temporal precedence constraints,” in Proceedings
Design, Automation and Test in Europe. Conference and Exhibition
2001. IEEE Comput. Soc, 2001, pp. 658–665.
[11] A. Montone, M. D. Santambrogio, D. Sciuto, and S. O. Memik,
“Placement and Floorplanning in Dynamically Reconfigurable FPGAs,”
ACM Transactions on Reconfigurable Technology and Systems, vol. 3,
no. 4, pp. 1–34, Nov. 2010.
[12] I. Belaid, F. Muller, and M. Benjemaa, “Off-line placement of
hardware tasks on FPGA,” in 2009 International Conference on
Field Programmable Logic and Applications. IEEE, Aug. 2009, pp.
591–595.
[13] M. Koester, M. Porrmann, and H. Kalte, “Task placement for
heterogeneous reconfigurable architectures,” in Proceedings. 2005
IEEE International Conference on Field-Programmable Technology,
2005., no. m. IEEE, 2005, pp. 43–50.
[14] M. Koester, W. Luk, J. Hagemeyer, M. Porrmann, and U. Ruckert,
“Design Optimizations for Tiled Partially Reconfigurable Systems,”
IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
vol. 19, no. 6, pp. 1048–1061, Jun. 2011.
[15] J. Angermeier, S. Wildermann, E. Sibirko, and J. Teich, “Placing
Streaming Applications with Similarities on Dynamically Partially
Reconfigurable Architectures,” in 2010 International Conference on
Reconfigurable Computing and FPGAs. IEEE, Dec. 2010, pp. 91–96.
Cambridge University
Related documents
Download