lecture01 - Brown University

advertisement
Reconfigurable Computing
(EN2911X)
Lecture 01: Introduction
Prof. Sherief Reda
Division of Engineering, Brown University
Spring 2007
S. Reda EN2911X FALL’07
Methods for executing algorithms
Hardware
(Application Specific
Integrated Circuits)
Advantages:
•very high
performance and
efficient
Disadvantages:
•not flexible (can’t
be altered after
fabrication)
• expensive
S. Reda EN2911X FALL’07
Reconfigurable
computing
Advantages:
•fills the gap
between hardware
and software
•much higher
performance than
software
•higher level of
flexibility than
hardware
Software-programmed
processors
Advantages:
•software is very
flexible to change
Disadvantages:
•performance can
suffer if clock is not
fast
•fixed instruction set
by hardware
Temporal vs. spatial based computing
Temporal-based execution
(software)
Spatial-based execution
(reconfigurable computing)
Ability to extract parallelism (or concurrency)
from algorithm descriptions is the key to
acceleration using reconfigurable computing
S. Reda EN2911X FALL’07
Reconfigurable devices
Programmable
interconnect
Programmable
logic blocks
• Field-Programmable Gate Arrays (FGPAs) are one example of
reconfigurable devices
• An FPGA consists of an array of programmable logic blocks whose
functionality is determined by programmable configuration bits
• The logic blocks are connected by a set of routing resources that
are also programmable
 Custom logic circuits can be mapped to the reconfigurable fabric
S. Reda EN2911X FALL’07
Configuring FPGAs
[Maxfield’04]
FPGAs can be dynamically reprogrammed before runtime or during
runtime (virtual hardware)
• full
• partial
S. Reda EN2911X FALL’07
Uses of reconfigurable devices
1. Low/med volume IC production
2. Early prototyping and logic emulation
3. Accelerating algorithms in reconfigurable computing
environments
i.
ii.
iii.
Reconfigurable functional units within a host processor (custom
instructions)
Reconfigurable units used as coprocessors
Reconfigurable units that are accessed through external I/O or
a network
[Compton’02]
S. Reda EN2911X FALL’07
Current problems with conventional
Intel VP Patrick Gelsinger (ISSCC 2001)
computing
“If scaling continues at present pace, by 2005, high speed processors
would have power density of nuclear reactor, by 2010, a rocket nozzle,
and by 2015, surface of sun.”
•Technology scaling doubled the number of devices in an IC
(processors, FPGAs, …, etc) every 2-3 years
• Scaling also provided devices with reduced delay → frequency
doubling (with aggressive pipelining) → increased power density
•Increases in clock frequency slowed down (or stopped); available
devices are used to create multi-processor (multi-core) processors
S. Reda EN2911X FALL’07
Why reconfigurable computing is more
relevant these days?
• Demand for high-performance computation is soaring:
– large-scale optimization problems, physics and earth simulation,
bioinformatics, signal processing (e.g. HDTV), …, etc)
• Why software-programmed processors are no longer attractive?
– Faster temporal execution of instructions) is no longer improving
– General-purpose multi-core processors requires coarse grain
thread-level parallelism
• Why reconfigurable fabrics are currently attractive?
– Increased integration densities allow large FPGAs that can
implement substantial functions
– Provide the spatial computational resources required to
implement massively-parallel computations directly in hardware
S. Reda EN2911X FALL’07
Topics that will be covered in this class…
(entry survey time)
S. Reda EN2911X FALL’07
Topic 01: Programmable logic technology
overview
a
b
Truth table
&
|
c
y = (a & b) | !c
y
Programmed LUT
a b c
y
SRAM cells
0
0
0
0
1
1
1
1
1
0
1
1
1
0
1
1
1
0
1
1
1
0
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
000
001
010
011
100
101
110
111
8:1 Multiplexer
Required function
abc
Programming information could be stored in SRAM
4-input Look-Up Table (LUT) is the typical size
S. Reda EN2911X FALL’07
y
Topic 01: Programmable logic technology
overview
a
b
c
d
4-input
LUT
y
mux
flip-flop
q
e
clock
Switch
box
S. Reda EN2911X FALL’07
Topic 02: Reconfigurable computing
methodologies
software
System
Specification
compile for target processor
Textual HDL
Graphical State Diagram
partitioning
When clock rises
If (s == 0)
then y = (a & b) | c;
else y = c & !(d ^ e);
Top-level
block-level
schematic
hardware
Graphical Flowchart
Block-level schematic
synthesis
(compilation)
Mapping
(placement & routing)
configuration data
S. Reda EN2911X FALL’07
Topic 03: Hardware programming languages
(Verilog)
• Verilog is a hardware description
language used to model digital systems
• Similar in syntax to C
• Differs from conventional programming
languages as the execution of
statements is not strictly linear. Possible
to have sequential and concurrent
execution statements
• The language can be synthesized into
logic circuits
S. Reda EN2911X FALL’07
module mux(a, b, select, y);
input a, b, select;
output y;
initial
begin
always @ (a or b or select)
if (select)
y = a;
else
y = b;
end
endmodule
Topic 04: Rapid prototyping with Altera DE2
board
No need to design our board; we will use Altera’s DE2
board and Quartus II software.
Features:
 Cyclone II FPGA 35K LUTs
 10/100 Ethernet
 RS232
 Video out (VGA 10-bit DAC)
 Video in (NTSC/PAL/multi-format)
 USB 2.0 (type A and type B)
 PS/2 mouse or keyboard port
 Line in/out, microphone in (24-bit Audio CODEC)
 Expansion headers (76 signal pins)
 Infrared port
 Memory 8-MBytes SDRAM, 512K SRAM, 4-MBytes flash
 SD memory card slot
 Displays 16 x 2 LCD display
 Eight 7-segment displays
 Switches and LEDs
S. Reda EN2911X FALL’07
Topic 05: High-level synthesis languages
(SystemC)
#include "systemc.h"
• SystemC is a system description
SC_MODULE(adder)
language for hardware/software systems
{
• SystemC is a set of library and macros
sc_in<int> a, b;
implemented in C++ to allow
sc_out<int> sum;
specification and simulation of
void do_add() {
concurrent processes
sum = a + b;
• Allow high-level description of hardware
}
modules
• A subset of the language can be
SC_CTOR(adder) {
SC_METHOD(do_add);
synthesized into logic circuits. We will
sensitive << a << b;
use Celoxica Agility compiler as our
}
synthesizer tool
};
S. Reda EN2911X FALL’07
Topic 06: Algorithm acceleration using
reconfigurable computing
• Learn how to use FPGAs and reconfigurable computing principles to
accelerate algorithms: sorting, dynamic programming, NP-hard
problems, …, etc.
• Accelerating application in various fields
– Signal and image processing
– Cryptology
– Bioinformatics
– Pattern recognition
… etc
S. Reda EN2911X FALL’07
Topic 07: Soft multi-core computing
environments
Nios processor
Core 1
Nios processor
Core 2
BUS
Accelerator
•
•
•
•
•
Memory
Learn about hard and soft processors
Design multi-core-based reconfigurable computing systems
Design of on-chip networks for multi-core systems
Design of custom instructions
Design of pluggable acceleration function units
S. Reda EN2911X FALL’07
Goals of this class
• Learn principles of reconfigurable computing with
minimum hardware bakground
• Acquire hands-on experience and useful
implementation skills
– Verilog / SystemC / Quartus II
• Develop/strengthen research skills
S. Reda EN2911X FALL’07
Class organization
•
•
•
•
HW assignments (paper reviews + mini labs): 20%
Class participation: 10%
Midterm: 20%
Class project (progress/final reports and presentation): 50%
• Sources: papers, lecture slides, manuals and book
chapters.
• Class website:
http://ic.engin.brown.edu/classes/EN2911F07
S. Reda EN2911X FALL’07
Download