Embedded System Hardware

advertisement
November 21, 2001, Tampere, Finland
Reiner
Hartenstein
University of
Kaiserslautern
Enabling Technologies for
Reconfigurable Computing
Part 4:
FPGAs: recent developments
Wednesday, November 21, 16.00 – 17.30 hrs.
>> Configware Market
Configware Market
FPGA Market
Embedded Systems (Co-Design)
Hardwired IP Cores on Board
Run-Time Reconfiguration (RTR)
Rapid Prototyping & ASIC Emulation
Evolvable Hardware (EH)
Academic Expertise
ASICs dead
Soft CPU
HLLs
Problems to be solved
Configware heading for mainstream
1. Configware market taking off for mainstream
2. FPGA-based designs more complex, even SoC
3. No design productivity and quality without good configware libraries
(soft IP cores) from various application areas.
4. Growing number of independent configware houses (soft IP core
vendors) and design services
5. Alliance CORE & Reference Design Alliance
6. Currently the top FPGA vendors are the key innovators and meet most
configware demands.
bleeding edge designs
1. Infinite amount of gates not yet available on a chip
2. 3 millions gates (10 millions in 2003 ?) far away from "infinite"
3. Bleeding edge designs only with sophisticated EDA tools
4. Excessive optimization needed
5. Hardware epertise is inevitable for the designer.
6. improve and simplify the design flow the user
7. provide rich configware libraries of soft IP cores,
8. APPLICATIONS:
1. control applications,
2. networking,
3. wireless telecommunication,
4. data communication,
5. embedded and consumer markets.
Configware (soft IP Products)
1. For libraries, creation and reuse of configware
2. To search for IPs see: List of all available IP
3. The AllianceCORE program is a cooperation between Xilinx
and third-party core developers
4. The Xilinx Reference Design Alliance Program
5. The Xilinx University Program
6. LogiCORE soft IP with LogiCORE PCI Interface.
7. Consultants
EDA as the Key Enabler (major EDA vendors)
1. Select EDA quality / productivity, not FPGA architectures
2. EDA often has massive software quality problems
3. Customer: highest priority EDA center of excellence
1. collecting EDA expertise and EDA user experience
2. to assemble best possible tool environments
3. for optimum support design teams
4. to cope with interoperability problems
5. to keep track with the EDA scene as a rapidly moving target
4. being fabless, FPGA vendors spend most qualified manpower in
development of EDA, IP cores, applications , support
5. Xilinx and Altera are morphing into EDA companies.
OS for FPGAs
separate EDA software market, comparable to the compiler / OS market in
computers,
Cadence, Mentor, Synopsys just jumped in.
< 5% Xilinx / Altera income from EDA SW
Changing EDA Tools Market
Major configware EDA
vendors
 Altera
 Cadence
 Mentor Graphics
 Synopsys
 Xilinx
EDA Software for Xilinx
1. Full design flow from Cadence, Mentor, & Synopsys
2. Xilinx Software AllianceEDA Program:
1. Alliance Series Development System.
2. Foundation Series Development Systems.
3. Xilinx Foundation Series ISE (Integrated Synthesis
Environment)
4. free WebPOWERED SW w. WebFitter & WebPACKISE
5. StateCAD XE and HDL Bencher
6. Foundation Base Express
7. Foundation ISE Base Express
Foundation ISE Base Express
ModelSim Xilinx Edition (ModelSim
XE)
Forge Compiler
Modular Design
Chipscope ILA
The Xilinx System Generator
XPower
JBits SDK
The Xilinx XtremeDSP Initiative
MathWorks / Xilinx Alliance
System Generator
Wind River / Xilinx alliance
Altera EDA
• Altera was founded in June 1983
• EDA: synthesis, place & route, and, verification
• Quartus II: APEX, Excalibur, Mercury, FLEX 6000 families
• MAX+PLUS II: FLEX, ACEX & MAX families
• Flow with Quartus II: Mentor Graphics, Synopsys, Synplicity deliver a
design design software to support Altera SOPC solutions.
• Mentor: only EDA vendor w. complete design environment f. APEX II
incl. IP, design capture, simulation, synthesis, and h/s co-verification
• Configware: Altera offers over a hundred IP cores
• Third party IP core design services and consultants
Cadence
• FPGA Designer: top-down FPGA design system,
• high-level mapping, architecture-specific optimization,
• Verilog,VHDL, schematic-level design entry.
• Verilog, VHDL to Synergy (logic synthesis) and FPGA Designer
• FPGAs simulated by themselves using Cadence's Verilog-XL or
Leapfrog VHDL simulators and
• simulated with rest of the system design with Logic Workbench
board/system verification environment.
• Libraries for the leading FPGA manufacturers.
Mentor Graphics
• System Design and Verification.
• PCB design and analysis:
• IC Design and Verification
• shifts ASIC design flow to FPGAs (Altera, Xilinx)
 by FPGA Advantage with IP support
 by ModuleWare,
 Xilinx CORE Generator
 Altera MegaWizard integration,
Synopsys
• FPGA Compiler II
• Version of ASIC Design Compiler Ultra
• Block Level Incremental Synthesis (BLIS)
• ASIC <-> FPGA migration
• Actel, Altera, Atmel, Cypress, Lattice, Lucent,
Quicklogic, Triscend, Xilinx
>> FPGA Market
•Configware Market
•FPGA Market
•Embedded Systems (Co-Design)
•Hardwired IP Cores on Board
•Run-Time Reconfiguration (RTR)
•Rapid Prototyping & ASIC Emulation
•Evolvable Hardware (EH)
•Academic Expertise
•ASICs dead
•Soft CPU
•HLLs
•Problems to be solved
Top 4 PLD Manufacturers 2000
Lattice
15%
Altera
37%
Actel
6%
Xilinx
42%
$3.7 Bio
Top 4 PLD Manufacturers 2000
FPGA market 1998 / 1999
global sales (mio $)
1999 rank
1998
1999
Xilinx
629
899
2
Altera
654
837
3
Lattice
206
410
4
Actel
154
172
5
Lucent
100
120
6
Cypress
41
43
7
Quicklogic
30
40
8
Atmel
32
38
Source:
IC Insights Inc.
1
Meanwhile,
Xilinx acquired
Philips' MOS
PLD
business,
Lattice
purchased
Vantis.
.
.... into every application
[Dataquest] PLD market > $7 billion by 2003.
„ fastest growing segment of semiconductor market.“
IP reuse and "pre-fabricated" components for the
efficiency of design and use for PLDs
FPGAs are going into every type of application.
.... going into every type of application
[Gordon Bell]
Xilinx
• fabless FPGA semi vendor, San Jose, Ca, founded 1984
• key patents on FPGAs (expiring in a few years)
• Fortune 2001: No. 14 Best Company to work for in (intel: no. 42, hp no.
64, TI no. 65).
• DARPA grant (Nov‘99) to develop Jbits API tools for internet
reconfigurable / upgradable logic (w. VT)
• Less brilliant early/mid 90ies (president Curt Wozniak): 1995 market
share from 84% down to 62% [Dataquest]
• As designs get larger, Xilinx losed its advantage (bugfixes did not require
to burn new chips)
• meanwhile, weeks of expensive debug time needed
Xilinx Flexware
•Virtex, Virtex-II, first w. 1 mio system gates.
 Virtex-E series > 3 mio system gates.
• Virtex-EM on a copper process & addit. on chip memory f. network switch
appl.
• The Virtex XCV3200E > 3 million gates, 0.15-micron technology,
•Spartan, Spartan-XL, Spartan-II
 for low-cost, high volume applications as ASIC replacements
 Multiple I/O standards, on-chip block RAM, digital delay lock loops
 eliminate phase lock loops, FIFOs, I/O xlators , system bus drivers
•XC4000XV, XC4000XL/XLA, CPLD: low-cost families
 rapid development, longer system life, robust field upgradability
 support In-System Programming (ISP), in-board debugging,
 test during manufacturing, field upgrades, full JTAG compliant interface
• CoolRunner: low power, high speed/density, standby mode.
• Military & Aerospace: QPRO high-reliability QML certified
• Configuration Storage Devices
Altera Flexware
Newer families: APEX 20KE, APEX 20KC, APEX II, MAX 7000B, ACEX 1K,
Excalibur, Mercury families.
 Apex EP20K1500E (0.18-µ), up to 2.4 million system gates,
 APEX II (all-copper 0.13-µ) f. data path applications, supports many I/O
standards. 1-Gbps True-LVDS performance
 wQ2001, an ARM-based Excalibur device
Altera mainstream: MAX 7000A, 3000A; FLEX 6000, 10KA, 10KE; APEX 20K
families.
Mature and other : Classic, MAX 7000, 7000S, 9000; FLEX 8000, 10K families.
Triscend CSoC
[Kean]
Configurable system logic
ARM
Digital Filter
Display Interface
Viterbi
A/D Interface
CSI Socket
Configurable System Interconnect (CSI) Bus
Memory
Other System Resources
>> Embedded Systems (Co-Design)
Configware Market
FPGA Market
Embedded Systems (Co-Design)
Hardwired IP Cores on Board
Run-Time Reconfiguration (RTR)
Rapid Prototyping & ASIC Emulation
Evolvable Hardware (EH)
Academic Expertise
ASICs dead
Soft CPU
HLLs
Problems to be solved
Goal: away from complex design flow
Schematics/
HDL
[à la S. Guccione]
Netlister
Netlist
Place
and
Route
Bitstream
HLL
Compiler
Overcome traditional separate design
flow
[à la S. Guccione]
HLL
Schematics/
HDL
Netlister
Netlist
Place
and
Route
.
.
Bitstream
User
Code
Compiler
Executable
Compiler
Overcome traditional co-processing design
separate flow -> JBits Design Flow
[à la S. Guccione]
Schematics/
HDL
Netlister
JBits
API
Netlist
Place
and
Route
.
.
Bitstream
User
Code
Compiler
Executable
User
Java
Code
Java
Compiler
Executable
Embedded hardware. CPU & memory cores on chip.
HLL
Compiler
FPGA core
HLL
[à la S. Guccione]
Compiler
CPU Memory
core
core
new directions in application development
1. new directions in application development.
2. automatic partitioning compilers: designer productivity
3. like CoDe-X (Jürgen Becker, Univ. of Karlsruhe),
4. supports Run-Time Reconfiguration (RTR),
•
a key enabler of error handling and fault correction by partial rerouting the FPGA at run time,
•
as well as remote patching for upgrading, remote debugging, and
remote repair by reconfiguration - even over the internet.
>> Run-Time Reconfiguration (RTR)
Configware Market
FPGA Market
Embedded Systems (Co-Design)
Hardwired IP Cores on Board
Run-Time Reconfiguration (RTR)
Rapid Prototyping & ASIC Emulation
Evolvable Hardware (EH)
Academic Expertise
ASICs dead
Soft CPU
HLLs
Problems to be solved
CPU use for configuration management
•on-board microprocessor CPU is available anyhow - even along
with a little RTOS
•use this CPU for configuration management
Run-Time Reconfiguration
RTR System Design
HLL
Compiler
Hard CPU & memory core on the same chip
HLL
Compiler
FPGA core
RTR System Design
HLL
Compiler
CPU Memory
core
core
Converging factors for RTR
• Converging factors make RTR based system design
viable
• 1) million gate FPGA devices and co-processing with standard
microprocessors are commonplace
• direct implementation of complex algorithms in FPGAs.
• This alone has already
revolutionized FPGA design.
• 2) new tools like Xilinx Jbits
software tool suite directly
support coprocessing and RTR.
JBits
API
User
Java
Code
Java
Compiler
Executable
RTR
• Divides application into a series of sequentially executed stages, each implemented as a
separate execution module.
• Partial RTR partitions these stages into finer-grain sub-modules to be swapped in as
needed.
• Without RTR, all conf. platforms just ASIC emulators.

needs a new kind of application development environments.

directly support development and debugging of RTR appl.

essential for the advancement of configurable computing

will also heavily influence the future system organization
• Xilinx, VT, BYU work on run-time kernels, run-time support, RTR debugging tools and
other associated tools.
• smaller, faster circuits, simplified hardware interfacing, fewer IOBs; smaller, cheaper
packages, simplified software interfaces.
Run-time Mapping
1.
Run-time reconfigurable are: Xilinx VIRTEX FPGA family
2.
RAs being part of Chameleon CS2000 series systems
3.
Using such devices changes many of the basic assumptions in the HW/SW
co-design process:
4.
Host/RL interaction is dynamic, needs a tiny OS like eBIOS, also to organize
RL reconfiguration under host control
5.
Typical goal is minimization of reconfiguration latency (especially important in
communication processors), to hide configuration loading latency, and,
6.
Scheduling to find ’best’ schedule for eBIOS calls (C~side).
>> Rapid Prototyping & ASIC Emulation
Configware Market
FPGA Market
Embedded Systems (Co-Design)
Hardwired IP Cores on Board
Run-Time Reconfiguration (RTR)
Rapid Prototyping & ASIC Emulation
Evolvable Hardware (EH)
Academic Expertise
ASICs dead
Soft CPU
HLLs
Problems to be solved
ASIC emulation: a new business model ?
1.
ASIC emulation / Rapid Prototyping: to replace simulation
2.
Quickturn (Cadence), IKOS (Synopsys), Celaro (Mentor)
3.
From rack to board to chip (from other vendors, e. g. Virtex and VirtexE family
(emulate up to 3 million gates)
4.
Easy configuration using SmartMedia FLASH cards
5.
ASIC emulators will become obsolete within years
6.
By RTR: in-circuit execution debugging instead of emulation
>> Evolvable Hardware (EH)
Configware Market
FPGA Market
Embedded Systems (Co-Design)
Hardwired IP Cores on Board
Run-Time Reconfiguration (RTR)
Rapid Prototyping & ASIC Emulation
Evolvable Hardware (EH)
Academic Expertise
ASICs dead
Soft CPU
HLLs
Problems to be solved
EH, EM, ...
1.
"Evolvable Hardware" (EH), "Evolutionary Methods" (EM), „digital DANN“,
"Darwinistic Methods", and biologically inspired electronic systems
2.
New research area, also a new application area of FPGAs
3.
Revival of cybernetics or bionics: stimulated by technology
4.
Evolutionary“ and „DNA“ metaphor create awareness
5.
EM sucks, although there are mushrooming funds in the EU, in Japan, Korea,
and the USA
6.
EM-related international conference series are in their stormy visionary phase,
like EH, ICES, EuroGP, GP, CEC, GECCO, EvoWorkshops, MAPLD, ICGA
EH, EM, ...
1.
Shake-out phenomena expected, like in the past with „Artificial Intelligence“
2.
Should be considered as a specialized EDA scene, focusing on theoretical
issues.
3.
Genetic algorithms suck - often replacable by more efficient ones from EDA
4.
It is recommendable to set-up an interwoven competence in both scenes, EM
scene and the highly commercialized EDA scene
5.
EH should be done by EDA people, rather than EM freaks.
>> Academic Expertise
Configware Market
FPGA Market
Embedded Systems (Co-Design)
Hardwired IP Cores on Board
Run-Time Reconfiguration (RTR)
Rapid Prototyping & ASIC Emulation
Evolvable Hardware (EH)
Academic Expertise
ASICs dead
Soft CPU
HLLs
Problems to be solved
BRASS (1)
1.
UC Berkeley, the BRASS group: Prof. Dr. John Wawrzynek
2.
The Pleiades Project, Prof. Jan Rabaey, ultra-low power highperformance multimedia computing through reconfiguration of
heterogeneous system modules, reducing energy by overhead
elimination, programmability at just right granularity, parallellism,
pipelining, dynamic voltage scaling.
3.
Garp integrates processor and FPGA; developed in parallel with
compiler - software compile techniques (VLIW SW pipelining): simple
pipelining functionalites, broad class of loops.
4.
SCORE, a stream-based computation model - a unifying
computational model. Fast Mapping for Datapaths: by a tree-parsing
compiler tool for datapath module mapping
BRASS (2)
HSRA. new FPGA (& related tools) supports pipelining,
w. retiming capable CLB architecture, implemented in a
0.4um DRAM process supporting 250MHz operation
OOCG. Object Oriented Circuit-Generators in Java
MESCAL (GSRC), the goal is: to provide a programmer's
model and software development environment for
efficient implementation of an interesting set of
applications onto a family of fully-programmable
architectures / microarchitectures.
Berkeley claiming (1)
1.
SCORE, a stream-based computation model: the BRASS group claims
having solved the problem of primary impediment to wide-spread
reconfigurable computing, by a unifying computational model.
2.
Remark: clean stream-based model introduced ~1980: Systolic Array
3.
1995: Rainer Kress. Introduces reconfigurable stream-based model
4.
Fast Mapping for Datapaths (SCORE): BRASS claims having introduced
1998 the first tree-parsing compiler tool for datapath module mapping ."
1.
Further, it is the first work to integrate simultaneous placement with
module mapping in a way that preserves linear time complexity."
Berkeley claiming (2)
1. Remark: The DPSS (Data Path Synthesis System) using tree covering
simultanous datapath placement and routing has been published in
1995 by Rainer Kress
2. „Chip-in-a-Da2 Bee Project. Prof. Dr. Bob Broderson‘s „radical rethink of
the ASIC design flow aimed at shortening design time, relying on
stream-based DPU arrays.“ [published in 2000]
3. Remark: the KressArray, a scalable rDPU array [1995] is stream-based
.... Stream Processors - MSP-3
3rd Workshop on Media and Stream Processors (MSP-3)
• http://www.pdcl.eng.wayne.edu/msp01
in conj. w. 34th Int‘l Symp. on Microarchitecture (MICRO-34)
• http://www.microarch.org/micro34
Austin, Texas, December 1-2, 2001
http://www.microarch.org/micro34
Topics of interest include, but are not limited to:
 Hardware/Compiler techniques for improving memory performance of
media and stream-based processing
 Application-specific hardware architectures for graphics, video, audio,
communications, and other media and streaming applications
 System-on-a-chip architectures for media & stream processors
 Hardware/Software Co-Design of media and stream processors
 and others ....
Berkeley: „Chip-in-a-Day“ Bee Project
Chip-in-a-Day Project.
•
Prof. Dr. Bob Broderson, BWRD: targeting a radical rethink of the
ASIC design flow aimed at shortening design time.
•
Relying on stream-based DPU arrays (not rDPU and related EDA
tools.
•
Davis: „ „... 50x decrease in power requirements over typical TI C64X
design.“
1. New design flow to break up the highly iterative EDA process,
allowing designers to spend more time defining the device and far
less time implementing it in silicon. „... developers to start by creating
data flow graphs rather than C code,„
2. It is stream-based computing by DPU array (hardwired DPA)
3. For hardwired and reconfigurable DPU array and rDPU array
From Stanford to BYU
1.
Stanford: Prof. Flynn went emeritus, Oskar Menzer moved to Bell Labs.
•
no activities seen other than YAFA (yet another FPGA application)
2.
UCLA: Prof. Jason Cong, expert on FPGA architectures and R& P algorithms. 9 projects, mult.
sponsors under California MICRO Program
3.
Prof. Majid Sarrafzadeh directs the SPS project: "versatile IPs„, a new routing architecture,
architecture-aware CAD, IP-aware SPS compiler
4.
5.
USC: Prof. Viktor Prasanna (EE dept.) works 20% on reconfigurable computing: MAARC project,
DRIVE project and Efficient Self-Reconfiguration. - Prof. Dubois: RPM Project, FPGA-based
emulation of scalable multiprocessors.
DEFACTO proj.: compilation - architecture-independent at all levels
6.
MIT. MATRIX web pages removed `99. „RAW project“: a conglomerate
7.
VT. Prof. Athanas: Jbits API f. internet RTR logic ($2.7 mio DARPA). w. Prof. Brad Hutchings, BYU
on programming approaches for RTR Systems
8.
BYU. Prof. Brad Hutchings works on the JHDL (JAVA Hardware Description Language) and
compilation of JHDL sources into FPGAs.
From Toronto to Karlsruhe
U. Toronto. Prof. J.Rose, expert in FPGA architectures and R & P alg.
The group has developed Transmogrifier C, a C compiler creating netlist for Xilinx
XC4000 and Altera's Flex 8000 and Flex 10000 series FPGAs.
Founder of Right Track CAD Corporation acquired by Altera in 1999
Los Alamos National Laboratory, Los Alamos, New Mexico (Jeff Arnold) – Project
Streams-C: programming FPGAs from C sources.
Katholic University of Leuven, and IMEC: Prof. Rudy Lauwereins, methods for
MPEG-4 like multimedia applications on dynamically reconfigurable platforms, & on
reconf. instruction set processors.
University of Karlsruhe. Prof. Dr.-Ing. Juergen Becker: hardware/software codesign, reconfigurable architectures & related synthesis for future mobile
communication systems & synthesis with distributed internet-based CAD methods,
partitioning co-compilers
>> ASICs dead ?
Configware Market
FPGA Market
Embedded Systems (Co-Design)
Hardwired IP Cores on Board
Run-Time Reconfiguration (RTR)
Rapid Prototyping & ASIC Emulation
Evolvable Hardware (EH)
Academic Expertise
ASICs dead ?
Soft CPU
HLLs
Problems to be solved
(When) Will FPGAs Kill ASICs?
[Jonathan Rose]
ASICs Are Already Dead
My Position
[Jonathan Rose]
They Just Don’t Know It Yet!
Why? [Jonathan Rose]
You have to fabricate an ASIC
 Very hard, getting harder
An FPGA is pre-fabricated
 A standard part
 immense economic advantages
Making ASICs is Damn Difficult
[Jonathan Rose]
Testing
Yield
Cross Talk
Noise
Leakage
Clock Tree Design
Horrible very deep submicron effects we don’t even know about
yet
Did I Mention Inventory? [Jonathan Rose]
ASIC users must predict # parts
 2 or 3 months in advance!
Never guess the Right Amount
 Make Too Many – You Pay holding costs
 Make Too Few – Competitor gets the Sale
[Jonathan Rose]
[Jonathan Rose] FPGAs Give You
Instant Fabrication
 Get to Market Fast
 Fix ‘em quick
Zero NRE Charges
 Low Risk
 Low Cost at good volume
FPGAs: “Too Pricey & Too Slow ?”
[Jonathan Rose]
9 Times Out of 10
 You make can the thing fast by breaking it into multiple
parallel slower pieces
Custom IC Designer Can Make Logic
 20x Faster,
 20x Smaller than Programmable
What’s Wrong with This Picture?
What About PLD
Cores on ASICs
?
Embedded
FPGA Fabric
Still Have to Make the Chip
Need Two Sets of Software to Build It
 The ASIC Flow
 The PLD Flow
Have No Idea What to Connect the PLD Pins to
 Chances Are, You Are Going to Get It Wrong!
[Jonathan Rose]
What’s Right with This Picture!
Embedded
CPU Serial Link,
Analog, “etc.”
Pre-Fabricated
One CAD Tool Flow!
Can Connect Anything to Anything
 PLDs are built for general connectivity
[Jonathan Rose]
>> Soft CPU
Configware Market
FPGA Market
Embedded Systems (Co-Design)
Hardwired IP Cores on Board
Run-Time Reconfiguration (RTR)
Rapid Prototyping & ASIC Emulation
Evolvable Hardware (EH)
Academic Expertise
ASICs dead
Soft CPU
HLLs
Problems to be solved
Free 32 bit processor core
Processors in PLDs: Excalibur
Dual-Port
RAM
Single-Port
RAM
ARM 922T
Core
High-Speed
Processors
Integrated with
PLDs
General Purpose
PLD
[Jonathan Rose]
Available Today!
Soft CPU: new job for compilers
HLL
Compiler
FPGA
Memory
core
soft
CPU
FPGA
Some soft CPU core examples
core
MicroBlaze 125
MHz 70 D-MIPS
Nios
architecture
platform
32 bit standard RISC
32 reg. by 32 LUT
RAM-based reg.
Xilinx up to 100 on
one FPGA
core
architecture
platform
Leon
25 Mhz
SPARC
ARM7 clone
ARM
uP1232 8-bit
CISC, 32 reg.
200 XC4000E CLBs
16-bit
instr. set
Altera
Mercury
REGIS
8 bits Instr. + ext.
ROM
2 XILINX 3020 LCA
Nios
50 MHz
32-bit
instr. set
Altera
22 D-MIPS
Reliance-1
12 bit DSP
Lattice
4 isp30256,
4 isp1016
Nios
8 bit
Altera – Mercury
1Popcorn-1
8 bit CISC
Altera, Lattice, Xilinx
gr1040
16-bit
gr1050
32-bit
My80
i8080A
FLEX10K30 or
EPF6016
YARD-1A
16-bit RISC,
2 opd. Instr.
old Xilinx FPGA Board
DSPuva16
16 bit DSP
Spartan-II
xr16
RISC integer C
SpartanXL
Acorn-1
1 Flex 10K20
Nios Architecture (Altera)
free DSP or Processor Cores
CPU core
Description
Language
Implementation
Reliance 1
12bit DSP and peripherals
Schematic
Viewlogic 7 Lattice CPLDs
PopCorn 1
small 8 bit CISC
Verilog
1 Lattice CPLD isp3256-90
Acorn 1
small 8 bit CISC
VHDL
Max2PlusII+ 1 Altera 10k20
16-bit DSP
A 16-bit Harvard DSP with 5 pipeline
stages.
VHDL
Xilinx XC4000
Free-6502
6502 compatible core
VHDL
DLX
Generic 32-bit RISC CPU
VHDL
DLX2
Generic 32-bit RISC CPU
VHDL
GL85
i8085 clone
VHDL
AMD 2901
AMD 2901 4-bit slice
VHDL
AMD 2910
AMD 2910 bit slice
VHDL
i8051
8-bit micro-controller
VHDL
Synopsys
i8051
another i8051 clone
VHDL
Mentor Graphics
Synopsys
FPGA CPUs in teaching and academic
research
UCSC: 1990!
Märaldalen University, Eskilstuna,
Sweden
Chalmers University, Göteborg,
Sweden
Cornell University
Gray Research
Georgia Tech
Hiroshima City University, Japan
Michigan State
Universidad de Valladolid,
Spain
Virginia Tech
Washington University, St.
Louis
New Mexico Tech
UC Riverside
Tokai University, Japan
Xilinx 10Mg, 500Mt, .12 mic
Soft rDPA feasible ?
[à la S. Guccione]
Array I/O examples
data streams, or, from / to
embedded memory banks
Performance
1000
100
µProc
60%/yr..
10
1
1980
Processor-Memory
Performance Gap:
(grows 50% / year)
CPU
DRAM
1990
2000
DRAM
7%/yr..
data
streams,
or,
from / to
embedded
memory
banks
[à la S. Guccione]
HLL 2 Soft Array
miscellanous
HLL
[à la S. Guccione]
Compiler
soft CPU
Memory
HLL 2 „flex“ rDPA
miscellanous
HLL
[à la S. Guccione]
Compiler
CPU
Memory
>> HLLs
Configware Market
FPGA Market
Embedded Systems (Co-Design)
Hardwired IP Cores on Board
Run-Time Reconfiguration (RTR)
Rapid Prototyping & ASIC Emulation
Evolvable Hardware (EH)
Academic Expertise
ASICs dead
Soft CPU
HLLs
Problems to be solved
HLLs for Hardware Design vs. System
Design vs. RTR System Design
HLL
Compiler
System Design
HLL
[à la S. Guccione]
Compiler
RTR System Design
HLLs for Hardware Design vs. System
Design vs. RTR System Design
HLL
Compiler
HLL
Compiler
System Design
HLL
[à la S. Guccione]
Compiler
RTR System Design
CPU and memory on Chip
HLL
Compiler
FPGA core
RTR System Design
HLL
[à la S. Guccione]
Compiler
CPU Memory
core
core
Jbit Environment
RTP Core
Library
[à la S. Guccione]
JRoute
API
JBits
API
User
Code
BoardScope
Debugger
XHWIF
TCP/IP
Device
Simulator
HLLs for Hardware Design vs.
System Design vs. RTR System
Design
HLL
Compiler
HLL
System Design
[à la S. Guccione]
Compiler
Embedded System Design
FPGA core
HLL
Compiler
CPU Memory
core
core
Memory
FPGA
core
HLL
[à la S. Guccione]
Compiler
soft
CPU
FPGA
>> Problems to be solved
Configware Market
FPGA Market
Embedded Systems (Co-Design)
Hardwired IP Cores on Board
Run-Time Reconfiguration (RTR)
Rapid Prototyping & ASIC Emulation
Evolvable Hardware (EH)
Academic Expertise
ASICs dead
Soft CPU
HLLs
Problems to be solved
Why Can’t Reconfig. Software
Survive?
Resource constraints/sizes are exposed:
 to programmer
 in low-level representation (netlist)
Design revolves around device size
 Algorithmic structure
 Exploited parallelism
Target technologies
Processing units

Power efficiency of target technologies

ASICs

Processors
•
Energy efficiency
•
Code size efficiency and code compaction
•
Run-time efficiency
•
DSP processors
•
Multimedia processors
•
Very long instruction word (VLIW) & EPIC machines
•
Micro-controllers

Reconfigurable Hardware

Memory
Reconfigurable Logic
Full custom chips may be too expensive, software too slow.
Combine the speed of HW with the flexibility of SW
 HW with programmable functions and interconnect.
 Use of configurable hardware;
common form: field programmable gate arrays (FPGAs)
Applications: bit-oriented algorithms like
 encryption,
 fast „object recognition“ (medical and military)
 Adapting mobile phones to different standards.
Very popular devices from
 XILINX (XILINX Vertex II are very recent devices)
 Actel and others
Floor-plan of VIRTEX II FPGAs
Virtex II Configurable Logic Block (CLB)
Virtex II Slice (simplified)
Example:
Look-up tables LUT F and G can be used to
compute any Boolean function of  4 variables.
a
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
b
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
c
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
d
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
G
0
1
1
0
1
0
0
1
1
0
0
1
0
1
1
0
Virtex II
(Pro) Slice
[© and source: Xilinx Inc.:
Virtex-II Pro™ Platform
FPGAs: Functional
Description, Sept. 2002,
//www.xilinx.com]
2 carry paths per
CLB (Vertex II Pro)
Enables efficient
implementation of
adders.
[© and source: Xilinx Inc.: Virtex-II Pro™
Platform FPGAs: Functional Description, Sept.
2002, //www.xilinx.com]
Shift register configuration
Slices can be configured as shift registers
Implementing sums of products
16-input AND gate
Dedicated or chain for computing sum of products
Number of resources
available in Virtex II Pro devices
[© and source: Xilinx Inc.: Virtex-II Pro™ Platform FPGAs:
Functional Description, Sept. 2002, //www.xilinx.com]
Embedded Multipliers
A Virtex-II Pro multiplier block is
an 18-bit by 18- signed
multiplier.
Device
Columns
Multipliers
XC2VP2
4
12
XC2VP4
4
28
Multipliers are connected to a
switch matrix, share some bits
with RAM (MAC instruction).
XC2VP7
6
44
XC2VP20
8
88
XC2VP30
8
136
XC2VPX20
8
88
XC2VP40
10
192
XC2VP50
12
232
XC2VP70
14
328
XC2VPX70
14
308
XC2VP100
16
444
Hierarchical Routing Resources
Interconnect
Virtex II Pro Devices
include
up to 4 PowerPC
processor cores
[© and source: Xilinx Inc.: Virtex-II Pro™ Platform
FPGAs: Functional Description, Sept. 2002,
//www.xilinx.com]
Memory for processor cores
Cores are
connected to local
block RAM that can
be used as a
scratchpad.
Summary
Processing units

Power efficiency of target technologies

ASICs

Processors
•
Energy efficiency
•
Code size efficiency and code compaction
•
Run-time efficiency
•
DSP processors
•
Multimedia processors
•
Very long instruction word (VLIW) machines
•
Micro-controllers

Reconfigurable Hardware

Memory
Covered
today
November 21, 2001, Tampere, Finland
Reiner Hartenstein
University of
Kaiserslautern
Enabling Technologies for
Reconfigurable Computing
Part 4:
FPGAs: recent developments
Wednesday, November 21, 16.00 – 17.30 hrs.
Download