November 21, 2001, Tampere, Finland Reiner Hartenstein University of Kaiserslautern Enabling Technologies for Reconfigurable Computing Part 4: FPGAs: recent developments Wednesday, November 21, 16.00 – 17.30 hrs. >> Configware Market Configware Market FPGA Market Embedded Systems (Co-Design) Hardwired IP Cores on Board Run-Time Reconfiguration (RTR) Rapid Prototyping & ASIC Emulation Evolvable Hardware (EH) Academic Expertise ASICs dead Soft CPU HLLs Problems to be solved Configware heading for mainstream 1. Configware market taking off for mainstream 2. FPGA-based designs more complex, even SoC 3. No design productivity and quality without good configware libraries (soft IP cores) from various application areas. 4. Growing number of independent configware houses (soft IP core vendors) and design services 5. Alliance CORE & Reference Design Alliance 6. Currently the top FPGA vendors are the key innovators and meet most configware demands. bleeding edge designs 1. Infinite amount of gates not yet available on a chip 2. 3 millions gates (10 millions in 2003 ?) far away from "infinite" 3. Bleeding edge designs only with sophisticated EDA tools 4. Excessive optimization needed 5. Hardware epertise is inevitable for the designer. 6. improve and simplify the design flow the user 7. provide rich configware libraries of soft IP cores, 8. APPLICATIONS: 1. control applications, 2. networking, 3. wireless telecommunication, 4. data communication, 5. embedded and consumer markets. Configware (soft IP Products) 1. For libraries, creation and reuse of configware 2. To search for IPs see: List of all available IP 3. The AllianceCORE program is a cooperation between Xilinx and third-party core developers 4. The Xilinx Reference Design Alliance Program 5. The Xilinx University Program 6. LogiCORE soft IP with LogiCORE PCI Interface. 7. Consultants EDA as the Key Enabler (major EDA vendors) 1. Select EDA quality / productivity, not FPGA architectures 2. EDA often has massive software quality problems 3. Customer: highest priority EDA center of excellence 1. collecting EDA expertise and EDA user experience 2. to assemble best possible tool environments 3. for optimum support design teams 4. to cope with interoperability problems 5. to keep track with the EDA scene as a rapidly moving target 4. being fabless, FPGA vendors spend most qualified manpower in development of EDA, IP cores, applications , support 5. Xilinx and Altera are morphing into EDA companies. OS for FPGAs separate EDA software market, comparable to the compiler / OS market in computers, Cadence, Mentor, Synopsys just jumped in. < 5% Xilinx / Altera income from EDA SW Changing EDA Tools Market Major configware EDA vendors Altera Cadence Mentor Graphics Synopsys Xilinx EDA Software for Xilinx 1. Full design flow from Cadence, Mentor, & Synopsys 2. Xilinx Software AllianceEDA Program: 1. Alliance Series Development System. 2. Foundation Series Development Systems. 3. Xilinx Foundation Series ISE (Integrated Synthesis Environment) 4. free WebPOWERED SW w. WebFitter & WebPACKISE 5. StateCAD XE and HDL Bencher 6. Foundation Base Express 7. Foundation ISE Base Express Foundation ISE Base Express ModelSim Xilinx Edition (ModelSim XE) Forge Compiler Modular Design Chipscope ILA The Xilinx System Generator XPower JBits SDK The Xilinx XtremeDSP Initiative MathWorks / Xilinx Alliance System Generator Wind River / Xilinx alliance Altera EDA • Altera was founded in June 1983 • EDA: synthesis, place & route, and, verification • Quartus II: APEX, Excalibur, Mercury, FLEX 6000 families • MAX+PLUS II: FLEX, ACEX & MAX families • Flow with Quartus II: Mentor Graphics, Synopsys, Synplicity deliver a design design software to support Altera SOPC solutions. • Mentor: only EDA vendor w. complete design environment f. APEX II incl. IP, design capture, simulation, synthesis, and h/s co-verification • Configware: Altera offers over a hundred IP cores • Third party IP core design services and consultants Cadence • FPGA Designer: top-down FPGA design system, • high-level mapping, architecture-specific optimization, • Verilog,VHDL, schematic-level design entry. • Verilog, VHDL to Synergy (logic synthesis) and FPGA Designer • FPGAs simulated by themselves using Cadence's Verilog-XL or Leapfrog VHDL simulators and • simulated with rest of the system design with Logic Workbench board/system verification environment. • Libraries for the leading FPGA manufacturers. Mentor Graphics • System Design and Verification. • PCB design and analysis: • IC Design and Verification • shifts ASIC design flow to FPGAs (Altera, Xilinx) by FPGA Advantage with IP support by ModuleWare, Xilinx CORE Generator Altera MegaWizard integration, Synopsys • FPGA Compiler II • Version of ASIC Design Compiler Ultra • Block Level Incremental Synthesis (BLIS) • ASIC <-> FPGA migration • Actel, Altera, Atmel, Cypress, Lattice, Lucent, Quicklogic, Triscend, Xilinx >> FPGA Market •Configware Market •FPGA Market •Embedded Systems (Co-Design) •Hardwired IP Cores on Board •Run-Time Reconfiguration (RTR) •Rapid Prototyping & ASIC Emulation •Evolvable Hardware (EH) •Academic Expertise •ASICs dead •Soft CPU •HLLs •Problems to be solved Top 4 PLD Manufacturers 2000 Lattice 15% Altera 37% Actel 6% Xilinx 42% $3.7 Bio Top 4 PLD Manufacturers 2000 FPGA market 1998 / 1999 global sales (mio $) 1999 rank 1998 1999 Xilinx 629 899 2 Altera 654 837 3 Lattice 206 410 4 Actel 154 172 5 Lucent 100 120 6 Cypress 41 43 7 Quicklogic 30 40 8 Atmel 32 38 Source: IC Insights Inc. 1 Meanwhile, Xilinx acquired Philips' MOS PLD business, Lattice purchased Vantis. . .... into every application [Dataquest] PLD market > $7 billion by 2003. „ fastest growing segment of semiconductor market.“ IP reuse and "pre-fabricated" components for the efficiency of design and use for PLDs FPGAs are going into every type of application. .... going into every type of application [Gordon Bell] Xilinx • fabless FPGA semi vendor, San Jose, Ca, founded 1984 • key patents on FPGAs (expiring in a few years) • Fortune 2001: No. 14 Best Company to work for in (intel: no. 42, hp no. 64, TI no. 65). • DARPA grant (Nov‘99) to develop Jbits API tools for internet reconfigurable / upgradable logic (w. VT) • Less brilliant early/mid 90ies (president Curt Wozniak): 1995 market share from 84% down to 62% [Dataquest] • As designs get larger, Xilinx losed its advantage (bugfixes did not require to burn new chips) • meanwhile, weeks of expensive debug time needed Xilinx Flexware •Virtex, Virtex-II, first w. 1 mio system gates. Virtex-E series > 3 mio system gates. • Virtex-EM on a copper process & addit. on chip memory f. network switch appl. • The Virtex XCV3200E > 3 million gates, 0.15-micron technology, •Spartan, Spartan-XL, Spartan-II for low-cost, high volume applications as ASIC replacements Multiple I/O standards, on-chip block RAM, digital delay lock loops eliminate phase lock loops, FIFOs, I/O xlators , system bus drivers •XC4000XV, XC4000XL/XLA, CPLD: low-cost families rapid development, longer system life, robust field upgradability support In-System Programming (ISP), in-board debugging, test during manufacturing, field upgrades, full JTAG compliant interface • CoolRunner: low power, high speed/density, standby mode. • Military & Aerospace: QPRO high-reliability QML certified • Configuration Storage Devices Altera Flexware Newer families: APEX 20KE, APEX 20KC, APEX II, MAX 7000B, ACEX 1K, Excalibur, Mercury families. Apex EP20K1500E (0.18-µ), up to 2.4 million system gates, APEX II (all-copper 0.13-µ) f. data path applications, supports many I/O standards. 1-Gbps True-LVDS performance wQ2001, an ARM-based Excalibur device Altera mainstream: MAX 7000A, 3000A; FLEX 6000, 10KA, 10KE; APEX 20K families. Mature and other : Classic, MAX 7000, 7000S, 9000; FLEX 8000, 10K families. Triscend CSoC [Kean] Configurable system logic ARM Digital Filter Display Interface Viterbi A/D Interface CSI Socket Configurable System Interconnect (CSI) Bus Memory Other System Resources >> Embedded Systems (Co-Design) Configware Market FPGA Market Embedded Systems (Co-Design) Hardwired IP Cores on Board Run-Time Reconfiguration (RTR) Rapid Prototyping & ASIC Emulation Evolvable Hardware (EH) Academic Expertise ASICs dead Soft CPU HLLs Problems to be solved Goal: away from complex design flow Schematics/ HDL [à la S. Guccione] Netlister Netlist Place and Route Bitstream HLL Compiler Overcome traditional separate design flow [à la S. Guccione] HLL Schematics/ HDL Netlister Netlist Place and Route . . Bitstream User Code Compiler Executable Compiler Overcome traditional co-processing design separate flow -> JBits Design Flow [à la S. Guccione] Schematics/ HDL Netlister JBits API Netlist Place and Route . . Bitstream User Code Compiler Executable User Java Code Java Compiler Executable Embedded hardware. CPU & memory cores on chip. HLL Compiler FPGA core HLL [à la S. Guccione] Compiler CPU Memory core core new directions in application development 1. new directions in application development. 2. automatic partitioning compilers: designer productivity 3. like CoDe-X (Jürgen Becker, Univ. of Karlsruhe), 4. supports Run-Time Reconfiguration (RTR), • a key enabler of error handling and fault correction by partial rerouting the FPGA at run time, • as well as remote patching for upgrading, remote debugging, and remote repair by reconfiguration - even over the internet. >> Run-Time Reconfiguration (RTR) Configware Market FPGA Market Embedded Systems (Co-Design) Hardwired IP Cores on Board Run-Time Reconfiguration (RTR) Rapid Prototyping & ASIC Emulation Evolvable Hardware (EH) Academic Expertise ASICs dead Soft CPU HLLs Problems to be solved CPU use for configuration management •on-board microprocessor CPU is available anyhow - even along with a little RTOS •use this CPU for configuration management Run-Time Reconfiguration RTR System Design HLL Compiler Hard CPU & memory core on the same chip HLL Compiler FPGA core RTR System Design HLL Compiler CPU Memory core core Converging factors for RTR • Converging factors make RTR based system design viable • 1) million gate FPGA devices and co-processing with standard microprocessors are commonplace • direct implementation of complex algorithms in FPGAs. • This alone has already revolutionized FPGA design. • 2) new tools like Xilinx Jbits software tool suite directly support coprocessing and RTR. JBits API User Java Code Java Compiler Executable RTR • Divides application into a series of sequentially executed stages, each implemented as a separate execution module. • Partial RTR partitions these stages into finer-grain sub-modules to be swapped in as needed. • Without RTR, all conf. platforms just ASIC emulators. needs a new kind of application development environments. directly support development and debugging of RTR appl. essential for the advancement of configurable computing will also heavily influence the future system organization • Xilinx, VT, BYU work on run-time kernels, run-time support, RTR debugging tools and other associated tools. • smaller, faster circuits, simplified hardware interfacing, fewer IOBs; smaller, cheaper packages, simplified software interfaces. Run-time Mapping 1. Run-time reconfigurable are: Xilinx VIRTEX FPGA family 2. RAs being part of Chameleon CS2000 series systems 3. Using such devices changes many of the basic assumptions in the HW/SW co-design process: 4. Host/RL interaction is dynamic, needs a tiny OS like eBIOS, also to organize RL reconfiguration under host control 5. Typical goal is minimization of reconfiguration latency (especially important in communication processors), to hide configuration loading latency, and, 6. Scheduling to find ’best’ schedule for eBIOS calls (C~side). >> Rapid Prototyping & ASIC Emulation Configware Market FPGA Market Embedded Systems (Co-Design) Hardwired IP Cores on Board Run-Time Reconfiguration (RTR) Rapid Prototyping & ASIC Emulation Evolvable Hardware (EH) Academic Expertise ASICs dead Soft CPU HLLs Problems to be solved ASIC emulation: a new business model ? 1. ASIC emulation / Rapid Prototyping: to replace simulation 2. Quickturn (Cadence), IKOS (Synopsys), Celaro (Mentor) 3. From rack to board to chip (from other vendors, e. g. Virtex and VirtexE family (emulate up to 3 million gates) 4. Easy configuration using SmartMedia FLASH cards 5. ASIC emulators will become obsolete within years 6. By RTR: in-circuit execution debugging instead of emulation >> Evolvable Hardware (EH) Configware Market FPGA Market Embedded Systems (Co-Design) Hardwired IP Cores on Board Run-Time Reconfiguration (RTR) Rapid Prototyping & ASIC Emulation Evolvable Hardware (EH) Academic Expertise ASICs dead Soft CPU HLLs Problems to be solved EH, EM, ... 1. "Evolvable Hardware" (EH), "Evolutionary Methods" (EM), „digital DANN“, "Darwinistic Methods", and biologically inspired electronic systems 2. New research area, also a new application area of FPGAs 3. Revival of cybernetics or bionics: stimulated by technology 4. Evolutionary“ and „DNA“ metaphor create awareness 5. EM sucks, although there are mushrooming funds in the EU, in Japan, Korea, and the USA 6. EM-related international conference series are in their stormy visionary phase, like EH, ICES, EuroGP, GP, CEC, GECCO, EvoWorkshops, MAPLD, ICGA EH, EM, ... 1. Shake-out phenomena expected, like in the past with „Artificial Intelligence“ 2. Should be considered as a specialized EDA scene, focusing on theoretical issues. 3. Genetic algorithms suck - often replacable by more efficient ones from EDA 4. It is recommendable to set-up an interwoven competence in both scenes, EM scene and the highly commercialized EDA scene 5. EH should be done by EDA people, rather than EM freaks. >> Academic Expertise Configware Market FPGA Market Embedded Systems (Co-Design) Hardwired IP Cores on Board Run-Time Reconfiguration (RTR) Rapid Prototyping & ASIC Emulation Evolvable Hardware (EH) Academic Expertise ASICs dead Soft CPU HLLs Problems to be solved BRASS (1) 1. UC Berkeley, the BRASS group: Prof. Dr. John Wawrzynek 2. The Pleiades Project, Prof. Jan Rabaey, ultra-low power highperformance multimedia computing through reconfiguration of heterogeneous system modules, reducing energy by overhead elimination, programmability at just right granularity, parallellism, pipelining, dynamic voltage scaling. 3. Garp integrates processor and FPGA; developed in parallel with compiler - software compile techniques (VLIW SW pipelining): simple pipelining functionalites, broad class of loops. 4. SCORE, a stream-based computation model - a unifying computational model. Fast Mapping for Datapaths: by a tree-parsing compiler tool for datapath module mapping BRASS (2) HSRA. new FPGA (& related tools) supports pipelining, w. retiming capable CLB architecture, implemented in a 0.4um DRAM process supporting 250MHz operation OOCG. Object Oriented Circuit-Generators in Java MESCAL (GSRC), the goal is: to provide a programmer's model and software development environment for efficient implementation of an interesting set of applications onto a family of fully-programmable architectures / microarchitectures. Berkeley claiming (1) 1. SCORE, a stream-based computation model: the BRASS group claims having solved the problem of primary impediment to wide-spread reconfigurable computing, by a unifying computational model. 2. Remark: clean stream-based model introduced ~1980: Systolic Array 3. 1995: Rainer Kress. Introduces reconfigurable stream-based model 4. Fast Mapping for Datapaths (SCORE): BRASS claims having introduced 1998 the first tree-parsing compiler tool for datapath module mapping ." 1. Further, it is the first work to integrate simultaneous placement with module mapping in a way that preserves linear time complexity." Berkeley claiming (2) 1. Remark: The DPSS (Data Path Synthesis System) using tree covering simultanous datapath placement and routing has been published in 1995 by Rainer Kress 2. „Chip-in-a-Da2 Bee Project. Prof. Dr. Bob Broderson‘s „radical rethink of the ASIC design flow aimed at shortening design time, relying on stream-based DPU arrays.“ [published in 2000] 3. Remark: the KressArray, a scalable rDPU array [1995] is stream-based .... Stream Processors - MSP-3 3rd Workshop on Media and Stream Processors (MSP-3) • http://www.pdcl.eng.wayne.edu/msp01 in conj. w. 34th Int‘l Symp. on Microarchitecture (MICRO-34) • http://www.microarch.org/micro34 Austin, Texas, December 1-2, 2001 http://www.microarch.org/micro34 Topics of interest include, but are not limited to: Hardware/Compiler techniques for improving memory performance of media and stream-based processing Application-specific hardware architectures for graphics, video, audio, communications, and other media and streaming applications System-on-a-chip architectures for media & stream processors Hardware/Software Co-Design of media and stream processors and others .... Berkeley: „Chip-in-a-Day“ Bee Project Chip-in-a-Day Project. • Prof. Dr. Bob Broderson, BWRD: targeting a radical rethink of the ASIC design flow aimed at shortening design time. • Relying on stream-based DPU arrays (not rDPU and related EDA tools. • Davis: „ „... 50x decrease in power requirements over typical TI C64X design.“ 1. New design flow to break up the highly iterative EDA process, allowing designers to spend more time defining the device and far less time implementing it in silicon. „... developers to start by creating data flow graphs rather than C code,„ 2. It is stream-based computing by DPU array (hardwired DPA) 3. For hardwired and reconfigurable DPU array and rDPU array From Stanford to BYU 1. Stanford: Prof. Flynn went emeritus, Oskar Menzer moved to Bell Labs. • no activities seen other than YAFA (yet another FPGA application) 2. UCLA: Prof. Jason Cong, expert on FPGA architectures and R& P algorithms. 9 projects, mult. sponsors under California MICRO Program 3. Prof. Majid Sarrafzadeh directs the SPS project: "versatile IPs„, a new routing architecture, architecture-aware CAD, IP-aware SPS compiler 4. 5. USC: Prof. Viktor Prasanna (EE dept.) works 20% on reconfigurable computing: MAARC project, DRIVE project and Efficient Self-Reconfiguration. - Prof. Dubois: RPM Project, FPGA-based emulation of scalable multiprocessors. DEFACTO proj.: compilation - architecture-independent at all levels 6. MIT. MATRIX web pages removed `99. „RAW project“: a conglomerate 7. VT. Prof. Athanas: Jbits API f. internet RTR logic ($2.7 mio DARPA). w. Prof. Brad Hutchings, BYU on programming approaches for RTR Systems 8. BYU. Prof. Brad Hutchings works on the JHDL (JAVA Hardware Description Language) and compilation of JHDL sources into FPGAs. From Toronto to Karlsruhe U. Toronto. Prof. J.Rose, expert in FPGA architectures and R & P alg. The group has developed Transmogrifier C, a C compiler creating netlist for Xilinx XC4000 and Altera's Flex 8000 and Flex 10000 series FPGAs. Founder of Right Track CAD Corporation acquired by Altera in 1999 Los Alamos National Laboratory, Los Alamos, New Mexico (Jeff Arnold) – Project Streams-C: programming FPGAs from C sources. Katholic University of Leuven, and IMEC: Prof. Rudy Lauwereins, methods for MPEG-4 like multimedia applications on dynamically reconfigurable platforms, & on reconf. instruction set processors. University of Karlsruhe. Prof. Dr.-Ing. Juergen Becker: hardware/software codesign, reconfigurable architectures & related synthesis for future mobile communication systems & synthesis with distributed internet-based CAD methods, partitioning co-compilers >> ASICs dead ? Configware Market FPGA Market Embedded Systems (Co-Design) Hardwired IP Cores on Board Run-Time Reconfiguration (RTR) Rapid Prototyping & ASIC Emulation Evolvable Hardware (EH) Academic Expertise ASICs dead ? Soft CPU HLLs Problems to be solved (When) Will FPGAs Kill ASICs? [Jonathan Rose] ASICs Are Already Dead My Position [Jonathan Rose] They Just Don’t Know It Yet! Why? [Jonathan Rose] You have to fabricate an ASIC Very hard, getting harder An FPGA is pre-fabricated A standard part immense economic advantages Making ASICs is Damn Difficult [Jonathan Rose] Testing Yield Cross Talk Noise Leakage Clock Tree Design Horrible very deep submicron effects we don’t even know about yet Did I Mention Inventory? [Jonathan Rose] ASIC users must predict # parts 2 or 3 months in advance! Never guess the Right Amount Make Too Many – You Pay holding costs Make Too Few – Competitor gets the Sale [Jonathan Rose] [Jonathan Rose] FPGAs Give You Instant Fabrication Get to Market Fast Fix ‘em quick Zero NRE Charges Low Risk Low Cost at good volume FPGAs: “Too Pricey & Too Slow ?” [Jonathan Rose] 9 Times Out of 10 You make can the thing fast by breaking it into multiple parallel slower pieces Custom IC Designer Can Make Logic 20x Faster, 20x Smaller than Programmable What’s Wrong with This Picture? What About PLD Cores on ASICs ? Embedded FPGA Fabric Still Have to Make the Chip Need Two Sets of Software to Build It The ASIC Flow The PLD Flow Have No Idea What to Connect the PLD Pins to Chances Are, You Are Going to Get It Wrong! [Jonathan Rose] What’s Right with This Picture! Embedded CPU Serial Link, Analog, “etc.” Pre-Fabricated One CAD Tool Flow! Can Connect Anything to Anything PLDs are built for general connectivity [Jonathan Rose] >> Soft CPU Configware Market FPGA Market Embedded Systems (Co-Design) Hardwired IP Cores on Board Run-Time Reconfiguration (RTR) Rapid Prototyping & ASIC Emulation Evolvable Hardware (EH) Academic Expertise ASICs dead Soft CPU HLLs Problems to be solved Free 32 bit processor core Processors in PLDs: Excalibur Dual-Port RAM Single-Port RAM ARM 922T Core High-Speed Processors Integrated with PLDs General Purpose PLD [Jonathan Rose] Available Today! Soft CPU: new job for compilers HLL Compiler FPGA Memory core soft CPU FPGA Some soft CPU core examples core MicroBlaze 125 MHz 70 D-MIPS Nios architecture platform 32 bit standard RISC 32 reg. by 32 LUT RAM-based reg. Xilinx up to 100 on one FPGA core architecture platform Leon 25 Mhz SPARC ARM7 clone ARM uP1232 8-bit CISC, 32 reg. 200 XC4000E CLBs 16-bit instr. set Altera Mercury REGIS 8 bits Instr. + ext. ROM 2 XILINX 3020 LCA Nios 50 MHz 32-bit instr. set Altera 22 D-MIPS Reliance-1 12 bit DSP Lattice 4 isp30256, 4 isp1016 Nios 8 bit Altera – Mercury 1Popcorn-1 8 bit CISC Altera, Lattice, Xilinx gr1040 16-bit gr1050 32-bit My80 i8080A FLEX10K30 or EPF6016 YARD-1A 16-bit RISC, 2 opd. Instr. old Xilinx FPGA Board DSPuva16 16 bit DSP Spartan-II xr16 RISC integer C SpartanXL Acorn-1 1 Flex 10K20 Nios Architecture (Altera) free DSP or Processor Cores CPU core Description Language Implementation Reliance 1 12bit DSP and peripherals Schematic Viewlogic 7 Lattice CPLDs PopCorn 1 small 8 bit CISC Verilog 1 Lattice CPLD isp3256-90 Acorn 1 small 8 bit CISC VHDL Max2PlusII+ 1 Altera 10k20 16-bit DSP A 16-bit Harvard DSP with 5 pipeline stages. VHDL Xilinx XC4000 Free-6502 6502 compatible core VHDL DLX Generic 32-bit RISC CPU VHDL DLX2 Generic 32-bit RISC CPU VHDL GL85 i8085 clone VHDL AMD 2901 AMD 2901 4-bit slice VHDL AMD 2910 AMD 2910 bit slice VHDL i8051 8-bit micro-controller VHDL Synopsys i8051 another i8051 clone VHDL Mentor Graphics Synopsys FPGA CPUs in teaching and academic research UCSC: 1990! Märaldalen University, Eskilstuna, Sweden Chalmers University, Göteborg, Sweden Cornell University Gray Research Georgia Tech Hiroshima City University, Japan Michigan State Universidad de Valladolid, Spain Virginia Tech Washington University, St. Louis New Mexico Tech UC Riverside Tokai University, Japan Xilinx 10Mg, 500Mt, .12 mic Soft rDPA feasible ? [à la S. Guccione] Array I/O examples data streams, or, from / to embedded memory banks Performance 1000 100 µProc 60%/yr.. 10 1 1980 Processor-Memory Performance Gap: (grows 50% / year) CPU DRAM 1990 2000 DRAM 7%/yr.. data streams, or, from / to embedded memory banks [à la S. Guccione] HLL 2 Soft Array miscellanous HLL [à la S. Guccione] Compiler soft CPU Memory HLL 2 „flex“ rDPA miscellanous HLL [à la S. Guccione] Compiler CPU Memory >> HLLs Configware Market FPGA Market Embedded Systems (Co-Design) Hardwired IP Cores on Board Run-Time Reconfiguration (RTR) Rapid Prototyping & ASIC Emulation Evolvable Hardware (EH) Academic Expertise ASICs dead Soft CPU HLLs Problems to be solved HLLs for Hardware Design vs. System Design vs. RTR System Design HLL Compiler System Design HLL [à la S. Guccione] Compiler RTR System Design HLLs for Hardware Design vs. System Design vs. RTR System Design HLL Compiler HLL Compiler System Design HLL [à la S. Guccione] Compiler RTR System Design CPU and memory on Chip HLL Compiler FPGA core RTR System Design HLL [à la S. Guccione] Compiler CPU Memory core core Jbit Environment RTP Core Library [à la S. Guccione] JRoute API JBits API User Code BoardScope Debugger XHWIF TCP/IP Device Simulator HLLs for Hardware Design vs. System Design vs. RTR System Design HLL Compiler HLL System Design [à la S. Guccione] Compiler Embedded System Design FPGA core HLL Compiler CPU Memory core core Memory FPGA core HLL [à la S. Guccione] Compiler soft CPU FPGA >> Problems to be solved Configware Market FPGA Market Embedded Systems (Co-Design) Hardwired IP Cores on Board Run-Time Reconfiguration (RTR) Rapid Prototyping & ASIC Emulation Evolvable Hardware (EH) Academic Expertise ASICs dead Soft CPU HLLs Problems to be solved Why Can’t Reconfig. Software Survive? Resource constraints/sizes are exposed: to programmer in low-level representation (netlist) Design revolves around device size Algorithmic structure Exploited parallelism Target technologies Processing units Power efficiency of target technologies ASICs Processors • Energy efficiency • Code size efficiency and code compaction • Run-time efficiency • DSP processors • Multimedia processors • Very long instruction word (VLIW) & EPIC machines • Micro-controllers Reconfigurable Hardware Memory Reconfigurable Logic Full custom chips may be too expensive, software too slow. Combine the speed of HW with the flexibility of SW HW with programmable functions and interconnect. Use of configurable hardware; common form: field programmable gate arrays (FPGAs) Applications: bit-oriented algorithms like encryption, fast „object recognition“ (medical and military) Adapting mobile phones to different standards. Very popular devices from XILINX (XILINX Vertex II are very recent devices) Actel and others Floor-plan of VIRTEX II FPGAs Virtex II Configurable Logic Block (CLB) Virtex II Slice (simplified) Example: Look-up tables LUT F and G can be used to compute any Boolean function of 4 variables. a 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 b 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 c 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 d 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 G 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 Virtex II (Pro) Slice [© and source: Xilinx Inc.: Virtex-II Pro™ Platform FPGAs: Functional Description, Sept. 2002, //www.xilinx.com] 2 carry paths per CLB (Vertex II Pro) Enables efficient implementation of adders. [© and source: Xilinx Inc.: Virtex-II Pro™ Platform FPGAs: Functional Description, Sept. 2002, //www.xilinx.com] Shift register configuration Slices can be configured as shift registers Implementing sums of products 16-input AND gate Dedicated or chain for computing sum of products Number of resources available in Virtex II Pro devices [© and source: Xilinx Inc.: Virtex-II Pro™ Platform FPGAs: Functional Description, Sept. 2002, //www.xilinx.com] Embedded Multipliers A Virtex-II Pro multiplier block is an 18-bit by 18- signed multiplier. Device Columns Multipliers XC2VP2 4 12 XC2VP4 4 28 Multipliers are connected to a switch matrix, share some bits with RAM (MAC instruction). XC2VP7 6 44 XC2VP20 8 88 XC2VP30 8 136 XC2VPX20 8 88 XC2VP40 10 192 XC2VP50 12 232 XC2VP70 14 328 XC2VPX70 14 308 XC2VP100 16 444 Hierarchical Routing Resources Interconnect Virtex II Pro Devices include up to 4 PowerPC processor cores [© and source: Xilinx Inc.: Virtex-II Pro™ Platform FPGAs: Functional Description, Sept. 2002, //www.xilinx.com] Memory for processor cores Cores are connected to local block RAM that can be used as a scratchpad. Summary Processing units Power efficiency of target technologies ASICs Processors • Energy efficiency • Code size efficiency and code compaction • Run-time efficiency • DSP processors • Multimedia processors • Very long instruction word (VLIW) machines • Micro-controllers Reconfigurable Hardware Memory Covered today November 21, 2001, Tampere, Finland Reiner Hartenstein University of Kaiserslautern Enabling Technologies for Reconfigurable Computing Part 4: FPGAs: recent developments Wednesday, November 21, 16.00 – 17.30 hrs.