Synthesis of Transaction-Level Models to FPGAs Prof. Jason Cong Yiping Fan, Guoling Han, Wei Jiang, Zhiru Zhang VLSI CAD Lab Computer Science Department University of California, Los Angeles Outline Transaction-level model (TLM) SystemC TLM Metropolis Meta Model Synthesis from TLM RDR/MCAS: our existing architectural synthesis approach xPilot: Ongoing synthesis infrastructure for TLM Outline Transaction-level model (TLM) SystemC TLM Metropolis Meta Model Synthesis from TLM RDR/MCAS: our existing architectural synthesis approach xPilot: Ongoing synthesis infrastructure for TLM SystemC Framework SystemC history OO system/HW modeling and simulation SystemC under development by CAD vendors/researchers • Synopsys • Frontier Design • CoWare (Belgium) Released to public Sept. ‘99 • Open source distribution @ www.systemc.org • Version 2 out July ‘01 Channels and Modules Basic building blocks: Module (class) instances, communicating via channel (class) instances Modules’ functionality coded as concurrent processes • Processes communicate via channels or events Communication Modeling in SystemC Primitive Channels in SystemC Library Ordinary signal (wire) of type <T> Fill in data type T when instantiated Point-to-point or multi-point (1 writer, n readers) Signal bus (arbitrary width) FIFO, for producer/consumer connection Pseudo-channels Mutex & semaphore, for interprocess sync Accessed using channel syntax Complex “hierarchical” channels composed of primitive channels, processes, modules Events and Processes Events: abstract occurrences used for Process triggering (like VHDL sensitivity list) Channel communication Interprocess synchronization Process can call wait() to block on event Event occurrence tells simulator to schedule simulation of relevant process Processes execution Not called directly from your code Triggered for simulation by events on ports, channels, or explicit named events Registered in constructor of enclosing module (associate method with events) Thread process → infinite loop Must call wait() to lose control Method process → runs to completion Less scheduling overhead Data Types in SystemC SystemC supports Native C/C++ Types SystemC Types SystemC Types Data type for system modeling 2 value (‘0’,’1’) logic/logic vector 4 value (‘0’,’1’,’Z’,’X’) logic/logic vector Arbitrary sized integer (Signed/Unsigned) Fixed Point types (Templated/Untemplated) Objective: to reflect HW registers & ALU operations Functional Level and RTL Modeling in SystemC Functional level Sequential, algorithmic, software-like Explore HW/SW architectures, proof of algorithms, performance modeling & analysis Register transfer level Complete detailed functional description of hardware • Every register, bus, bit for every clock cycle • Use C++ switch/case for FSM implementation At this point, can switch to HDL, but staying in SystemC leverages test benches Prepare for HW synthesis step by using only synthesizable constructs Transaction Level Modeling in SystemC Transaction level Model includes architectural components Maintain component interface accuracy • E.g., buses modeled as channels (read/write operations) Behavioral style inside a component Simulates 100-10,000x faster than RTL Provide execution platform for SW development TLM – Raise the Level of Architectural Modeling What is TLM? Communication uses function calls • burst_read(char* buf, int addr, int len); Why is TLM interesting? Simulation: Fast and compact Integrate HW and SW models Early platform for SW development Early system exploration and verification Verification reuse Synthesis … Reference: www.systemc.org Typical Design Flow Using TLM Functional model Captures system behaviour TLM, Transaction Level Model Bus transactions Accurate interaction with SW portion Simulates rapidly Can create TLM model initially Introduction of Metropolis A UCB and GSRC project, http://www.gigascale.org/metropolis/ Platform-based design [ASV] Platforms have sufficient flexibility to support a series of applications/products Choose a platform by design space exploration Above two require models to be reusable Orthogonalization of concerns Computation vs. Communication Behavior vs. Coordination Behavior vs. Architecture Capability vs. Cost Metropolis Meta Model A combination Imperative of imperative program and declarative constraints program: objects (process, media, quantity, statemedia) netlist await block and label interface function call quantity annotation Declarative constraints Linear Temporal Logic (LTL) (synch) Logic of Constraints (LOC) A Metropolis Design Tutorial MyMapNetlist MyFncNetlist P1 Env1 M P2 Env2 A Metropolis Design Tutorial MyMapNetlist B(P1, M.write) <=> B(mP1, mP1.writeCpu); E(P1, M.write) <=> E(mP1, mP1.writeCpu); B(P1, P1.f) <=> B(mP1, mP1.mapf); E(P1, P1.f) <=> E(mP1, mP1.mapf); B(P2, M.read) <=> B(P2, mP2.readCpu); E(P2, M.read) <=> E(mP2, mP2.readCpu); B(P2, P2.f) <=> B(mP2, mP2.mapf); E(P2, P2.f) <=> E(mP2, mP2.mapf); MyFncNetlist MyArchNetlist P1 M Env2 Env1 Y2T write() P2 Th,Wk mP1 … … … mP2 Cpu OsSched Bus Bus Arbiter T2Y read() Mem Outlook of the First Metropolis Release A design tutorial Meta model language Sample MoC: • multi-media (Yapi, TTL) Meta model infrastructure • Synchronous Sample architectural libraries: Front end • coarse-simple cpu, bus, memory, arbiters Abstract syntax trees • time quantity Back end1 Back end2 Back end3 SystemC simulation LOC checking SPIN interface Back endN Meta model debugger http://www.gigascale.org/metropolis/ TLM Conclusions SystemC is the defacto system-level-design standard Pushed by many CAD tool vendors Used widely in industry and academia • E.g., Intel handhold system project [ICCAD’04] Unified language to model a system in different levels Improving path to HW synthesis from SystemC source code Fits with trend to take system design to higher level Metropolis is a novel academic framework of model of computation Capable of representing TLM as well Provides a comprehensive starting point of synthesis Outline Transaction-level model (TLM) SystemC TLM Metropolis Meta Model Synthesis from TLM xPilot: our ongoing synthesis infrastructure for TLM RDR/MCAS: our existing architectural synthesis approach xPilot: TLM to RTL Synthesis Flow TLM in SystemC/Metropolis Frontend RTL SSDM Checking Loop unrolling/pipelining Strength reduction/Bitwidth analysis Speculative-execution transformation … Arch-dependent passes SSDM FPGAs Arch-Independent passes Memory analysis/allocation Scheduling/Binding/Memory analysis/allocation Register/port binding Traditional/Low power/RDR-pipe or Placement driven … Arch-generation passes: RTL/constraints generation Verilog/VHDL/SystemC Altera/Xilinx General/Synopsys/Magma … Integration xPilot with Metropolis Meta model language Meta model infrastructure SystemC Simulation Front end Abstract syntax trees LOC Checking SPIN Interface … Synthesis xPilot/SSDM Compilation for RP Extended Instruction HW Implementation Reconfigurable Coprocessor … Reconfigurable Interconnect Latency Insensitive Design … GALS RTL Handoff Simulation HW implementation RDR/MCAS RTL Timing Constraints Physical Constraints IP Assembly Predictable RTL Synthesis FPGA ASICS IP Library … SSDM Zoomed In – CDFG 2-level CDFG representation 1st level: control flow graph 2nd level: data flow graph if (cond1) bb1(); cond1 F bb2() else bb2(); bb3(); switch (test1) { case c3: bb6(); break; } bb7() bb1() bb3() case c1: bb4(); break; case c2: bb5(); break; T c1 bb4() test1 bb5() bb7() c3 c2 bb6() SSDM Features Different from Software IR Top-level: Process netlist of concurrent processes port/interface semantics FIFO: FifoRead() / FifoWrite() BUFF: BuffRead() / BuffWrite() Memory: MemRead() / MemWrite() Bit vector manipulation Bit extraction / concatenation / insertion Bit-width property for every value Cycle-level notation Scheduling / binding information / delay Our Architectural Synthesis Approaches – RDR / MCAS Consideration of multi-cycle communication during architectural (or behavioral) synthesis Regular Distributed Register (RDR) micro-architecture [Cong et al, ISPD’03] • Highly regular • Direct support of multi-cycle on-chip communication MCAS: Architectural Synthesis for Multi-cycle Communication • Efficiently maps the behavioral descriptions to RDR uArch • Integrates architectural synthesis (e.g. resource binding, scheduling) with physical planning RDR/MCAS: Support for Heterogeneous Integration with Multicycle Communication & Automatic Interconnect Pipelining Distribute registers to each “island” Choose the island size such that Single cycle for intra-island computation and communication Multi-cycle communication between islands Support interconnect pipelining Inter-island pipeline register station (PRS) for global communications PRS performs autonomous store-and-forward MCAS: Multi-cycle architectural synthesis integrated with global placement Experimental results MCAS vs. Conventional flow: Pipeline Register Station (PRS) 3 1 Can also support IP integration using latency insensitive technique [Carloni, ICCAD’99] LCC LCC 1 FSM FSM • 28.8% long global wirelength reduction • 19.3% total wirelength reduction Reg. File V channel MCAS-Pipe vs. MCAS: PR S LCC FSM • 36% reduction in clock period and • 30% reduction in total latency 2 4 PR S 2 H channelPR Adaptor IP Library 3 PR S 4 S Synthesis Flow: MCAS-Pipe System C / VHDL CDFG generation CDFG Resource allocation & Functional unit binding ICG Scheduling-driven placement Locations Placement-driven rescheduling & rebinding Global interconnect sharing Register and port binding Datapath & FSM generation RTL VHDL & Floorplan constraints Global interconnect sharing Enable multiple data communications to share one physical link (a wire with pipeline registers) Related Publications Regular distributed register (RDR) architecture and MCAS synthesis algorithms ISPD’03, ICCAD’03 RDR-Pipe and MCAS-Pipe synthesis algorithms DAC’04 Lopass: high-level synthesis for low-power FPGAs ISLPED’03 Multiplexor optimization through register/port binding ASPDAC’04 Bitwidth-aware ASPDAC’05 scheduling and binding algorithms Conclusions Higher level abstraction is needed in current SO(P)C design flow SystemC becomes the SLD standard, esp., TLM is widely used Metropolis is a platform-based design framework It is time to build new generation of behavioral synthesis system from TLM xPilot: Ongoing project An architectural synthesis infrastructure from TLM to RTL (FPGAs)