Tutorial 4 System Level Design An Industrial Perspective Moderator Guido Stehr, Infineon Technologies Speakers Laurent Maillet-Contoz, ST Microelectronics Guido Stehr, Infineon Technologies Sören Sonntag, Lantiq Drew Taussig, Synopsys Sylvian Kaiser, DOCEA Power Enno Wein, ProximusDA Cell phones off No photography Concept of this Tutorial “System Level Design” – Covers wide range of problems and solution techniques – Young discipline: Established but still evolving Goal: – Give you an idea of how varied this discipline is – Show what has arrived in industrial practice 2 Contributions Overview presentation: – Guido Stehr, Infineon Technologies: Transaction Level Modeling in Practice: Motivation and Introduction Focus presentations: 1– Laurent Maillet-Contoz, ST Microelectronics: Standards for System Level Design 2– Sören Sonntag, Lantiq: Design Space Exploration and Performance Evaluation at Electronic System Level for NoC-based MP-SoC 3– Sylvian Kaiser, DOCEA Power: ESL Solutions for Low Power Design coffee break 4– Enno Wein, ProximusDA: HW / SW Co-Design of Parallel Systems 5– Drew Taussig, Synopsys: Application Specific Processor Design 3 Transaction Level Modeling in Practice: Motivation and Introduction Nov. 9th, 2010 Dr. Guido Stehr Dr. Josef Eckmüller Infineon Technologies AG Outline Transaction Level Modeling (TLM): Motivation – Historic and emerging trends in electronic system design – SystemC as language for system design in TLM style TLM: Introduction – Modeling basics Communication Computation – Consistency between TLM and RTL – Applications Conclusions 5 Outline Transaction Level Modeling (TLM): Motivation – Historic and emerging trends in electronic system design – SystemC as language for TLM TLM: Introduction – Modeling basics Communication Computation – Consistency between TLM and RTL – Applications Conclusions 6 Historic Trends in System Design Increasing design complexity – Technological progress Duplication of transistor count every two years – Chip-level integration Combination of formerly separated chips – Chip business platform business Vendor offers entire chip sets including software Changing design styles – Shift in HW/SW partitioning Custom HW programmable cores (standard/custom) + SW 5 – Increasing importance of IP Focus of in-house development on differentiators 7 Emerging Trends in System Design Introduction of multi-core architectures 4 – Heterogeneous Controller + DSPs – Homogeneous Software Defined Radio with massive parallelism (> 20 cores) Networks on chip – Increasing throughput requirements Buses crossbars networks on chip (NoC) 2 Advanced power management – Battery life is key Problem: Increasing demand for flexibility and processing power 3 8 Outline Transaction Level Modeling (TLM): Motivation – Historic and emerging trends in electronic system design – SystemC as language for TLM TLM: Introduction – Modeling basics Communication Computation – Consistency between TLM and RTL – Applications Conclusions 9 SystemC as Modeling Language Challenges: – Tight interaction between HW and SW development – Large systems to be modeled Abstraction level beyond RTL required – Our solution: SystemC (C++ library) Suitable for HW and SW development • Naturally inherits SW aspects • Adds features for HW modeling (parallelism, HW signals, etc.) Applicable to large systems • Supports abstraction well due to power of C++ Includes simulation kernel • Combination of model and simulator in one executable 10 Outline Transaction Level Modeling (TLM): Motivation – Historic and emerging trends in electronic system design – SystemC as language for TLM TLM: Introduction – Modeling basics Communication Computation – Consistency between TLM and RTL – Applications Conclusions 11 Communication: Abstraction Basic idea of transaction level modeling (TLM): – Hide details of communication protocols – Represent communication transactions by function calls More general: Hide HW implementation details Bus at RT Level: signal protocol Bus at Transaction Level: function call write(addr, data) read(addr, data) RX/TX initiator valid initiator addr data target e.g. TLM 2 target 1 HW signals still possible custom HW Transaction Level does not define a precise level of abstraction 12 Communication: Timing Programmer’s View (PV) – Non-blocking transaction function calls Focus: functionally correct sequence of function calls Time not modeled explicitly 13 Communication: Timing Example PV initiator target write(addr, dat) write(addr, dat) 14 Communication: Timing Programmer’s View (PV) – Non-blocking transaction function calls Focus: functionally correct sequence of function calls Time not modeled explicitly Programmer’s View with annotated Timing (PVT) – Non-blocking transaction function calls Transaction target yields delay time as return value Initiator lets simulation time advance by given amount 15 Communication: Timing Example PV initiator PVT target initiator target write(addr, dat, d) write(addr, dat) t0 write(addr, dat) d write(addr, dat, d) t0+d t 16 Communication: Timing Programmer’s View (PV) – Non-blocking transaction function calls Focus: functionally correct sequence of function calls Time not modeled explicitly Programmer’s View with annotated Timing (PVT) – Non-blocking transaction function calls Transaction target returns delay time Initiator lets simulation time advance by given amount Cycle Callable (CC) – Blocking transaction function calls Alignment with a periodic clock signal 17 Communication: Timing Example PV initiator PVT target initiator target write(addr, dat, d) write(addr, dat) t0 write(addr, dat) CC initiator Tclk target write(addr, dat) d write(addr, dat, d) t0+d t write(addr, dat) t 18 Efficiency vs. Accuracy efficiency PV TLM 2 LT PVT TLM 2 AT CC accuracy 19 Outline Transaction Level Modeling (TLM): Motivation – Historic and emerging trends in electronic system design – SystemC as language for TLM TLM: Introduction – Modeling basics Communication Computation – Consistency between TLM and RTL – Applications Conclusions 20 Computation: Timing Clocked – Periodic clock event providing temporal pattern – Units triggered each cycle 21 Computation: Timing Examle rdy 1 0 X toggling signal clk 1 0 t T cnt = 3 >0 clk cnt = 2 >0 cnt = 1 >0 cnt cnt = 0 == 0 clock events rdy timer 22 Computation: Timing Clocked – Periodic clock event providing temporal pattern – Units triggered each cycle Event-driven – Units react to events (from transactions, signal changes, etc.) – Events scheduled on demand for certain points in time 23 Computation: Timing Examle rdy Clocked 1 0 X toggling signal clk 1 0 t T cnt = 3 >0 rdy Eventdriven 1 0 cnt = 2 >0 cnt = 1 >0 cnt = 0 == 0 clock events X clk = T transaction event timer event clock period t 3T cnt = 3 != 0 24 When to Apply What Modeling Style? Combinations of modeling styles Communication Computation Event-driven Clocked CC PVT PV Modeling style depends on required level of detail – HW with tight feedback loops CC – Hardware-dependent SW PVT, possibly with selected HW blocks in CC fashion Synchronization wrappers PVT CC – SW PVT with simplified HW models 25 Stream-Driven Models Stream-driven paradigm (Kahn Process Networks, KPNs) popular for data flow modeling input Infinite FIFO P1 P Process: Triggered by availability of data P3 P2 No notion of time! output 26 Embedding KPN in TLM TL wrapper adds timing to untimed KPN ready timer clock t = t0 + data in FIFO transaction data out FIFO 27 Outline Transaction Level Modeling (TLM): Motivation – Historic and emerging trends in electronic system design – SystemC as language for TLM TLM: Introduction – Modeling basics Communication Computation – Consistency between TLM and RTL – Applications Conclusions 28 Consistency Between TLM and RTL Ensure consistency TL model RTL implementation – Code generation from spec Interface (registers, memories, ports) • TLM / RTL: stub models • SW: register / memory access functions Function • State machines: UML TLM / RTL – High-Level Synthesis Requirements on input model: clocked, wire accurate – Assertions Ensure essential properties – Common testbench 29 Common Testbench: RTL in TLM bus model TLM RTL transactor initiator TLM RTL transactions target_1 type translator target_2 signals • Transactions signals: transactors (FSMs) • Incompatible signal types: type translators (no internal states) • Compatible signal types: direct connect RTL in TLM: Test RTL implementation in system context TLM in RTL: Test TL model in legacy RTL testbenches 30 Outline Transaction Level Modeling (TLM): Motivation – Historic and emerging trends in electronic system design – SystemC as language for TLM TLM: Introduction – Modeling basics Communication Computation – Consistency between TLM and RTL – Applications Conclusions 31 Applications TLM up to 100x faster than RTL Simulate entire system – Check correct system functionality Assertions Signal traces Event logs Program traces – Quantitative analyses Inspect statistics collected by modeled units • CPU cores • Buses • Custom HW – SW development / debugging Target SW debugger 32 Check of Transmitter Performance HW focus: – PVT/CC TL model: bit and cycle true signal chain symbol generator DSP block 1 CPU … DSP block 2 control/ state FSM DSP block N control/ state timing analysis signal analysis Typical performance simulation – 1 GSM frame (approx. 3.5 ms) RTL: 30min TLM: 35sec Speedup: 50x 33 Core Selection Uncached core with expensive on-chip memory cached core with cheap off-chip memory Simulate UMTS data transmission on both architectures cached uncached MHz 0 1000 1200 1400 slots Cached core required 20% faster clock 34 Port of 3.5G Signal Processing SW Challenge: – General purpose core replaced by dedicated DSP Efforts: – Update system simulation model: 1 day – Port software to DSP: 1 month Benefit: – Software available several months before silicon 35 Outline Transaction Level Modeling (TLM): Motivation – Historic and emerging trends in electronic system design – SystemC as language for TLM TLM: Introduction – Modeling basics Communication Computation – Consistency between TLM and RTL – Applications Conclusions 36 Conclusions Transaction level modeling (TLM): – Enables simulation of today’s complex electronic systems – Common ground for SW and HW development – More an art than a science Leaves and requires a great deal of modeling flexibility – Has become indispensible in Infineon’s design flow Thank you! 37 Extra Material Cache trade-off analysis Performance analysis Core models SW success stories 38 Cache Trade-off Analysis Optimal cache size? Modify cache size and re-simulate testcases CPU load [MHz] 30 20 10 0 0 4 8 16 32 cache size [kB] Cache larger than 4kB not justified here 39 Performance Analysis Question: Enough performance for 3.5G data transmission? Scenario: Run critical testcases, analyze core statistics CPU load [%] 50 10.7 downlink 5.4 data rate 2.7 [Mb/s] 1.3 40 30 20 10 0 0 2 4 6 uplink data rate [Mb/s] Result: Plenty of headroom available 40 Core Models CPU models essential for core-based designs – Instruction set simulator (ISS) Emulates behavior of target core on host CPU Accurate Not very fast – Performance optimized model Maps instructions of target core to native functions of host CPU Fast execution High costs – CPU interface model Covers HW interface of target core (registers, interrupts, reset) SW code is compiled for native CPU • Before compilation: HW accesses are mapped to transactions Fast, cheap Limited timing accuracy 41 SW Success Stories 3.5G signal processing – Challenge: General purpose core replaced by dedicated DSP – Efforts: Update system simulation model: 1 day Port software to DSP: 1 month – Benefit: Software available several months before silicon 3.5G control code – Challenge: Verification: power-up, boot phase, host/device communication – Required model quality: Full functional coverage Simulation speed > 1/20 HW speed – Benefit: Control code debugged prior to silicon availability 42