TM Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, ColdFire+, C-Ware, the Energy Efficient Solutions logo, Kinetis, mobileGT, PowerQUICC, Processor Expert, QorIQ, Qorivva, StarCore, Symphony and VortiQa are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. Airfast, BeeKit, BeeStack, CoreNet, Flexis, Layerscape, MagniV, MXC, Platform in a Package, QorIQ Qonverge, QUICC Engine, Ready Play, SafeAssure, the SafeAssure logo, SMARTMOS, TurboLink, Vybrid and Xtrinsic are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. The Power Architecture and Power.org word marks and the Power and Power.org logos and related marks are trademarks and service marks licensed by Power.org. © 2013 Freescale Semiconductor, Inc. • Freescale develops many different types of cores: − Microprocessors, • DSPs, highly-embedded RISC cores. A variety of models/tools needed over a project’s life-cycle: − Early architectural exploration. − Assemblers, compilers, documentation − Verification models − Virtual platform models for early software development. • • All must be consistent, even as the design changes! Little need for RTL generation: − These • are highly optimized cores, so RTL will be done by hand. Need to separate architecture and micro-architecture: − Same functional behavior, but the same architecture may have multiple implementations. • Need high-degree of re-use across projects, clean way to handle minor architectural differences. 2 • • Freescale ADL consists of two languages: − A language for describing a core's architecture: ADL − A language for describing a core’s micro-architecture: μADL The system is a collection of generators. − • A single description can be used to create a consistent set of tools A single executable specification may be used for a variety of purposes: − Early architectural exploration − Verification − Software enablement, early sw development. • In use since 2007. • It is an open-source project! − http://opensource.freescale.com/fsl-oss-projects/ 3 Advantages of having a Freescale language: − Modifiable in-house. − Keeps our IP in a format that we control. − Easily interact with external research groups or vendors. − Distribute models to customers w/o requiring additional licenses. Why not use an existing language? • Most vendor tools developed for modelling ASPs and DSPs: • Cannot directly model the many complex elements of our microprocessors, e.g. MMUs, closely-coupled caches, etc. • Difficulty with handling complex microarchitectures, e.g. out-of-order. • Often combine architecture and microarchitecture descriptions. uArch Complexity • Power ISA: Superscalar, out-oforder Power ISA + extensions: Single-issue, inorder Arch Complexity 4 • • Generated Tools A single specification can be used throughout the lifetime of a project. As opposed to: − Lots of individual specialpurpose models and tools. − Each must be individually verified. − This represents a tremendous amount of duplication, both in terms of time and information. Prototype models ADL Description Impl. Spec Arch Spec. Dev Tools: assemblers, compilers, debuggers Product Life-cycle Product definition, trade-off analysis Pre-silicon verification Random testcase generator Functional testbench Post-silicon verification Bring-up tools: silicon debuggers, etc. Customer support Arch Spec. Production quality models and dev. tools. 5 • ADL is a declarative language for describing the resources contained within a microprocessor core. − − − • Some of the resources we model: − − − − − − • Registers and register-files MMUs Memory hierarchy: Cache hierarchies and local memories Exceptions Instructions Inter-processor communication facilities Implementation-specific behaviour is described using blocks of a subset of C++. − − • A description can be decomposed into a series of architecture blocks. These architecture blocks can be combined together to form a core. Architecture blocks can be re-used across different cores. Verilog-like bit-vector manipulation provided via a template-based bit-vector class. Helper functions may be specified directly in the design in order to re-use commonly performed actions. External libraries may also be used. 6 An Example Instruction Declaration: define (instr=addi) { “”” The sum (GPR[rA]|0) + SI is placed into GPR[RT]. Documentation String “””; fields=(OPCD(14),RT,RA,SI); syntax = (“%i %f,%f,%f”,RT,RA,SI); action = { var b = signExtend(SI,regSize); if (RA == 0) { GPR(RT) = b; Instruction Semantics Instruction encoding. These fields are defined elsewhere in the description. •OPCD(14) means that the OPCD field has a value of 14. •RT, RA, SI are operand fields. } else { GPR(RT) = GPR(RA) + b; } }; } An Example Register File Declarations: define (regfile=GPR) { """ General purpose registers. """; size = 32; prefix = r; } 7 define (arch=A) { define (reg=LR) {} … define (regfile=SPR) { size=1024; define (entry=8) { reg = LR; } define (entry=9) { reg = CTR; } define (entry=1) { reg = XER; } define (entry=50) { reg = HID0; } define (entry=51) { reg = HID1; } } } define (core=C) { archs = (A,B); } Initial architecture definition: Registers and a sparse register file collection. define(arch=B) { define (reg=FOO) {} defmod (regfile=SPR) { define (entry=50) { remove = true; } define (entry=51) { reg = FOO; } } } Partial architecture modifies the register file, removing one item and overwriting another. define: Create or overwrite existing entity. defmod: Modify existing entity, overwriting specified keys. This allows for fine-grained control of final architecture, avoiding the confusion of lots of #ifdef’s. Final core definition combines these two architectures 8 EX Trg.allocate() MM GPR Write_ops() WB A formal model: Operation state machine and resources The language: declarative, concise, hides ISA details Define (instr_class sfx) { instructions = ( add, addi ); define (operands) { … }; action = { S_ID: if (Src.can_read()…){ Src.read(); …; } … } } Operands hide differences between instructions: Define (instr_class sfx) { instructions = ( add, addi ); define (operands) { Src1 = GPR(RA); Src2 = GPR(RB); Trg = GPR(RT); }; } 9 addi does not have Src2 Therefore, will be replaced with a dummy operand During template instantiation, compiler will fold out this code. Architecture Team Assembler/ disassembler ADL Compiler Team Simulator XML Database Verification RTL Design Documentation Team 10 Architectural Exploration Performance Analysis Pipeline Model • • Summary: − Open-source architecture description language. − Applied to a variety of projects: Power ISA, DSPs, 16-bit micro-controllers, etc. − Applied across a range of phases of various designs’ life-cycles, across a variety of groups within Freescale − Separates the architecture (programmer’s view) and micro-architecture − High-level constructs such as translation units and caches for a high-degree of re-use across a given architecture. Current and Future Development: − Better integration with SystemC and other simulation environments: − Parallel simulation: − MIT Graphite. Multi-threaded simulation kernel via Boost threads. Improvements for dynamic binary translation (FastISS using LLVM). 11 12 ADL Description μADL Description Programming Model (Registers, MMU, Instructions) Pipeline Description (Pipeline stages, register read/write behavior) Verification Interpreted Functional Simulator Cycle-Accurate Simulator Interpreted Functional Simulator Documentation (HTML, PDF) Assembler/ Disassembler Trace-cache Fast Functional Simulator Architectural Exploration Interpreted Functional Simulator XML, Perl Database Assembler/ Disassembler Roles For Various Generated Tools Software Enablement Virtual Prototyping Documentation/ Database Documentation/ Database Cycle-Accurate Simulator Assembler/ Disassembler Assembler/ Disassembler Trace-cache Fast Functional Simulator Cycle-Accurate Simulator 13 • ADL can generate different types of models, depending upon the need: − Functional: Interpreted models: Slow, but very useful for verification (can single-step the design), fast enough for short workloads (arch. explr.). 1-10 MIPS. Byte-coded: Faster than interpreted, fully portable C++. 50-80 MIPS. Dynamic-binary-translation: Uses LLVM. > 100 MIPS. Currently limited to Linux. Parallel simulation kernel using Boost threads. − Performance: Safe-mode: ISS coupled to pipeline model. • Functionally correct with approximate timing. Normal-mode: Transactional ISS tightly coupled to pipeline model. • Exposes timing errors as wrong answers, e.g. forgotten forwarding path. 14 DUT Verification Team ADL Model ADL Model ADL Model ADL Model Testbench ADL Model Architectural Test-case Generator Architectural Test-cases (text) 15 Core Core Core Core • Better integration with SystemC and other simulation environments: − − − • Currently, SystemC TLM2 integration is handled on a per-project basis. Goal is to create a push-button flow for TLM2-compliant models (LT for ADL, AT for µADL). Also experimenting with parallel simulation environments: MIT Graphite. Multi-threaded simulation kernel via Boost threads (released with ADL 3.4). Future enhancements include multi-threaded tracing, more efficient memory models. Improvements for dynamic binary translation: − Uses LLVM for translation of basic blocks to native machine code. − − • Further optimizations for code generation. Investigate translation at the trace level, in order to reduce branch penalties. Investigate adding timing to these high-speed models. Continue expanding ADL to other architectures, based upon Freescale’s design needs. 16 • FreescaleADL is an open-source system for describing programmable cores, applied to a variety of projects including Power ISA cores, DSPs, 16-bit micro-controllers, etc. • The fact that the language is open (and we develop it) allows us to add enhancements as needed as well as makes it easy to integrate into various kinds of simulation environments, e.g. SystemC, distributed simulation environments, etc. • The separation of architecture and micro-architecture descriptions, plus high-level constructs such as translation units and caches, allows for a high-degree of re-use across a given architecture. • This language and tooling have been successfully applied across many phases of various designs’ life-cycles, with a high-degree of re-use, e.g. architecture exploration models enhanced for verification, then adapted for virtual-prototype use. • It has been successfully deployed to a number of groups within Freescale: Rather than a central modeling team, teams in individual business units do the modeling. This allows for fast turn-around time (they own the model). The central organization supports the infrastructure and adds enhancements as needed. 17