Peter S. Magnusson, Magnus Crhistensson, Jesper Eskilson, Daniel Forsgren, Gustav Hallberg, Johan Högberg, Frederik larsson, Anreas Moestedt. Presented by Eduardo Cuervo Simulation is an important step ◦ Research ◦ Evaluation ◦ Computer Design Not enough to simulate only user level code ◦ Not accurate enough ◦ Need for Full System Simulation (slower) Simulation must be able to interface with detailed HW models timely. Scope (What is modeled?) ◦ Full system ◦ User level Level of abstraction ◦ Functional behavior (what) ◦ Timing behavior (when) Realistic workloads ◦ Functional:Boot, run unmodified OS, benchmarks ◦ Timing: Support hardware engineering ◦ Fast enough to run real workloads Full system simulator Instruction-set level Support for multiple architectures ◦ ◦ ◦ ◦ ◦ ◦ SPARC Alpha X86 Itanium MIPS ARM Unmodified OS support Processor Models Device Models ◦ Accurate enough for real drivers and firmware Simics Central ◦ Creation of full-scale distributed system ◦ Router ◦ Communicates multiple Simics instances Multiple nodes of the same architecture per instance ◦ Synchronizes instances Microprocessor design ◦ ◦ ◦ ◦ Interaction with memory manager an scheduler Approximate cache and I/O timing models Scalable to full server workloads (TPC-C) OoO Support, custom ROB but no pipeline Memory studies ◦ Memory Spaces Address Spaces ◦ Extendable with timing models OS Development ◦ Implementation of very specific breakpoints Debugging Simics Central ◦ Synchronizes virtual time ◦ Simulation speed = speed of the slowest Simics process Configuration ◦ Object Oriented Command Line Interface (CLI) and Scripting ◦ Built-in Python runtime environment ◦ Scripts tied to events (TLB misses, I/O operations) Devices ◦ Timer, floppy, keyboard,mouse, DMA,interrupt,etc. OBJECT cpu0 TYPE x86-hammer { freq_mhz: 3500 physical_memory: phys_mem0 } OBJECT phys_mem0 TYPE memory-space { map: ((0xa0000, vga0, 1, 0, 0x20000), (0x100000, mem0, 0, 0x100000, 0xff00000), ... } OBJECT con0 TYPE gfx-console { queue: cpu0 x-size: 720 y-size: 400 keyboard: kbd0 Mouse: kbd0 } from sim_core import * import conf def break_handler(id): if conf.cpu0.eax > conf.cpu0.ecx: raise SimExc_Break id = SIM_breakpoint(conf.phys_mem0, Break_Physical, Break_Execute, 0x000f2501, 1, 0) SIM_hap_register_callback( “Core_Breakpoint” ,break_handler, id) HDL interface ◦ Link Simics to Verilog through C interface Simics API ◦ Makes Simics extensible ◦ Write new device models, commands, routines Memory ◦ Biggest performance challenge ◦ Simulator transaction cache Speeds up loads, stores and fetches Pointers to simulated memory ◦ Indexed by virtual address No side effects on hit ◦ Alignment exception, TLB miss, cache miss, breakpoint Interpreter cache Hit inlined in the kernel Most complex construct Two event queues ◦ Step queue Triggered by program counter steps ◦ Time queue Resolution of a processor clock cycle Mix of time-driven and event-driven components Specification language: Sim Gen Generates all permitted combinations Better interpreter than practical to do manually Outputs an interpreter in C // IA32/x86-64 add to left instruction instruction ADD_L({REG}, {REG_OR_MEM}) pattern op_h2 == 0 && opl == 0 && d == 1 && opm == 0 syntax “add {REG},{REG_OR_MEM}” semantics #{ ireg_t op1 = {REG}; ireg_t op2 = {REG_OR_MEM}; ireg_t dst = op1 + op2; EFLAGS_ADD(dst,op1,op2,w,os); SET({REG_W}, dst); #} attributes type = IT_ALU Os boot workloads ◦ Modeled with 7 processor architectures Scalability shown on Ultra II Enterprise Servers ◦ Increasing number of CPUs Lower performance on OoO versions IBM first emulator (7070) PDP-11 G88 Gsim ◦ Based on g88 ◦ Rewritten as the first version of simics SimOS ◦ ◦ ◦ ◦ MIPS-based processor Similar goals and solutions More general solution Three CPU simulators Full system simulation is required for realistic workloads Simics offers a valuable simulation tool for designing and evaluating HW Support for scripting, networking, and multiple architectures are a great advantage