Microprocessor and Microcontroller Architecture Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 1 Von Neumann Architecture Stored-Program Digital Computer • Digital computation in ALU • Programmable via set of standard instructions • Internal storage of data • Internal storage of program • Automatic Input/Output • Automatic sequencing of instruction execution by decoder/controller input Processor Architecture output Arithmetic Logic Unit (ALU) controller Von Neumann Architecture Data and instructions stored in a single memory unit Harvard Architecture Data and instructions stored in a separate memory units Embedded Systems — Hadassah College — Spring 2012 memory data/instruction path control path Dr. Martin Land 2 Memory Hierarchy Levels of data / memory storage Data and instructions of "all" programs "All" data and instructions of running programs Copy small section of Main Memory for faster access Fastest access to small amount of data Access by OS call Width = allocation unit Access by address Width = byte Access by address Width = byte Access by name Width = word μP: External μC: External μP: External μC: Internal μP: Internal μC: — μP: Internal μC: Virtual Long Term Storage Main Memory (RAM) Cache Register All Files and Data Running Programs and Data Next Few Instructions and Data Current Data Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 3 μP Subsystems I/O System Microprocessor register memory long‐term storage network input Arithmetic Logic Unit (ALU) processor control I/O control output Bus Controller Cache Memory Unit cache memory cache control Main Memory Unit main memory memory control Processor package Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 4 μC Subsystems Microprocessor data memory long‐term storage I/O devices processor control Arithmetic Logic Unit (ALU) instruction memory input output timers Controller package Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 5 Instruction Set Architecture General instruction — instance of data structure Operation Operand Operand ... Operand Instruction set Range of data structure for Operation ∈ {legal actions} Operand ∈ {legal addressing modes} Machine instruction Binary code for processing by hardware Assembly instruction User-friendly form of machine instruction Binary code → words Typical instruction Machine: 0x82E31F2B Assembly: ADD destination, source_1, source_2 Definition: destination ← source_1 + source_2 Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 6 General Operations Data transfer Load (r ← m), store (m ← r), move (r/m ← r/m), convert data types Arithmetic/Logical (ALU) Integer arithmetic (+ – × ÷ compare shift) and logical (AND, OR, NOR, XOR) Decimal Integer arithmetic on decimal numbers Floating point (FPU) Floating point arithmetic (+ – × ÷ sqrt trig exp …) String String move, string compare, string search Control Conditional and unconditional branch, call/return, trap Operating System System calls, virtual memory management instructions Graphics Pixel operations, compression/decompression operations Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 7 Addressing Modes Formal specification Immediate (IMM) Constant = literal = numerical value coded into instruction Register operands register name = a μP storage location REGS[register name] = data stored in register REGS[R3] = data stored in register R3 = 11223340 R3 11223340 Memory operands 11223344 address = a memory storage location 45 MEM[address] = data stored in memory MEM[11223344] = data stored at address 11223344 = 45 Effective Address (EA) — pointer arithmetic REGS[R3] ← &(variable) MEM[REGS[R3]+4] = *(&(variable)+4) = *(REGS[R3]+4) = *(11223340+4) = 45 Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 8 General Addressing Modes Mode Syntax Memory Access Use Register R3 Regs[R3] Register data Immediate #3 3 Constant Direct (absolute) (1001) Mem[1001] Static data Mem[Regs[R1]] Pointer Mem[100+Regs[R1]] Local variable Mem[Regs[R1]+Regs[R2]] Array addressing Mem[Mem[Regs[R3]]] Pointer to pointer Mem[Regs[R2]] Regs[R2] ← Regs[R2]+d Stack access Regs[R2] ← Regs[R2]-d Mem[Regs[R2]] Stack access Mem[100+Regs[R2]+Regs[R3]*d] Indexing arrays Mem[PC+value] Load instruction to data register Mem[PC+Mem[1001]] Load instruction to data register Register deferred Displacement Indexed (R1) 100(R1) (R1 + R2) Memory indirect @(R3) Auto Increment (R2)+ Auto Decrement Scaled -(R2) 100(R2)[R3] PC-relative (PC) PC-relative deferred 1001(PC) Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 9 Read Only Memory (ROM) Non-volatile memory Permanent configuration data Device initialization code (FIRMWARE) Basic Input/Output System (BIOS) ROM technologies OTP (One-Time Programmable) Cannot be changed after writing EPROM (Erasable Programmable Read-Only Memory) Glass window allows ultraviolet light to erase device EEPROM (Electrical Erasable Programmable Read-Only Memory Can write electrically without prior erase Flash Organized as blocks Read Write Read any bit in any block 0 → 1 Overwrite and bit in any block 1 → 0 Copy block → delete entire block (write 0) → overwrite 1‐bits Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 10 General Register Operation Models Register-Memory Model Operands can be stored in any REGISTER or MEMORY location Z = X + Y → load R1, X add R1, R1, Y store Z, R1 Easier to program Register- Register Model MEMORY operands must be loaded to a REGISTER Also called LOAD-STORE MODEL Z = X + Y → load R1, X load R2, Y add R1, R1, R2 store Z, R1 Easier to implement in hardware Statistics → most loaded operands used more than once Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 11 Running Machine Language Program Program list Instruction Instruction Instruction Instruction Instruction Instruction … 1 2 3 4 5 6 Instructions in Flat Memory Instruction 5 Instruction 4 Instruction 3 Instruction 2 μP Fetches next instruction in list Decodes fetched instruction Executes decoded instruction Instruction 1 byte A+11 byte A+10 byte A+9 byte A+8 byte A+7 byte A+6 byte A+5 byte A+4 byte A+3 byte A+2 byte A+1 byte A Address Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 12 Modular Programming (Main + Functions) Build program from separate source files (modules) Each source module edited in a separate file Compile / Assemble source files into separate object files Object code is machine code with symbolic external references Link object files together Create executable file Easier to design, read and understand programs Write most modules in high level language Write critical sections in assembly language Write, debug, and change modules independently Embedded Systems — Hadassah College — Spring 2012 main.C compile main.OBJ f1.ASM assemble f1.OBJ f2.C compile f2.OBJ load f_std.LIB Processor Architecture link prog.EXE Dr. Martin Land 13 Computer Design Before 1990 Limitations Memory = expensive RAM ~ $5000/MB wholesale in 1977 Compiler = bad Bad error messaging Weak optimization Efficient code ⇒ write / optimize in assembly language Implications Complex Instruction Set Computer (CISC) Easier assembly programming Closer to high level language Powerful assembly language > 300 instructions > 12 addressing modes > 10 data types Powerful instructions ⇒ fewer instructions ⇒ less memory Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 14 CISC Physical Implementation Machine Language Instruction SUB R1, R2, 100(R3) Microcode Instruction Sequence (Microprogram) ALU_IN ← R3 ALU Subsystem ALU ← 100 1 OUT 3 ADD Registers 2 IN MAR ← OUT READ System Bus ALU_IN ← MDR ALU ← R2 Status Decoder IR PC MAR + Word SUB R1 ← OUT ALU Op erat ion A LU R esult F lag control PC - program counter MAR - memory address register IR - instruction register MDR - memory data register Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Address MDR Data Main Memory Dr. Martin Land 15 Run Time and Clock Cycles μP is timed by periodic signal called a clock clock cycle Clock Cycle time is measured in seconds per cycle Instruction requires 1 or more clock cycles to process Clock Rate is cycles per second = Hz (Hertz) Run time = clock cycles to run program × seconds per clock cycles clock cycles to run program = clock cycles per second Higher clock rate ⇒ shorter run time More clock cycles (at constant clock rate) ⇒ longer run time Clock Cycles Per Instruction (CPI) = lines of microcode Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 16 CISC Limitations Complex microcode Many instruction types ⇒ many microcode sequences Complex operations ⇒ complex decoding and sequencing Central bus organization Permits atomic microcode operations System bus ⇒ bottleneck Microcode operations execute one-at-a-time Machine instructions execute one-at-a-time Microcode ⇒ several clock cycles to execute machine instruction Memory access Instruction length Non-uniform Depends on operation complexity Multiple clock cycles to load instruction Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 17 Computer Design Since 1990 Technological developments Price of RAM $5000 / MByte (1975) → $5 / MByte (1990) → $0.01 / MByte (2012) Compilers Powerful + efficient + optimized Reduced Instruction Set Computer (RISC) Speed up most common operations Fewer machine instructions with uniform instruction length (in bytes) Ignore performance degradation to other operations Simpler hardware design No microcode No system bus All processors today use RISC technology Pure RISC (PowerPC, Sparc, MIPS, ARM, Arduino, PIC, …) RISC technology for CISC language (Pentium II – 4, Centrino, Core) Explicitly parallel RISC (Intel Itanium, IBM mainframes) Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 18 Typical RISC Instructions Class Instruction Load Transfer Store Arithmetic + – × ÷ ALU Logic ∧∨⊕ Test & Set Float Control + – × ÷ Test & Set Operands Integer / Float Register/Register Register/Immediate Register/Register Register/Register Example R1 ← [R2 + offset] [R2 + offset] ← R1 R1 ← R2 + R3 R1 ← R2 – Imm R1 ← R2 ∧ R3 R1 ← R2 ∨ Imm R1 ← (R2 > R3) F1 ← F2 × F3 F1 ← (F2 ≠ F3) Branch Immediate PC ← PC + Imm Branch Register Register PC ← R1 Branch & Link Conditional Branch Embedded Systems — Hadassah College — Spring 2012 Register/Immediate R31 ← PC PC ← PC + Imm PC ← PC + (Imm * (F2 = 0)) Processor Architecture Dr. Martin Land 19 Typical RISC Pipeline Stage 1 IF Stage 2 ID Instruction Fetch Instruction Decode Address Stage 3 EX Stage 4 MEM Stage 5 WB Execute Data Memory Access Write Back Instruction Address Instruction Memory clock cycle 1 2 I1 IF ID I2 IF I3 I4 I5 I6 I7 I8 Embedded Systems — Hadassah College — Spring 2012 Data Data Memory 3 EX ID IF 4 MEM EX ID IF 5 6 7 8 WB MEM WB EX MEM WB ID EX MEM WB IF ID EX MEM IF ID EX IF ID IF Processor Architecture Dr. Martin Land 20 Microcontroller (μC) versus Microprocessor (μP) Microprocessor (μP) application General purpose computer Access + process data → data output Computational power Programming generality Execution speed Multiple processors for parallel processing Each μP handles thread Microcontroller (μC) application Embedded system Control external hardware operations Cost efficiency Small number of program tasks stored in permanent memory Lowest possible cost Multiple controllers for concurrent control problems Each μC applied to small group of tasks Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 21 Embedded‐Processor Core Programmable Logic Devices (PLD/FPGA) Generic programmable integrated circuits Large array of digital circuit blocks User defines logic blocks Truth tables Boolean functions Karnaugh diagrams User defines connections on programmable routing matrix Design copied to ASIC for manufacture ASIC — application specific integrated circuit Embedded-processor core μP or μC available on chip as programmable logic block Large embedded system designed on single chip μP or μC works with other digital system blocks System on chip (SoC) Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 22 Choosing a Microcontroller — Generic Requirements Optimum device for given application Device family Uniform ISA Different hardware resources Internal resources Interrupts Type + number of I/O lines (analog and digital) Size of program and data memory Space optimization Smallest footprint at reasonable cost Low power consumption Battery powers applications using microcontrollers Sleep state while microcontroller idle Copy protection Stored program protected against user reading Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 23 Microcontroller Components Microprocessor core Typically RISC-type μP Memory ROM holds program (FIRMWARE) RAM / registers for data + configuration Usually RAM < ROM Timers Time internal / external events Watchdog — timeout resets system if code loop fails Controller I/O Interrupt controller — external event grabs processor attention Analog ↔ digital converters (A/D and D/A) Digital signal processor (DSP) Serial ↔ parallel converters (UART) Oscillator Generates clock signal to synchronize all internal operations Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 24 Special Purpose Registers Instruction register Holds executing instruction Program counter Points to next instruction Accumulator Associated with ALU operations One operand must be in ACC Result stored in ACC Status register (flags) Set configuration Results of ALU operations Data address register (DAR) Stores data memory addresses Stack pointer Points to last element pushed to stack Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 25 Reset Initializes microcontroller Sets PC to preset value (init address) Microcontroller starts executing commands from init address Causes of reset Power up Controller resets at startup Manual reset Press reset button Power-glitch reset Detect spike on power supply Brown-out reset Input voltage drops below threshold Watchdog timer (WDT) Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 26 Power Consumption Low power consumption Most microcontroller applications on battery power Low power chip technology Complementary metal-oxide semiconductor (CMOS) Clock frequency Power consumed only on logic transition 1 ↔ 0 Higher clock frequency ⇒ more transitions /second ⇒ more power Supply voltage Higher supply voltage ⇒ faster + higher power Sleep state Stop clock ⇒ 0 transitions /second ⇒ no power Leave low-power mode by external interrupt or reset Key press Interrupt Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 27 Common Microcontrollers Applied Micro (was IBM) PowerPC Atmel AVR family AT89 family (Intel 8051 architecture) AT91 family (ARM architecture) Freescale 68HC00 family DSP56800 MPC family Intel MCS‐48 (8048 family) MCS‐51 (8051 family) MCS‐96 (8096 family) Microchip Technology PIC families National Semiconductor COP families Sony SPC families STMicroelectronics ST families Texas Instruments TMS families Toshiba TLCS families Zilog eZ8/80/16 families Embedded Systems — Hadassah College — Spring 2012 Processor Architecture Dr. Martin Land 28