Prelude to Multiprocessing Detecting cpu and system-board capabilities with CPUID and the MP Configuration Table CPUID • Recent Intel processors provide a ‘cpuid’ instruction (opcode 0x0F, 0xA2) to assist software in detecting a CPU’s capabilities • If it’s implemented, this instruction can be executed in any of the processor modes, and at any privilege level • But it may not be implemented (e.g., 8086, 80286, 80386) Pentium EFLAGS register 31 21 0 I D 0 0 0 0 0 0 0 0 0 16 V I P V I F A C V M R F 15 0 0 N T IOPL O F D F I F T F S F Z F 0 A F 0 P F 1 C F Software can ‘toggle’ the ID-bit (bit #21) in the 32-bit EFLAGS register if the processor is capable of executing the ‘cpuid’ instruction But what if there’s no EFLAGS? • The early Intel processors (8086, 80286) did not implement 32-bit registers • The FLAGS register was only 16-bits wide • So there was no ID-bit that software could try to ‘toggle’ • How can software be sure that the 32-bit EFLAGS register exists within the CPU? Detecting 32-bit processors • There’s a subtle difference in the way the logical shift/rotate instructions work when register CL contains the shift-factor • On the 32-bit processors (e.g., 80386+) the value in CL is truncated to 5-bits, but not so on the 16-bit CPUs (8086, 80286) • Software can exploit this distinction, in order to tell if EFLAGS is implemented Detecting EFLAGS ; Here’s a test for the presence of EFLAGS mov ax, #0xFFFF; a nonzero value mov cl, #32 ; shift-factor of 32 shl ax, cl ; do logical shift or ax, ax ; test result in AX jnz is32bit ; EFLAGS present jmp is16bit ; EFLAGS absent Testing for ID-bit ‘toggle’ ; Here’s a test for the presence of the CPUID instruction pushfd ; copy EFLAGS contents pop eax ; to accumulator register mov edx, eax ; save a duplicate image xor eax, 0x00200000 ; toggle the ID-bit (bit 21) push eax ; copy revised contents popfd ; back into EFLAGS pushfd ; copy EFLAGS contents pop eax ; back into accumulator xor eax, edx ; do XOR with prior value test eax, 0x00200000 ; did ID-bit get toggled? jnz y_cpuid ; yes, can execute ‘cpuid’ jmp n_cpuid ; else ‘cpuid’ unimplemented How does CPUID work? • Step 1: load value 0 into register EAX • Step 2: execute ‘cpuid’ instruction • Step 3: Verify ‘GenuineIntel’ characterstring in registers (EBX,EDX,ECX) • Step 4: Find maximum CPUID input-value in the EAX register Version and Features • load 1 into EAX and execute CPUID • Processor model and stepping information is returned in register EAX 27 20 19 16 Extended Extended Family ID Model ID 13 12 11 Type 8 7 Family ID Model 4 3 0 Stepping ID Some Feature Flags in EDX 28 H T T 9 3 1 A P I C P S E V M E HTT = HyperThreading Technology (1 = yes, 0 = no) APIC = Advanced Programmable Interrupt Controller on-chip (1 = yes,0 = no) PSE = Page-Size Extensions (1 = yes, 0 = no) Virtual-8086 Mode Enhancements (1 = yes, 0 = no) Multiprocessor Specification • Industry standard allowing OS software to use multiple processors in a uniform way • Software searches in three regions of the physical address-space below 1-megabyte for a “paragraph-aligned” data-structure of length 16bytes called the MP Floating Pointer Structure: – Search in lowest KB of Extended Bios Data Area – Search in topmost KB of conventional 640K RAM – Search in the 64KB ROM-BIOS (0xF0000-0xFFFFF) MP Floating Pointer Structure • This structure may contain an ID-number for one a small number of standard SMP system architectures, or may contain the memory address for a more extensive MP Configuration Table whose entries specify a “more customized” system architecture • Our classroom machines employ the latter of these two options The processor’s Local-APIC • The purpose of each processor’s APIC is to allow CPUs in a multiprocessor system to transmit messages among one another and to manage the delivery of interrupts from the various peripheral devices to one or more CPUs in a carefully controlled way • The Local-APIC has a variety of registers which are ‘memory mapped’ to paragraph-aligned addresses in the 4KB page at 0xFEE00000 Each CPU has its own timer! • Four of the Local-APIC registers are used to implement a programmable timer • It can privately deliver a periodic interrupt just to its own CPU – 0xFEE00320: Timer Vector register – 0xFEE00380: Initial Count register – 0xFEE00390: Current Count register – 0xFEE003E0: Divider Configuration register Timer’s Local Vector Table 17 16 M O D E 0xFEE00320 M A S K 12 B U S Y 7 0 Interrupt ID-number In-class exercise • Run the ‘cpuid.cpp’ Linux application (on our course website) to see if the CPUs in our classroom implement HyperThreading (i.e., multiple processors within one CPU) • Then run the ‘smpinfo.cpp’ application, to see if the MP Base Configuration Table has entries for more than one processor • If both results hold true, then we can write our own multiprocessing software in here! In-class exercise #2 • Run the ‘apictick.s’ demo (on our website) to observe the APIC’s periodic interrupt drawing bytes onto the screen • It executes for ten-milliseconds (the 8042 is used to create this timed delay) • Try reprogramming the APIC’s Divider Configuration register, to cut the interrupt frequency in half (or to double it)