Prelude to Multiprocessing Detecting cpu and system-board capabilities with CPUID and the

advertisement
Prelude to Multiprocessing
Detecting cpu and system-board
capabilities with CPUID and the
MP Configuration Table
CPUID
• Recent Intel processors provide a ‘cpuid’
instruction (opcode 0x0F, 0xA2) to assist
software in detecting a CPU’s capabilities
• If it’s implemented, this instruction can be
executed in any of the processor modes,
and at any privilege level
• But it may not be implemented (e.g., 8086,
80286, 80386)
Pentium EFLAGS register
31
21
0
I
D
0
0
0
0
0
0
0
0
0
16
V
I
P
V
I
F
A
C
V
M
R
F
15
0
0
N
T
IOPL
O
F
D
F
I
F
T
F
S
F
Z
F
0
A
F
0
P
F
1
C
F
Software can ‘toggle’ the ID-bit (bit #21) in the 32-bit EFLAGS register
if the processor is capable of executing the ‘cpuid’ instruction
But what if there’s no EFLAGS?
• The early Intel processors (8086, 80286)
did not implement 32-bit registers
• The FLAGS register was only 16-bits wide
• So there was no ID-bit that software could
try to ‘toggle’
• How can software be sure that the 32-bit
EFLAGS register exists within the CPU?
Detecting 32-bit processors
• There’s a subtle difference in the way the
logical shift/rotate instructions work when
register CL contains the shift-factor
• On the 32-bit processors (e.g., 80386+)
the value in CL is truncated to 5-bits, but
not so on the 16-bit CPUs (8086, 80286)
• Software can exploit this distinction, in
order to tell if EFLAGS is implemented
Detecting EFLAGS
; Here’s a test for the presence of EFLAGS
mov ax, #0xFFFF; a nonzero value
mov cl, #32
; shift-factor of 32
shl
ax, cl
; do logical shift
or
ax, ax
; test result in AX
jnz
is32bit
; EFLAGS present
jmp is16bit
; EFLAGS absent
Testing for ID-bit ‘toggle’
; Here’s a test for the presence of the CPUID instruction
pushfd
; copy EFLAGS contents
pop
eax
; to accumulator register
mov
edx, eax
; save a duplicate image
xor
eax, 0x00200000
; toggle the ID-bit (bit 21)
push eax
; copy revised contents
popfd
; back into EFLAGS
pushfd
; copy EFLAGS contents
pop
eax
; back into accumulator
xor
eax, edx
; do XOR with prior value
test
eax, 0x00200000
; did ID-bit get toggled?
jnz
y_cpuid
; yes, can execute ‘cpuid’
jmp
n_cpuid
; else ‘cpuid’ unimplemented
How does CPUID work?
• Step 1: load value 0 into register EAX
• Step 2: execute ‘cpuid’ instruction
• Step 3: Verify ‘GenuineIntel’ characterstring in registers (EBX,EDX,ECX)
• Step 4: Find maximum CPUID input-value
in the EAX register
Version and Features
• load 1 into EAX and execute CPUID
• Processor model and stepping information
is returned in register EAX
27
20 19
16
Extended Extended
Family ID Model ID
13 12 11
Type
8 7
Family
ID
Model
4 3
0
Stepping
ID
Some Feature Flags in EDX
28
H
T
T
9
3
1
A
P
I
C
P
S
E
V
M
E
HTT = HyperThreading Technology (1 = yes, 0 = no)
APIC = Advanced Programmable Interrupt Controller on-chip (1 = yes,0 = no)
PSE = Page-Size Extensions (1 = yes, 0 = no)
Virtual-8086 Mode Enhancements (1 = yes, 0 = no)
Multiprocessor Specification
• Industry standard allowing OS software to use
multiple processors in a uniform way
• Software searches in three regions of the
physical address-space below 1-megabyte for a
“paragraph-aligned” data-structure of length 16bytes called the MP Floating Pointer Structure:
– Search in lowest KB of Extended Bios Data Area
– Search in topmost KB of conventional 640K RAM
– Search in the 64KB ROM-BIOS (0xF0000-0xFFFFF)
MP Floating Pointer Structure
• This structure may contain an ID-number
for one a small number of standard SMP
system architectures, or may contain the
memory address for a more extensive MP
Configuration Table whose entries specify
a “more customized” system architecture
• Our classroom machines employ the latter
of these two options
The processor’s Local-APIC
• The purpose of each processor’s APIC is to
allow CPUs in a multiprocessor system to
transmit messages among one another and to
manage the delivery of interrupts from the
various peripheral devices to one or more CPUs
in a carefully controlled way
• The Local-APIC has a variety of registers which
are ‘memory mapped’ to paragraph-aligned
addresses in the 4KB page at 0xFEE00000
Each CPU has its own timer!
• Four of the Local-APIC registers are used
to implement a programmable timer
• It can privately deliver a periodic interrupt
just to its own CPU
– 0xFEE00320: Timer Vector register
– 0xFEE00380: Initial Count register
– 0xFEE00390: Current Count register
– 0xFEE003E0: Divider Configuration register
Timer’s Local Vector Table
17 16
M
O
D
E
0xFEE00320
M
A
S
K
12
B
U
S
Y
7
0
Interrupt
ID-number
In-class exercise
• Run the ‘cpuid.cpp’ Linux application (on
our course website) to see if the CPUs in
our classroom implement HyperThreading
(i.e., multiple processors within one CPU)
• Then run the ‘smpinfo.cpp’ application, to
see if the MP Base Configuration Table
has entries for more than one processor
• If both results hold true, then we can write
our own multiprocessing software in here!
In-class exercise #2
• Run the ‘apictick.s’ demo (on our website)
to observe the APIC’s periodic interrupt
drawing bytes onto the screen
• It executes for ten-milliseconds (the 8042
is used to create this timed delay)
• Try reprogramming the APIC’s Divider
Configuration register, to cut the interrupt
frequency in half (or to double it)
Download