80386 Summary

advertisement
80386DX
Features
•
•
•
•
•
•
•
•
It supports 8/16/32 bit data operands
It has 32-bit internal registers
It supports 32-bit data bus and 32-bit non-multiplexed address bus
It supports
– Physical Address of 4GB
– Virtual Address of 64TB
– Maximum Segment size of 4GB
It operates in 3 different modes
– Real
– Protected
– Virtual 8086
MMU provides virtual memory, paging and 4 levels of protection
Clock Frequency : 20,25 and 33MHz
It has 132 pin package
Architecture of 80386
UQ: Draw the block diagram of the 80386 DX Processor and explain each block in brief
1
The internal architecture of 80386 is divided into three sections:
1. Central Processing Unit
2. Memory Management Unit
3. Bus Interface unit
Central Processing Unit
•
•
•
•
1.
2.
3.
4.
5.
The CPU is further divided into Instruction Unit and Execution Unit
Instruction Unit:
– It decodes the opcode bytes received from the 16-byte instruction queue and
arranges them into a 3-decoded instruction queue.
– After decoding it is passed to control section for deriving necessary control
signals
Execution Unit:
– It has 8 general purpose and 8 special purpose registers which either handles data
or addresses
– The 64-bit barrel shifter increases the speed of all shift, rotate, multiply and divide
operations
– The multiply/divide logic implements the bit-shift-rotate algorithms to complete
the operations in minimum time(Even 32bit multiplication is done in 1µs)
Elements of Execution Unit
Arithmetic/logic unit (ALU): Performs the operation identified by ADD, SUB, AND, etc.
Flags register: Holds status and control information
General-purpose registers: Holds address or data information
Control ROM: Contains microcode sequences that define operations performed by
machine instructions
Special multiply, shift, and barrel shift hardware: Accelerate multiply, divide, and rotate
operations
Memory Management Unit
•
•
MMU consists of a segmentation unit and paging unit
Segmentation Unit:
– It allows the use of two address components - segment and offset – for
relocability and sharing of data
– It allows a maximum segment size of 4GB
– It provides a 4-level protection mechanism for protecting and isolating system’s
code and data from those of application program
– The limit and attribute PLA checks segment limits and attributes at segment level
to avoid invalid accesses to code and data in memory segment.
2
•
Paging Unit
– It organizes physical memory in terms of pages of 4KB size
– It works under the control of segmentation unit
– It converts linear addresses into physical addresses
– The control and attribute PLA checks privileges at page level.
Bus Interface Unit
•
•
•
•
It has a prioritizer to resolve the priority of various bus requests. This controls the access
of the bus
The address driver drives the bus enable and address signals A2 – A31.
The pipeline/bus size unit handles the control signals for pipelining and dynamic bus
sizing units
The data buffers interface the internal data bus with system bus
Signal Interface of 80386DX
•
•
•
•
Signals are arranged by functional groups.
The # symbol indicates active low signal.
When no # is present, the signal is active high.
Example: M/IO# - High voltage indicates memory selected
- Low voltage indicates I/O selected
 Clock (CLK2):
– It is divided by two internally to generate the internal processor clock.
– The phase of internal processor clock can be synchronized to a known phase.
3
•
•
•
Data Bus (D0 through D31):
– It has three-state bidirectional signals.
– It can transfer data on 32- and 16-bit buses using a data bus sizing feature.
Address Bus (A2 through A31)
• These three-state outputs provide memory or I/O port addresses.
• It can access 4GB of physical memory from 00000000H to FFFFFFFFH
• Of the total 32-bits, only higher 30 are released by MP
• A1 & A0 are used internally by MP to produce 4 bank enable signals(BE3# - BE0#)
Byte Enable Outputs( BE0# -- BE3#)
• enable 4 memory banks
• indicates which bytes of the 32-bit data bus are involved with the current transfer.
• BE0# applies to D0-D7
• BE1# applies to D8-D15
• BE2# applies to D16-D23
• BE3# applies to D24-D31
• No. of Byte Enables asserted indicates physical size of operand being transferred (1, 2, 3,
or 4 bytes).
BE3#
1
1
1
1
0
1
1
0
1
0
0
BE2#
1
1
1
0
1
1
0
0
0
0
0
BE1#
1
1
0
1
1
0
0
1
0
0
0
BE0#
1
0
1
1
1
0
1
1
0
1
0
4
Operation
No Operation
Bank0 (8-bit)
Bank1 (8-bit)
Bank2 (8-bit)
Bank3 (8-bit)
Bank 0,1 (16-bit)
Bank 1,2 (16-bit)
Bank 2,3 (16-bit)
Bank 0,1,2(24-bit)
Bank 1,2,3(24-bit)
Bank 0,1,2,3 (32-bit)
Bus Cycle Definition Signals (W/R#, D/C#, M/IO# , LOCK#)
– three-state outputs
– W/R# :distinguishes b/w write and read cycles.
– D/C# :distinguishes b/w data and control cycles.
– M/IO# :distinguishes b/w memory and I/O cycles.
– LOCK# :distinguishes b/w locked and unlocked bus cycles. It enables CPU to prevent
other bus masters (like coprocessor) from gaining the control of system bus.
These control signals are decoded by the bus control logic to decide which bus cycle to be
performed:
M/IO#
D/C#
W/R#
0
0
0
0
0
0
Bus Cycle
Type
Locked?
Yes
1
Interrupt
Acknowledge
Inactive
1
0
I/O Data Read
No
0
1
1
I/O Data Write
No
1
0
0
No
1
0
1
Memory Code
Read
HALT
1
1
0
Sometimes
1
1
1
Memory Data
Read
Memory Data
Write
--
No
Sometimes
Bus Control Signals(ADS#,READY#,NA#,BS16#):
– indicates when a bus cycle has begun and allow other system hardware to control address
pipelining, data bus width and bus cycle termination.
– ADDRESS STATUS (ADS#) : indicates that a valid address is driven at 80386DX pins.
– TRANSFER ACKNOWLEDGE (READY#) : indicates that the previous bus cycle is
complete and bus is ready for next bus cycle. It is useful for interfacing slow peripherals
– NEXT ADDRESS REQUEST (NA#) :
– This is used to enable address pipelining.
– It indicates that the system is prepared to accept the next address even if the end
of current cycle is not being acknowledged on READY#.
– BUS SIZE 16 (BS16#) :
– Asserting this input constrains current bus cycle to use only D0-D15 of data bus.
5
Bus Arbitration Signals (HOLD, HLDA)
•
–
–
–
–
•
–
–
–
BUS HOLD REQUEST (HOLD):
This input indicates some other device requires bus mastership.
HOLD must remain asserted as long as any other device is a local bus master.
HOLD is not recognized while RESET is asserted. (i.e. RESET has priority over
HOLD and places the bus into an idle state rather than hold acknowledge state)
HOLD is level-sensitive.
BUS HOLD ACKNOWLEDGE (HLDA):
This output indicates 80386 has relinquished control of its local bus in response to
HOLD asserted and it is in Bus Hold Acknowledge state.
This state offers near-complete signal isolation ( It is the only signal being driven by
80386)
The other output signals (D0-D31, BE0#-BE3#, A2-A31, W/R#, D/C#,M/IO#, LOCK#
and ADS#) are in a high-impedance state so the requesting bus master may control
them.
Coprocessor Interfacing
Intel 387DX numeric coprocessor is I/O mapped
•
•
As Intel386DX begins supporting a coprocessor instruction, it tests the BUSY# and
ERROR# signals to determine if the coprocessor can accept its next instruction
Intel 387DX can be given its command opcode immediately
COPROCESSOR REQUEST (PEREQ) :
• This input signal indicates a coprocessor request for a data operand to be transferred
to/from memory by Intel386 DX.
• In response, Intel 386DX transfers information between the coprocessor and memory
• Since Intel386 DX has internally stored the coprocessor opcode being executed, it
performs the requested data transfer with the correct direction and memory address.
• PEREQ is level-sensitive
COPROCESSOR BUSY (BUSY#) :
• This input indicates that coprocessor is still executing an instruction and is not yet able
to accept another.
• This sampling of BUSY# input prevents overrunning the execution of a previous
coprocessor instruction.
• BUSY# is level-sensitive
6
•
COPROCESSOR ERROR (ERROR#) :
• This input signal indicates that the previous coprocessor instruction
generated a coprocessor error of a type not masked by coprocessor's
control register.
• This input is automatically sampled by Intel386 DX when a coprocessor
instruction is encountered, and if asserted it generates exception 16 to
access the error-handling software.
• ERROR# is level-sensitive
Interrupt Signals (INTR, NMI, RESET)
• MASKABLE INTERRUPT REQUEST (INTR):
– This input indicates a request for interrupt service, which can be masked by Flag
Register IF bit.
– When processor responds to INTR input, it performs two interrupt acknowledge
cycles and at the end of second, it latches an 8-bit interrupt vector on D0-D7 to
identify source of interrupt.
– INTR is level-sensitive
– To assure recognition of an INTR request, INTR should remain asserted until the
first interrupt acknowledge bus cycle begins.
• NON-MASKABLE INTERRUPT REQUEST(NMI):
– This input indicates a request for interrupt service which cannot be masked by
software.
– Because of fixed NMI slot, no interrupt acknowledge cycles are performed when
processing NMI.
– NMI is rising edge-sensitive
– Once NMI processing has begun, no additional NMIs are processed until the next
IRET instruction, which is typically the end of the NMI service routine.
• RESET (RESET) :
– This input signal suspends any operation in progress and places the Intel386 DX
in a known reset state.
– The Intel386 DX is reset by asserting RESET for 15 or more CLK2 periods
– When RESET is asserted, all other input pins are ignored, and all other bus pins
are driven to an idle bus state.
– If RESET and HOLD are both asserted at a point in time, RESET takes priority.
– RESET is level-sensitive
• Vcc: These are system power supply lines
• GND: These are return lines for the power supply
7
Programming Model
•
•
•
The basic programming model consists of the following aspects:
– Registers
– Instruction Set
– Addressing Modes
– Data Types
– Memory Organization
– Interrupts and Exceptions
Register Overview
The Intel386 DX has 32 register resources in the following categories:
– General Purpose Registers
– Segment Registers
– Instruction Pointer and Flags
– Control Registers
– System Address Registers
– Debug Registers
– Test Registers
General Purpose Registers
•
•
•
•
•
•
•
•
hold data or address values.
support data of 1, 8, 16, 32 and 64 bits.
32-bit registers : EAX, EBX, ECX, EDX, ESI, EDI, EBP, and ESP.
The least significant 16 bits of the registers can be accessed as in 8086 with names of the
registers AX, BX, CX, DX, SI, DI, BP, and SP.
When accessed as a 16-bit operand, the upper 16 bits of the register are neither used nor
changed.
8-bit operations can be performed with AL, BL, CL and DL.
The higher bytes are AH, BH, CH and DH
The individual byte accessibility offers flexibility for data operations.
Segment Registers
8
•
The segment registers
– CS indicates the current code segment
– SS indicates the current stack segment
– DS, ES, FS and GS indicate four current data segments.
• On any data reference the DS-pointed data segment is assumed by default.
• In order to access any other data segment, an override directive is used
Instruction Pointer
•
•
•
•
It is a 32-bit register named EIP.
EIP holds the offset of the next instruction to be executed.
The offset is always relative to the base of the code segment (CS).
The lower 16 bits of EIP contain the 16-bit instruction pointer named IP, which is used
by 16-bit addressing.
Flag Register
•
Bit 17 (VM Bit, Virtual Mode):
– When VM bit is set, the processor will be working in Virtual 8086 mode
•
Bit 16 (RF Bit, Resume Flag):
– The RF flag is used in conjunction with the debug register breakpoints.
9
– When RF is set, it causes any debug fault to be ignored on the next instruction. RF
is then automatically reset at the successful completion of every instruction
•
Bit 15 : Reserved
•
Bit 14 (NT Bit, Nested Task):
•
•
– This flag applies to Protected Mode.
– NT is set to indicate that the execution of this task is nested within another task.
– If set, it indicates that the current nested task's Task State Segment (TSS) has a
valid back link to the previous task's TSS.
Bit 13,12 (IOPL Bit, Input/output Privilege):
– maximum CPL (current privilege level) value permitted to execute I/O
instructions without generating an exception 13 fault or consulting the I/O
Permission Bitmap.
Bit 11 (OF Bit, Overflow Flag):
– OF is set if the operation resulted in a signed overflow.
•
Bit 10 (DF Bit, Direction Flag):
–
•
•
DF defines whether ESI/EDI registers post-decrement or post-increment during
the string instructions.
– Post-increment occurs if DF is reset.
– Post-decrement occurs if DF is set.
Bit 9 (IF Bit, Interrupt Enable Flag):
– When IF =1 the processor allows recognition of external interrupts on INTR pin
Bit 8 (TF Bit, Trap Enable Flag):
– When TF =1 the processor enables the single step mode for debugging.
•
Bit 7 (SF Bit, Sign Flag):
– SF is set if the high-order bit of the result is set, it is reset otherwise.
•
Bit 6 (ZF bit, Zero Flag):
– ZF is set if all bits of the result are 0.
•
Bit 4 (AF Bit, Auxiliary Carry Flag):
–
Auxiliary Flag is used to simplify the addition and subtraction of packed BCD
numbers.
10
– AF is set if the operation resulted in a carry out of bit 3 (addition) or a borrow into
bit 3 (subtraction).
– Otherwise AF is reset.
– AF is only for bit 3.
•
Bit 2 (PF Bit, Parity Flag):
– PF is set for even parity.
•
Bit 0 (CF Bit, Carry Flag):
– CF is set for 8-, 16- or 32-bit operations if it results in a carry out of (addition), or
a borrow into (subtraction) the high-order bit.
Control Registers
•
Intel386 DX has three control registers of 32 bits, CR0, CR2 and CR3, to hold machine
state of a global nature
•
These registers along with System Address Registers hold machine state that affects all
tasks in the system
•
To access Control Registers, load and store instructions are defined
CR0 : Machine Control Register
•
CR0 contains 6 defined bits for control and status purposes.
•
The low-order 16 bits of CR0 is defined as Machine Status Word
•
To operate only on the low-order 16-bits of CR0, LMSW and SMSW instructions are
used.
•
For 32-bit operations the system should use MOV CR0, Reg instruction.
•
Bit 31 (PG Bit, Paging Enable) : The PG bit is set to enable the on-chip paging unit.
•
Bit 4 (Reserved) : This bit is reserved by Intel.
11
•
Bit 3 (TS Bit, Task Switched) : TS is automatically set whenever a task switch operation
is performed.
•
Bit 2 (EM Bit, Emulate Coprocessor) :
•
•
•
This bit is set to cause all coprocessor opcodes to generate a Coprocessor Not
Available fault (exception 7).
Bit 1 (MP Bit, Monitor Coprocessor) :
•
The MP bit is used with TS bit to determine if the WAIT opcode will generate a
Coprocessor Not Available fault (exception 7).
•
When both MP = 1 and TS= 1, the WAIT opcode generates a trap which checks
for the coprocessor availability.
Bit 0 (PE Bit, Protection Enable) :
• The PE bit is set to enable the Protected Mode.
• If PE is reset, the processor operates in Real Mode.
CR1 : Reserved
• CR1 is reserved for use in future Intel processors
CR2 : Page Fault Linear Address
•
CR2 holds the 32-bit linear address that caused the last page fault detected.
CR3 : Page Directory Base Address
•
CR3 contains the physical base address of the page directory table.
•
The Intel386 DX page directory table is always page-aligned (4 Kbyte-aligned).
•
Thus the lowest twelve bits of CR3 are ignored.
•
A task switch through a TSS invalidates all page table entries in paging unit cache.
12
System Address Registers
•
•
•
•
Four special registers are defined to reference the tables.
These tables or segments are:
– GDT (Global Descriptor Table)
– IDT (Interrupt Descriptor Table)
– LDT (Local Descriptor Table)
– TSS (Task State Segment)
The addresses of these tables and segments are stored in special registers, the System
Address and System Segment Registers.
These registers are named GDTR, IDTR, LDTR and TR, respectively
GDTR and IDTR
•
These registers hold:
– 32-bit linear base address and
– 16-bit limit
of GDT and IDT respectively.
• GDT and IDT segments are global to all tasks in the system.
LDTR and TR
•
•
•
These registers hold 16-bit selector for
– LDT descriptor and
– TSS descriptor
Since they are task specific, they are defined by selector values stored in system segment
registers.
A system descriptor register, which is not visible to programmer, is associated with each
system segment register
13
Debug Registers
UQ: Explain the debug registers of Pentium
UQ: Explain the Debug Registers of 80386 DX processor
•
Debugging of 80386 allows data access breakpoints as well as code execution
breakpoints.
•
80386 contains 6 debug registers to specify
– 4 breakpoints
– Breakpoint Control options
– Breakpoint Status
Linear Breakpoint Address Registers
• The breakpoint addresses specified are 32-bit linear addresses
• While debugging, Intel 386 h/w continuously compares the linear breakpoint addresses in
DR0-DR3 with the linear addresses generated by executing software.
14
Debug Control Register
•
LENi(i=0 - 3): Breakpoint Length Specification Bits:
• 2 bit field for each breakpoint
• Specifies length of breakpoint fields
• The choices of data breakpoints are 1byte, 2bytes & 4bytes
• For instruction execution breakpoint, the length is 1(beginning byte address)
LENi Encoding
•
RWi(i=0 - 3): Memory Access Qualifier Bit
• 2 bit field for each breakpoint
• Specifies the type of usage which must occur inorder to activate the associated
breakpoint
•
GD: Global Debug Register Access Detect
• Debug registers can only be accessed in real mode or at privilege level 0 in
protected mode
15
•
•
•
GD bit, when set, provides extra protection against any Debug Register access
even in Real Mode or at privilege level 0 in Protected Mode.
• This additional protection feature is provided to guarantee that a software
debugger can have full control over the Debug Register resources when required.
• The GD bit, when set, causes an exception 1 fault if an instruction attempts to
read or write any Debug Register.
• The GD bit is then automatically cleared when the exception 1 handler is invoked,
allowing the exception 1 handler free access to the debug registers.
GE and LE bit: Exact data breakpoint match, global and local
• If either GE or LE is set, any data breakpoint trap will be reported exactly after
completion of the instruction that caused the operand transfer.
• LE bit is cleared during task switch and is used for task-local breakpoints.
• GE bit is unaffected during a task switch and remain enabled during all tasks
executing in the system.
Gi and Li(i=0 - 3): Breakpoint Enable, global and local
• If either Gi and Li is set then the associated breakpoint is enabled.
Debug Status Register
• A Debug Status Register allows the exception 1 handler to easily determine why it was
invoked.
• It can be invoked as a result of one of several events:
1) DR0 Breakpoint fault/trap.
2) DR1 Breakpoint fault/trap.
3) DR2 Breakpoint fault/trap.
4) DR3 Breakpoint fault/trap.
5) Single-step (TF) trap.
6) Task switch trap.
7) Fault due to attempted debug register access when GD = 1.
•
•
•
Bi : Debug fault/trap due to breakpoint 0 -3
• Four breakpoint indicator flags, B0-B3, correspond one-to-one with the
breakpoint registers in DR0-DR3.
• A flag Bi is set when the condition described by DRi, LENi, and RWi occurs.
BD : Debug fault due to attempted register access when GD bit is set
• This bit is set if the exception 1 handler was invoked due to an instruction
attempting to read or write to the debug registers when GD bit was set.
BS : Debug trap due to single step
• This bit is set if the exception 1 handler was invoked due to the TF bit in the flag
register being set
16
•
BT : Debug trap due to task switch
• This bit is set if the exception 1 handler was invoked due to a task switch
occurring to a task having an Intel386 DX TSS with the T bit set.
Test Registers
•
They are used to control the testing of Translation Look-aside Buffer of Intel386 DX.
•
TR6 is the command test register
•
TR7 is the data register which contains the data of Translation Look-aside buffer test
Instruction Set
• The instruction set is divided into 9 categories of operations:
• Data Transfer
• Arithmetic
• Shift/Rotate
• String Manipulation
• Bit Manipulation
• Control Transfer
• High Level Language Support
• Operating System Support
• Processor Control
• These instructions operate on either 0,1,2 or 4 operands
• where an operand resides in
• Register
• Instruction itself
• Memory
• Most zero operand instructions take only one byte
• One operand instructions are generally two bytes long
• The average instruction is 3.2 bytes long
• Since 80386 has a 16-byte queue, an average of 5 instructions are prefetched.
• The use of 2 operands permits the following types of common instruction:
• Register to Register
• Memory to Register
• Immediate to Register
17
•
• Register to Memory
• Immediate to Memory
The operands can be either 8,16 or 32 bits long
Addressing Modes
•
•
•
•
•
•
•
•
•
The Intel386 DX provides 11 addressing modes for instructions to specify operands.
Register Operand Mode:
• The operand is located in one of the 8-, 16- or 32-bit general registers.
• Example : ADD EAX,EBX
Immediate Operand Mode:
• The operand is included in the instruction as part of the opcode.
• Example : CLI,STI
The remaining 9 modes provide a mechanism for specifying the effective address of an
operand.
The linear address consists of two components:
• the segment base address and
• an effective address.
The effective address is calculated by using four address elements:
• DISPLACEMENT: An 8-, or 32-bit immediate value
• BASE: The contents of any general purpose register. It is generally used by
compilers to point to the start of the local variable area.
• INDEX: The contents of any general purpose register except for ESP. The index
registers are used to access the elements of an array, or a string of characters.
• SCALE: The index register's value can be multiplied by a scale factor, either 1, 2,
4 or 8. Scaled index mode is especially useful for accessing arrays or structures.
Combinations of these 4 components make up the 9 additional addressing modes
The effective address (EA) of an operand is calculated according to the following
formula:
EA = Base Register+ (Index Register * Scaling) + Displacement.
This calculation can be shown as follows:
18
•
Direct Mode:
• The operand's offset is contained as part of the instruction as an 8- or 32-bit
displacement.
Example: INC Word PTR [500]
•
Register Indirect Mode:
•
•
A base register will contain the address of operand
Example: MOV [ECX], EDX
19
•
•
Based Mode:
•
A BASE register's contents is added to a DISPLACEMENT to form the operands
offset.
•
Example: MOV ECX, [EAX+24]
Index Mode:
•
•
An INDEX register's contents is added to a DISPLACEMENT to form the
operands offset. EXAMPLE: ADD EAX, TABLE[ESI]
Scaled Index Mode:
•
An INDEX register's contents is multiplied by a scaling factor which is added to a
DISPLACEMENT to form the operands offset.
•
Example: IMUL EBX, TABLE[ESI*4],7
20
•
Based Index Mode:
•
The contents of a BASE register is added to the contents of an INDEX register to
form the effective address of an operand.
Example: MOV EAX, [ESI] [EBX]
•
Based Scaled Index Mode:
•
The contents of an INDEX register is multiplied by a SCALING factor and the
result is added to the contents of a BASE register to obtain the operands offset.
•
Example: MOV ECX, [EDX*8] [EAX]
21
•
•
Based Index Mode with Displacement:
•
The contents of an INDEX Register and a BASE register's contents and a
DISPLACEMENT are all summed together to form the operand offset.
•
Example: ADD EDX, [ESI] [EBP+00FFFFF0H]
Based Scaled Index Mode with Displacement:
•
The contents of an INDEX register are multiplied by a SCALING factor, the
result is added to the contents of a BASE register and a DISPLACEMENT to
form the operand's offset.
•
EXAMPLE: MOV EAX, LOCALTABLE[EDI*4] [EBP+80]
Data Types
•
The Intel386 DX supports all of the data types commonly used in high level languages:
•
Bit: A single bit quantity.
•
Bit Field: A group of upto 32 contiguous bits,
which spans a maximum of four bytes.
22
•
Bit String: A set of contiguous bits, on the Intel386 DX bit strings can be up to 4
gigabits long.
•
Byte: A signed 8-bit quantity
•
Unsigned Byte: An unsigned 8-bit quantity.
•
Integer (Word): A signed 16-bit quantity.
•
Long Integer (Double Word):
– A signed 32-bit quantity.
– All operations assume a 2's complement representation.
23
•
Unsigned Integer (Word): An unsigned 16-bit quantity.
•
Unsigned Long Integer (Double Word): An unsigned 32-bit quantity.
•
Signed Quad Word: A signed 64-bit quantity.
•
Unsigned Quad Word: An unsigned 64-bit quantity.
•
Offset: A 16- or 32-bit offset only quantity which indirectly references another memory
location.
•
Pointer: A full pointer which consists of a 16-bit segment selector and either a 16- or 32bit offset.
24
•
Char: A byte representation of an ASCII Alphanumeric or control character.
•
String: A contiguous sequence of bytes, words or dwords. A string may contain between
1 byte and 4 GB.
•
BCD: A byte (unpacked) representation of decimal digits 0±9.
•
Packed BCD: A byte (packed) representation of two decimal digits 0±9 storing one digit
in each nibble.
•
When 80386 DX is coupled with 387 Numeric Coprocessor then the following common
floating point types are supported.
•
Floating Point: A signed 32-, 64-, or 80-bit real number representation.
25
Memory Organization and Segmentation
Introduction
•
Memory is divided into bytes, words and dwords.
•
Words are stored in two consecutive bytes and dwords in 4 consecutive bytes
•
It supports larger units of memory: pages and segments.
•
Segmentation: Memory is divided into one or more variable length segments which can
be swapped to disk or shared between programs.
•
Paging: Memory is organized into one or more 4KB pages.
•
Segmentation and Paging can be combined to gain advantages of both systems.
•
Segmentation is used for organizing memory in logical modules
•
Pages are useful for system programmer for managing physical memory of system.
Address Spaces
•
80386DX has three distinct address spaces:
•
Logical(Virtual) Address:
– It consists of a selector and an offset
– Selector : contents of segment register
– Offset : Effective address (sum of base, index and displacement)
– Each task has maximum of 16K selectors (214) and offset can be 4GB(232) to give
a total of 246 or 64TB
•
Linear Address
– Segmentation unit translates logical address space into 32-bit linear address space.
– If there is no paging linear address will be the physical address
•
Physical Address
– Paging unit translates linear address space to physical address space
– It is what appears on address pins.
26
Operating Modes
UQ: State and explain the operating modes of X86 family of processors.
•
The Intel 386DX has three modes of operation
– Real Mode (Real Address) and
– Protected Mode (Protected Virtual Address Mode)
– Virtual 8086 Mode
•
Real Mode:
– It works as a very fast 8086 with 32-bit extensions.
– It is required to set up the processor for protected mode
•
Protected Mode:
– It provides access to sophisticated memory management, paging and privilege
capabilities of the processor
•
Virtual 8086 Mode:
– It allows the execution of 8086 applications with protection mechanism
27
Real Mode Architecture
•
It has same base architecture as 8086
•
When a processor is reset, it is initialized in real mode.
•
It sets up the processor for Protected Mode.
•
The segment size of 80386 in real mode is 64KB
•
The maximum memory size is 1MB
•
Only address lines A2-A19 are active
•
Paging is not allowed in real mode and hence the physical address is same as linear.
•
Physical Address is formed by adding contents of segment register shifted left by 4 bits to
an effective address.
•
This results in a physical address from 00000000 to 0010FFEF (FFFF0+FFFF)
•
Real mode segments always start on 16-byte boundaries since they are left shifted.
•
All segments in Real Mode are exactly 64KB long and generate exception 13 if a data
operand or instruction fetch occurs past the end of a segment.
•
Segments may be overlapped in Real Mode.If a particular segment does not use all
64KB another segment can be overlayed on top of the unused portion of the previous
segment.
28
Protected Mode
Concepts of Virtual Memory





Though the virtual memory does not exist physically it still appears to be available within
the system.
The concept of virtual memory is implemented using the physical memory that the CPU
can directly access and the secondary memory that is used as a storage for data and
program.
The segment of a program or data required for actual execution at that instant is fetched
from the secondary memory into the physical memory. After the execution of this fetched
segment, the next segment required for further execution is again fetched from the
secondary memory, while the results of the executed segment are stored back into
secondary memory for further references. This continues till the complete program is
executed.
The procedure of fetching the chosen program segments or data from the secondary
memory to physical memory is called swapping.
The procedure of storing the results or data back into the secondary storage is called
unswapping.
•
In case of huge programs, they are divided into smaller segments or pages which are
arranged in appropriate sequence and are swapped in or out of primary memory as per the
requirement for execution of complete program
•
These segments or pages are associated with a data structure called descriptor which
contains information about this segment or page.
29
•
A set of such descriptors arranged in a proper sequence describes the complete program.
This set of descriptors is called descriptor table.
•
In case of multiprogramming environment, many of such descriptor table may be
available in the system at an instant of time
•
Descriptor Tables are prepared and managed by the operating system.
•
For different types of program segments, there may be different types of descriptors
•
The descriptors are automatically referred to by the CPU when a segment register is
loaded with selector.
Selector (Segment Register)
A selector in protected mode has 3 fields:
TI (Table Indicator): Local or Global Descriptor Table Indicator
Index(Descriptor Entry Index ): Selects one of 8K descriptors
RPL (Requestor Privilege Level): allows testing of selector’s privilege attributes
Level 0: Most Privilege Level
Level 3: Least Privilege Level
30
Descriptor Tables
•
It defines all the segments used in x86 system
•
There are 3 types of table:
– Global Descriptor Table(GDT)
– Local Descriptor Table(LDT)
– Interrupt Descriptor Table(IDT)
•
All tables are variable length memory arrays
• They can range in size from 8 bytes to 64KB (213 x 23) (3 bits are implied as descriptor
size is fixed)
•
Each table can hold up to 8192(213) 8-byte descriptors.
• The table has registers associated with them named GDTR, LDTR and IDTR which hold
the 32-bit linear base address and 16-bit limit of each table.
•
These tables are manipulated by the OS using privileged instructions.
Global Descriptor Table
•
Every Intel386 DX system contains a GDT.

GDT contains descriptors which are possibly available to all of the tasks in the system.

GDT contains code and data segments used by the operating systems and task state
segments and descriptors for the LDTs in a system.
•
The first slot of GDT corresponds to the null selector and is not used.
Global Descriptor Table Register
31
Local Descriptor Table
•
LDTs contain descriptors which are Task Specific.
•
LDT may contain only code, data, stack, task gate and call gate descriptors.
•
LDTs isolates a given task's code and data segments from the rest of the OS.
•
The visible portion of LDT register contains only a 16-bit selector.
•
This selector refers to a Local Descriptor Table Descriptor in the GDT.
Interrupt Descriptor Table
•
IDT contains descriptors to point to the location of up to 256 interrupt service routines.
•
IDT should be atleast 256 bytes to hold 32 Intel Reserved Interrupts(32*8)
•
Every interrupt used must have an entry in IDT.
•
IDT entries are referenced via INT instructions, external interrupt vectors and
exceptions.
•
Interrupts handle asynchronous external events (e.g. pressing a key) and exceptions
handle instruction faults (e.g. page permission fault)
32
Descriptors
•
The object to which the segment selector points to is called a descriptor.
•
Descriptors are 8 byte quantities which contain 32 bit base address, 20-bit limit and
attributes (12 bits) about a given segment.
General Format of a Descriptor
UQ: Write short note on structure of segment descriptor field
•
All segments have 3 attribute fields in common: the P-bit, the DPL bit and the S-bit.
•
P-bit : P=1  segment is loaded in physical memory. If P=0, then exception 11
•
DPL-bit: 2-bit field which specifies the protection level 0-3 associated with a segment.
•
There are two main categories of segments: system segments and non-system segments(
for code and data)
•
S-bit: S=1  non-system segment.
S=0  system segment
Non-system segment descriptor
•
•
G-bit (Granularity Bit): It specifies if a segment length is byte granular or page granular.
G = 0  byte granular (max 1MB)
segment size may be 1, 2, ..., 220 bytes
G = 1  page granular (max 4GB)
33
segment size may be 1 × 212,2 × 212,...,220 ×212 bytes
•
D-bit : It indicates default length for operands and effective addresses.
D = 1  32-bit operands and 32-bit addressing modes
D = 0  16-bit operands and 16-bit addressing modes
• B-bit : It indicates size of stack pointer
B = 1  32-bit ESP register
B = 0  16-bit SP register
• AVL-bit (Available field) : It indicates whether it is for user or OS.
Access Right Byte
Bit Position
Name
Function
7
Present(P)
P = 1  Segment is mapped into physical memory.
P = 0  No mapping to physical memory exits.
6-5
Descriptor
Privilege
Level(DPL)
Segment Privilege attributes
4
Segment
Descriptor(S)
S =1  Code or Data segment descriptor
S =0  System Segment or Gate Descriptor
3
Executable(E)
E = 0  Descriptor type is data segment
E = 1  Descriptor type is code segment
2
Expansion
Direction(ED) for
data
ED =0  Expand up(data) segment, offsets must be ≤
limit.
ED =1  Expand down(stack) segment, offsets must
be > limit.
Confirming(C)
when E=1 for code
Code segment may only be executed when CPL ≥DPL
and CPL remains unchanged.
Writeable (W) for
data
W = 0  Data segment are read only
W = 1  Data segment may be written into.
Readable (R) for
code
R = 0  Code segment may not be read(execute only)
R = 0  Code segment may be read(execute/read)
Access Bit (A)
A = 0  Segment has not been accessed
A = 1  Segment has been accessed
1
0
34
System Segment Descriptor
Paging
• Paging is used for virtual memory multitasking operating system.
• Pages are fixed size portions of the program module or data
• The complete task need not be in physical memory at any time and only a few pages
are required. Hence the remaining space can be allocated for other tasks and thus
multitasking can be achieved
Paging Organization
Page Mechanism
• Intel386DX uses a two level table mechanism to convert linear address to physical
address.
• There are 3 components in paging mechanism:
1. Page Directory
2. Page Table
3. Page Frame (page itself)
• Page size of Intel 386DX is 4KB
35
Page Descriptor Base Register
•
•
•
•
CR2 is used to store the 32-bit linear address of page fault.
CR3 (Page Directory Physical Base Address Register) stores the physical starting
address of Page Directory.
The lower 12 bits of CR3 are always zero to ensure that the Page Directory is always
page aligned
Loading it via a MOV CR3, reg instruction causes Page Table Entry cache to be flushed
and a task switch through a TSS changes the value of CR0.
Page Directory
• It is at the most 4KB in size and allows upto 1024(only 10bits from linear address)
entries are allowed.
• The upper 10 bits of the linear address are used as an index to corresponding page
directory entry
• Page directory entry points to page tables.
Page Directory Entry
Page Tables
• Each Page Table is 4K bytes(page aligned) and holds up to 1024 Page Table
Entries(PTE).
• PTEs contain the starting address of the page frame and statistical information about
the page.
• The 20 upperbit page frame address is concatenated with the lower 12 bits of the
linear address to form the physical address.
• Page tables can be shared between tasks and swapped to disks.
Page Table Entry
36
•
•
•
•
•
•
•
•
•
•
P(Present)Bit: indicates if PDE/PTE can be used in address translation. P-bit of the
currently executed page is always high.
A (Accessed) Bit: It is set before any access to the page.
D (Dirty) bit: It is set before a write operation to the page is carried out. The D bit is
undefined for PDEs.
OS Reserved Bits: They are defined by the operating system software.
U/S (0Supervisor/1User)Bit and R/W (0Read/1Write) Bit: They are used to
provide page level or page table level protection. They are decoded as
Supervisor level (U/S=0) ── for the operating system and other systems software and
related data.
User level (U/S=1) ── for applications procedures and data.
Read-only access (R/W=0)
Read/write access (R/W=1)
U/S (0Supervisor/1User)Bit and R/W (0Read/1Write) Bit: They are used to
provide page level or page table level protection. They are decoded as
Translation Lookaside Buffer(TLB)
• 80386 supports demand paged virtual memory systems
•
Performance degrades if the processor access two levels of tables for every memory
reference.
•
To solve this problem, the Intel386 DX keeps a cache of the most recently accessed pages
and this cache is called Translation Lookaside Buffer (TLB).
•
TLB is a 4 way set associative 32 entry page table cache
37
•
TLB has 4 sets of eight entries each.
•
Each entry consists of a TAG and a DATA.
•
Tags are 24 bit wide. They contain 20 upper bits of linear address, a valid bit (Validation
of Entry) and three attribute bits(D,U/S and R/W)
•
Data portion of each entry contains higher 20 bits of the Physical address.
•
It automatically keeps the most commonly used Page Table Entries.
•
32-entry TLB coupled with a 4K page size results in the coverage of 128KB of memory
addresses.
Paging Operation
•
Paging unit receives a 32-bit linear address from the segmentation unit.
•
The upper 20 linear address bits are compared with all 32 entries in the TLB to determine
if there is a match.
38
•
If there is a match (i.e. a TLB hit), then the 32-bit physical address and will be placed on
the address bus.
•
However, if PTE entry is not in TLB, the Intel386 DX will read the appropriate PDE
Entry.
•
If P = 1 on PDE (the page table is in memory), then the Intel386 DX will read the
appropriate PTE and set the Access bit.
•
If P = 1 on PTE ( the page is in memory), then the Intel386 DX will update the Access
and Dirty bits as needed and fetch the operand.
•
The upper 20 bits of the linear address read from the page table will be stored in the TLB
for future accesses.
•
If P = 0 for either PDE or PTE, then the processor will generate a page fault exception
•
This exception is also generated when protection rules are violated and the CR2 is loaded
with the page fault address
39
Protection Mechanism of 80386
•
80386 DX has four levels of protection which isolate and protect user programs from
each other and the operating system.
•
It offers an additional type of protection on a page basis, when paging is enabled(using
U/S and R/W fields)
•
The four-level hierarchical privilege system is illustrated as follows:
•
The privilege levels (PL) are numbered 0 through 3.
•
Level 0 is the most privileged or trusted level.
Rules for Privileges
•
Intel 386Dx controls access to both data and procedures according to the following rules:
(1) Data segment with privilege level p can be accessed only by the code executing at a
privilege level atleast as privileged as p
(E.g. Application programs are prevented from reading or changing OS Tables)
(2) A code segment with a privilege level p can only be called by a task executing at the
same or lesser privilege level than p
(E.g. An Application Program may call an OS routine)
Privilege Level
•
There are 3 different types of privilege level entering into the privilege level checks:
– Current Privilege Level (CPL)
40
– Descriptor Privilege Level (DPL)
– Requestor Privilege Level (RPL)
Current Privilege Level (CPL)
• CPL is stored in the selector of currently executing CS register
•
It represents the privilege level(PL) of the currently executing task.
•
It is also PL in the descriptor of the code segment.
•
It is also designated as Task Privilege Level(TPL)
Descriptor Privilege Level (DPL)
•
It is the PL of the object which is being attempted to be accessed by the current task
•
It is PL of target segment and is contained in the descriptor of the segment
Requestor Privilege Level (RPL)
•
It is the lowest two bits of any selector.
•
It can be used to weaken the CPL if desired.
•
The Effective Privilege Level(EPL) is
EPL = max (CPL, RPL) (numerically)
•
Thus the task becomes less privileged.
Accessing Data Segments
•
Assume that a task needs data from data segment.
•
The privilege levels are checked at the time a selector for the target segment is loaded
into the data segment register.
•
Three privilege levels enter into privilege checking mechanism
– CPL of the currently executing task
– RPL of the selector which specifies target segment
– DPL of the descriptor of the target segment
41



Instructions may load the DS register only if the DPL of the target segment is
numerically greater than or equal to the maximum of the CPL and selectors RPL.
i.e. DPL of target segment ≥ max (CPL,RPL)
A procedure can only access data that is at the same or less privilege level.
Restricting Control Transfer
•
Control transfer (except interrupts) are accomplished by JMP, CALL and RET
instructions.
•
The near forms of JMP and CALL transfer within current code segment and requires only
limit checking
•
The far forms of JMP and CALL refer to other segments and require privilege checking.
•
The far JMP and CALL can be done in 2 ways:
1. Without Call Gate Descriptor
2. With Call Gate Descriptor
42
Without Call Gate Descriptor
•
The processor permits a JMP or CALL directly to another segment only if
1. DPL of the target segment = CPL of the calling segment
2. Confirming bit of the target code is set and DPL of the target segment ≤ CPL
•
Confirming Segment: These segments may be called from various privilege levels but
execute at the privilege level of the calling procedure.
With Call Gate Descriptor


The far pointer of the control transfer instruction uses the selector part of the pointer and
selects a gate.
The selector and offset fields of a gate form a pointer to the entry of a procedure.
43
•
Four privilege levels are used to check the validity of the control transfer via a call gate:
1. CPL
2. RPL of the selector used to specify call gate
3. DPL of the gate descriptor
4. DPL of the descriptor of target segment.
•
Only CALL instruction can use gates to transfer to smaller privilege levels.
•
For a JMP instruction, the privilege rules are
MAX(CPL,RPL) ≤ gate DPL
target segment DPL = CPL(numerically)
•
For a CALL instruction, the rules are
44
MAX(CPL,RPL) ≤ gate DPL
target segment DPL ≤ CPL(numerically)
Privilege Check via Call Gate
Task Switching
•
The multitasking/multiuser OS needs the ability to rapidly switch between tasks.
•
Intel 386DX supports this operation by providing a task switch instruction hardware.
•
The task switch operation
– saves the entire state of the machine(all of reg.s,address space and a link to
previous task),
– loads a new execution state,
– performs protection checks and
– commences execution in the new task
in about 17 μs.
45
•
The task switch operation is invoked by an intersegment JMP or CALL instruction which
refers to a TSS or task gate descriptor in GDT or LDT
•
The TSS Descriptor points to a segment containing the entire execution state while a
Task Gate Descriptor contains a TSS Selector
Task Gate Descriptor
TSS Descriptor
•
Each task must have a TSS associated with it.
•
The current TSS is identified by the Task State Segment Register(TR)
•
TR contains the selector referring to current TSS Descriptor.
•
Returning from a task is done by IRET
•
IRET makes the control transfer to the task which was interrupted and current task state
is saved in TSS and old task state is restored from TSS
46
Task State Segment(TSS)
•
All information the processor needs to manage a task is stored in task state segment
(TSS).
•
The fields of a TSS belong to two classes:
47
•
Dynamic set : The processor updates with each task switch . It includes the fields that
store:
– General registers (EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI)
– The segment registers (ES, CS, SS, DS, FS, GS).
– The flags register (EFLAGS).
– The instruction pointer (EIP).
– The selector of the TSS
• Static set : the processor reads but does not change.
• This set includes the fields that store:
– The selector of the task's LDT.
– Page Directory Base Register (PDBR)
– Pointers to the stacks.
– T-bit (debug trap bit)
– I/O map base
Virtual 8086
UQ: Write short note on V-86 mode of operation(05Marks)
• It allows the execution of 8086 applications with protection mechanism
• It allows simultaneous execution of 8086 OS and its applications
Virtual Mode Memory Management
• The segment registers are used as in REAL mode
• The contents of segment register is shifted by 4 bits and added to offset to form linear
address
48
Paging in Virtual Mode
•
It allows concurrent running of multiple VM tasks and provide protection and OS
isolation
• Paging is not necessary but for multiple tasks it is required
• Paging allows 20-bit linear address to form 256 pages and can be allocated anywhere in
4GB physical memory
• Since CR3 is loaded by a task switch, each VM task can use a different mapping scheme
to map pages to different physical location
• The paging hardware allows the sharing of 8086 OS b/w multiple 8086 applications.
Protection in Virtual Mode
• All VM Tasks execute at a privilege level 3.
• Virtual 8086 mode are subject to all protection checks defined in protection mode
• An attempt to execute privileged instruction will cause an exemption
Structure of a V86 Task
• A V-86 task consists
– Partly of 8086 program
– Partly of 80386 code that serves as the VM (Virtual Machine)monitor (80386
protected mode code (PL=0) with initialization and exception handling
procedures)
• To run in V86 mode, 8086 program needs:
– A V-86 monitor
– Operating System Services
Entering and Leaving V86 Mode
• The processor can enter V86 by two means:
•
•
Case 1:
– A task switch to an 80386 task loads the image of EFLAGS from the new TSS.
– The TSS of the new task contains the VM flag
– VM = 1 of the new EFLAGS indicates that the new task is executing 8086
instructions and therefore the segment registers from the TSS forms base
addresses as 8086 would.
Case 2:
– An IRET from a procedure of an 80386 task loads the image of EFLAGS from
the stack
– VM = 1 indicates that the procedure to which control is being returned is an 8086
procedure
49
– CPL at the time IRET is executed must be zero, else the processor does not
change VM.

The processor leaves V86 mode when an interrupt or exception occurs.
Case 1:





The interrupt or exception causes a task switch which loads EFLAGS from the TSS of
the new task.
If the new TSS is an 80386 TSS and VM bit is 0 in the EFLAGS, then the processor
clears the VM bit of EFLAGS
loads the segment registers from the new TSS and
begins executing the instructions of the new task according to 80386 protected-mode
semantics.
Case 2:



The interrupt or exception vectors to a privilege-level zero procedure.
The processor stores the current setting of EFLAGS on the stack, then clears the VM
bit.
The interrupt or exception handler, therefore, executes as 80386 protected-mode code
50
Download