Deferred segment-loading

advertisement
Deferred segment-loading
An exercise on implementing the
concept of ‘load-on-demand’ for the
program-segments in an ELF
executable file
Background
• Recall our previous in-class exercise: we
wrote a demo-program that could execute
a Linux application (named ‘hello’)
• A working version of that demo is now on
our class website (named ‘tryexec.s’)
• That demo simulated ‘loading’ of the .text
and .data program-segments, by copying
the ‘hello’ file’s memory-image into two
distinct locations in extended memory
Memory-to-memory copying
• We used the Pentium’s ‘movsb’ instruction
to perform those two copying operations
• The number of bytes we copied was equal
to the size of five disk-sectors (5 * 512)
• To ‘load’ the ‘.text’ program-segment, we
copied from 0x00011800 to 0x08048000
• To ‘load’ the ‘.data’ program-segment, we
copied from 0x00011800 to 0x08049000
Copying to extended memory
• The ‘movsb’ instruction is an example of a
‘complex’ instruction – it requires setup of
several CPU registers prior to its execution
• Setup required for ‘movsb’ involves:
–
–
–
–
Setup DS : ESI to address the source buffer
Setup ES : EDI to address the dest’n buffer
Setup ECX with the number of bytes to copy
Clear the DF-bit in the EFLAGS register
• Then ‘rep movsb’ perform the string-copying
• Note that 32-bit addressing is required here!
Example assembly code
; Source-statements to ‘load’ the ‘.text’ program-segment:
USE32
; assemble for 32-bit code-seg
mov ax, #sel_fs
; selector for 4GB data-segment
mov ds, ax
; with base-address=0x000000
mov es, ax
; is used for both DS and ES
mov esi, #0x00011800 ; offset-address for ‘source’
mov edi, #0x08048000 ; offset-address for ‘dest’n’
mov ecx, #2560
; number of bytes to be copied
cld
; use ‘forward’ string-copying
rep
; ‘repeat-prefix’ is inserted
movsb
; before the ‘movsb’ opcode
Segments were ‘preloaded’
• In our ‘tryexec.s’ demo, ‘.text’ and ‘.data’
segments were initialized in advance of
transferring control to the ‘hello’ program
• That technique is called ‘preloading’
• But the Pentium supports an alternative
approach to program-loading (it’s called
‘load-on-demand’)
• Segments remain ‘uninitialized’ until they
are actually accessed by the application
Segment-Not-Present
• The ‘Segment-Not-Present’ exception can
be utilized to implement ‘demand-loading’
• Segment-descriptors are initially marked
as ‘Not Present’ (i.e., the P-bit is zero)
• When any instruction attempts to access
these memory-segments (by moving the
segment-selector into a segment-register),
the CPU will generate an interrupt (int-11)
The Fault-Handler
• The interrupt service routine for INT-0x0B
(Segment-Not-Present Fault) can perform
the initialization of the specified memory
region (i.e., the ‘loading’ operation), mark
the segment-descriptor as ‘Present’ and
then ‘retry’ the instrtuction that triggered
the fault (by executing an ‘iret’ or ‘iretd’)
Error-Code Format
31
15
reserved
3
table-index
2
1
0
T
I
I E
D X
T T
Legend:
EXT = An external event caused the exception (1=yes, 0=no)
IDT = table-index refers to Interrupt Descriptor Table (1=yes, 0=no)
TI = The Table Indicator flag, used when IDT=0 (1=GDT, 0=LDT)
This same error-code format is used with exceptions 0x0B, 0x0C, and 0x0D
Benefits of deferred loading?
• With a small-size program (like ‘hello’) we
might not see much benefit from using the
‘load-on-demand’ mechanism, since both
of the program-segments sooner-or-later
would have to be ‘loaded’ into memory
• The only apparent benefit is that copying
can be done by ONE program-fragment
(i.e., within the fault-handler) instead of by
two fragments in the ‘pre-load’ procedure
Table-driven ‘handler’
• Balanced against the fewer instructions required
with ‘load-on-demand’ is the need to provide a
table-driven interrupt-handler that can ‘load’
whichever ‘not present’ program-segments
happen to get accessed
• A very simple implementation for such a handler
could use a table like this one:
memmap: ; from
to
count type
.LONG 0x11800, 0x08048000, 2560, 0xFA
.LONG 0x11800, 0x08049000, 2560, 0xF2
Big/Complex programs
• With complex applications that use many more
program-segments, ‘demand-loading’ could
potentially offer some runtime efficiencies
• For example, with interactive programs that can
display various error-messages: If error-handling
routines are in separate program-segments,
then those segments would not need to be
loaded unless -- and until -- the error-condition
actually occurs (maybe never)
In-class exercise
• To get practical ‘hands on’ experience with
implementing the demand-loading concept
we propose the following exercise
• Modify the ‘tryexec.s’ demo (see website)
by deferring the memory-to-memory copy
operations until the program-segments are
actually referenced by the ‘hello’ program
• Then perform the copying within an ISR
Some exercise details
• Copy the ‘tryexec.s’ demo-program to a
new file, named ‘ondemand.s’
• In the ‘load_and_exec_demo’ procedure,
comment out the two memory-to-memory
copy operations, and the mark the LDT
segment-descriptors for .text and .data as
‘NOT PRESENT’ segments (i.e., P=0)
• Create a ‘memmap’ table that describes
the copying operations that will be needed
Create a fault-handler
• Add an interrupt-gate for exception 0x0B and a
fault-handler that will perform the copy-operation
for a ‘not-present’ segment
• Remember that the CPU will automatically push
an error-code onto the ring0 stack if a ‘segmentnot-present exception occurs
• Don’t forget to discard that error-code as the
final step before exiting from the ISR:
add esp, #4
; discard error-code
iretd
; retry the instruction
Parallel table-entries
memmap
theLDT
0x00
0x00
From
0x11800
To
Size Type
0x8048000 2560 0xFA
0x08
From
0x10 0x11800
To
Size Type
0x8049000 2560 0xF2
0x00CF7A000000FFFF
0x10
0x00CF72000000FFFF
0x00CF72000000FFFF
4-words
0x20
From
0
To
0
4-longwords
Size
0
Type
0xF2
Download