Linking, Loading and Mapping A look at how Operating System utilities and services support application deployment Overview of source translation User-created files Makefile Make Utility C/C++ C/C++Source Source and Header and Header Files Files assembler Object Object Files Files Archive Utility Shared Object File Linker Command File preprocessor compiler Library Library Files Files Assembly Assembly Source Source Files Files Linker and Locator Linkable Image File Executable Image File Link Map File Executable versus Linkable ELF Header ELF Header Program-Header Table (optional) Program-Header Table Section 1 Data Segment 1 Data Section 2 Data Segment 2 Data Section 3 Data Segment 3 Data … Section n Data … Segment n Data Section-Header Table Section-Header Table (optional) Linkable File Executable File Role of the Linker ELF Header Section 1 Data Section 2 Data … Section n Data ELF Header Program-Header Table Segment 1 Data Section-Header Table Linkable File Segment 2 Data ELF Header … Segment n Data Section 1 Data Section 2 Data … Section n Data Section-Header Table Linkable File Executable File ELF Header e_ident [ EI_NIDENT ] e_type e_machine e_shoff e_version e_flags e_entry e_phoff e_ehsize e_phentsize e_phnum e_shentsize e_shnum e_shstrndx Section-Header Table: e_shoff, e_shentsize, e_shnum, e_shstrndx Program-Header Table: e_phoff, e_phentsize, e_phnum, e_entry Section-Headers sh_name sh_type sh_flags sh_addr sh_offset sh_size sh_link sh_info sh_addralign sh_entsize Program-Headers p_type p_offset p_vaddr p_paddr p_filesz p_memsz p_flags p_align Linux ‘Executable’ ELF files • The Executable ELF files produced by the Linux linker are configured for execution in a private ‘virtual’ address space, whereby every program gets loaded at the identical virtual memory-address (i.e., 0x08048000) • We will soon study the Pentium’s paging mechanism which makes all this possible (Also read Chapter 8 in Stallings textbook) Linux ‘Linkable’ ELF files • But it is possible that some ‘linkable’ ELF files are self-contained (i.e., they do not need to be linked with other object-files or libraries) • Our ‘courseid.o’ is such an example; you use this command to produce ‘courseid.o’: $ gcc -c courseid.cpp • The GNU linker (named ‘ld’) will let us override its usual linking rules if we use a special-format linker script (ours is named ‘ldscript’ on website) Our ‘elfinfo.cpp’ tool • We wrote a program that ‘parses’ an ELF file, showing a breakdown of its sections • You can use it to examine ‘courseid.o’ $ ./elfinfo courseid.o • It shows the section-names and locations ‘Binary-Executable’ format • Instead of the usual ELF executable-fornat file, we can produce a ‘binary-executable’ file (intended to be ‘loaded’ at address 0): $ ld courseid.o –T ldscript –o courseid.b • The linker-script instructs ‘ld’ to combine our object-file’s ‘.text’, ‘.data’ and ‘.bss’ sections • The ELF header and other sections get omitted • ‘ld’ performs memory-address ‘relocations’ as needed, so that any global addresses become offsets from the starting-address (i.e., from zero) What ‘ld’ does with ‘courseid.o’ courseid.o courseid.b ELF Header Section (.text) Section (.data) Section (.bss) Section (.rodata) extract extract extract extract Section (.text) Section (.data) Section (.bss) Section (.rodata) other sections Section (.rel.text) perform address-relocations Section-Header Table Linkable File These two sections happen to be empty in our particular example Memory: Physical vs. Virtual Portions of physical memory are “mapped” by the CPU into regions of each task’s ‘virtual’ address-space Virtual Address Space (4 GB) Physical address space (1 GB) Mapping a file to memory • The ‘mmap()’ function allows a program to ask the kernel to setup a memory-mapping • The contents of a file can be ‘mapped’ to a designated virtual address in user-space • If the file contains executable instructions, the CPU can be made to ‘execute’ them by simply calling the entry-point address • We will illustrate this with ‘launch.cpp’ Memory-Mapping a disk-file Portions of secondary memory can be “mapped” to unused regions of a task’s virtual address-space via the CPU’s page-fault mechanism Disk storage (files) file Physical address space Virtual address space file How file-mapping is requested The ‘mmap()’ function takes six arguments -- the region’s desired virtual-address -- the region’s length (expressed in bytes) -- access privileges (e.g., PROT_READ) -- mapping-type flags (e.g., MAP_FIXED) -- the file’s ID-number (from ‘open()’) -- the file’s starting offset If successful, it returns map’s virtual address Steps to follow • • • • • • 1) Open the file with ‘open()’ function 2) Determine length of the region to map 3) Request the mapping with ‘mmap()’ 4) Close the file-descriptor with ‘close()’ 4) Use the mapped region as normal data 5) Destroy the mapping with ‘munmap()’ Executing ‘courseid.b’ • Declare a function-pointer and initialize it with the virtual address of the entry-point void (*my_program)( void ) = 0x00000000; • Map the executable file to virtual memory at its expected load-address (0x00000000) • Execute the memory-mapped program using an ordinary (indirect) C function-call my_program(); Our ‘launch.cpp’ Demo • We have written an application program that illustrates the foregoing memory-map steps • After you try out this demo, you can gain a better understanding of ‘mmap()’ if you add a few extra statements to ‘launch.cpp’ and recompile it • Add calls to ‘getchar()’ before and after each of the key operations, and then examine the ‘/proc/<pid>/maps’ file (from another virtual console) each time the demo pauses for input Alternatives • Can I map my file to a different address and still execute it just as easily? • Yes – if you adjust the ‘load-address’ used by your linker script, and also adjust your ‘mmap()’ argument and function-pointer accordingly • But you have to use a ‘load-address’ that’s a multiple of the CPU’s page-size (i.e., 4K) In-Class exercise #1 • Try using 0x00040000 as your linker script ‘load-address’ and re-link the ‘courseid.o’ object-file to get a new ‘courseid.b’ file • Then modify the ‘launch.cpp’ source-code so it uses this new virtual-address as the ‘mmap()’ start-address and the function pointer’s value (i.e., use the assignment myexec = ( void (*)( ) )0x00040000; In-class exercise #2 • See if you can modify the ‘courseid.cpp’ program so that it will use one or more function-arguments passed to it from its caller: e.g., int main( int len, char *msg ); • Also make it return some function-value: e.g., return len; • Then modify ‘launch.cpp’ accordingly: e.g., int retval = myexec( 6, “Hello\n” );