Linking and Loading Fred Prussack CS 518 L&L: Overview Wake-up Questions Terms and Definitions / General Information Loading Linking – Static vs. Dynamic – ELF – Other Linking Information/Processing References L&L: Topics Not Covered Windows Alternate binary file formats Different versions of glibc Different versions of the kernel (from 2.4) L&L: Questions for the Sleepy What is the name of the compile time linker in Linux? – ld What is the name of the run-time linker in Linux? – ld.so Where is the loader located in Linux? – Part kernel / part ld.so L&L: Terms Linking – Taking object files and creating loadable modules with correct references to required libraries, data, and procedures Loading – Placing a program image into main memory for execution L&L: General Information Static libraries generally named xxx.a (archives) Dynamic libraries generally named xxx.so (shared objects) Object files generally named xxx.o ELF: Executable and Linking Format. Currently the most common object file format on Linux systems. Other formats: a.out, COFF, etc… L&L: Static vs. Dynamic Fully statically compiled executables – Provide for faster load->execution time due to no run-time linking requirement – Generate larger executables requiring more disk space Executables with dynamic dependencies – Require run-time linking and thus potential time implications – Allows for easier and better code re-use L&L: Loading do_execve() Searches all the registered binary handlers search_binary_handler load_elf_binary start_thread Loads current binary and elf interpreter sets up correct registers Question: What does the instruction pointer have in it now? Answer: Entry point of the ELF interpreter L&L: Loading/Linking At this point ld.so now has control Determine what libraries need to be loaded for this binary Determine dependencies for these libraries In what order are these loaded and what type of list is produced from this dependency list? L&L: Linking Basic job to clean up unresolved symbols At compile time this can be accomplished by executing ld with object files to produce an executable At run-time this is accomplished by loading all required shared libraries (.so’s) and fixing unresolved symbols found in the libraries L&L: Dynamic Linking Load Time Dynamic Linking Done By ld.so – Most likely on your system it is ld-linux.so which links to ld-2.3.2.so – All possibly resolved symbols are resolved during compilation/first link (run of ld). Remaining unresolved symbols are done at time of load Lazy Binding (LD_BIND_NOW) Run Time Dynamic (inline) Linking – Allows applications to, during run time, open shared object files and execute their functions <dlfcn.h> L&L: [more] Questions for the Sleepy In what package is ld.so distributed and built from? – Glibc Can gcc be made to not link files automatically? – Yes, of course! Use the –c option. L&L: ELF File Format Currently the standard binary format for Linux since the late 90’s. Created in late 80’s. Three types of object files – Shared Object Files (.so; shared object file) – Relocatable Object Files (.o; object file) – Executable Object Files (executable binary file) First 4 characters of this type of file is [backspace (ascii 127)]ELF L&L: ELF File Format First the ELF Header – 52 bytes in length on a 32 bit system Sections and Segments for libraries and binaries Various ELF segments – – – – text: program instructions data: initialized data plt: procedure linkage table got: global offset table Checking for NEEDED entries in the dynamic segment will let ld know what it needs to load L&L: ld.so & Library Location ld.so must be able to correctly locate the identified libraries in the executable. It does this by looking for them in the following order: – DT_RPATH (-rpath-link option) Section in ELF file – LD_LIBRARY_PATH Environment Variable – /etc/ld.so.cache Compiled list of files to load – /lib; /usr/lib – /etc/ld.so.conf L&L: ld.so processing Loop all the program headers to find necessary info – PHDR (program header): where the program headers start; This must be found first. – DYNAMIC: indicates where to find the dynamic segment (what must be loaded) NEEDED: Name of file needed – INTERP: used to find the interpreter – which generally turns out to be ld.so L&L: ld.so processing Load all required libraries found in NEEDED portions of the DYNAMIC segment Get all necessary information from library – Dynamic header; phdr; load header L&L: ld.so info Read-Only Read-Write L&L: ld.so processing What about when we actually call a function that hasn’t been loaded? – First need to resolve addressing issues – Probably best to permanently fix them – Then we need to call the actual procedure L&L: ld.so processing Next Procedure Run PLT0: pushl GOT + 4 jmp *GOT + 8 PLTN: jmp *GOT+n push #reloc_offset jmp PLT0 Question: What is the name of the fix routine? Answer: fixup library reloc_offset Stack Procedure Start Loc Routine to fix GOT then jump to procedure after locating correct symbol L&L: ld.so misc. info You can run ld.so from the command line with an executable – This provides a great ability to test out new ld.so’s if necessary – /lib/ld-linux.so [executable [args…]] L&L: linking helper tools ldd – list the dynamic dependancies readelf – displays information from ELF files objdump – show information from object files nm – show symbol information from object files strip – removes symbols from object files LD_DEBUG/LD_DEBUG_OUTPUT – shows debug output from ld.so L&L: References Stallings, William. Operating Systems Internals and Design Principles, 4th Edition. Upper Saddle River, NJ: Prentice-Hall, 2001 http://efrw01.frascati.enea.it/Software/Unix/IstrFTU/cern-cnl-2001003-25-link.html http://www.iecc.com/linker/linker10.html http://www.ibiblio.org/oswg/oswg-nightly/oswg/en_GB.ISO_88591/books/linux-c-programming/GCC-HOWTO/x796.html http://linux.about.com/library/cmd/blcmdl2_execve.htm http://www.iecc.com/linker/ http://www.suse.de/~bastian/Export/linking.txt http://linux.about.com/library/cmd/blcmdl8_ld.so.htm http://www.linuxjournal.com/node/6463 http://www.ibiblio.org/oswg/oswg-nightly/oswg/en_GB.ISO_88591/books/linux-c-programming/GCC-HOWTO/x575.html L&L: References (cont.) http://www.moses.uklinux.net/patches/lki-single.html http://whatis.techtarget.com/definition/0,,sid9_gci212493,00.html http://encyclopedia.thefreedictionary.com/position%20independent% 20code http://www.faqs.org/docs/Linux-HOWTO/Program-LibraryHOWTO.html http://sources.redhat.com/autobook/autobook/autobook_71.html http://www.educ.umu.se/~bjorn/linux/howto/ELF-HOWTO-1.html http://www.tcfs.it/docs/manpages/BSD/gcc-howto-6.html http://www.cs.ucdavis.edu/~haungs/paper/ http://www-106.ibm.com/developerworks/linux/library/ldll.html?dwzone=linux