Quirks of Shared Libraries

Static & Dynamic Libraries

When compiling with static libraries, the linker (at compile time) extracts the functions from the library and copies them into the executable. When compiling with dynamic libraries, the linker (at run time) loads the library when the executable is run and resolves the missing symbols.

Static libraries have some obvious benefits - single executable file with the code that is required. Dynamic libraries have benefits too - multiple executables can share the same in-memory copy of dynamic library code and read-only data.

Dynamic libraries are created using the gcc flag -fPIC i.e. Position Independent Code. This ensures that all references are relative. If this flag isn't  used references would be absolute addresses. Relative addressing allows loading the library at any address and sharing the same library across different executables i.e. virtual address spaces of different processes. This sharing allows loading a single copy and reusing.

Symbol Interposition

A dynamic library maintains a table of pointers (PLT - procedure linkage table) to its public functions and a table of pointers to its global variables (GOT - global offset table). All calls to these public functions and access to global variables, both from within the dynamic library and from external executables are routed through these two tables. If the executable has a global variable or public function which has the same signature, then it overrides all calls to the same function/variable in the dynamic library both from within the library and external calls.

With static libraries, interpositions works similarly, global functions and variables in main executable override those in the library.

There is a way to avoid unwanted symbol interposition, when you want to override a function which is visible outside the shared library and all calls to it from within the shared library must use this definition - by giving it protected visibility. See gcc documentation on Common Function Attributes __attribute__((visibility("protected"))).

Interposition can be used to call a function with the same signature from a different library than the library used to compile the executable. Using the LD_PRELOAD environment variable, the loader can be provided with a list of libraries to load before loading the executable and its dependent libraries. The preloaded library can contain a function with the same signature which will then be used as the entry in PLT, overriding any other library loaded afterwards.

Code Injection

Interposition can also be used to inject code. Call to a function in a shared library can be overriden by preloading  another library containing the function with the same signature. This function can execute some code and then use dlysm(RTLD_NEXT, "function_name") to refer to the function provided by the initial library and invoke it. The RTLD_NEXT pseudo handle requests for the next occurrence of the function from the libraries loaded after the current library.

fPIC

The fPIC compiler flag makes the code section position independent, creates PLT for functions and GOT for public and static data.


On 32-bit linux, when shared libraries are created without the fPIC flag, GOT is still created and used for all static and local data lookups. On 64-bit systems, this problem exists but to a lesser extent i.e. without -fPIC, 32-bit absolute addresses are used for static arrays which can be a problem if the shared library is loaded at addresses above 2GB. A partial solution is to use the -fPIE flag on 64-bit systems, however this restricts creating public variables.