Subject: New idea on ELF prebinding
To: None <tech-userlevel@netbsd.org>
From: Bang Jun-Young <junyoung@netbsd.org>
List: tech-userlevel
Date: 11/22/2002 15:18:28
Hi folks,
Here's the summary of what I have been thinking about ELF prebinding
since a couple times of objections from people against my previous
not-good implementation ;-):
- Every binary, including executable and shared object, has .csum
section inserted by ld(1) at compile time. It is 32-bit long and
used for storing checksum (CRC32) of the binary.
- Actual prebinding and prerelocation is done by ld.elf_so(1). After
ld.elf_so(1) loads a binary for the first time, it creates a disk
file in /usr/libexec/reloc (say it "cache") and writes all of the
relocated GOT and PLT sections in memory to the file (checksum and
other necessary information as well). In any subsequent execution
of the same binary, ld.elf_so(1) no longer performs relocation.
Instead it loads cache from the disk file previously created and
compares cache information and in-memory data. If they don't differ,
it patches GOT/PLT pointers so that they point to locations in the
cache. But if they differ, ld.elf_so(1) will do the same job.
- As time goes by, there will be more caches stored in /usr/libexec/reloc.
If needed, elfreld(1) daemon regularly check if they are still valid, and
removes invalid files. Or you can remove all of them, and ld.elf_so(1)
will perform the same job again for each binary it loads.
Advantages of this method include:
- Minimal modification to binary. Only .csum is inserted and it is
ignored by old ld.elf_so(1).
- No additional executable is required (elfreld(1) is fully optional).
- It doesn't break ELF semantics. Cache is just an image of the in-memory
data after relocation is done.
- It is much simpler and can be (significantly) faster than our
competitor's implementation (prelinking in Red Hat 8.0). You don't
even have to bother to run prelink against newly created binaries in
the system regularly. Everything is automagically done by ld.elf_so(1).
- Better CPU cache utilization is possible, since it is likely
that all the GOT/PLT entries for a binary and shared objects it
depends on are stored together in a single page, or at least, adjacent
in memory.
Disadvantages of it include:
- When ld.elf_so(1) loads a binary for the first time, it takes more
time (rarely, much more) to get it done, since it creates and write
cache to disk.
- Security considerations (?).
- (please put your comments here ;-).
Comments would be appreciated,
Jun-Young
--
Bang Jun-Young <junyoung@netbsd.org>