Hi all, we have a number of places in our system where matching an input word against a table using binary search. An example is Roy's terminfo code. This is suboptimal from a performance point of view. I'd like to fix this issue in two steps. First, I'd like the integrate the Jenkins hash function into libc. It is using only simple operations and known to provide few collisions. It can be used with power-of-two hash tables as well as modular hashes. As the man page states, the prototype goes into string.h. The primary reason to make this libc and not e.g. libutil is to allow all parts of the system to use it without pulling in libutil. This applies e.g. to the new libterminfo. The second part is the provided minimal perfect hash function generator. The generated hash function is very simple, using two lookups, a Jenkins hash and three unsigned constant divisions. Space usage is c * n * ld n * x Bytes with: n = number of keys c >= 2 x = 1 for cn < 256, x = 2 for 256 <= cn < 65536, x = 4 otherwise The process itself is propabilistic and generally quite fast. To process all non-comment lines of /etc/services on my laptop, it needs around 6s. The output is much smaller than the function created by gperf, but doesn't contain the check if key actually is valid. E.g. you still have to do a strcmp after that. The generator is also much faster than gperf. Comments? Joerg
Attachment:
nbperf.tar.gz
Description: application/tar-gz