Because the set of strings that you have is fixed, you should try looking for a **perfect hash function**, a hash function specifically designed over a set of data to guarantee no collisions occur. There are many tools for creating hash functions like these, one of which, `gperf`

(not to be confused with `gprof`

) I know is freely available. I would strongly suggest this.

If you later end up needing to change the set of strings and want a lightweight, simple hash function, you may want to consider using the **Rabin-Karp rolling hash function**. It can be computed for a string of length n using O(n) additions, multiplications, and moduli, and ensures that each two strings have pairwise independent hash values. Moreover, you could probably code this up in about half an hour to test whether or not it performs better than the Adler checksum.

That said, using a well-known hash function like MD5 is still probably a good idea if you aren't trying to achieve cryptographic security. Even a simple CRC32 might be sufficient in that case.

2more comments