Datafu should offer the murmur3 hash.
The attached patch uses Guava to add murmur3 (a fast hash with good statistical properties), SipHash-2-4 (a fast cryptographically secure hash), crc32, adler32, md5 and sha.
From the javadoc:
- 'murmur3-32', [optional seed] or 'murmur3-128', [optional seed]: Returns a murmur3 hash of the given length. Murmur3 is fast, with has exceptionally good statistical properties; it's a good choice if all you need is good mixing of the inputs. It is not cryptographically secure; that is, given an output value from murmur3, there are efficient algorithms to find an input yielding the same output value. Supply the seed as a string that Integer.decode can handle.
- 'sip24', [optional seed]: Returns a 64-bit SipHash-2-4. SipHash is competitive in performance with Murmur3, and is simpler and faster than the cryptographic algorithms below. When used with a seed, it can be considered cryptographically secure: given the output from a sip24 instance but not the seed used, we cannot efficiently craft a message yielding the same output from that instance.
- 'adler32': Returns an Adler-32 checksum (32 hash bits) by delegating to Java's Adler32 Checksum
- 'crc32': Returns a CRC-32 checksum (32 hash bits) by delegating to Java's CRC32 Checksum.
- 'md5': Returns an MD5 hash (128 hash bits) using Java's MD5 MessageDigest.
- 'sha1': Returns a SHA-1 hash (160 hash bits) using Java's SHA-1 MessageDigest.
- 'sha256': Returns a SHA-256 hash (256 hash bits) using Java's SHA-256 MessageDigest.
- 'sha512': Returns a SHA-512 hash (160 hash bits) using Java's SHA-512 MessageDigest.
- 'good-(integer number of bits)': Returns a general-purpose, non-cryptographic-strength, streaming hash function that produces hash codes of length at least minimumBits. Users without specific compatibility requirements and who do not persist the hash codes are encouraged to choose this hash function. (Cryptographers, like dieticians and fashionistas, occasionally realize that We've Been Doing it Wrong This Whole Time. Using 'good-*' lets you track What the Experts From (Milan|NIH|IEEE) Say To (Wear|Eat|Hash With) this Fall.) Values for this hash will change from run to run.
Important notes about this patch:
- It should be applied after the patch for
DATAFU-46and DATAFU-48. (It expands the dependence on Guava. Does pull req 75 mean there's momentum to de-Guava datafu?) (The patch has (commented out) code that shows what life would be like if the sip24, crc32 and adler32 hashes were available. On your advice, I will either (a) put in a patch removing the spurious comments or (b) file a separate bug to update guava, push in a patch for that, and put in a patch restoring to glory the extra hashes.)