Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.17.0
-
None
-
None
Description
XOR is commutative so its already a poor choice for pairs as both permutations {A, B} and {B, A} will hash to the same value. Triples are worse as all 6 permutations have the same hash code. Also of note is that all Pairs {A, A} hash to 0.
There are way more collisions than that with large numbers of values; e.g. if we randomly generate 2.5 million Strings, and 8 million pairs of those, inserting those Pairs into a HashSet takes longer than an hour. Comparably, inserting the same 8 million pairs into a TreeSet takes 10 seconds, and inserting 8 million Strings into a HashSet takes a few seconds.
Did some digging into HashMap implementation and its never using compareTo() on the Pairs on the collisions. That could be a separate issue and a bug in HashMap - Pair is comparable and HashMap should use it but it instead degrades to linked list-like performance. The worst collisions had 150,000+ Pairs with the same hashCode value. But it may be prudent to examine why ImmutablePair<String, String> doesn't satisfy the logic in HashMap.comparableClassFor() - https://github.com/openjdk/jdk/blob/bf0debc023a42ccdf2f589039e4d98e11424b4dd/src/java.base/share/classes/java/util/HashMap.java#L345 as it may be possible and prudent to work around it too.