Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
3.2.1
-
None
-
None
-
None
-
java 1.6.0_24
Ubuntu 11.10
Description
Hi,
I am encountering a performance problem in ListUtils.retainAll(). It
appears in version 3.2.1 and also in revision 1355448. I attached a
test that exposes this problem and a one-line patch that fixes it. On
my machine, for this test, the patch provides a 238X speedup.
To run the test, just do:
$ java Test
The output for the un-patched version is:
Time is 5485
The output for the patched version is:
Time is 23
As the patch shows, the problem is that
"ListUtils.retainAll(Collection<E> collection, Collection<?> retain)"
performs "retain.contains(obj)" for each element in "collection",
which can be very expensive if "retain.contains(obj)" is expensive,
e.g., when "retain" is a list.
The one-line patch I attached puts the elements of "retain" in a
HashSet (which has very fast "contains()"), if "retain" is not already
a set:
"if (!(retain instanceof java.util.Set<?>)) retain = new HashSet<Object>(retain);"
Is this a bug, or am I misunderstanding the intended behavior? If so,
can you please confirm that the patch is correct?
Thanks,
Adrian