Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
3.2.1
-
None
-
None
-
None
-
java 1.6.0_24
Ubuntu 11.10
Description
Hi,
I am encountering a performance problem in ListUtils.removeAll(). It
appears in version 3.2.1 and also in revision 1355448. I attached a
test that exposes this problem and a one-line patch that fixes it. On
my machine, for this test, the patch provides a 217X speedup.
To run the test, just do:
$ java Test
The output for the un-patched version is:
Time is 5430
The output for the patched version is:
Time is 25
As the patch shows, the problem is that
"ListUtils.removeAll(Collection<E> collection, Collection<?> remove)"
performs "remove.contains(obj)" for each element in "collection",
which can be very expensive if "remove.contains(obj)" is expensive,
e.g., when "remove" is a list.
The one-line patch I attached puts the elements of "remove" in a
HashSet (which has very fast "contains()"), if "remove" is not already
a set:
"if (!(remove instanceof java.util.Set<?>)) remove = new HashSet<Object>(remove);"
Is this a bug, or am I misunderstanding the intended behavior? If so,
can you please confirm that the patch is correct?
Thanks,
Adrian