Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
ListUtils.intersect is quite slow and can be improved by using a HashSet for the contains operation which cuts the complexity from n^2 to n. I ran into this by intersecting two lists with a few hundred thousand elements.
current:
public static List intersection(final List list1, final List list2) {
final ArrayList result = new ArrayList();
final Iterator iterator = list2.iterator();
while (iterator.hasNext()) {
final Object o = iterator.next();
if (list1.contains(o))
{ result.add(o); }}
return result;
Basically would work by inserting list1 into a HashSet like this:
Set objs = new HashSet();
objs.addAll(list1);
and then instead of list1.contains do a objs.contains
Further performance can be gained by picking the smallest list for this with a simple size comparison.
BTW what is this method supposed to do with duplicate entries in lists? Semantics are not really clear here as opposed to set intersection.