Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
4.0
-
None
-
Ubuntu 14.04
Description
CollectionUtils provides a linear time implementation of containsAll using additional linear space.
There is a linear time implementation of retainAll as follows:
// Eliminates from coll1 all elements that are not in coll2. // It runs in O(m+n) size, requiring additional O(m) space. public static boolean efficientRetainAll(final Collection<?> coll1,final Collection<?> coll2) { // If coll2 is empty there are no elements to retain. if(coll2.isEmpty()) { return coll1.removeAll(coll1); } // Simple case when we're supposed to retain all elements // in the first collection. if(coll1==coll2) return false; // it1 iterates over coll1 and it2 iterares over coll2, // and seen contains all the elements before it2, allowing // us to never revisit previous elements. // The algorithm iterates through it1, checks to see if we've // already seen the current element of it1 via a Hashset // efficient check, or traverses elements of it2 until we find // it or it2 ends. At each iteration over it2 we add the // elements to seen to avoid revisiting items. Iterator<?> it1 = coll1.iterator(); Iterator<?> it2 = coll2.iterator(); HashSet<Object> seen = new HashSet<Object>(); boolean changed=false; // Traverse all the elements in coll1. while(it1.hasNext()) { final Object o=it1.next(); // If we've seen this element in coll2, keep it. if(seen.contains(o)) continue; // Otherwise, check for its containment in coll2, while // adding the elements to seen. boolean contained=false; while(it2.hasNext()) { final Object o2=it2.next(); seen.add(o2); // Found the element in coll2. if(o2.equals(o)) { contained=true; break; } } // If the element was not found in coll2, remove it from it1. if(!contained) { changed=true; it1.remove(); } } return changed; }
Besides providing this in CollectionUtils for the user, this function could be
beneficial within other functions, where the existing retainAll exhibits a significant slowdown for large collections:
1) https://issues.apache.org/jira/browse/COLLECTIONS-534
2) https://issues.apache.org/jira/browse/COLLECTIONS-548