Uploaded image for project: 'Commons Collections'
  1. Commons Collections
  2. COLLECTIONS-545

Undocumented performance issue in the removeAll method in CollectionUtils

    XMLWordPrintableJSON

Details

    Description

      This bug is analogous to https://issues.apache.org/jira/browse/COLLECTIONS-544

      The method removeAll in CollectionUtils is inefficient when the second parameter collection has a slow containment method.

      The following is the current implementation with its documentation:
      ============================
      /**

      • Removes the elements in <code>remove</code> from <code>collection</code>. That is, this
      • method returns a collection containing all the elements in <code>c</code>
      • that are not in <code>remove</code>. The cardinality of an element <code>e</code>
      • in the returned collection is the same as the cardinality of <code>e</code>
      • in <code>collection</code> unless <code>remove</code> contains <code>e</code>, in which
      • case the cardinality is zero. This method is useful if you do not wish to modify
      • the collection <code>c</code> and thus cannot call <code>collection.removeAll(remove);</code>.
        *
      • @param <E> the type of object the {@link Collection}

        contains

      • @param collection the collection from which items are removed (in the returned collection)
      • @param remove the items to be removed from the returned <code>collection</code>
      • @return a <code>Collection</code> containing all the elements of <code>collection</code> except
      • any elements that also occur in <code>remove</code>.
      • @throws NullPointerException if either parameter is null
      • @since 4.0 (method existed in 3.2 but was completely broken)
        */
        public static <E> Collection<E> removeAll(final Collection<E> collection, final Collection<?> remove) { return ListUtils.removeAll(collection, remove); }

      =======================================

      We can notice the inefficiency by looking at the removeAll method in ListUtils.
      The removeAll method from ListUtils is implemented and documented as follows:

      =======================================

      /**

      • Removes the elements in <code>remove</code> from <code>collection</code>. That is, this
      • method returns a list containing all the elements in <code>collection</code>
      • that are not in <code>remove</code>. The cardinality of an element <code>e</code>
      • in the returned collection is the same as the cardinality of <code>e</code>
      • in <code>collection</code> unless <code>remove</code> contains <code>e</code>, in which
      • case the cardinality is zero. This method is useful if you do not wish to modify
      • <code>collection</code> and thus cannot call <code>collection.removeAll(remove);</code>.
      • <p>
      • This implementation iterates over <code>collection</code>, checking each element in
      • turn to see if it's contained in <code>remove</code>. If it's not contained, it's added
      • to the returned list. As a consequence, it is advised to use a collection type for
      • <code>remove</code> that provides a fast (e.g. O(1)) implementation of
      • {@link Collection#contains(Object)}

        .
        *

      • @param <E> the element type
      • @param collection the collection from which items are removed (in the returned collection)
      • @param remove the items to be removed from the returned <code>collection</code>
      • @return a <code>List</code> containing all the elements of <code>c</code> except
      • any elements that also occur in <code>remove</code>.
      • @throws NullPointerException if either parameter is null
      • @since 3.2
        */
        public static <E> List<E> removeAll(final Collection<E> collection, final Collection<?> remove) {
        final List<E> list = new ArrayList<E>();
        for (final E obj : collection)
        Unknown macro: { if (!remove.contains(obj)) { list.add(obj); } }

        return list;
        }

      =======================================

      In the case of ListUtils:removeAll, the inefficiency is properly documented.

      Perhaps the disclaimer about potential inefficiencies depending on the type
      of the parameter collection in ListUtils:removeAll should also be included in CollectionUtils:removeAll.

      Attachments

        Activity

          People

            Unassigned Unassigned
            oswaldo_o Oswaldo Olivo
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: