Uploaded image for project: 'Groovy'
  1. Groovy
  2. GROOVY-5785

HashSet removeAll() operations are slow

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.0.5
    • None
    • groovy-jdk
    • None

    Description

      Consider the following (simple but not very Groovy) code:

      def hashSet1 = (0..<100000) as HashSet
      def hashSet2 = (50000..<150000) as HashSet
      def hashSet3 = hashSet2 - hashSet1
      println hashSet3.size()

      This script takes a very long time to complete due to the implementation of the minus() operation used in computing hashSet3 since it is a non-linear operation. By comparison, using Java methods it completes in less than a second:

      def hashSet1 = (0..<100000) as HashSet
      def hashSet2 = (50000..<150000) as HashSet
      def hashSet3 = hashSet2.clone()
      hashSet3.removeAll(hashSet1)
      println hashSet3.size()

      It seems to me that a simple fix is just to call removeAll() in Collection or worst case just for HashSets to solve this problem but it seems too simple (maybe implications for equals() that I'm missing?). If I'm not missing anything, then I'd be happy to create a patch which fixes this performance bug but I just want to make sure that I'm not missing anything.

      Attachments

        Activity

          People

            Unassigned Unassigned
            balor123 Uri Moszkowicz
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: