Uploaded image for project: 'Groovy'
  1. Groovy
  2. GROOVY-7448

AbstractConcurrentMap performing rehash() on every insert



    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.1.2, 2.4.3
    • 2.4.4
    • groovy-jdk
    • None


      The problem happens in inner class org.codehaus.groovy.util.AbstractConcurrentMap.Segment.put() method (line 105):

      When the current count is greater then map's threshold a rehash() happens. Now the tricky part is, that the Map holds soft references to objects and rehash() validates those references. So when GC discards soft references (e.isValid() == false), the resulting segment doesn't expand (as it is assumed in the put() method).

      in the last line of rehash() the Segmen't internal counter is updated count = newCount (which is the number of undiscarded (e.isValid() == true) references and can be smaller than previous count as described above)

      after rehash() is done, the put() method continues, however the buggy part is, that it disregards the previous setting of the internal count and it sets the previous count+1 value in every case on lines 124, 143 and 159 of AbstractConcurrentMapBase class.
      Since the rehash() is called only from put() method, the internal counter is not synchronized with real elements count contained in underlying array and rehash() is called on all subsequent calls of put() method.

      The following steps are happening:

      1) Map state: threshold = 786432; count=786432
      2) New element is inserted into the map: count = 786433; threshold = 786432
      3) since new count would be greater than threshold rehash() happens
      rehash() finds out, that most of the objects were garbage collected, thus it doesn't increase the size of the Segment, however anyway it copies all the objects from one table to another (System.arrayCopy()). (question: why this is done, in this case the old array with repaired invalid elements whould be sufficient ...? )
      4) rehash() sets the internal count to new value, which is smaller, because many objects were garbage collected (soft references), lets say: count = 486 000
      5) The put() continues, disregards the count = 486 000 and sets the count to count = 786433
      6) When another element is inserted, the count is still greater than threshold so the rehash happens again
      From now on every element added to map will trigger a rehash(), which has a huge performance impact.

      When this happens in multithreaded environment, all the other threads are waiting (parked) for lock() until the rehash() and put() is done (and then the next one is doing the rehash() again and so on). You can imagine what performance impact this is.

      Proposed solution:

      Update the c variable after rehash is done. Between lines 105 and 106 of org.codehaus.groovy.util.AbstractConcurrentMap add:
      104: if (c++ > threshold)

      { 105: rehash(); 106: c = count + 1 107:}




            glaforge Guillaume Laforge
            barti Tomas Bartalos
            0 Vote for this issue
            3 Start watching this issue