Lucene - Core
  1. Lucene - Core
  2. LUCENE-3656

IndexReader's add/removeCloseListener should not use ConcurrentHashMap, just a synchronized set

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 3.5, 4.0-ALPHA
    • Fix Version/s: 3.6, 4.0-ALPHA
    • Component/s: core/index
    • Labels:
    • Lucene Fields:
      New

      Description

      The use-case for ConcurrentHashMap is when many threads are reading and less writing to the structure. Here this is just funny: The only reader is close(). Here you can just use a synchronized HashSet. The complexity of CHM is making this just a joke

      1. LUCENE-3656.patch
        5 kB
        Uwe Schindler
      2. LUCENE-3656.patch
        3 kB
        Uwe Schindler

        Activity

        Uwe Schindler created issue -
        Hide
        Robert Muir added a comment -

        +1

        Show
        Robert Muir added a comment - +1
        Uwe Schindler made changes -
        Field Original Value New Value
        Labels curiosity
        Hide
        Michael McCandless added a comment -

        +1

        Show
        Michael McCandless added a comment - +1
        Hide
        Uwe Schindler added a comment -

        Simple patch.

        I had to add some additional synchronization in the notifier, as the iterators are not thread-safe. Collections.synchronized*() returns a set, thats synced on itsself, so you can for more complex changes like iteration, sync on the result (the Set itsself), see for complete explanation: http://docs.oracle.com/javase/1.5.0/docs/api/java/util/Collections.html#synchronizedSet(java.util.Set)

        Show
        Uwe Schindler added a comment - Simple patch. I had to add some additional synchronization in the notifier, as the iterators are not thread-safe. Collections.synchronized*() returns a set, thats synced on itsself, so you can for more complex changes like iteration, sync on the result (the Set itsself), see for complete explanation: http://docs.oracle.com/javase/1.5.0/docs/api/java/util/Collections.html#synchronizedSet(java.util.Set)
        Uwe Schindler made changes -
        Attachment LUCENE-3656.patch [ 12507916 ]
        Hide
        Uwe Schindler added a comment -

        One more thing, because Robert reminded me:
        I changed to use a LinkedHashSet instead simple HashSet, because the caller adding listeners would expect that they are called in order on close. Adding them to a HashSet makes they executed in arbitrary order and thats not what a user expects. The Linked* does not cost us much, as the number of listeners should be low

        Show
        Uwe Schindler added a comment - One more thing, because Robert reminded me: I changed to use a LinkedHashSet instead simple HashSet, because the caller adding listeners would expect that they are called in order on close. Adding them to a HashSet makes they executed in arbitrary order and thats not what a user expects. The Linked* does not cost us much, as the number of listeners should be low
        Hide
        Simon Willnauer added a comment -

        Uwe, this seems a pretty good usecase for CopyOnWriteArraySet. No sync but thread-safe and reasonably cheap.

        Show
        Simon Willnauer added a comment - Uwe, this seems a pretty good usecase for CopyOnWriteArraySet. No sync but thread-safe and reasonably cheap.
        Hide
        Yonik Seeley added a comment -

        this seems a pretty good usecase for CopyOnWriteArraySet.

        +1, good call.

        Show
        Yonik Seeley added a comment - this seems a pretty good usecase for CopyOnWriteArraySet. +1, good call.
        Hide
        Uwe Schindler added a comment -

        I agree that for small ordered sets (like Handlers/Listender, see example from Javadocs) this is a good idea, but its still the wrong set for this use case:

        The listeners on close are registered by code that may (unlikely) come from different threads (e.g. FieldCacheImpl registers an event to purge the caches on IndexReader close, or maybe in future Solr registers a handler - this is generally done on setup of IndexReader). In general synchronization would not really be required at all, but as also different threads may register listeners, there should be some basic synchronization. If you would use CopyOnWriteArraySet, registering/removing new listeners gets slow, as it has to copy the array each time, so registering event handlers will not block but just be slow. On the other hand we have very fast access just for exactly one single iteration on thi set (when the listeners are triggered, on closing the reader). We get this for free without sync, but who cares, IndexReader.close() is the last operation on an IR where you have no concurrency anymore.

        I think we should stay with a simply synchronized LinkedHashSet which is cheap as concurrency is no issue at all (not many threads will ever register an event, the whole synchronization is just to guard the set for concurrent modifications (e.g. if two threads create a new FieldCache entry at same time). The addition cost for sync on IR.close() is a no-op, as explained above (no concurrecy anymore).

        Show
        Uwe Schindler added a comment - I agree that for small ordered sets (like Handlers/Listender, see example from Javadocs) this is a good idea, but its still the wrong set for this use case: The listeners on close are registered by code that may (unlikely) come from different threads (e.g. FieldCacheImpl registers an event to purge the caches on IndexReader close, or maybe in future Solr registers a handler - this is generally done on setup of IndexReader). In general synchronization would not really be required at all, but as also different threads may register listeners, there should be some basic synchronization. If you would use CopyOnWriteArraySet, registering/removing new listeners gets slow, as it has to copy the array each time, so registering event handlers will not block but just be slow. On the other hand we have very fast access just for exactly one single iteration on thi set (when the listeners are triggered, on closing the reader). We get this for free without sync, but who cares, IndexReader.close() is the last operation on an IR where you have no concurrency anymore. I think we should stay with a simply synchronized LinkedHashSet which is cheap as concurrency is no issue at all (not many threads will ever register an event, the whole synchronization is just to guard the set for concurrent modifications (e.g. if two threads create a new FieldCache entry at same time). The addition cost for sync on IR.close() is a no-op, as explained above (no concurrecy anymore).
        Hide
        Uwe Schindler added a comment -

        Any comments about my explanation to still use LHM, otherwise I will commit this in the evening?

        Show
        Uwe Schindler added a comment - Any comments about my explanation to still use LHM, otherwise I will commit this in the evening?
        Hide
        Michael McCandless added a comment -

        I think we should stay with a simply synchronized LinkedHashSet

        +1

        Show
        Michael McCandless added a comment - I think we should stay with a simply synchronized LinkedHashSet +1
        Hide
        Robert Muir added a comment -

        +1, its the right data structure here.

        Show
        Robert Muir added a comment - +1, its the right data structure here.
        Hide
        Uwe Schindler added a comment -

        Simple refactorings to make the code in IR and SCR/SR similar.

        Will commit this now.

        Show
        Uwe Schindler added a comment - Simple refactorings to make the code in IR and SCR/SR similar. Will commit this now.
        Uwe Schindler made changes -
        Attachment LUCENE-3656.patch [ 12508104 ]
        Uwe Schindler made changes -
        Fix Version/s 3.6 [ 12319070 ]
        Fix Version/s 4.0 [ 12314025 ]
        Hide
        Uwe Schindler added a comment -

        Committed trunk revision: 1221369
        Merged 3.x revision: 1221371

        Show
        Uwe Schindler added a comment - Committed trunk revision: 1221369 Merged 3.x revision: 1221371
        Uwe Schindler made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Uwe Schindler made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Uwe Schindler
            Reporter:
            Uwe Schindler
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development