Lucene - Core
  1. Lucene - Core
  2. LUCENE-1609

Eliminate synchronization contention on initial index reading in TermInfosReader ensureIndexIsRead

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.9
    • Fix Version/s: 2.9
    • Component/s: core/index
    • Labels:
      None
    • Environment:

      Solr
      Tomcat 5.5
      Ubuntu 2.6.20-17-generic
      Intel(R) Pentium(R) 4 CPU 2.80GHz, 2Gb RAM

    • Lucene Fields:
      New

      Description

      synchronized method ensureIndexIsRead in TermInfosReader causes contention under heavy load

      Simple to reproduce: e.g. Under Solr, with all caches turned off, do a simple range search e.g. id:[0 TO 999999] on even a small index (in my case 28K docs) and under a load/stress test application, and later, examining the Thread dump (kill -3) , many threads are blocked on 'waiting for monitor entry' to this method.

      Rather than using Double-Checked Locking which is known to have issues, this implementation uses a state pattern, where only one thread can move the object from IndexNotRead state to IndexRead, and in doing so alters the objects behavior, i.e. once the index is loaded, the index nolonger needs a synchronized method.

      In my particular test, this uncreased throughput at least 30 times.

      1. LUCENE-1609.patch
        44 kB
        Michael McCandless
      2. LUCENE-1609.patch
        50 kB
        Michael McCandless
      3. LUCENE-1609.patch
        3 kB
        Dan Rosher
      4. LUCENE-1609.patch
        3 kB
        Dan Rosher

        Issue Links

          Activity

          Hide
          Uwe Schindler added a comment -

          Are you sure, this works correct? If the indexState is changed in the synchronized block, another thread not synchronizing on the lock may still see the old indexState. At least, the indexState must be volatile, but this only works correct with Java 1.5 (and Lucene only needs Java 1.4 as requirement).

          Show
          Uwe Schindler added a comment - Are you sure, this works correct? If the indexState is changed in the synchronized block, another thread not synchronizing on the lock may still see the old indexState. At least, the indexState must be volatile, but this only works correct with Java 1.5 (and Lucene only needs Java 1.4 as requirement).
          Hide
          Earwin Burrfoot added a comment - - edited

          The problem is not with indexState not being volatile. You can unsafely publish objects that have no internal state, or their state is consistent enough for you under any memory visibility/reordering effects. See working example of it in LUCENE-1607, Yonik's hash for interning strings.

          The problem is that indexState guards indexTerms, indexInfos, indexPointers, which aren't volatile too and aren't guarded by the lock. It is possible that one thread does load these fields and then sets indexState = new IndexRead(), but another thread sees only the last write and dies with NPE.

          The thing I don't get, is why do we want lazy loading here at all? Is there any usage for TermInfosReader that prevents it from initializing in constructor?

          Show
          Earwin Burrfoot added a comment - - edited The problem is not with indexState not being volatile. You can unsafely publish objects that have no internal state, or their state is consistent enough for you under any memory visibility/reordering effects. See working example of it in LUCENE-1607 , Yonik's hash for interning strings. The problem is that indexState guards indexTerms, indexInfos, indexPointers, which aren't volatile too and aren't guarded by the lock. It is possible that one thread does load these fields and then sets indexState = new IndexRead(), but another thread sees only the last write and dies with NPE. The thing I don't get, is why do we want lazy loading here at all? Is there any usage for TermInfosReader that prevents it from initializing in constructor?
          Hide
          Uwe Schindler added a comment -

          You could fix this, if you put all these field into the state object, too (an abstract class instead of interface containing these variables) and cloning those on creating the new state. But then you have the mentioned problem, that one thread may exchange the state object to a IndexRead, but another one still sees the reference to the IndexNotRead object, not used any longer. As log as you not also sychronize the state object change or make it volatile in Java 1.5 it will still not work. That was, what I meant.
          In my opinion, this is not fixable in any case with these type of state objects, yes?

          Show
          Uwe Schindler added a comment - You could fix this, if you put all these field into the state object, too (an abstract class instead of interface containing these variables) and cloning those on creating the new state. But then you have the mentioned problem, that one thread may exchange the state object to a IndexRead, but another one still sees the reference to the IndexNotRead object, not used any longer. As log as you not also sychronize the state object change or make it volatile in Java 1.5 it will still not work. That was, what I meant. In my opinion, this is not fixable in any case with these type of state objects, yes?
          Hide
          Earwin Burrfoot added a comment -

          You cannot put all these fields into state object, because you introduce state to it and it can no longer be unsafely published.

          > one thread may exchange the state object to a IndexRead, but another one still sees the reference to the IndexNotRead object
          Nothing terrible here, a thread hitting stale IndexNotRead synchronizes and short-circuits in the beginning of the method. The problem is that seeing proper state object doesn't guarantee seeing fields it is supposed to guard

          Yes, it's not fixable here without volatile or proper synchronization. But I still have a feeling that lazy loading (and consequent synchronization) is not needed here at all.

          Show
          Earwin Burrfoot added a comment - You cannot put all these fields into state object, because you introduce state to it and it can no longer be unsafely published. > one thread may exchange the state object to a IndexRead, but another one still sees the reference to the IndexNotRead object Nothing terrible here, a thread hitting stale IndexNotRead synchronizes and short-circuits in the beginning of the method. The problem is that seeing proper state object doesn't guarantee seeing fields it is supposed to guard Yes, it's not fixable here without volatile or proper synchronization. But I still have a feeling that lazy loading (and consequent synchronization) is not needed here at all.
          Show
          Yonik Seeley added a comment - Re: why the lazy load http://www.lucidimagination.com/search/document/97e73361748808b/terminfosreader_lazy_term_index_reading#2a73aaca25d516ec
          Hide
          Michael McCandless added a comment -

          If it's only for segment merging, couldn't we up front conditionalize the loading of the index?

          Show
          Michael McCandless added a comment - If it's only for segment merging, couldn't we up front conditionalize the loading of the index?
          Hide
          Dan Rosher added a comment -

          I've added the TermInfosReader object to the constructor of the IndexNotRead object so that it can be synchronised on during readIndex, and then protect indexTerms, indexInfos, indexPointer

          Show
          Dan Rosher added a comment - I've added the TermInfosReader object to the constructor of the IndexNotRead object so that it can be synchronised on during readIndex, and then protect indexTerms, indexInfos, indexPointer
          Hide
          Michael McCandless added a comment -

          Could we instead change TermInfosReader (it's package private so that's OK) to take "boolean loadIndex", and fix callers to thread down that argument?

          Show
          Michael McCandless added a comment - Could we instead change TermInfosReader (it's package private so that's OK) to take "boolean loadIndex", and fix callers to thread down that argument?
          Hide
          Yonik Seeley added a comment -

          If it's only for segment merging, couldn't we up front conditionalize the loading of the index?

          Since the indexWriter now does deletes, doesn't it sometimes need the term index too?

          Show
          Yonik Seeley added a comment - If it's only for segment merging, couldn't we up front conditionalize the loading of the index? Since the indexWriter now does deletes, doesn't it sometimes need the term index too?
          Hide
          Michael McCandless added a comment -

          Since the indexWriter now does deletes, doesn't it sometimes need the term index too?

          True, so when IW opens the SR in order to apply deletes it should request that the terms index be loaded?

          Though, because iW now internally pools open SR's (for NRT search), it's entirely possible that an SR already opened for merging is re-used when it's time to apply deletes (even in the NRT case). To handle that we could add a package private method to load the terms index, which only IW would call (we do something similar with the doc stores). IW's synchronization ensures only one thread would call the method.

          It would then not be necessary to check on every Term lookup whether the index was loaded.

          In general I prefer up-front initialization when possible. Ie, when I call IndexReader.open, it should do as much initializing as it can instead of deferring initialization until the first query.

          Show
          Michael McCandless added a comment - Since the indexWriter now does deletes, doesn't it sometimes need the term index too? True, so when IW opens the SR in order to apply deletes it should request that the terms index be loaded? Though, because iW now internally pools open SR's (for NRT search), it's entirely possible that an SR already opened for merging is re-used when it's time to apply deletes (even in the NRT case). To handle that we could add a package private method to load the terms index, which only IW would call (we do something similar with the doc stores). IW's synchronization ensures only one thread would call the method. It would then not be necessary to check on every Term lookup whether the index was loaded. In general I prefer up-front initialization when possible. Ie, when I call IndexReader.open, it should do as much initializing as it can instead of deferring initialization until the first query.
          Hide
          Jed Wesley-Smith added a comment -

          We get hit by this too. We'd love to see a fix and we'd agree that up-front initialisation would work for us.

          AFAICT there are a number of other potential subtle concurrency issues with TermInfosReader:

          1. lack of final on fields - a number of fields (directory, segment, fieldInfos, origEnum, enumerators etc.) are never written to after construction and should be declared final for better publication semantics
          2. unsafe publication of indexDivisor and totalIndexInterval these fields are not written to under lock and in a worst-case could be unstable under use.
          3. close() calls enumerators.set(null) which only clears the value for the calling thread.

          Making the TermInfosReader more immutable would address some of these issues.

          As far as the root problem goes, uncontended synchronisation is generally very fast, but significantly slows down once a lock becomes contended. The kind of pattern employed here (do something quite expensive but only once) is not an ideal use of synchronisation as it commonly leads to a contended lock, which remains a slow lock well after it is required*. That being said, it isn't easy to do correctly and performantly under 1.4.

          * An alternative approach is something like this LazyReference class, although this kind of thing really requires Java5 for full value.

          Show
          Jed Wesley-Smith added a comment - We get hit by this too. We'd love to see a fix and we'd agree that up-front initialisation would work for us. AFAICT there are a number of other potential subtle concurrency issues with TermInfosReader : lack of final on fields - a number of fields ( directory , segment , fieldInfos , origEnum , enumerators etc.) are never written to after construction and should be declared final for better publication semantics unsafe publication of indexDivisor and totalIndexInterval these fields are not written to under lock and in a worst-case could be unstable under use. close() calls enumerators.set(null) which only clears the value for the calling thread. Making the TermInfosReader more immutable would address some of these issues. As far as the root problem goes, uncontended synchronisation is generally very fast , but significantly slows down once a lock becomes contended. The kind of pattern employed here (do something quite expensive but only once) is not an ideal use of synchronisation as it commonly leads to a contended lock, which remains a slow lock well after it is required*. That being said, it isn't easy to do correctly and performantly under 1.4. * An alternative approach is something like this LazyReference class, although this kind of thing really requires Java5 for full value.
          Hide
          Michael McCandless added a comment -

          I agree, we need to fix this for 2.9. I'll explore the "initialize up front" approach.

          Show
          Michael McCandless added a comment - I agree, we need to fix this for 2.9. I'll explore the "initialize up front" approach.
          Hide
          Yonik Seeley added a comment -

          It would be nice to fix, but I wonder how important it is (i.e. a must fix for 2.9 vs 3.0 where we can use Java5)
          Wouldn't an application that hit contention on this very fast test, simply go on to hit a different contention point after this one is removed?

          A much better application fix is to simply limit the number of threads executing queries.

          Show
          Yonik Seeley added a comment - It would be nice to fix, but I wonder how important it is (i.e. a must fix for 2.9 vs 3.0 where we can use Java5) Wouldn't an application that hit contention on this very fast test, simply go on to hit a different contention point after this one is removed? A much better application fix is to simply limit the number of threads executing queries.
          Hide
          Michael McCandless added a comment -

          I agree we could ship 2.9 without this fix (in fact I think most of
          the issues marked 2.9 could be postponed until 3.x), but...

          Because the ability of the JRE to downgrade an inflated lock back to a
          lightweight lock seems to be very implementation/version dependent,
          coupled with the fact that term info lookup is very frequently called
          (especially for MultiTermQuery), makes me think this is indeed
          something rather important to fix.

          And a nice side effect of up-front init is that it front-loads
          initialization more, ie, the work is done in IndexReader.open instead
          of the first query. (Other things, like populating FieldCache,
          loading norms, still happen on the first query of course...).

          A much better application fix is to simply limit the number of threads executing queries.

          Well... if your hardware has concurrency, is it really the case that
          Lucene never gets in the way? I'd love to see this confirmed
          empirically. Ie you should ideally be able to run as many threads as
          will saturate your hardware, and see it scale properly.

          Show
          Michael McCandless added a comment - I agree we could ship 2.9 without this fix (in fact I think most of the issues marked 2.9 could be postponed until 3.x), but... Because the ability of the JRE to downgrade an inflated lock back to a lightweight lock seems to be very implementation/version dependent, coupled with the fact that term info lookup is very frequently called (especially for MultiTermQuery), makes me think this is indeed something rather important to fix. And a nice side effect of up-front init is that it front-loads initialization more, ie, the work is done in IndexReader.open instead of the first query. (Other things, like populating FieldCache, loading norms, still happen on the first query of course...). A much better application fix is to simply limit the number of threads executing queries. Well... if your hardware has concurrency, is it really the case that Lucene never gets in the way? I'd love to see this confirmed empirically. Ie you should ideally be able to run as many threads as will saturate your hardware, and see it scale properly.
          Hide
          Michael McCandless added a comment -

          Alas... the big problem with doing this up-front is the IndexReader.setTermIndexInterval, which relies on the fact that the index is loaded lazily.

          So, maybe we need to wait until 3.0 to do this up-front.

          But perhaps for this issue we should make it possible to pass in the termIndexInterval to IndexReader.open, and deprecate the current methods, and then in 3.0 we could switch to up-front loading.

          Show
          Michael McCandless added a comment - Alas... the big problem with doing this up-front is the IndexReader.setTermIndexInterval, which relies on the fact that the index is loaded lazily. So, maybe we need to wait until 3.0 to do this up-front. But perhaps for this issue we should make it possible to pass in the termIndexInterval to IndexReader.open, and deprecate the current methods, and then in 3.0 we could switch to up-front loading.
          Hide
          Michael McCandless added a comment -

          Attached patch. This addresses this issue and LUCENE-1718.

          I added 2 new static IndexReader.open expert methods that allow you to
          pass in the TermInfos index divisor. You can pass in -1 to disable
          loading of the index entirely (eg, IndexWriter does this when merging
          segments). I also added the param to IndexWriter.getReader, so you
          can get an NRT reader w/ subsampled index terms.

          This replaces the set/getTermInfosIndexDivisor methods (they are now
          deprecated), ie you now must specify the divisor when opening the
          reader. If these methods are called, an UnsupportedOperationException
          is thrown. This is technically a break in back compat (previously you
          could call it before the terms index was used, eg if no searches had
          been run) but I think we should make an exception here. Very few
          users make use of these expert methods, and having these users switch
          to specifying the index divisor up front is a small code change in
          exchange for removing all synchronization from the terms dict.

          I also made all attrs in TermInfosReader final, and there is no longer
          any synchronization. To handle the case in IndexWriter, where a merge
          first opens a segment (which does not need the index) and then an NRT
          reader (or, applyDeletes) needs to share the same pooled reader and
          needs the terms index, I added a PrivateTermsDict static class to
          SegmentReader. This class just wraps a no-index-loaded
          TermInfosReader, which merging will use, and then can open a new
          index-is-loaded TermInfosReader when/if needed.

          Show
          Michael McCandless added a comment - Attached patch. This addresses this issue and LUCENE-1718 . I added 2 new static IndexReader.open expert methods that allow you to pass in the TermInfos index divisor. You can pass in -1 to disable loading of the index entirely (eg, IndexWriter does this when merging segments). I also added the param to IndexWriter.getReader, so you can get an NRT reader w/ subsampled index terms. This replaces the set/getTermInfosIndexDivisor methods (they are now deprecated), ie you now must specify the divisor when opening the reader. If these methods are called, an UnsupportedOperationException is thrown. This is technically a break in back compat (previously you could call it before the terms index was used, eg if no searches had been run) but I think we should make an exception here. Very few users make use of these expert methods, and having these users switch to specifying the index divisor up front is a small code change in exchange for removing all synchronization from the terms dict. I also made all attrs in TermInfosReader final, and there is no longer any synchronization. To handle the case in IndexWriter, where a merge first opens a segment (which does not need the index) and then an NRT reader (or, applyDeletes) needs to share the same pooled reader and needs the terms index, I added a PrivateTermsDict static class to SegmentReader. This class just wraps a no-index-loaded TermInfosReader, which merging will use, and then can open a new index-is-loaded TermInfosReader when/if needed.
          Hide
          Michael McCandless added a comment -

          Updated patch to resolve conflicts after LUCENE-1726... I plan to commit soon.

          Show
          Michael McCandless added a comment - Updated patch to resolve conflicts after LUCENE-1726 ... I plan to commit soon.
          Hide
          Michael McCandless added a comment -

          Reopening to un-deprecate getTermInfosIndexDivisor.

          Show
          Michael McCandless added a comment - Reopening to un-deprecate getTermInfosIndexDivisor.

            People

            • Assignee:
              Michael McCandless
              Reporter:
              Dan Rosher
            • Votes:
              3 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development