Solr
  1. Solr
  2. SOLR-1296

Enable opening IndexReader with termInfosIndexDivisor and setting IndexWriter's termIndexInterval

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Fixed
    • Affects Version/s: 1.3
    • Fix Version/s: 1.4
    • Component/s: None
    • Labels:
      None

      Description

      We need to enable opening an IndexReader with termInfosIndexDivisor set in solrConfig.

      We need to enable setting the termIndexInterval in SolrIndexConfig.

      1. SOLR-1296.patch
        24 kB
        Jason Rutherglen
      2. SOLR-1296.patch
        7 kB
        Jason Rutherglen
      3. SOLR-1296.patch
        24 kB
        Jason Rutherglen
      4. SOLR-1296.patch
        4 kB
        Jason Rutherglen

        Activity

        Hide
        Jason Rutherglen added a comment -
        • termInfosIndexDivisor is set in StandardIndexReaderFactory as
          an optional parameter
        • termIndexInterval is obtained from SolrIndexConfig and set in
          SolrIndexWriter
        • Needs a test case
        Show
        Jason Rutherglen added a comment - termInfosIndexDivisor is set in StandardIndexReaderFactory as an optional parameter termIndexInterval is obtained from SolrIndexConfig and set in SolrIndexWriter Needs a test case
        Hide
        Jason Rutherglen added a comment -
        • Test case placed into TestConfig
        • Created solrconfig-termindex.xml
        • IndexReader.getTermInfosIndexDivisor is deprecated and probably can be turned on again in Lucene trunk
        Show
        Jason Rutherglen added a comment - Test case placed into TestConfig Created solrconfig-termindex.xml IndexReader.getTermInfosIndexDivisor is deprecated and probably can be turned on again in Lucene trunk
        Hide
        Jason Rutherglen added a comment -

        Added the remainder of testTermIndexDivisor

        Show
        Jason Rutherglen added a comment - Added the remainder of testTermIndexDivisor
        Hide
        Grant Ingersoll added a comment -

        Looks like solrconfig-termindex.xml was not included in the patch.

        Also, not sure about exposing getIndexWriter() on the DUH2 just for testing purposes. Is there another way to get at testing this?

        Show
        Grant Ingersoll added a comment - Looks like solrconfig-termindex.xml was not included in the patch. Also, not sure about exposing getIndexWriter() on the DUH2 just for testing purposes. Is there another way to get at testing this?
        Hide
        Jason Rutherglen added a comment -
        • Added solrconfig-termindex.xml
        • I'm not sure about IW access?
        Show
        Jason Rutherglen added a comment - Added solrconfig-termindex.xml I'm not sure about IW access?
        Hide
        Hoss Man added a comment -

        This should do the trick...

        public class ExposeWriterHandler extends DirectUpdateHandler2 {
          public ExposeWriterHandler() { super(h.getCore()); }
          public getWriter() {
            forceOpenWriter();
            return writer;
          }
        };
        IndexWriter writer = (new ExposeWriterHandler()).getWriter();
        

        ...since all that maters is you get a writer using the configsfrom the core.

        If i'm missing something then the next obvious solution would be to changed the <updateHandler class="..."/> to pointat a concrete public class created for this test.

        BTW: do we really need solrconfig-termindex.xml ? .. why not just make these changes to the solrconfig.xml that TestConfig already uses?

        Show
        Hoss Man added a comment - This should do the trick... public class ExposeWriterHandler extends DirectUpdateHandler2 { public ExposeWriterHandler() { super (h.getCore()); } public getWriter() { forceOpenWriter(); return writer; } }; IndexWriter writer = ( new ExposeWriterHandler()).getWriter(); ...since all that maters is you get a writer using the configsfrom the core. If i'm missing something then the next obvious solution would be to changed the <updateHandler class="..."/> to pointat a concrete public class created for this test. BTW: do we really need solrconfig-termindex.xml ? .. why not just make these changes to the solrconfig.xml that TestConfig already uses?
        Hide
        Jason Rutherglen added a comment -

        This should do the trick...

        Why would we go through the effort? IW is a public Lucene class.

        do we really need solrconfig-termindex.xml

        We probably don't want all the tests to have different behavior.

        Show
        Jason Rutherglen added a comment - This should do the trick... Why would we go through the effort? IW is a public Lucene class. do we really need solrconfig-termindex.xml We probably don't want all the tests to have different behavior.
        Hide
        Grant Ingersoll added a comment -

        Why would we go through the effort? IW is a public Lucene class.

        Because it's not publicly exposed in Solr and exposing it just for the sake of testing doesn't seem wise.

        Show
        Grant Ingersoll added a comment - Why would we go through the effort? IW is a public Lucene class. Because it's not publicly exposed in Solr and exposing it just for the sake of testing doesn't seem wise.
        Hide
        Jason Rutherglen added a comment -

        Because it's not publicly exposed in Solr and exposing it just for the sake of testing doesn't seem wise.

        Can you describe the worst case scenario you imagine will happen if IndexWriter is exposed?

        Show
        Jason Rutherglen added a comment - Because it's not publicly exposed in Solr and exposing it just for the sake of testing doesn't seem wise. Can you describe the worst case scenario you imagine will happen if IndexWriter is exposed?
        Hide
        Mark Miller added a comment -

        People may start to use it and count on it, and we will have to needlessly support that.

        A public class in Lucene will still occasionally break back compat, and come with maintenance/deprecation as well. Just because its public in Lucene, that doesn't mean it should be public in Solr.

        For each little thing, its arguably never that big a deal - but a good policy keeps a bunch of little things from creeping.

        -1 on exposing anything just for tests.

        Show
        Mark Miller added a comment - People may start to use it and count on it, and we will have to needlessly support that. A public class in Lucene will still occasionally break back compat, and come with maintenance/deprecation as well. Just because its public in Lucene, that doesn't mean it should be public in Solr. For each little thing, its arguably never that big a deal - but a good policy keeps a bunch of little things from creeping. -1 on exposing anything just for tests.
        Hide
        Grant Ingersoll added a comment -

        Can you describe the worst case scenario you imagine will happen if IndexWriter is exposed?

        Sure, someone who thinks they know what they are doing closes the IW when it shouldn't be closed causing exceptions, etc and emails to solr-user, etc. and wasting the communities time. However, the impetus is not on me to defend why it shouldn't be exposed, it's on you to show why it is proper to take something that is currently private and make it public just to pass a test. If it needs to be public for other use cases, fine, but generally speaking, I don't think variables, etc. should be made public just for testing purposes. That's bad OOD.

        Show
        Grant Ingersoll added a comment - Can you describe the worst case scenario you imagine will happen if IndexWriter is exposed? Sure, someone who thinks they know what they are doing closes the IW when it shouldn't be closed causing exceptions, etc and emails to solr-user, etc. and wasting the communities time. However, the impetus is not on me to defend why it shouldn't be exposed, it's on you to show why it is proper to take something that is currently private and make it public just to pass a test. If it needs to be public for other use cases, fine, but generally speaking, I don't think variables, etc. should be made public just for testing purposes. That's bad OOD.
        Hide
        Jason Rutherglen added a comment -

        For a test I agree it's not worth exposing IW publicly however
        there should be a simple way to access it as a package protected
        variable?

        A great example of a public Lucene API being exposed from Solr
        that can easily break the system is getWrappedReader. IR is as
        canonical to Lucene as IW. And calling close on IR will also
        cause numerous errors for users. Why is it public, it's only
        used internally to Solr?

        The Solr policy as it's being described isn't making sense to me.

        An answer to the second question?

        do we really need solrconfig-termindex.xml

        We probably don't want all the tests to have different behavior.

        Show
        Jason Rutherglen added a comment - For a test I agree it's not worth exposing IW publicly however there should be a simple way to access it as a package protected variable? A great example of a public Lucene API being exposed from Solr that can easily break the system is getWrappedReader. IR is as canonical to Lucene as IW. And calling close on IR will also cause numerous errors for users. Why is it public, it's only used internally to Solr? The Solr policy as it's being described isn't making sense to me. An answer to the second question? do we really need solrconfig-termindex.xml We probably don't want all the tests to have different behavior.
        Hide
        Mark Miller added a comment -

        I'm not against exposing something package private for tests - anyone that jumps the fence to use that should know what they are getting themselves into.

        Show
        Mark Miller added a comment - I'm not against exposing something package private for tests - anyone that jumps the fence to use that should know what they are getting themselves into.
        Hide
        Grant Ingersoll added a comment -

        Committed revision 814160.

        Moved the termInfosIndexDivisor up to the abstract class.

        Implemented Hoss's Expose Writer option. Added unit tests for the IndexReaderFactory.

        Show
        Grant Ingersoll added a comment - Committed revision 814160. Moved the termInfosIndexDivisor up to the abstract class. Implemented Hoss's Expose Writer option. Added unit tests for the IndexReaderFactory.
        Hide
        Grant Ingersoll added a comment -

        Bulk close for Solr 1.4

        Show
        Grant Ingersoll added a comment - Bulk close for Solr 1.4

          People

          • Assignee:
            Grant Ingersoll
            Reporter:
            Jason Rutherglen
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 48h
              48h
              Remaining:
              Remaining Estimate - 48h
              48h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development