Lucene - Core
  1. Lucene - Core
  2. LUCENE-2025

Ability to turn off the store for an index

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 4.9, 5.0
    • Component/s: core/index
    • Labels:
    • Lucene Fields:
      New

      Description

      It would be really good in combination with parallel indexing if the
      Lucene store could be turned off entirely for an index.

      The reason is that part of the store is the FieldIndex (.fdx file),
      which contains an 8 bytes pointer for each document in a segment, even
      if a document does not contain any stored fields.

      With parallel indexing we will want to rewrite certain parallel
      indexes to update them, and if such an update affects only a small
      number of documents it will be a waste if you have to write the .fdx
      file every time.

      So in the case where you only want to update a data structure in the
      inverted index it makes sense to separate your index into multiple
      parallel indexes, where the ones you want to update don't contain any
      stored fields.

      It'd be also great to not only allow turning off the store but to make
      it customizable, similarly to what flexible indexing wants to achieve
      regarding the inverted index.

      As a start I'd be happy with the ability to simply turn off the store and to
      add more flexibility later.

        Issue Links

          Activity

          Michael Busch created issue -
          Michael Busch made changes -
          Field Original Value New Value
          Link This issue is part of LUCENE-1879 [ LUCENE-1879 ]
          Michael Busch made changes -
          Link This issue is related to LUCENE-1458 [ LUCENE-1458 ]
          Mark Thomas made changes -
          Workflow jira [ 12481151 ] Default workflow, editable Closed status [ 12563357 ]
          Mark Thomas made changes -
          Workflow Default workflow, editable Closed status [ 12563357 ] jira [ 12584501 ]
          Simon Willnauer made changes -
          Labels gsoc2011 lucene-gsoc-11
          Simon Willnauer made changes -
          Labels gsoc2011 lucene-gsoc-11 mentor
          Simon Willnauer made changes -
          Labels mentor gsoc2011, lucene-gsoc-11 mentor,
          Hide
          Michael McCandless added a comment -

          Simon, watch out for INFRA-3517 – we have to be careful, when labeling, to not use the label with a trailing comma stuck on!

          Ie this issue now has two such labels: 'gosc2011,' and 'mentor,'

          Show
          Michael McCandless added a comment - Simon, watch out for INFRA-3517 – we have to be careful, when labeling, to not use the label with a trailing comma stuck on! Ie this issue now has two such labels: 'gosc2011,' and 'mentor,'
          Simon Willnauer made changes -
          Labels gsoc2011, lucene-gsoc-11 mentor, gsoc2011 lucene-gsoc-11 mentor
          Hide
          Simon Willnauer added a comment -

          Ie this issue now has two such labels: 'gosc2011,' and 'mentor,'

          thanks mike I changed them back to have no commas

          Show
          Simon Willnauer added a comment - Ie this issue now has two such labels: 'gosc2011,' and 'mentor,' thanks mike I changed them back to have no commas
          Simon Willnauer made changes -
          Labels gsoc2011 lucene-gsoc-11 mentor gsoc2011 gsoc2012 lucene-gsoc-11 mentor
          Simon Willnauer made changes -
          Labels gsoc2011 gsoc2012 lucene-gsoc-11 mentor gsoc2011 gsoc2012 lucene-gsoc-11 lucene-gsoc-12 mentor
          Hide
          Simon Willnauer added a comment -

          moving this over to 4.1 this won't happen in 4.0 anymore

          Show
          Simon Willnauer added a comment - moving this over to 4.1 this won't happen in 4.0 anymore
          Simon Willnauer made changes -
          Fix Version/s 4.1 [ 12321140 ]
          Fix Version/s 4.0 [ 12314025 ]
          Hide
          Robert Muir added a comment -

          One simple way to do this today is to just use a codec that has a NoStoredFieldsImpl,
          Throws exception in its writer impl if you ask it to actually write any stored fields
          (e.g. startDocument(n) is called where n > 0), and does nothing in its reader impl.

          I think for the typical case its fairly uncommon, i looked into seeing if we could
          optimize this case for Lucene40's impl, but it introduces a lot of scary situations
          for things like bulk merge.

          So for now I really think this is a simple safe way at the moment, if someone wants to
          turn it off they just set this as their codec on indexwriter.

          Show
          Robert Muir added a comment - One simple way to do this today is to just use a codec that has a NoStoredFieldsImpl, Throws exception in its writer impl if you ask it to actually write any stored fields (e.g. startDocument(n) is called where n > 0), and does nothing in its reader impl. I think for the typical case its fairly uncommon, i looked into seeing if we could optimize this case for Lucene40's impl, but it introduces a lot of scary situations for things like bulk merge. So for now I really think this is a simple safe way at the moment, if someone wants to turn it off they just set this as their codec on indexwriter.
          Steve Rowe made changes -
          Fix Version/s 4.2 [ 12323899 ]
          Fix Version/s 4.1 [ 12321140 ]
          Robert Muir made changes -
          Fix Version/s 4.3 [ 12324143 ]
          Fix Version/s 4.2 [ 12323899 ]
          Adrien Grand made changes -
          Labels gsoc2011 gsoc2012 lucene-gsoc-11 lucene-gsoc-12 mentor gsoc2013
          Uwe Schindler made changes -
          Fix Version/s 4.4 [ 12324323 ]
          Fix Version/s 4.3 [ 12324143 ]
          Hide
          Steve Rowe added a comment -

          Bulk move 4.4 issues to 4.5 and 5.0

          Show
          Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
          Steve Rowe made changes -
          Fix Version/s 5.0 [ 12321663 ]
          Fix Version/s 4.5 [ 12324742 ]
          Fix Version/s 4.4 [ 12324323 ]
          Adrien Grand made changes -
          Fix Version/s 4.6 [ 12324999 ]
          Fix Version/s 5.0 [ 12321663 ]
          Fix Version/s 4.5 [ 12324742 ]
          Simon Willnauer made changes -
          Fix Version/s 4.7 [ 12325572 ]
          Fix Version/s 4.6 [ 12324999 ]
          Michael McCandless made changes -
          Labels gsoc2013 gsoc2014
          David Smiley made changes -
          Fix Version/s 4.8 [ 12326269 ]
          Fix Version/s 4.7 [ 12325572 ]
          Hide
          Uwe Schindler added a comment -

          Move issue to Lucene 4.9.

          Show
          Uwe Schindler added a comment - Move issue to Lucene 4.9.
          Uwe Schindler made changes -
          Fix Version/s 4.9 [ 12326730 ]
          Fix Version/s 5.0 [ 12321663 ]
          Fix Version/s 4.8 [ 12326269 ]

            People

            • Assignee:
              Michael Busch
              Reporter:
              Michael Busch
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development