Lucene - Core
  1. Lucene - Core
  2. LUCENE-6898

Avoid reading last stored field value when StoredFieldVisitor.Status.NO

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.4
    • Component/s: core/codecs
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      CompressingStoredFieldsReader.visitDocument (line 597) loops through the fields in the input while consulting the StoredFieldVisitor on what to do. There is a small optimization that could be done on the last loop iteration. If the visitor returns Status.NO then it should be treated as equivalent to Status.STOP. As it is now, it will call skipField() which reads needless bytes from the DataInput that won't be used.

      With this optimization in place, it is advisable to put the largest text field last in sequence – something the user or search platform (e.g. ES/Solr) could do.

      1. LUCENE-6898.patch
        0.8 kB
        David Smiley

        Activity

        Hide
        David Smiley added a comment -

        Here's a simple patch.

        I have no idea how much this optimization helps, but I imagine for it would help for medium to large docs.

        Show
        David Smiley added a comment - Here's a simple patch. I have no idea how much this optimization helps, but I imagine for it would help for medium to large docs.
        Hide
        David Smiley added a comment -

        I plan to commit this tomorrow at about this time.

        Show
        David Smiley added a comment - I plan to commit this tomorrow at about this time.
        Hide
        Adrien Grand added a comment -

        +1

        For the record, this would only help if the last stored field value is larger than 16KB. Otherwise we're just skipping over data that is already decompressed.

        Show
        Adrien Grand added a comment - +1 For the record, this would only help if the last stored field value is larger than 16KB. Otherwise we're just skipping over data that is already decompressed.
        Hide
        ASF subversion and git services added a comment -

        Commit 1715299 from David Smiley in branch 'dev/trunk'
        [ https://svn.apache.org/r1715299 ]

        LUCENE-6898: Don't fully read the last stored field value from disk if the StoredFieldVisitor doesn't want it.

        Show
        ASF subversion and git services added a comment - Commit 1715299 from David Smiley in branch 'dev/trunk' [ https://svn.apache.org/r1715299 ] LUCENE-6898 : Don't fully read the last stored field value from disk if the StoredFieldVisitor doesn't want it.
        Hide
        ASF subversion and git services added a comment -

        Commit 1715300 from David Smiley in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1715300 ]

        LUCENE-6898: Don't fully read the last stored field value from disk if the StoredFieldVisitor doesn't want it.

        Show
        ASF subversion and git services added a comment - Commit 1715300 from David Smiley in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1715300 ] LUCENE-6898 : Don't fully read the last stored field value from disk if the StoredFieldVisitor doesn't want it.

          People

          • Assignee:
            David Smiley
            Reporter:
            David Smiley
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development