Lucene - Core
  1. Lucene - Core
  2. LUCENE-5268

Cutover more postings formats to the inverted "pull" API

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.0, master
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      In LUCENE-5123, we added a new, more flexible, "pull" API for writing
      postings. This API allows the postings format to iterate the
      fields/terms/postings more than once, and mirrors the API for writing
      doc values.

      But that was just the first step (only SimpleText was cutover to the
      new API). I want to cutover more components, so we can (finally)
      e.g. play with different encodings depending on the term's postings,
      such as using a bitset for high freq DOCS_ONLY terms (LUCENE-5052).

      1. LUCENE-5268.patch
        99 kB
        Michael McCandless
      2. LUCENE-5268.patch
        138 kB
        Michael McCandless

        Activity

        Michael McCandless created issue -
        Hide
        Michael McCandless added a comment -

        Patch with these changes:

        • Cutover BlockTreeTermsWriter, BlockTermsWriter, FST/OrdTermsWriter
          from PushFieldsConsumer to FieldsConsumer
        • Changed PostingsBaseWriter to a "pull" API, with a single method
          to write the current term's postings, and then added a new
          PushPostingsBaseWriter that has the "push" API.
        • Cutover some formats to new PostingsBaseWriter; pulsing and bloom
          were nice cleanups. For the rest I just switched them to
          PushPostingsBaseWriter.
        • Only two PushFieldsConsumers remain: MemoryPF and RAMOnlyPF
          (test-framework); I'm tempted to just cut those over and then
          remove PushFieldsConsumer here.

        Still a few nocommits but I think it's close ...

        Show
        Michael McCandless added a comment - Patch with these changes: Cutover BlockTreeTermsWriter, BlockTermsWriter, FST/OrdTermsWriter from PushFieldsConsumer to FieldsConsumer Changed PostingsBaseWriter to a "pull" API, with a single method to write the current term's postings, and then added a new PushPostingsBaseWriter that has the "push" API. Cutover some formats to new PostingsBaseWriter; pulsing and bloom were nice cleanups. For the rest I just switched them to PushPostingsBaseWriter. Only two PushFieldsConsumers remain: MemoryPF and RAMOnlyPF (test-framework); I'm tempted to just cut those over and then remove PushFieldsConsumer here. Still a few nocommits but I think it's close ...
        Michael McCandless made changes -
        Field Original Value New Value
        Attachment LUCENE-5268.patch [ 12607614 ]
        Hide
        Robert Muir added a comment -

        this looks awesome, its good to see how it simplified pulsing. I think that means the new api is working...

        Show
        Robert Muir added a comment - this looks awesome, its good to see how it simplified pulsing. I think that means the new api is working...
        Hide
        Han Jiang added a comment -

        +1, the pulsing code is much cleaner!

        Show
        Han Jiang added a comment - +1, the pulsing code is much cleaner!
        Hide
        Michael McCandless added a comment -

        New patch, cutting over the last two holdouts from PushFieldsConsumer -> FieldsConsumer, and removing PushFieldsConsumer.

        I think it's nearly done ... nocommits are gone ... I still need to do javadocs ...

        Show
        Michael McCandless added a comment - New patch, cutting over the last two holdouts from PushFieldsConsumer -> FieldsConsumer, and removing PushFieldsConsumer. I think it's nearly done ... nocommits are gone ... I still need to do javadocs ...
        Michael McCandless made changes -
        Attachment LUCENE-5268.patch [ 12607863 ]
        Hide
        ASF subversion and git services added a comment -

        Commit 1531949 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1531949 ]

        LUCENE-5268: cutover all postings formats to FieldsConsumer

        Show
        ASF subversion and git services added a comment - Commit 1531949 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1531949 ] LUCENE-5268 : cutover all postings formats to FieldsConsumer
        Michael McCandless made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        ASF subversion and git services added a comment -

        Commit 1532060 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1532060 ]

        LUCENE-5268: fix test failures: bloom must first call delegate.write, then write its own

        Show
        ASF subversion and git services added a comment - Commit 1532060 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1532060 ] LUCENE-5268 : fix test failures: bloom must first call delegate.write, then write its own
        Hide
        Robert Muir added a comment -

        reopen for backport

        Show
        Robert Muir added a comment - reopen for backport
        Robert Muir made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Robert Muir made changes -
        Fix Version/s 4.11 [ 12327844 ]
        Hide
        ASF subversion and git services added a comment -

        Commit 1620250 from Robert Muir in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1620250 ]

        LUCENE-5123, LUCENE-5268: invert codec postings api (backport from trunk)

        Show
        ASF subversion and git services added a comment - Commit 1620250 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1620250 ] LUCENE-5123 , LUCENE-5268 : invert codec postings api (backport from trunk)
        Robert Muir made changes -
        Status Reopened [ 4 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        ASF subversion and git services added a comment -

        Commit 1620252 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1620252 ]

        LUCENE-5123, LUCENE-5268: move CHANGES 5.0 -> 4.11

        Show
        ASF subversion and git services added a comment - Commit 1620252 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1620252 ] LUCENE-5123 , LUCENE-5268 : move CHANGES 5.0 -> 4.11
        Hide
        Michael McCandless added a comment -

        Thanks Rob!

        Show
        Michael McCandless added a comment - Thanks Rob!
        Hide
        Anshum Gupta added a comment -

        Bulk close after 5.0 release.

        Show
        Anshum Gupta added a comment - Bulk close after 5.0 release.
        Anshum Gupta made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        4d 21h 40m 1 Michael McCandless 14/Oct/13 15:56
        Resolved Resolved Reopened Reopened
        314d 9h 30m 1 Robert Muir 25/Aug/14 01:27
        Reopened Reopened Resolved Resolved
        25m 13s 1 Robert Muir 25/Aug/14 01:52
        Resolved Resolved Closed Closed
        182d 3h 8m 1 Anshum Gupta 23/Feb/15 05:01

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development