Lucene - Core
  1. Lucene - Core
  2. LUCENE-5268

Cutover more postings formats to the inverted "pull" API

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.0, Trunk
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      In LUCENE-5123, we added a new, more flexible, "pull" API for writing
      postings. This API allows the postings format to iterate the
      fields/terms/postings more than once, and mirrors the API for writing
      doc values.

      But that was just the first step (only SimpleText was cutover to the
      new API). I want to cutover more components, so we can (finally)
      e.g. play with different encodings depending on the term's postings,
      such as using a bitset for high freq DOCS_ONLY terms (LUCENE-5052).

      1. LUCENE-5268.patch
        138 kB
        Michael McCandless
      2. LUCENE-5268.patch
        99 kB
        Michael McCandless

        Activity

        Hide
        Michael McCandless added a comment -

        Patch with these changes:

        • Cutover BlockTreeTermsWriter, BlockTermsWriter, FST/OrdTermsWriter
          from PushFieldsConsumer to FieldsConsumer
        • Changed PostingsBaseWriter to a "pull" API, with a single method
          to write the current term's postings, and then added a new
          PushPostingsBaseWriter that has the "push" API.
        • Cutover some formats to new PostingsBaseWriter; pulsing and bloom
          were nice cleanups. For the rest I just switched them to
          PushPostingsBaseWriter.
        • Only two PushFieldsConsumers remain: MemoryPF and RAMOnlyPF
          (test-framework); I'm tempted to just cut those over and then
          remove PushFieldsConsumer here.

        Still a few nocommits but I think it's close ...

        Show
        Michael McCandless added a comment - Patch with these changes: Cutover BlockTreeTermsWriter, BlockTermsWriter, FST/OrdTermsWriter from PushFieldsConsumer to FieldsConsumer Changed PostingsBaseWriter to a "pull" API, with a single method to write the current term's postings, and then added a new PushPostingsBaseWriter that has the "push" API. Cutover some formats to new PostingsBaseWriter; pulsing and bloom were nice cleanups. For the rest I just switched them to PushPostingsBaseWriter. Only two PushFieldsConsumers remain: MemoryPF and RAMOnlyPF (test-framework); I'm tempted to just cut those over and then remove PushFieldsConsumer here. Still a few nocommits but I think it's close ...
        Hide
        Robert Muir added a comment -

        this looks awesome, its good to see how it simplified pulsing. I think that means the new api is working...

        Show
        Robert Muir added a comment - this looks awesome, its good to see how it simplified pulsing. I think that means the new api is working...
        Hide
        Han Jiang added a comment -

        +1, the pulsing code is much cleaner!

        Show
        Han Jiang added a comment - +1, the pulsing code is much cleaner!
        Hide
        Michael McCandless added a comment -

        New patch, cutting over the last two holdouts from PushFieldsConsumer -> FieldsConsumer, and removing PushFieldsConsumer.

        I think it's nearly done ... nocommits are gone ... I still need to do javadocs ...

        Show
        Michael McCandless added a comment - New patch, cutting over the last two holdouts from PushFieldsConsumer -> FieldsConsumer, and removing PushFieldsConsumer. I think it's nearly done ... nocommits are gone ... I still need to do javadocs ...
        Hide
        ASF subversion and git services added a comment -

        Commit 1531949 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1531949 ]

        LUCENE-5268: cutover all postings formats to FieldsConsumer

        Show
        ASF subversion and git services added a comment - Commit 1531949 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1531949 ] LUCENE-5268 : cutover all postings formats to FieldsConsumer
        Hide
        ASF subversion and git services added a comment -

        Commit 1532060 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1532060 ]

        LUCENE-5268: fix test failures: bloom must first call delegate.write, then write its own

        Show
        ASF subversion and git services added a comment - Commit 1532060 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1532060 ] LUCENE-5268 : fix test failures: bloom must first call delegate.write, then write its own
        Hide
        Robert Muir added a comment -

        reopen for backport

        Show
        Robert Muir added a comment - reopen for backport
        Hide
        ASF subversion and git services added a comment -

        Commit 1620250 from Robert Muir in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1620250 ]

        LUCENE-5123, LUCENE-5268: invert codec postings api (backport from trunk)

        Show
        ASF subversion and git services added a comment - Commit 1620250 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1620250 ] LUCENE-5123 , LUCENE-5268 : invert codec postings api (backport from trunk)
        Hide
        ASF subversion and git services added a comment -

        Commit 1620252 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1620252 ]

        LUCENE-5123, LUCENE-5268: move CHANGES 5.0 -> 4.11

        Show
        ASF subversion and git services added a comment - Commit 1620252 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1620252 ] LUCENE-5123 , LUCENE-5268 : move CHANGES 5.0 -> 4.11
        Hide
        Michael McCandless added a comment -

        Thanks Rob!

        Show
        Michael McCandless added a comment - Thanks Rob!
        Hide
        Anshum Gupta added a comment -

        Bulk close after 5.0 release.

        Show
        Anshum Gupta added a comment - Bulk close after 5.0 release.

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development