Lucene - Core
  1. Lucene - Core
  2. LUCENE-6165

Change merging APIs to work on CodecReader instead of LeafReader

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.0, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Patch factors out "reader based on codec apis" and changes all merge policy/addIndexes apis to use this.

      If you want to do slow wrapping, you can still do it, just use SlowCodecReaderWrapper.wrap(LeafReader) yourself (versus SegmentMerger doing it always if its not a SegmentReader).

      Also adds FilterCodecReader, to make it easier to start efficiently filtering on merge. I cutover all the index splitters to this. This means they should be much much faster with this patch, they just change the deletes as you expect, and the merge is as optimal as a normal one.

      In other places, for now I think we should just do a rote conversion with SlowCodecReaderWrapper.wrap. Its no slower than today, just explicit, and we can incrementally fix them to do the right thing in the future rather than all at once.

        Activity

        Hide
        Robert Muir added a comment -

        patch. all tests pass.

        Show
        Robert Muir added a comment - patch. all tests pass.
        Hide
        Adrien Grand added a comment -

        +1 I like the patch

        Show
        Adrien Grand added a comment - +1 I like the patch
        Hide
        Michael McCandless added a comment -

        +1

        Show
        Michael McCandless added a comment - +1
        Hide
        Uwe Schindler added a comment -

        Hi, I think this is a clean approach. People can still reuse their old & slow code with conventional filterReaders, but people who take care of speed can use CodecReader.

        I just wonder why - in contrast to SlowCompositeReaderWrapper - the SlowCodecReaderWrapper extends Object and does not implement CodecReader directly. Instead the impl is anonymous subclass in wrap(). I think we should make the wrapper itsself implement CodecReader, but still with private ctor. This would also prevent the synthetic access$XX methods to work around the private methods. In addition this would allow to check if a CodecReader is slow via instanceof.

        Show
        Uwe Schindler added a comment - Hi, I think this is a clean approach. People can still reuse their old & slow code with conventional filterReaders, but people who take care of speed can use CodecReader. I just wonder why - in contrast to SlowCompositeReaderWrapper - the SlowCodecReaderWrapper extends Object and does not implement CodecReader directly. Instead the impl is anonymous subclass in wrap(). I think we should make the wrapper itsself implement CodecReader, but still with private ctor. This would also prevent the synthetic access$XX methods to work around the private methods. In addition this would allow to check if a CodecReader is slow via instanceof.
        Hide
        Robert Muir added a comment -

        just wonder why - in contrast to SlowCompositeReaderWrapper - the SlowCodecReaderWrapper extends Object and does not implement CodecReader directly. Instead the impl is anonymous subclass in wrap(). I think we should make the wrapper itsself implement CodecReader, but still with private ctor. This would also prevent the synthetic access$XX methods to work around the private methods. In addition this would allow to check if a CodecReader is slow via instanceof.

        Because it is not here to stay. We need to remove it, but I cannot do this shit all in one patch. I don't think we need to put such investments into the slow wrapper for that reason.

        Show
        Robert Muir added a comment - just wonder why - in contrast to SlowCompositeReaderWrapper - the SlowCodecReaderWrapper extends Object and does not implement CodecReader directly. Instead the impl is anonymous subclass in wrap(). I think we should make the wrapper itsself implement CodecReader, but still with private ctor. This would also prevent the synthetic access$XX methods to work around the private methods. In addition this would allow to check if a CodecReader is slow via instanceof. Because it is not here to stay. We need to remove it, but I cannot do this shit all in one patch. I don't think we need to put such investments into the slow wrapper for that reason.
        Hide
        Robert Muir added a comment -

        It also has no state (unlike slowwrapper). No need for overengineering here.

        Show
        Robert Muir added a comment - It also has no state (unlike slowwrapper). No need for overengineering here.
        Hide
        Robert Muir added a comment -

        I'm committing this as-is. Uwe if you want to refactor that reader, i have no problem with it.

        The current code is simply moved out of SegmentMerger and is the safe approach.

        Show
        Robert Muir added a comment - I'm committing this as-is. Uwe if you want to refactor that reader, i have no problem with it. The current code is simply moved out of SegmentMerger and is the safe approach.
        Hide
        Uwe Schindler added a comment -

        No problem i will refactor this with a small patch afterwards. I have to fix Jenkins first, I broke it today.... arg

        Show
        Uwe Schindler added a comment - No problem i will refactor this with a small patch afterwards. I have to fix Jenkins first, I broke it today.... arg
        Hide
        ASF subversion and git services added a comment -

        Commit 1650301 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1650301 ]

        LUCENE-6165: Change merging APIs from LeafReader to CodecReader

        Show
        ASF subversion and git services added a comment - Commit 1650301 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1650301 ] LUCENE-6165 : Change merging APIs from LeafReader to CodecReader
        Hide
        ASF subversion and git services added a comment -

        Commit 1650308 from Robert Muir in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1650308 ]

        LUCENE-6165: Change merging APIs from LeafReader to CodecReader

        Show
        ASF subversion and git services added a comment - Commit 1650308 from Robert Muir in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1650308 ] LUCENE-6165 : Change merging APIs from LeafReader to CodecReader
        Hide
        Anshum Gupta added a comment -

        Bulk close after 5.0 release.

        Show
        Anshum Gupta added a comment - Bulk close after 5.0 release.

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development