Lucene - Core
  1. Lucene - Core
  2. LUCENE-4371

consider refactoring slicer to indexinput.slice

    Details

    • Type: Task Task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.9, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      From LUCENE-4364:

      In my opinion, we should maybe check, if we can remove the whole Slicer in all Indexinputs? Just make the slice(...) method return the current BufferedIndexInput-based one. This could be another issue, once this is in.

      1. LUCENE-4371.patch
        40 kB
        Robert Muir
      2. LUCENE-4371.patch
        40 kB
        Robert Muir
      3. LUCENE-4371.patch
        28 kB
        Robert Muir
      4. LUCENE-4371.patch
        28 kB
        Robert Muir
      5. LUCENE-4371.patch
        28 kB
        Robert Muir

        Activity

        Hide
        Robert Muir added a comment -

        fyi there are some things i dont like about it, and i didnt look at the directories in misc/.

        but its worth thinking about i guess.

        Show
        Robert Muir added a comment - fyi there are some things i dont like about it, and i didnt look at the directories in misc/. but its worth thinking about i guess.
        Hide
        Uwe Schindler added a comment -

        +1 to rermove the slicer! I was thinking about that the whole time and it really makes it easier.

        In general for 4.0, we should make the slice method abstract and not provide a default implementation. In most cases its as easy to implement like clone().

        Show
        Uwe Schindler added a comment - +1 to rermove the slicer! I was thinking about that the whole time and it really makes it easier. In general for 4.0, we should make the slice method abstract and not provide a default implementation. In most cases its as easy to implement like clone().
        Hide
        Robert Muir added a comment -

        Well realistically I'm not sure we have to fix this for 4.0? We could do this one later in a fairly backwards compatible/compatiblish way:

        E.g. a good approach might be to make Directory.createSlicer just call clone/slice.

        Another safer alternative might introduce a minor break: by making that deprecation final and having CFS code just use slice() directly,
        listing it in the backwards break section. this would just be a minor break in expert usage.

        Show
        Robert Muir added a comment - Well realistically I'm not sure we have to fix this for 4.0? We could do this one later in a fairly backwards compatible/compatiblish way: E.g. a good approach might be to make Directory.createSlicer just call clone/slice. Another safer alternative might introduce a minor break: by making that deprecation final and having CFS code just use slice() directly, listing it in the backwards break section. this would just be a minor break in expert usage.
        Hide
        Michael McCandless added a comment -

        +1

        I think having II implement slice is much cleaner than Directory having to implement createSlicer returning an IndexInputSlicer with only one method.

        Show
        Michael McCandless added a comment - +1 I think having II implement slice is much cleaner than Directory having to implement createSlicer returning an IndexInputSlicer with only one method.
        Hide
        Michael McCandless added a comment -

        I don't think the default impl (SlicedIndexInput) should overrided BII's copyBytes? Seems ... spooky.

        Show
        Michael McCandless added a comment - I don't think the default impl (SlicedIndexInput) should overrided BII's copyBytes? Seems ... spooky.
        Hide
        Robert Muir added a comment -

        I agree Mike, i wanted to remove it... but I'm afraid!

        I also dont understand why we have DataOutput.copyBytes(DataInput), and also IndexInput.copyBytes(IndexOutput).
        Is this all really necessary?

        Show
        Robert Muir added a comment - I agree Mike, i wanted to remove it... but I'm afraid! I also dont understand why we have DataOutput.copyBytes(DataInput), and also IndexInput.copyBytes(IndexOutput). Is this all really necessary?
        Hide
        Robert Muir added a comment -

        just syncing the patch up to trunk.

        part of the funkiness i dont like is e.g. NIOFSIndexInput extends SimpleIndexInput. This is not good. I will see if i can clear that up in a separate issue.

        Show
        Robert Muir added a comment - just syncing the patch up to trunk. part of the funkiness i dont like is e.g. NIOFSIndexInput extends SimpleIndexInput. This is not good. I will see if i can clear that up in a separate issue.
        Hide
        Robert Muir added a comment -

        just an updated patch after LUCENE-4380. still work to be done.

        Show
        Robert Muir added a comment - just an updated patch after LUCENE-4380 . still work to be done.
        Hide
        Robert Muir added a comment -

        Cleaned up patch (as trunk has changed much), I think I'm happy with this one:

        Just a summary:

        • Remove IndexInputSlicer and Directory.createSlicer()
        • Add abstract IndexInput.slice(), behaves like clone(), but takes offset and length
        • MMap, NIOFS, SimpleFS just clone with adjusted offset.
        • MMap "slice-of-slice" assert (cfs within cfs) is removed because Lucene40DocValues does this, and slicing a slice now does what you think (before you got the default buffering impl over mmap...)
        • the previous "Default impl" is moved to BufferedIndexInput.wrap(). Its only used by RAMdir and a few other wierd ones. This makes it explicit they are using buffering wrapper for CFS.
        Show
        Robert Muir added a comment - Cleaned up patch (as trunk has changed much), I think I'm happy with this one: Just a summary: Remove IndexInputSlicer and Directory.createSlicer() Add abstract IndexInput.slice(), behaves like clone(), but takes offset and length MMap, NIOFS, SimpleFS just clone with adjusted offset. MMap "slice-of-slice" assert (cfs within cfs) is removed because Lucene40DocValues does this, and slicing a slice now does what you think (before you got the default buffering impl over mmap...) the previous "Default impl" is moved to BufferedIndexInput.wrap(). Its only used by RAMdir and a few other wierd ones. This makes it explicit they are using buffering wrapper for CFS.
        Hide
        Uwe Schindler added a comment -

        Looks cool.

        I was a bit confused about ByteBufferIndexInput, because this one already has slice(...). We should add @Override here, because it now implements abstract method.

        I still have to think if close works as expected, but this did not change as before. Maybe this is my misunderstanding, but it is really confusing:
        Slices are always closed by consumer code (not like clones) or not? If yes, all looks fine, but we should document this: clones do not need to be closed, but what about slices? I think we use the same FileDescriptor, so we also don't need to close the slices?

        Show
        Uwe Schindler added a comment - Looks cool. I was a bit confused about ByteBufferIndexInput, because this one already has slice(...) . We should add @Override here, because it now implements abstract method. I still have to think if close works as expected, but this did not change as before. Maybe this is my misunderstanding, but it is really confusing: Slices are always closed by consumer code (not like clones) or not? If yes, all looks fine, but we should document this: clones do not need to be closed, but what about slices? I think we use the same FileDescriptor, so we also don't need to close the slices?
        Hide
        Uwe Schindler added a comment -

        Btw, thanks for hiding and making the concrete FSDirIndexInputs hidden and especially final! Great step. The protected annoyed me for long time, but for backwards compatibility I never removed them (although I am sure nobody was ever able to subclass them correctly!).

        In ByteBufferIndexInput.slice() the return value is a package-protected class, so we should change this to the general IndexInput like in the abstract base class, otherwise the Javadocs will be look broken. This applies to the other classes and their clone(), too. The caller only needs the abstract IndexInput (especially if the impl class is invisible).

        Show
        Uwe Schindler added a comment - Btw, thanks for hiding and making the concrete FSDirIndexInputs hidden and especially final! Great step. The protected annoyed me for long time, but for backwards compatibility I never removed them (although I am sure nobody was ever able to subclass them correctly!). In ByteBufferIndexInput.slice() the return value is a package-protected class, so we should change this to the general IndexInput like in the abstract base class, otherwise the Javadocs will be look broken. This applies to the other classes and their clone(), too. The caller only needs the abstract IndexInput (especially if the impl class is invisible).
        Hide
        Michael McCandless added a comment -

        +1, this is an awesome simplification!

        Show
        Michael McCandless added a comment - +1, this is an awesome simplification!
        Hide
        Robert Muir added a comment -

        We should add @Override here, because it now implements abstract method.

        Oh, thanks, I forgot this.

        I think we use the same FileDescriptor, so we also don't need to close the slices?

        Slices are just like clones. So for example CFSDirectory holds an input over the entire .cfs file, and when you ask to open a "file" within the cfs it returns a slice (clone) of it. when you close the cfs it closes the real one.

        In ByteBufferIndexInput.slice() the return value is a package-protected class, so we should change this to the general IndexInput like in the abstract base class, otherwise the Javadocs will be look broken.

        What javadocs? This is not a public class

        Show
        Robert Muir added a comment - We should add @Override here, because it now implements abstract method. Oh, thanks, I forgot this. I think we use the same FileDescriptor, so we also don't need to close the slices? Slices are just like clones. So for example CFSDirectory holds an input over the entire .cfs file, and when you ask to open a "file" within the cfs it returns a slice (clone) of it. when you close the cfs it closes the real one. In ByteBufferIndexInput.slice() the return value is a package-protected class, so we should change this to the general IndexInput like in the abstract base class, otherwise the Javadocs will be look broken. What javadocs? This is not a public class
        Hide
        Uwe Schindler added a comment -

        What javadocs? This is not a public class

        You are right, because MMapIndexInput is private, too!

        Show
        Uwe Schindler added a comment - What javadocs? This is not a public class You are right, because MMapIndexInput is private, too!
        Hide
        Robert Muir added a comment -

        Added missing @Override

        By the way, i noticed something when refactoring the code: slicing/cloning currently has no safety (except for MMAP). we should think about this for NIO/Simple too: simple range checks that the slice is in bounds and maybe that the channel.isOpen. CFSDir could check some of this too, because its "handle" is now an ordinary input.

        But i didn't want to stir up controversy in this refactor (it is unrelated to this patch). I think there is no performance impact of adding such checks to NIO/Simple because they already must suffer a buffer refill here anyway. So maybe we can just open a followup.

        Show
        Robert Muir added a comment - Added missing @Override By the way, i noticed something when refactoring the code: slicing/cloning currently has no safety (except for MMAP). we should think about this for NIO/Simple too: simple range checks that the slice is in bounds and maybe that the channel.isOpen. CFSDir could check some of this too, because its "handle" is now an ordinary input. But i didn't want to stir up controversy in this refactor (it is unrelated to this patch). I think there is no performance impact of adding such checks to NIO/Simple because they already must suffer a buffer refill here anyway. So maybe we can just open a followup.
        Hide
        ASF subversion and git services added a comment -

        Commit 1595480 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1595480 ]

        LUCENE-4371: Replace IndexInputSlicer with IndexInput.slice

        Show
        ASF subversion and git services added a comment - Commit 1595480 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1595480 ] LUCENE-4371 : Replace IndexInputSlicer with IndexInput.slice
        Hide
        Robert Muir added a comment -

        Reopen for 4.9 backport.

        Show
        Robert Muir added a comment - Reopen for 4.9 backport.
        Hide
        Uwe Schindler added a comment -

        Thanks! We need this backported for the ByteBufferIndexInput improvements.

        Show
        Uwe Schindler added a comment - Thanks! We need this backported for the ByteBufferIndexInput improvements.
        Hide
        ASF subversion and git services added a comment -

        Commit 1599274 from Robert Muir in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1599274 ]

        LUCENE-4371: Replace IndexInputSlicer with IndexInput.slice

        Show
        ASF subversion and git services added a comment - Commit 1599274 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1599274 ] LUCENE-4371 : Replace IndexInputSlicer with IndexInput.slice
        Hide
        ASF subversion and git services added a comment -

        Commit 1599275 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1599275 ]

        LUCENE-4371: move CHANGES entry

        Show
        ASF subversion and git services added a comment - Commit 1599275 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1599275 ] LUCENE-4371 : move CHANGES entry
        Hide
        ASF subversion and git services added a comment -

        Commit 1599284 from Robert Muir in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1599284 ]

        LUCENE-4371: remove bogus and bogusly placed assert

        Show
        ASF subversion and git services added a comment - Commit 1599284 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1599284 ] LUCENE-4371 : remove bogus and bogusly placed assert

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development