Lucene - Core
  1. Lucene - Core
  2. LUCENE-3924 Optimize buffer size handling in RAMDirectory to make it more GC friendly
  3. LUCENE-3659

Allow per-RAMFile buffer sizes based on IOContext and source of data (e.g. copy from another directory)

    Details

    • Type: Sub-task Sub-task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: 4.9, 5.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Spinoff from several dev@lao issues:

      The use cases for RAMDirectory are very limited and to prevent users from using it for e.g. loading a 50 Gigabyte index from a file on disk, we should improve the javadocs.

      1. LUCENE-3659.patch
        10 kB
        Uwe Schindler
      2. LUCENE-3659.patch
        14 kB
        Uwe Schindler
      3. LUCENE-3659.patch
        15 kB
        Uwe Schindler
      4. LUCENE-3659.patch
        15 kB
        Uwe Schindler

        Activity

        Hide
        Erick Erickson added a comment -

        From the dev list, didn't want to lose this background (or make Uwe type it again <G>)

        The idea was to maybe replace RAMDirectory by a “clone” of MMapDirectory that uses large DirectByteBuffers outside the JVM heap. The current RAMDirectory is very limited (buffersize hardcoded to 8 KB, if you have a 50 Gigabyte Index in this RAMDirectory, your GC simply drives crazy – we investigated this several times for customers. RAMDirectory was in fact several times slower than a simple disk-based MMapDir). Also the locking on the RAMFile class is horrible, as for large indexes you have to change buffer several times when seeking/reading/…, which does heavily locking. In contrast, MMapDir is completely lock-free!

        Until there is no replacement we will not remove it, but the current RAMDirectory is not useable for large indexes. That’s a limitation and the design of this class does not support anything else. It’s currently unfixable and instead of putting work into fixing it, the time should be spent in working on a new ByteBuffer-based RAMDir with larger blocs/blocs that merge or IOContext helping to calculate the file size before writing it (e.g. when triggering a merge you know the approximate size of the file before, so you can allocate a buffer that’s better than 8 Kilobytes). Also directByteBuffer helps to make GC happy, as the RAMdir is outside JVM heap.....

        RAMdir uses more time for switching buffers than reading the data. The problem is that MMapDir does not support writing and that why we plan to improve this. Have you tried MMapDir for read access in comparison to RAMDirectory for a larger index, it outperforms several times (depending on OS and if file data is in FS cache already). The new directory will simply mimic the MMapIndexInput, add MMapIndexOutput, but not based on a mmapped buffer, instead a in-memory (Direct)ByteBuffer (outside or inside JVM heap – both will be supported). This simplifies code a lot.

        The discussions about the limitations of crappy RAMDirectory were discussed on conferences, sorry. We did *not*decide to remove it (without a patch/replacement). The whole “message” on the issue was that RAMDirectory is a bad idea. The recommended approach at the moment to handle large in-ram directories would be to use a tmpfs on Linux/Solaris and use MMapDir on top (for larger indexes). The MMap would then directly map the RAM of the underlying tmpfs.....

        Show
        Erick Erickson added a comment - From the dev list, didn't want to lose this background (or make Uwe type it again <G>) The idea was to maybe replace RAMDirectory by a “clone” of MMapDirectory that uses large DirectByteBuffers outside the JVM heap. The current RAMDirectory is very limited (buffersize hardcoded to 8 KB, if you have a 50 Gigabyte Index in this RAMDirectory, your GC simply drives crazy – we investigated this several times for customers. RAMDirectory was in fact several times slower than a simple disk-based MMapDir). Also the locking on the RAMFile class is horrible, as for large indexes you have to change buffer several times when seeking/reading/…, which does heavily locking. In contrast, MMapDir is completely lock-free! Until there is no replacement we will not remove it, but the current RAMDirectory is not useable for large indexes. That’s a limitation and the design of this class does not support anything else. It’s currently unfixable and instead of putting work into fixing it, the time should be spent in working on a new ByteBuffer-based RAMDir with larger blocs/blocs that merge or IOContext helping to calculate the file size before writing it (e.g. when triggering a merge you know the approximate size of the file before, so you can allocate a buffer that’s better than 8 Kilobytes). Also directByteBuffer helps to make GC happy, as the RAMdir is outside JVM heap..... RAMdir uses more time for switching buffers than reading the data. The problem is that MMapDir does not support writing and that why we plan to improve this. Have you tried MMapDir for read access in comparison to RAMDirectory for a larger index, it outperforms several times (depending on OS and if file data is in FS cache already). The new directory will simply mimic the MMapIndexInput, add MMapIndexOutput, but not based on a mmapped buffer, instead a in-memory (Direct)ByteBuffer (outside or inside JVM heap – both will be supported). This simplifies code a lot. The discussions about the limitations of crappy RAMDirectory were discussed on conferences, sorry. We did *not*decide to remove it (without a patch/replacement). The whole “message” on the issue was that RAMDirectory is a bad idea. The recommended approach at the moment to handle large in-ram directories would be to use a tmpfs on Linux/Solaris and use MMapDir on top (for larger indexes). The MMap would then directly map the RAM of the underlying tmpfs.....
        Hide
        Uwe Schindler added a comment -

        It's even worse, it uses a buffer size of 1 Kilobyte:

        public class RAMOutputStream extends IndexOutput {
          static final int BUFFER_SIZE = 1024;
        

        A 50 Gigabyte file means 52428800 byte[] arrays

        Show
        Uwe Schindler added a comment - It's even worse, it uses a buffer size of 1 Kilobyte: public class RAMOutputStream extends IndexOutput { static final int BUFFER_SIZE = 1024; A 50 Gigabyte file means 52428800 byte[] arrays
        Hide
        Robert Muir added a comment -

        I actually think heap versus direct is just an optimization (and ideally would just be an option, in my opinion not the default).
        The problem is mostly tiny buffers.

        I think a good idea is to rename MMapIndexInput to ByteBufferIndexInput, it does not really care if something is mapped or not, it has all the logic for dealing with multiple fixed-size buffers.

        And RAMDirectory could then just use 1MB shift by default (normal heap array-backed buffers). Sure it wastes at most 1MB for tiny files but the RAMFile is wasteful today too.

        Down the road we could optimize this: e.g. add IOContext.METADATA for tiny files (segments.gen, .fnm, segments_N, .per, etc), and RAMDir could use say a 256KB shift there and 4MB otherwise. This iocontext flag could also be used if someone didnt want to MMap tiny-files too.

        Show
        Robert Muir added a comment - I actually think heap versus direct is just an optimization (and ideally would just be an option, in my opinion not the default). The problem is mostly tiny buffers. I think a good idea is to rename MMapIndexInput to ByteBufferIndexInput, it does not really care if something is mapped or not, it has all the logic for dealing with multiple fixed-size buffers. And RAMDirectory could then just use 1MB shift by default (normal heap array-backed buffers). Sure it wastes at most 1MB for tiny files but the RAMFile is wasteful today too. Down the road we could optimize this: e.g. add IOContext.METADATA for tiny files (segments.gen, .fnm, segments_N, .per, etc), and RAMDir could use say a 256KB shift there and 4MB otherwise. This iocontext flag could also be used if someone didnt want to MMap tiny-files too.
        Hide
        Shay Banon added a comment -

        Uwe: Hey, have you looked at LUCENE-2292, this can be a good candidate to use to replace RAM directory. I just "refreshed" it and it seems to work find (all tests pass).

        Show
        Shay Banon added a comment - Uwe: Hey, have you looked at LUCENE-2292 , this can be a good candidate to use to replace RAM directory. I just "refreshed" it and it seems to work find (all tests pass).
        Hide
        Simon Willnauer added a comment -

        should we make this a blocker for 4.0?

        Show
        Simon Willnauer added a comment - should we make this a blocker for 4.0?
        Hide
        Uwe Schindler added a comment -

        If I remember the misinformation about using RAMDir going on, yes, it's a blocker - just look at java-user@lao last week, those mails about people saying "I want to copy my 20 Gigabyte FSDir to a RAMDir because it's faster as it has the word 'RAM' in it" looks more like a XY problem than a good reason for using it). MMapDir and even NIOFSDir work mostly from RAM, as the OS will cache for you.

        On the other hand, it's just documentation we can do it always.

        Fixing the default buffer size to something more suitable for real-world use cases is something that can be done with a one-line-patch. I would prefer 64 Kilobytes buffer size.

        Show
        Uwe Schindler added a comment - If I remember the misinformation about using RAMDir going on, yes, it's a blocker - just look at java-user@lao last week, those mails about people saying "I want to copy my 20 Gigabyte FSDir to a RAMDir because it's faster as it has the word 'RAM' in it" looks more like a XY problem than a good reason for using it). MMapDir and even NIOFSDir work mostly from RAM, as the OS will cache for you. On the other hand, it's just documentation we can do it always. Fixing the default buffer size to something more suitable for real-world use cases is something that can be done with a one-line-patch. I would prefer 64 Kilobytes buffer size.
        Hide
        Uwe Schindler added a comment -

        I think we should fix at least JavaDocs for 3.6 and maybe raise buffersize. I will provide a patch tomorrow.

        Show
        Uwe Schindler added a comment - I think we should fix at least JavaDocs for 3.6 and maybe raise buffersize. I will provide a patch tomorrow.
        Hide
        Uwe Schindler added a comment -

        I started to work on this, here is just a first step (trunk). This patch removes the BUFFER_SIZE constant and moves it up to RAMDirectory (but for now only as default, see below!). RAMDirectory inherits the default buffersize for now to its RAMFile childs (newRAMFile() method), but this can likely change (see below).

        As every RAMFile has its own buffer size, optimizations are possible:

        • when you open an IndexOutput, in trunk we get the IOContext, which may contain a Merge/Flush desc containing the complete segment size (unfortunately the complete segment size). But this number can be used as a order of magnitude for specifiing the buffer size.

        The patch does not yet implement that, but an idea would be to maybe allocate 1/32 of the segment size as buffer size. By that the buffer size does not get too big, but on the other hand the number of slices has an upper limit (approx 32 slices per merged segment). Currently a merged segment with a size of say 32 Gigabytes would have 32 million byte[] arrays, after the change only 32 byte[] arrays with a size of 1 Gigabyte each. This should make GC happy.

        When backporting to 3.x, the IOContext is not yet available and RAMDirectory always uses the default buffer size (maybe randomize in tests). Rainsing the buffer size should bring improvements here.

        We should still add some warnings into the Javadocs, that for large indexes it is often preferable to use MMapDir, especially when you store it on disk. We should also peple tell that new RAMDirectoty(OtherDirectory) maybe a bad idea...

        The new default buffer size was raised from 1024 to 8192.

        Show
        Uwe Schindler added a comment - I started to work on this, here is just a first step (trunk). This patch removes the BUFFER_SIZE constant and moves it up to RAMDirectory (but for now only as default, see below!). RAMDirectory inherits the default buffersize for now to its RAMFile childs (newRAMFile() method), but this can likely change (see below). As every RAMFile has its own buffer size, optimizations are possible: when you open an IndexOutput, in trunk we get the IOContext, which may contain a Merge/Flush desc containing the complete segment size (unfortunately the complete segment size). But this number can be used as a order of magnitude for specifiing the buffer size. The patch does not yet implement that, but an idea would be to maybe allocate 1/32 of the segment size as buffer size. By that the buffer size does not get too big, but on the other hand the number of slices has an upper limit (approx 32 slices per merged segment). Currently a merged segment with a size of say 32 Gigabytes would have 32 million byte[] arrays, after the change only 32 byte[] arrays with a size of 1 Gigabyte each. This should make GC happy. When backporting to 3.x, the IOContext is not yet available and RAMDirectory always uses the default buffer size (maybe randomize in tests). Rainsing the buffer size should bring improvements here. We should still add some warnings into the Javadocs, that for large indexes it is often preferable to use MMapDir, especially when you store it on disk. We should also peple tell that new RAMDirectoty(OtherDirectory) maybe a bad idea... The new default buffer size was raised from 1024 to 8192.
        Hide
        Robert Muir added a comment -

        Honestly I don't even have the time to review the patch: I'm sure Uwe's changes (as always)
        are very nice and thorough.

        I just want to propose the idea of a javadocs-only fix for 3.6: I am afraid of any .store changes,
        except serious bugfixes (with serious tests to go with them) this close to release.

        Show
        Robert Muir added a comment - Honestly I don't even have the time to review the patch: I'm sure Uwe's changes (as always) are very nice and thorough. I just want to propose the idea of a javadocs-only fix for 3.6: I am afraid of any .store changes, except serious bugfixes (with serious tests to go with them) this close to release.
        Hide
        Uwe Schindler added a comment -

        A played a little bit around and implemented the IOContext / filename dependent buffer sizes for RAMFiles.

        The code currently prints out lot's of size infornation (like buffer sizes) on RAMDirectory.close(). This is just for debugging and to show what happens.

        To catually see real-world use cases, execute tests with ant test -Dtests.directory=RAMDirectory -Dtests.nightly=true

        Show
        Uwe Schindler added a comment - A played a little bit around and implemented the IOContext / filename dependent buffer sizes for RAMFiles. The code currently prints out lot's of size infornation (like buffer sizes) on RAMDirectory.close(). This is just for debugging and to show what happens. To catually see real-world use cases, execute tests with ant test -Dtests.directory=RAMDirectory -Dtests.nightly=true
        Hide
        Uwe Schindler added a comment -

        More improvements:

        • If you use new RAMDirectory(existingDir), the RAMFiles in the created RAMDirectory will have the original fileSize (if less then 1L << 30 bytes) as bufferSize, as we know the file size upfront.
        Show
        Uwe Schindler added a comment - More improvements: If you use new RAMDirectory(existingDir), the RAMFiles in the created RAMDirectory will have the original fileSize (if less then 1L << 30 bytes) as bufferSize, as we know the file size upfront.
        Hide
        Michael McCandless added a comment -

        This looks great Uwe!

        I'm a little worried about the tiny file case; you're checking for
        SEGMENTS_* now, but many other files can be much smaller than 1/64th
        of the estimated segment size.

        I wonder if we should "improve" IOContext to hold the [rough]
        estimated file size (not just overall segment size)... the thing is
        that's sort of a hassle on codec impls.

        Or: maybe, on closing the ROS/RAMFile, we can downsize the final
        buffer (yes, this means copying the bytes, but that cost is vanishingly
        small as the RAMDir grows). Then tiny files stay tiny, though they
        are still [relatively] costly to create...

        I don't this RAMDir.createOutput should publish the RAMFile until the
        ROS is closed? Ie, you are not allowed to openInput on something
        still opened with createOutput in any Lucene Dir impl..? This would
        allow us to make RAMFile frozen (eg if ROS holds its own buffers and
        then creates RAMFile on close), that requires no sync when reading?

        I also don't think RAMFile should be public, ie, the only way to make
        changes to a file stored in a RAMDir is via RAMOutputStream. We can
        do this separately...

        Maybe we should pursue a growing buffer size...? Ie, where each newly
        added buffer is bigger than the one before (like ArrayUtil.oversize's
        growth function)... I realize that adds complexity
        (RAMInputStream.seek is more fun), but this would let tiny files use
        tiny RAM and huge files use few buffers. Ie, RAMDir would scale up
        and scale down well.

        Separately: I noticed we still have IndexOutput.setLength, but, nobody
        calls it anymore I think? (In 3.x we call this when creating a CFS).
        Maybe we should remove it...

        Show
        Michael McCandless added a comment - This looks great Uwe! I'm a little worried about the tiny file case; you're checking for SEGMENTS_* now, but many other files can be much smaller than 1/64th of the estimated segment size. I wonder if we should "improve" IOContext to hold the [rough] estimated file size (not just overall segment size)... the thing is that's sort of a hassle on codec impls. Or: maybe, on closing the ROS/RAMFile, we can downsize the final buffer (yes, this means copying the bytes, but that cost is vanishingly small as the RAMDir grows). Then tiny files stay tiny, though they are still [relatively] costly to create... I don't this RAMDir.createOutput should publish the RAMFile until the ROS is closed? Ie, you are not allowed to openInput on something still opened with createOutput in any Lucene Dir impl..? This would allow us to make RAMFile frozen (eg if ROS holds its own buffers and then creates RAMFile on close), that requires no sync when reading? I also don't think RAMFile should be public, ie, the only way to make changes to a file stored in a RAMDir is via RAMOutputStream. We can do this separately... Maybe we should pursue a growing buffer size...? Ie, where each newly added buffer is bigger than the one before (like ArrayUtil.oversize's growth function)... I realize that adds complexity (RAMInputStream.seek is more fun), but this would let tiny files use tiny RAM and huge files use few buffers. Ie, RAMDir would scale up and scale down well. Separately: I noticed we still have IndexOutput.setLength, but, nobody calls it anymore I think? (In 3.x we call this when creating a CFS). Maybe we should remove it...
        Hide
        Robert Muir added a comment -

        I'm a little worried about the tiny file case; you're checking for
        SEGMENTS_* now, but many other files can be much smaller than 1/64th
        of the estimated segment size.

        I wonder if we should "improve" IOContext to hold the [rough]
        estimated file size (not just overall segment size)... the thing is
        that's sort of a hassle on codec impls.

        Maybe its enough for IOContext to specify that its writing a 'metadata'
        file? These are all the tiny ones (fieldinfos, segmentinfos, .cfe, etc),
        as opposed to 'real files' like frq or prx that are expected to be possibly huge.

        Show
        Robert Muir added a comment - I'm a little worried about the tiny file case; you're checking for SEGMENTS_* now, but many other files can be much smaller than 1/64th of the estimated segment size. I wonder if we should "improve" IOContext to hold the [rough] estimated file size (not just overall segment size)... the thing is that's sort of a hassle on codec impls. Maybe its enough for IOContext to specify that its writing a 'metadata' file? These are all the tiny ones (fieldinfos, segmentinfos, .cfe, etc), as opposed to 'real files' like frq or prx that are expected to be possibly huge.
        Hide
        Uwe Schindler added a comment - - edited

        Robert: That was the first idea that came to my mind, too. I think thats a good idea. It especially strange that the segments_xx/segments.gen file (which is not part of the current segment) is written with MERGE/FLUSH context. It should be written with a standard context? Or do I miss something? (This was the reason why I added the file name check). Initially I was expecting that writing the commit is done with a separate IOContext, but it isn't - the noisy debugging helps.

        Show
        Uwe Schindler added a comment - - edited Robert: That was the first idea that came to my mind, too. I think thats a good idea. It especially strange that the segments_xx/segments.gen file (which is not part of the current segment) is written with MERGE/FLUSH context. It should be written with a standard context? Or do I miss something? (This was the reason why I added the file name check). Initially I was expecting that writing the commit is done with a separate IOContext, but it isn't - the noisy debugging helps.
        Hide
        Robert Muir added a comment -

        I think if we were to implement it this way, its not a burden on codecs.
        By default, somewhere in lucene core inits the codec APIs with a context always.
        For example SegmentInfos.write():

        infosWriter.writeInfos(directory, segmentFileName, codec.getName(), this, IOContext.DEFAULT);
        

        and DocFieldProcessor/SegmentMerger for fieldinfos:

        infosWriter.write(state.directory, state.segmentName, state.fieldInfos, IOContext.DEFAULT);
        

        These guys would just set this in the IOContext. Most/All codecs just pass this along.
        If a codec wants to ignore the IOContext and lie about it, thats its own choice.
        So I think its an easy change.

        Show
        Robert Muir added a comment - I think if we were to implement it this way, its not a burden on codecs. By default, somewhere in lucene core inits the codec APIs with a context always. For example SegmentInfos.write(): infosWriter.writeInfos(directory, segmentFileName, codec.getName(), this , IOContext.DEFAULT); and DocFieldProcessor/SegmentMerger for fieldinfos: infosWriter.write(state.directory, state.segmentName, state.fieldInfos, IOContext.DEFAULT); These guys would just set this in the IOContext. Most/All codecs just pass this along. If a codec wants to ignore the IOContext and lie about it, thats its own choice. So I think its an easy change.
        Hide
        Michael McCandless added a comment -

        I'm torn on the binary "metadata" idea... not all files cleanly fall into one category?

        Eg what about live doc bits? It can easily be tiny (we write a sparse set sparsely).

        Indices w/ immense docs will also start to look like they have tiny files that are not metadata (eg, fdx file, if they don't store fields).

        Show
        Michael McCandless added a comment - I'm torn on the binary "metadata" idea... not all files cleanly fall into one category? Eg what about live doc bits? It can easily be tiny (we write a sparse set sparsely). Indices w/ immense docs will also start to look like they have tiny files that are not metadata (eg, fdx file, if they don't store fields).
        Hide
        Robert Muir added a comment -

        But also codecs that write their own private tiny metadata files (like .per from PerFieldPostingsFormat)
        should set this in the context.

        Show
        Robert Muir added a comment - But also codecs that write their own private tiny metadata files (like .per from PerFieldPostingsFormat) should set this in the context.
        Hide
        Robert Muir added a comment -

        Live docs aren't a metadata. I think you are conflating 'tiny' with 'metadata'.

        I'm saying we should declare its metadata, thats all. This is pretty black and white!

        IF a directory wants to, as a heuristic, interpret metadata == tiny, then thats fine,
        but thats separate.

        Show
        Robert Muir added a comment - Live docs aren't a metadata. I think you are conflating 'tiny' with 'metadata'. I'm saying we should declare its metadata, thats all. This is pretty black and white! IF a directory wants to, as a heuristic, interpret metadata == tiny, then thats fine, but thats separate.
        Hide
        Uwe Schindler added a comment -

        I also don't think RAMFile should be public, ie, the only way to make
        changes to a file stored in a RAMDir is via RAMOutputStream. We can
        do this separately...

        RAMFile's public ctor without directory is only used by PrefixCodedTerms, itself used only by FrozenBufferedDeletes. I don't really see the real use case to do this like that. We can maybe replace that using a FST (it is already sorted by BytesRef) or using PagesBytes? Alternatively replace the whole thing with OutputStreamDataOutput(new ByteArrayOutputStream())?

        Show
        Uwe Schindler added a comment - I also don't think RAMFile should be public, ie, the only way to make changes to a file stored in a RAMDir is via RAMOutputStream. We can do this separately... RAMFile's public ctor without directory is only used by PrefixCodedTerms, itself used only by FrozenBufferedDeletes. I don't really see the real use case to do this like that. We can maybe replace that using a FST (it is already sorted by BytesRef) or using PagesBytes? Alternatively replace the whole thing with OutputStreamDataOutput(new ByteArrayOutputStream())?
        Hide
        Robert Muir added a comment -

        oops: I did that, sorry.

        it just wants a thing that combined byte[] slices that you can
        get datainput from, so it seemed like the right thing?

        Show
        Robert Muir added a comment - oops: I did that, sorry. it just wants a thing that combined byte[] slices that you can get datainput from, so it seemed like the right thing?
        Hide
        Uwe Schindler added a comment -

        Hi,
        I will soon work again on that. I have some comments:

        • We can remove the heavy synchronization bottleneck on RAMFile. RAMFile should only have final fields and should be created after the file is written., This should improve performance alltogether. The current synchronization is needed to "emulate" real file system behaviour (file is visible in directory with 0 bytes once created). This behaviour is not needed at all by Lucene. We should make the file visible in the ConcurrentHashMap of RAMDirectory once the IndexOutput is closed! We should create the RAMFile instance on this stage not before (so all is final). By this all sync on RAMFile can be removed.
        • We should add IOContext.META
        • Maybe we should rename RAMDirectory in trunk / 4.x to HeapDirectory. So we can have other impls like DirectBufferDirectory or whatever (see Shay Bannon's LUCENE-2292)
        Show
        Uwe Schindler added a comment - Hi, I will soon work again on that. I have some comments: We can remove the heavy synchronization bottleneck on RAMFile. RAMFile should only have final fields and should be created after the file is written., This should improve performance alltogether. The current synchronization is needed to "emulate" real file system behaviour (file is visible in directory with 0 bytes once created). This behaviour is not needed at all by Lucene. We should make the file visible in the ConcurrentHashMap of RAMDirectory once the IndexOutput is closed! We should create the RAMFile instance on this stage not before (so all is final). By this all sync on RAMFile can be removed. We should add IOContext.META Maybe we should rename RAMDirectory in trunk / 4.x to HeapDirectory. So we can have other impls like DirectBufferDirectory or whatever (see Shay Bannon's LUCENE-2292 )
        Hide
        Robert Muir added a comment -

        There is a patch somewhere to factor out the MMapIndexInput into a general ByteBufferIndexInput if you follow the same rules.

        I think we can just use that? you can have a direct and array-backed version (just have some hook to allocate a new ByteBuffer of some size). I think we should just start with the array-backed one for simplicity. Maybe the direct one can avoid some arrays bounds checks, but otherwise its not really
        related to the stupidity of current ramdirectory. arrays are fine, its just they shouldnt be so tiny etc.

        Show
        Robert Muir added a comment - There is a patch somewhere to factor out the MMapIndexInput into a general ByteBufferIndexInput if you follow the same rules. I think we can just use that? you can have a direct and array-backed version (just have some hook to allocate a new ByteBuffer of some size). I think we should just start with the array-backed one for simplicity. Maybe the direct one can avoid some arrays bounds checks, but otherwise its not really related to the stupidity of current ramdirectory. arrays are fine, its just they shouldnt be so tiny etc.
        Hide
        Uwe Schindler added a comment -

        Patch updated to trunk. I will work on this soon.

        Show
        Uwe Schindler added a comment - Patch updated to trunk. I will work on this soon.
        Hide
        Steve Rowe added a comment -

        Bulk move 4.4 issues to 4.5 and 5.0

        Show
        Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
        Hide
        Uwe Schindler added a comment -

        Move issue to Lucene 4.9.

        Show
        Uwe Schindler added a comment - Move issue to Lucene 4.9.

          People

          • Assignee:
            Uwe Schindler
            Reporter:
            Uwe Schindler
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development