Lucene - Core
  1. Lucene - Core
  2. LUCENE-1743

MMapDirectory should only mmap large files, small files should be opened using SimpleFS/NIOFS

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.9
    • Fix Version/s: 4.9, 5.0
    • Component/s: core/store
    • Labels:
    • Lucene Fields:
      New

      Description

      This is a followup to LUCENE-1741:
      Javadocs state (in FileChannel#map): "For most operating systems, mapping a file into memory is more expensive than reading or writing a few tens of kilobytes of data via the usual read and write methods. From the standpoint of performance it is generally only worth mapping relatively large files into memory."
      MMapDirectory should get a user-configureable size parameter that is a lower limit for mmapping files. All files with a size<limit should be opened using a conventional IndexInput from SimpleFS or NIO (another configuration option for the fallback?).

        Activity

        Uwe Schindler created issue -
        Hide
        Eks Dev added a comment -

        indeed! obvious idea,

        the only thing I do not like with it is making these hidden, deceptive decisions "I said I want MMapDirectory and someone else decided something else for me"... it does not matter if we have conses here now, it may change tomorrow

        probably better way would be to turbo charge FileSwitchDirectory with sexy parametrization options,
        MMapDirectory <- F(fileExtension, minSize, maxSize) // If <fileExtension> and file size less than <maxSize> and greater than <minSize> than open file with MMapDirectory... than go on on next rule... (can be designed upside down as well... changes nothing in idea)

        the same for RAMDir, NIO, FS...

        With this, we can make UwesBestOfMMapDirectoryFor32BitOSs (your proposal here) or
        HighlyConcurentForWindows64WithTermDictionaryInRamAndStoredFieldsOnDiskDirectory just for me

        So the most of the end users take some smart defaults we provide in core, and freaks (Expert users in official lingo have their job easy, just to configure TurboChargedFileSwitchDirectory

        Should be easy to come up with clean design for these "Concrete Directory selection rules" by keeping concrete Directories "pure"

        Cheers, Eks

        Show
        Eks Dev added a comment - indeed! obvious idea, the only thing I do not like with it is making these hidden, deceptive decisions "I said I want MMapDirectory and someone else decided something else for me"... it does not matter if we have conses here now, it may change tomorrow probably better way would be to turbo charge FileSwitchDirectory with sexy parametrization options, MMapDirectory <- F(fileExtension, minSize, maxSize) // If <fileExtension> and file size less than <maxSize> and greater than <minSize> than open file with MMapDirectory... than go on on next rule... (can be designed upside down as well... changes nothing in idea) the same for RAMDir, NIO, FS... With this, we can make UwesBestOfMMapDirectoryFor32BitOSs (your proposal here) or HighlyConcurentForWindows64WithTermDictionaryInRamAndStoredFieldsOnDiskDirectory just for me So the most of the end users take some smart defaults we provide in core, and freaks (Expert users in official lingo have their job easy, just to configure TurboChargedFileSwitchDirectory Should be easy to come up with clean design for these "Concrete Directory selection rules" by keeping concrete Directories "pure" Cheers, Eks
        Hide
        Michael McCandless added a comment -

        At a minimum we should make FileSwitchDirectory friendly for subclassing, eg so you can override the currently private getDirectory method to implement your own custom logic. Hmm... we should somehow pass the "context" (at least 'read' vs 'write') to getDirectory() as well...

        Show
        Michael McCandless added a comment - At a minimum we should make FileSwitchDirectory friendly for subclassing, eg so you can override the currently private getDirectory method to implement your own custom logic. Hmm... we should somehow pass the "context" (at least 'read' vs 'write') to getDirectory() as well...
        Hide
        Uwe Schindler added a comment - - edited

        HighlyConcurentForWindows64WithTermDictionaryInRamAndStoredFieldsOnDiskDirectory just for me

        By the way, for MMapDirectory the MappedByteBuffer.load() method should be somehow accesible/configureable to create this TermDictionaryInRam part (IndexInput would call load() and tells the OS to swap as much as possible of the mmaped file into RAM). Just an idea.

        The Hyper FileSwitchDirectory was my idea yesterday, too. As Mike said, at least the getDirectory() should be configureable.

        And for some good defaults, a factory could be provided like getUwesBestOfMMapDirectoryFor32BitOSs(File, LockFactory). What I do not like with the current FileSwitchDir is the fact, that you must create instances with the same Dir and LockFactory for each sub-directory (e.g. what happens if you use 2 different LockFactories on the same physical dir inside a FileSwitchDir?). Maybe the FileSwitchDirectory could just get the File and LockFactory once and creates the instances? Many ideas...

        Show
        Uwe Schindler added a comment - - edited HighlyConcurentForWindows64WithTermDictionaryInRamAndStoredFieldsOnDiskDirectory just for me By the way, for MMapDirectory the MappedByteBuffer.load() method should be somehow accesible/configureable to create this TermDictionaryInRam part (IndexInput would call load() and tells the OS to swap as much as possible of the mmaped file into RAM). Just an idea. The Hyper FileSwitchDirectory was my idea yesterday, too. As Mike said, at least the getDirectory() should be configureable. And for some good defaults, a factory could be provided like getUwesBestOfMMapDirectoryFor32BitOSs(File, LockFactory). What I do not like with the current FileSwitchDir is the fact, that you must create instances with the same Dir and LockFactory for each sub-directory (e.g. what happens if you use 2 different LockFactories on the same physical dir inside a FileSwitchDir?). Maybe the FileSwitchDirectory could just get the File and LockFactory once and creates the instances? Many ideas...
        Hide
        Eks Dev added a comment -

        right, it is not everything about reading index, you have to write it as well...

        why not making it an abstract class with
        abstract Directory getDirectory(String file, int minSize, int maxSize, String [read/write/append], String context);
        String getName(); // for logging

        What do you understand under "context"? Something along the lines /Give me directory for "segment merges", "read only" for search./
        ...Maybe one day we will have possibility not to kill OS cache by merging,

        Show
        Eks Dev added a comment - right, it is not everything about reading index, you have to write it as well... why not making it an abstract class with abstract Directory getDirectory(String file, int minSize, int maxSize, String [read/write/append] , String context); String getName(); // for logging What do you understand under "context"? Something along the lines /Give me directory for "segment merges", "read only" for search./ ...Maybe one day we will have possibility not to kill OS cache by merging,
        Hide
        Uwe Schindler added a comment -

        Another idea to think about:
        Maybe we make the base FSDirectory configureable, that you can define rules for the IndexInput or IndexOutput instances used. Why create three directories, when you could only use on that just decides, what is the right IndexInput/IndexOutput? In this case you need no MMapDirectory, no NIOFSDir, no SimpleFSDir. The basic list files, rename and so on are always identical (for FSDirs). The only difference is the IndexInput and IndexOutput.

        In this case you would only have one LockFactory, which would be good in my opinion (see above).

        So we only provide the Input/Output classes as separate top-level classes: SimpleFSIndexInput, MMapIndexInput (internal two classes for chunking or not), NIOIndexInput and for output at the beginning only the current SimpleFSIndexOutput. You can then configure your FSDirectory to use different impls depending on an method or rule defined like above. For backwards compatibility, the current MMapDirectory and so on are simple subclasses of this universal FSDir with a fixed configuration.

        FileSwitchDirectory should only used, when you really want to separate two directories in complete (stored fields on disk, index on SSD in two physical different dirs).

        Show
        Uwe Schindler added a comment - Another idea to think about: Maybe we make the base FSDirectory configureable, that you can define rules for the IndexInput or IndexOutput instances used. Why create three directories, when you could only use on that just decides, what is the right IndexInput/IndexOutput? In this case you need no MMapDirectory, no NIOFSDir, no SimpleFSDir. The basic list files, rename and so on are always identical (for FSDirs). The only difference is the IndexInput and IndexOutput. In this case you would only have one LockFactory, which would be good in my opinion (see above). So we only provide the Input/Output classes as separate top-level classes: SimpleFSIndexInput, MMapIndexInput (internal two classes for chunking or not), NIOIndexInput and for output at the beginning only the current SimpleFSIndexOutput. You can then configure your FSDirectory to use different impls depending on an method or rule defined like above. For backwards compatibility, the current MMapDirectory and so on are simple subclasses of this universal FSDir with a fixed configuration. FileSwitchDirectory should only used, when you really want to separate two directories in complete (stored fields on disk, index on SSD in two physical different dirs).
        Hide
        Earwin Burrfoot added a comment - - edited

        The initial motive for the issue seems wrong to me.

        "For most operating systems, mapping a file into memory is more expensive than reading or writing a few tens of kilobytes of data via the usual read and write methods. From the standpoint of performance it is generally only worth mapping relatively large files into memory."

        It is probably right if you're doing a single read through the file. If you're opening/mapping it and do thousands of repeated reads, mmap would be superior, because after initial mapping it's just a memory access VS system call for file.read().

        Add: In case you're not doing repeated reads, and just read these small files once from time to time, you can totally neglect speed difference between mmap and fopen. At least it doesn't warrant increased complexity.

        Show
        Earwin Burrfoot added a comment - - edited The initial motive for the issue seems wrong to me. "For most operating systems, mapping a file into memory is more expensive than reading or writing a few tens of kilobytes of data via the usual read and write methods. From the standpoint of performance it is generally only worth mapping relatively large files into memory." It is probably right if you're doing a single read through the file. If you're opening/mapping it and do thousands of repeated reads, mmap would be superior, because after initial mapping it's just a memory access VS system call for file.read(). Add: In case you're not doing repeated reads, and just read these small files once from time to time, you can totally neglect speed difference between mmap and fopen. At least it doesn't warrant increased complexity.
        Hide
        Uwe Schindler added a comment -

        A typical example, where MMap would be the wrong thing are e.g. norms. They are read one time and then the file is not accessed anymore. It would only be cool, if MMapDir could directly return the mapped array (but MappedByteBuffer.array() does not work) and use it as norms array. That would be cool.

        My problem was more with all these small files like segments_XXXX and segments.gen or *.del files. They are small and only used one time. Mapping them is just waste of resources and completely useless (even slower that opening them directly). This is why I said, some limit or file extension based mapping would be good.

        Show
        Uwe Schindler added a comment - A typical example, where MMap would be the wrong thing are e.g. norms. They are read one time and then the file is not accessed anymore. It would only be cool, if MMapDir could directly return the mapped array (but MappedByteBuffer.array() does not work) and use it as norms array. That would be cool. My problem was more with all these small files like segments_XXXX and segments.gen or *.del files. They are small and only used one time. Mapping them is just waste of resources and completely useless (even slower that opening them directly). This is why I said, some limit or file extension based mapping would be good.
        Hide
        Earwin Burrfoot added a comment -

        My problem was more with all these small files like segments_XXXX and segments.gen or *.del files. They are small and only used one time.

        I can only reiterate my point. These files aren't opened like 10k files per second, so your win is going to be in the order of microseconds per reopen - at the cost of increased complexity.

        Show
        Earwin Burrfoot added a comment - My problem was more with all these small files like segments_XXXX and segments.gen or *.del files. They are small and only used one time. I can only reiterate my point. These files aren't opened like 10k files per second, so your win is going to be in the order of microseconds per reopen - at the cost of increased complexity.
        Mark Thomas made changes -
        Field Original Value New Value
        Workflow jira [ 12470484 ] Default workflow, editable Closed status [ 12563127 ]
        Mark Thomas made changes -
        Workflow Default workflow, editable Closed status [ 12563127 ] jira [ 12584210 ]
        Robert Muir made changes -
        Fix Version/s 4.1 [ 12321140 ]
        Fix Version/s 4.0 [ 12314025 ]
        Steve Rowe made changes -
        Fix Version/s 4.2 [ 12323899 ]
        Fix Version/s 4.1 [ 12321140 ]
        Robert Muir made changes -
        Fix Version/s 4.3 [ 12324143 ]
        Fix Version/s 4.2 [ 12323899 ]
        Uwe Schindler made changes -
        Fix Version/s 4.4 [ 12324323 ]
        Fix Version/s 4.3 [ 12324143 ]
        Greg Bowyer made changes -
        Labels dead
        Hide
        Steve Rowe added a comment -

        Bulk move 4.4 issues to 4.5 and 5.0

        Show
        Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
        Steve Rowe made changes -
        Fix Version/s 5.0 [ 12321663 ]
        Fix Version/s 4.5 [ 12324742 ]
        Fix Version/s 4.4 [ 12324323 ]
        Adrien Grand made changes -
        Fix Version/s 4.6 [ 12324999 ]
        Fix Version/s 5.0 [ 12321663 ]
        Fix Version/s 4.5 [ 12324742 ]
        Simon Willnauer made changes -
        Fix Version/s 4.7 [ 12325572 ]
        Fix Version/s 4.6 [ 12324999 ]
        David Smiley made changes -
        Fix Version/s 4.8 [ 12326269 ]
        Fix Version/s 4.7 [ 12325572 ]
        Hide
        Uwe Schindler added a comment -

        Move issue to Lucene 4.9.

        Show
        Uwe Schindler added a comment - Move issue to Lucene 4.9.
        Uwe Schindler made changes -
        Fix Version/s 4.9 [ 12326730 ]
        Fix Version/s 5.0 [ 12321663 ]
        Fix Version/s 4.8 [ 12326269 ]

          People

          • Assignee:
            Uwe Schindler
            Reporter:
            Uwe Schindler
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development