Lucene - Core
  1. Lucene - Core
  2. LUCENE-5951

Detect when index is on SSD and set dynamic defaults

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.0, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      E.g. ConcurrentMergeScheduler should default maxMergeThreads to 3 if it's on SSD and 1 if it's on spinning disks.

      I think the new NIO2 APIs can let us figure out which device we are mounted on, and from there maybe we can do os-specific stuff e.g. look at /sys/block/dev/queue/rotational to see if it's spinning storage or not ...

      1. LUCENE-5951.patch
        39 kB
        Michael McCandless
      2. LUCENE-5951.patch
        37 kB
        Robert Muir
      3. LUCENE-5951.patch
        18 kB
        Robert Muir
      4. LUCENE-5951.patch
        18 kB
        Michael McCandless
      5. LUCENE-5951.patch
        15 kB
        Michael McCandless
      6. LUCENE-5951.patch
        15 kB
        Robert Muir
      7. LUCENE-5951.patch
        14 kB
        Michael McCandless

        Activity

        Hide
        Michael McCandless added a comment -

        Patch w/ tests.

        After I told Rob it's impossible to detect if a Path is backed by an
        SSD with pure Java, he of course went and did it

        I added his isSSD method to IOUtils: it's a rough, Linux-only (for
        now) method to determine if a Path is backed by an SSD (thank you
        Rob!).

        Then I fixed CMS to have dynamic defaults, so that the first time
        merge is invoked, it checks the writer's directory. If it's on an SSD,
        it uses the pre LUCENE-4661 defaults (good for SSDs), else it uses the
        current defaults (good for spinning disks). It also logs this to infoStream
        so we can use that to see what it did.

        Show
        Michael McCandless added a comment - Patch w/ tests. After I told Rob it's impossible to detect if a Path is backed by an SSD with pure Java, he of course went and did it I added his isSSD method to IOUtils: it's a rough, Linux-only (for now) method to determine if a Path is backed by an SSD (thank you Rob!). Then I fixed CMS to have dynamic defaults, so that the first time merge is invoked, it checks the writer's directory. If it's on an SSD, it uses the pre LUCENE-4661 defaults (good for SSDs), else it uses the current defaults (good for spinning disks). It also logs this to infoStream so we can use that to see what it did.
        Hide
        Adrien Grand added a comment -

        +1

        Show
        Adrien Grand added a comment - +1
        Hide
        Shalin Shekhar Mangar added a comment -

        +1

        Very nice!

        Show
        Shalin Shekhar Mangar added a comment - +1 Very nice!
        Hide
        Robert Muir added a comment -

        Try to improve the SSD detector more to make it safe to use for this purpose. It was mostly a joke and really ... not good code.

        • fix contract to throw IOException when incoming path does not exist. This is important not to mask.
        • for our internal heuristics, we could easily trigger SecurityException / AIOOBE, we are doing things that are not guaranteed at all. So those are important to mask.
        • don't use Files.readAllBytes, that method is too dangerous in these heuristics. Just read one byte.

        We should improve the getDeviceName too, but its less critical.

        Show
        Robert Muir added a comment - Try to improve the SSD detector more to make it safe to use for this purpose. It was mostly a joke and really ... not good code. fix contract to throw IOException when incoming path does not exist. This is important not to mask. for our internal heuristics, we could easily trigger SecurityException / AIOOBE, we are doing things that are not guaranteed at all. So those are important to mask. don't use Files.readAllBytes, that method is too dangerous in these heuristics. Just read one byte. We should improve the getDeviceName too, but its less critical.
        Hide
        Michael McCandless added a comment -

        New patch, renaming to "spins", and also unwrapping FileSwitchDir, and returning "false" for RAMDirectory.

        Show
        Michael McCandless added a comment - New patch, renaming to "spins", and also unwrapping FileSwitchDir, and returning "false" for RAMDirectory.
        Hide
        Robert Muir added a comment -

        Thanks, i will take another crack at FSDir logic. we should be able to handle tmpfs etc better here (likely on mac, too).

        Show
        Robert Muir added a comment - Thanks, i will take another crack at FSDir logic. we should be able to handle tmpfs etc better here (likely on mac, too).
        Hide
        Hoss Man added a comment -
        +  public static int AUTO_DETECT_MERGES_AND_THREADS = -1;
        

        ...that's suppose to be a final (sentinel value) correct? nothing should be allowed modify it at run time?

        +  public synchronized void setMaxMergesAndThreads(int maxMergeCount, int maxThreadCount) {
        +    if (maxMergeCount == AUTO_DETECT_MERGES_AND_THREADS && maxThreadCount == AUTO_DETECT_MERGES_AND_THREADS) {
        +      // OK
        +      maxMergeCount = AUTO_DETECT_MERGES_AND_THREADS;
        +      maxThreadCount = AUTO_DETECT_MERGES_AND_THREADS;
        

        ...is that suppose to be setting this.maxMergeCount and this.maxThreadCount ? ... it looks like it it's just a No-Op (and this.maxMergeCount and this.maxThreadCount never get set in this case?)

        +  public static boolean spins(Path path) throws IOException {
        

        ...is it worth using a terinary enum (or "nullable "Boolean") here to track the diff between:

        • confident it's a spinning disk
        • confident it's not a spinning disk
        • unknown what type of storage this is

        ...that way we can make the default behavior of CMS conservative, and only be aggressive if we are confident it's not-spinning; but app devs can be more aggressive – call the same spins() utility and only use conservative values if they are confident it's a spinning disk, otherwise call setMaxMergesAndThreads with higher values.

        Show
        Hoss Man added a comment - + public static int AUTO_DETECT_MERGES_AND_THREADS = -1; ...that's suppose to be a final (sentinel value) correct? nothing should be allowed modify it at run time? + public synchronized void setMaxMergesAndThreads(int maxMergeCount, int maxThreadCount) { + if (maxMergeCount == AUTO_DETECT_MERGES_AND_THREADS && maxThreadCount == AUTO_DETECT_MERGES_AND_THREADS) { + // OK + maxMergeCount = AUTO_DETECT_MERGES_AND_THREADS; + maxThreadCount = AUTO_DETECT_MERGES_AND_THREADS; ...is that suppose to be setting this.maxMergeCount and this.maxThreadCount ? ... it looks like it it's just a No-Op (and this.maxMergeCount and this.maxThreadCount never get set in this case?) + public static boolean spins(Path path) throws IOException { ...is it worth using a terinary enum (or "nullable "Boolean") here to track the diff between: confident it's a spinning disk confident it's not a spinning disk unknown what type of storage this is ...that way we can make the default behavior of CMS conservative, and only be aggressive if we are confident it's not-spinning; but app devs can be more aggressive – call the same spins() utility and only use conservative values if they are confident it's a spinning disk, otherwise call setMaxMergesAndThreads with higher values.
        Hide
        Robert Muir added a comment -

        I dont think we should make things complicated for app developers. We are not writing a generic spins() method for developers, its a lucene.internal method for good defaults.

        Show
        Robert Muir added a comment - I dont think we should make things complicated for app developers. We are not writing a generic spins() method for developers, its a lucene.internal method for good defaults.
        Hide
        Michael McCandless added a comment -

        ...that's suppose to be a final (sentinel value) correct? nothing should be allowed modify it at run time?

        Whoa, nice catch! I'll fix.

        . it looks like it it's just a No-Op (

        Gak, good catch I'll add a test that exposes this then fix it.

        Show
        Michael McCandless added a comment - ...that's suppose to be a final (sentinel value) correct? nothing should be allowed modify it at run time? Whoa, nice catch! I'll fix. . it looks like it it's just a No-Op ( Gak, good catch I'll add a test that exposes this then fix it.
        Hide
        Michael McCandless added a comment -

        New patch fixing Hoss's issues (thanks!).

        Show
        Michael McCandless added a comment - New patch fixing Hoss's issues (thanks!).
        Hide
        Robert Muir added a comment -

        I cleaned up the code to remove the hashmap, not try to lookup 'rotational' for obviously bogus names (like nfs), return false for tmpfs, etc.

        Show
        Robert Muir added a comment - I cleaned up the code to remove the hashmap, not try to lookup 'rotational' for obviously bogus names (like nfs), return false for tmpfs, etc.
        Hide
        Hoss Man added a comment -
        +    for (FileStore store : FileSystems.getDefault().getFileStores()) {
        +      String desc = store.toString();
        +      int start = desc.lastIndexOf('(');
        +      int end = desc.indexOf(')', start);
        +      mountToDevice.put(desc.substring(0, start-1), desc.substring(start+1, end));
        +    }
        

        ...I don't see anything in the javadocs for FileStore making any guarantees about the toString – so the results of these lastIndexOf and indexOf calls should probably have bounds checks to prevent IOOBE from substring. (either that or just catch the IOOBE and give up)

        +        if (!devName.isEmpty() && Character.isDigit(devName.charAt(devName.length()-1))) {
        +          devName = devName.substring(0, devName.length()-1);
        

        ...what about people with lots of partitions? ie: "/dev/sda42"

        Show
        Hoss Man added a comment - + for (FileStore store : FileSystems.getDefault().getFileStores()) { + String desc = store.toString(); + int start = desc.lastIndexOf('('); + int end = desc.indexOf(')', start); + mountToDevice.put(desc.substring(0, start-1), desc.substring(start+1, end)); + } ...I don't see anything in the javadocs for FileStore making any guarantees about the toString – so the results of these lastIndexOf and indexOf calls should probably have bounds checks to prevent IOOBE from substring. (either that or just catch the IOOBE and give up) + if (!devName.isEmpty() && Character.isDigit(devName.charAt(devName.length()-1))) { + devName = devName.substring(0, devName.length()-1); ...what about people with lots of partitions? ie: "/dev/sda42"
        Hide
        Robert Muir added a comment -

        ...I don't see anything in the javadocs for FileStore making any guarantees about the toString – so the results of these lastIndexOf and indexOf calls should probably have bounds checks to prevent IOOBE from substring. (either that or just catch the IOOBE and give up)

        Maybe you missed the try-catch when looking at the patch.

        } catch (Exception ioe) {
          // our crazy heuristics can easily trigger SecurityException, AIOOBE, etc ...
          return true;
        }
        

        ...what about people with lots of partitions? ie: "/dev/sda42"

        Maybe if you quoted more of the context, you would see this was in a loop?

        Show
        Robert Muir added a comment - ...I don't see anything in the javadocs for FileStore making any guarantees about the toString – so the results of these lastIndexOf and indexOf calls should probably have bounds checks to prevent IOOBE from substring. (either that or just catch the IOOBE and give up) Maybe you missed the try-catch when looking at the patch. } catch (Exception ioe) { // our crazy heuristics can easily trigger SecurityException, AIOOBE, etc ... return true ; } ...what about people with lots of partitions? ie: "/dev/sda42" Maybe if you quoted more of the context, you would see this was in a loop?
        Hide
        Uwe Schindler added a comment - - edited

        +1
        The heavy funny heuristics method is a masterpiece of coding in contrast to Hadoop's detection. I am so happy that it does not exec "df" or "mount" commands! Many thanks Java 7 is cool!

        Show
        Uwe Schindler added a comment - - edited +1 The heavy funny heuristics method is a masterpiece of coding in contrast to Hadoop's detection. I am so happy that it does not exec "df" or "mount" commands! Many thanks Java 7 is cool!
        Hide
        Hoss Man added a comment -

        Maybe you missed the try-catch when looking at the patch.

        that still seems sketchy because it's only in the spins() method ... it's going to be trappy if/when this code gets refactored and getDeviceName is called from somewhere else. why not just include some basic exception handling in getDeviceName as well?

        Maybe if you quoted more of the context, you would see this was in a loop?

        I did see that, but i didn't realize the purpose was to chomp away at individual digits in the path until it resolved as a valid file...

        too much voodoo for me, i'll shut up now.

        Show
        Hoss Man added a comment - Maybe you missed the try-catch when looking at the patch. that still seems sketchy because it's only in the spins() method ... it's going to be trappy if/when this code gets refactored and getDeviceName is called from somewhere else. why not just include some basic exception handling in getDeviceName as well? Maybe if you quoted more of the context, you would see this was in a loop? I did see that, but i didn't realize the purpose was to chomp away at individual digits in the path until it resolved as a valid file... too much voodoo for me, i'll shut up now.
        Hide
        Robert Muir added a comment -

        The method is private. its not getting called from anywhere else. when an exception strikes we need it, so that it causes the whole thing to return true. it also has a comment above it '// these are hacks that are not guaranteed'.

        Show
        Robert Muir added a comment - The method is private. its not getting called from anywhere else. when an exception strikes we need it, so that it causes the whole thing to return true. it also has a comment above it '// these are hacks that are not guaranteed'.
        Hide
        Robert Muir added a comment -

        I did see that, but i didn't realize the purpose was to chomp away at individual digits in the path until it resolved as a valid file...

        It has this comment:

              // tear away partition numbers until we find it.
        
        Show
        Robert Muir added a comment - I did see that, but i didn't realize the purpose was to chomp away at individual digits in the path until it resolved as a valid file... It has this comment: // tear away partition numbers until we find it.
        Hide
        Robert Muir added a comment -

        I added tests to the previous patch.

        I only pulled the main logic into a separate package-private method (spinsLinux) so we can test all logic with mocks directly on all operating systems and not mask any exceptions or problems.

        Show
        Robert Muir added a comment - I added tests to the previous patch. I only pulled the main logic into a separate package-private method (spinsLinux) so we can test all logic with mocks directly on all operating systems and not mask any exceptions or problems.
        Hide
        Michael McCandless added a comment -

        Thanks Rob, I love the new tests

        I'll revert my over-zealous changes to CreateIndexTask (just fix to use the new default) and commit soon ...

        Show
        Michael McCandless added a comment - Thanks Rob, I love the new tests I'll revert my over-zealous changes to CreateIndexTask (just fix to use the new default) and commit soon ...
        Hide
        Uwe Schindler added a comment - - edited

        I am interested to see, if the detection works correctly on Policeman Jenkins. This machine has an SSD, so what is the best way to see from test output if it detected an SSD? To me the algorithm looks correct!

        serv1:~# mount
        /dev/md1 on / type ext3 (rw)
        proc on /proc type proc (rw)
        sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
        none on /sys/fs/fuse/connections type fusectl (rw)
        none on /sys/kernel/debug type debugfs (rw)
        none on /sys/kernel/security type securityfs (rw)
        udev on /dev type devtmpfs (rw,mode=0755)
        devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
        tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
        none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
        none on /run/shm type tmpfs (rw,nosuid,nodev)
        /dev/md0 on /boot type ext2 (rw)
        /dev/sdc1 on /mnt/ssd type ext4 (rw,noatime,discard)
        serv1:~# cat /sys/block/sda/queue/rotational
        1
        serv1:~# cat /sys/block/sdb/queue/rotational
        1
        serv1:~# cat /sys/block/sdc/queue/rotational
        0
        serv1:~#
        

        DYI: The Worksspace is on /mnt/ssd.

        Show
        Uwe Schindler added a comment - - edited I am interested to see, if the detection works correctly on Policeman Jenkins. This machine has an SSD, so what is the best way to see from test output if it detected an SSD? To me the algorithm looks correct! serv1:~# mount /dev/md1 on / type ext3 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) none on /sys/fs/fuse/connections type fusectl (rw) none on /sys/kernel/debug type debugfs (rw) none on /sys/kernel/security type securityfs (rw) udev on /dev type devtmpfs (rw,mode=0755) devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620) tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755) none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880) none on /run/shm type tmpfs (rw,nosuid,nodev) /dev/md0 on /boot type ext2 (rw) /dev/sdc1 on /mnt/ssd type ext4 (rw,noatime,discard) serv1:~# cat /sys/block/sda/queue/rotational 1 serv1:~# cat /sys/block/sdb/queue/rotational 1 serv1:~# cat /sys/block/sdc/queue/rotational 0 serv1:~# DYI: The Worksspace is on /mnt/ssd.
        Hide
        Uwe Schindler added a comment -

        I only found out that our detection may not work with LVM or MD devices, unfortunately I cannot try it out:

        http://lkml.iu.edu/hypermail/linux/kernel/1202.2/01578.html

        Theoretically, the software RAID should pass this flag down unmodified, if all devices are rotational or not. But it seems, it doesn't.

        For the policeman machine, sda and sdb (both rotational) together create an md0 mirror RAID device, which reports rotational=1. This is good, but I am not sure if this works if there are really SSDs as both raid devives (some people do this using RAID0 devices to speed up sequential reads). Maybe somebody else can report back, but I think the linux kernel puts rotational=1 for RAID devices as a fixed value or this has changed in the meantime.

        Show
        Uwe Schindler added a comment - I only found out that our detection may not work with LVM or MD devices, unfortunately I cannot try it out: http://lkml.iu.edu/hypermail/linux/kernel/1202.2/01578.html Theoretically, the software RAID should pass this flag down unmodified, if all devices are rotational or not. But it seems, it doesn't. For the policeman machine, sda and sdb (both rotational) together create an md0 mirror RAID device, which reports rotational=1. This is good, but I am not sure if this works if there are really SSDs as both raid devives (some people do this using RAID0 devices to speed up sequential reads). Maybe somebody else can report back, but I think the linux kernel puts rotational=1 for RAID devices as a fixed value or this has changed in the meantime.
        Hide
        Robert Muir added a comment -

        Then those are bugs in the linux kernel. its not our problem.

        the worst that happens, is you get the same behavior as today. People seem to have difficulty understanding this.

        Show
        Robert Muir added a comment - Then those are bugs in the linux kernel. its not our problem. the worst that happens, is you get the same behavior as today. People seem to have difficulty understanding this.
        Hide
        Uwe Schindler added a comment -

        Robert Muir: I just wanted to be sure that the inverse does not happen: So a RAID device of spinning disk suddenly returning non-spinning because of another bug in linux. My test has verified that it returns rotational=1 for my example - so I am happy. I just wanted to write this down here to have a reference that someone looked at it.

        There are other things to mention: the ssd/rotational flag also does not work correctly in VMware's VShere or VirtualBOX unless the creator of the virtual machine selects "SSD" as virtual device type (in VirtualBox you can do this). I created here a virtual machine with an virtual SSD sitting on a spinning disk... So we should document this that the whole detection only works correct, if you use raw disks on metal hardware. People should also be aware that their VSphere infrastructure is configured correctly.

        Show
        Uwe Schindler added a comment - Robert Muir : I just wanted to be sure that the inverse does not happen: So a RAID device of spinning disk suddenly returning non-spinning because of another bug in linux. My test has verified that it returns rotational=1 for my example - so I am happy. I just wanted to write this down here to have a reference that someone looked at it. There are other things to mention: the ssd/rotational flag also does not work correctly in VMware's VShere or VirtualBOX unless the creator of the virtual machine selects "SSD" as virtual device type (in VirtualBox you can do this). I created here a virtual machine with an virtual SSD sitting on a spinning disk... So we should document this that the whole detection only works correct, if you use raw disks on metal hardware. People should also be aware that their VSphere infrastructure is configured correctly.
        Hide
        Robert Muir added a comment -

        the worst that happens, is you get the same behavior as today.

        Show
        Robert Muir added a comment - the worst that happens, is you get the same behavior as today.
        Hide
        Uwe Schindler added a comment -

        If you have a spinning disk and you falsefully detect it as an SSD then its a problem...

        Show
        Uwe Schindler added a comment - If you have a spinning disk and you falsefully detect it as an SSD then its a problem...
        Hide
        Robert Muir added a comment -

        I dont think its really a problem at all. its a heuristic for defaults. If there are bugs in the linux kernel, or virtualizers, or device drivers, its not our duty to fix that. Please, complain on the linux kernel list instead.

        today its a far bigger problem that we always falsely assume you have a spinning disk, and hurt performance on any modern hardware.

        Users can always set their merge threads etc explicitly.

        Show
        Robert Muir added a comment - I dont think its really a problem at all. its a heuristic for defaults. If there are bugs in the linux kernel, or virtualizers, or device drivers, its not our duty to fix that. Please, complain on the linux kernel list instead. today its a far bigger problem that we always falsely assume you have a spinning disk, and hurt performance on any modern hardware. Users can always set their merge threads etc explicitly.
        Hide
        Uwe Schindler added a comment -

        Sorry Robert, it is my personal decision to comment on this. It was not a complaint, just a notice, so anybody who wants to lookup more on this issue, to get the relevant information.

        My problem was just that some user with a misconfigured system could get sudenly the SSD optimization on a spinning disk and then his IO system gives up So it should be documented and I just ask for some hints in the documentation, that one should take care to configure his virtual machines correctly.

        I DON'T COMPLAIN! But now I complain: Why are you attacking me? I just bring in here useful items that might help others.

        Show
        Uwe Schindler added a comment - Sorry Robert, it is my personal decision to comment on this. It was not a complaint, just a notice, so anybody who wants to lookup more on this issue, to get the relevant information. My problem was just that some user with a misconfigured system could get sudenly the SSD optimization on a spinning disk and then his IO system gives up So it should be documented and I just ask for some hints in the documentation, that one should take care to configure his virtual machines correctly. I DON'T COMPLAIN! But now I complain: Why are you attacking me? I just bring in here useful items that might help others.
        Hide
        Michael McCandless added a comment -

        Uwe Schindler how about this disclaimer in CMS's top javadocs?

         *  <p>This class attempts to detect whether the index is
         *  on rotational storage (traditional hard drive) or not
         *  (e.g. solid-state disk) and changes the default max merge
         *  and thread count accordingly.  This detection is currently
         *  Linux-only, and relies on the OS to put the right value
         *  into /sys/block/&lt;dev&gt;/block/rotational.</p>
        
        Show
        Michael McCandless added a comment - Uwe Schindler how about this disclaimer in CMS's top javadocs? * <p>This class attempts to detect whether the index is * on rotational storage (traditional hard drive) or not * (e.g. solid-state disk) and changes the default max merge * and thread count accordingly. This detection is currently * Linux-only, and relies on the OS to put the right value * into /sys/block/&lt;dev&gt;/block/rotational.</p>
        Hide
        Uwe Schindler added a comment -

        Hi Mike,
        I am OK with that. I would only add one other addition: "For all other operating systems it currently assumes a rotational disk for backwards compatibility."
        Another idea: we should maybe add special convenience setter to set optimal settings. By that, you don't rely on the auto detection and still can set "automatic" settings for both types of drives:

        • setMaxMergesAndThreads(int maxMergeCount, int maxThreadCount) (expert)
        • setDefaultMergesAndThreads(boolean optimizeForNonRotational) (convenience)

        Uwe

        Show
        Uwe Schindler added a comment - Hi Mike, I am OK with that. I would only add one other addition: "For all other operating systems it currently assumes a rotational disk for backwards compatibility." Another idea: we should maybe add special convenience setter to set optimal settings. By that, you don't rely on the auto detection and still can set "automatic" settings for both types of drives: setMaxMergesAndThreads(int maxMergeCount, int maxThreadCount) (expert) setDefaultMergesAndThreads(boolean optimizeForNonRotational) (convenience) Uwe
        Hide
        Michael McCandless added a comment -

        New patch, adding Uwe's sentence to the javadocs, and a new setDefaultMaxMergesAndThreads(boolean spins) method. I think it's ready.

        Show
        Michael McCandless added a comment - New patch, adding Uwe's sentence to the javadocs, and a new setDefaultMaxMergesAndThreads(boolean spins) method. I think it's ready.
        Hide
        Uwe Schindler added a comment - - edited

        Yeah, looks good. Maybe just add a reference to the new method in the introduction:

        To enable default settings for spinning or solid state disks for other
        operating systems, use {@link #setDefaultMaxMergesAnThreads(boolean)}.

        I am currently investigating detection for windows, but its unlikely that we can detect SSDs there without native code or spawning processes (no file system with device data). But I think, for MacOSX there may be a similar solution? I'll investigate.

        Show
        Uwe Schindler added a comment - - edited Yeah, looks good. Maybe just add a reference to the new method in the introduction: To enable default settings for spinning or solid state disks for other operating systems, use {@link #setDefaultMaxMergesAnThreads(boolean)}. I am currently investigating detection for windows, but its unlikely that we can detect SSDs there without native code or spawning processes (no file system with device data). But I think, for MacOSX there may be a similar solution? I'll investigate.
        Hide
        Uwe Schindler added a comment - - edited

        I have here another item on the TODO list: I am currently investigating the new Linux Filesystem BTRFS, which might also bring some cool things for Lucene. Some Linux distribs now starting to make it as a default file system (like OpenSUSE, Ubuntu not yet - but soon). BTRFS is more like ZFS from Slowlaris, so the mount table is no longer giving you all information (no raw devices anymore just some symbolic "volume" name), because you have now "sub-filesystems" that you can mount anywhere. Of course, the current code cannot handle that, but we might improve. Correction: this is not a problem, the device name of the mount is still the raw device. The sub volume is given as parameter (-o subvol=xxx to mount/fstab. So the current code should be able to handle that.

        The same applies to "bind" mounts, I prefer in some situations. Bind mounts are those where you mount part of one file system at another place (like a symlink, but more "hard").

        Show
        Uwe Schindler added a comment - - edited I have here another item on the TODO list: I am currently investigating the new Linux Filesystem BTRFS, which might also bring some cool things for Lucene. Some Linux distribs now starting to make it as a default file system (like OpenSUSE, Ubuntu not yet - but soon). BTRFS is more like ZFS from Slowlaris, so the mount table is no longer giving you all information (no raw devices anymore just some symbolic "volume" name), because you have now "sub-filesystems" that you can mount anywhere. Of course, the current code cannot handle that, but we might improve. Correction: this is not a problem, the device name of the mount is still the raw device. The sub volume is given as parameter ( -o subvol=xxx to mount/fstab. So the current code should be able to handle that. The same applies to "bind" mounts, I prefer in some situations. Bind mounts are those where you mount part of one file system at another place (like a symlink, but more "hard").
        Hide
        Michael McCandless added a comment -

        Actually, I run BTRFS on my current dev box. It is a symlink, but spins() gets through that:

          mike@haswell:/$ df -h .
          Filesystem                    Size  Used Avail Use% Mounted on
          /dev/mapper/haswell--vg-root  466G  138G  324G  30% /
          mike@haswell:/$ ls -l /dev/mapper/haswell--vg-root 
          lrwxrwxrwx 1 root root 7 Dec 14 17:08 /dev/mapper/haswell--vg-root -> ../dm-0
          mike@haswell:/$ cat /sys/block/dm-0/queue/rotational 
          0
        

        To verify spins() is working, I just run:

          ant test -Dtestcase=TestConcurrentMergeScheduler -Dtestmethod=testDynamicDefaults -Dtests.verbose=true -Dtests.directory=MMapDirectory
        

        and then look for the line where CMS logs the "spins" result:

         [junit4]   1> CMS 1 [Fri Dec 19 16:36:47 UZT 2014; main]: initMaxMergesAndThreads spins=false maxThreadCount=3 maxMergeCount=5
        
        Show
        Michael McCandless added a comment - Actually, I run BTRFS on my current dev box. It is a symlink, but spins() gets through that: mike@haswell:/$ df -h . Filesystem Size Used Avail Use% Mounted on /dev/mapper/haswell--vg-root 466G 138G 324G 30% / mike@haswell:/$ ls -l /dev/mapper/haswell--vg-root lrwxrwxrwx 1 root root 7 Dec 14 17:08 /dev/mapper/haswell--vg-root -> ../dm-0 mike@haswell:/$ cat /sys/block/dm-0/queue/rotational 0 To verify spins() is working, I just run: ant test -Dtestcase=TestConcurrentMergeScheduler -Dtestmethod=testDynamicDefaults -Dtests.verbose=true -Dtests.directory=MMapDirectory and then look for the line where CMS logs the "spins" result: [junit4] 1> CMS 1 [Fri Dec 19 16:36:47 UZT 2014; main]: initMaxMergesAndThreads spins=false maxThreadCount=3 maxMergeCount=5
        Hide
        Michael McCandless added a comment -

        Maybe just add a reference to the new method in the introduction:

        I'll add that!

        Show
        Michael McCandless added a comment - Maybe just add a reference to the new method in the introduction: I'll add that!
        Hide
        Uwe Schindler added a comment -

        Thanks Mike! I already changed my original message because the raw device still appears in mount output, the virtual subvolume is part of the mount options. So our code works with BTRFS. I was about to startup another virtual machine.

        I just want to list all the "special" cases here, so when people open bug reports we already know where there might be problems. And we also know what our customers might ask

        +1 to commit!

        Show
        Uwe Schindler added a comment - Thanks Mike! I already changed my original message because the raw device still appears in mount output, the virtual subvolume is part of the mount options. So our code works with BTRFS. I was about to startup another virtual machine. I just want to list all the "special" cases here, so when people open bug reports we already know where there might be problems. And we also know what our customers might ask +1 to commit!
        Hide
        ASF subversion and git services added a comment -

        Commit 1646775 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1646775 ]

        LUCENE-5951: try to detect if index is on an SSD and default CMS's settings accordingly

        Show
        ASF subversion and git services added a comment - Commit 1646775 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1646775 ] LUCENE-5951 : try to detect if index is on an SSD and default CMS's settings accordingly
        Hide
        ASF subversion and git services added a comment -

        Commit 1646778 from Michael McCandless in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1646778 ]

        LUCENE-5951: try to detect if index is on an SSD and default CMS's settings accordingly

        Show
        ASF subversion and git services added a comment - Commit 1646778 from Michael McCandless in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1646778 ] LUCENE-5951 : try to detect if index is on an SSD and default CMS's settings accordingly
        Hide
        ASF subversion and git services added a comment -

        Commit 1646791 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1646791 ]

        LUCENE-5951: these test cases can't run on Windows

        Show
        ASF subversion and git services added a comment - Commit 1646791 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1646791 ] LUCENE-5951 : these test cases can't run on Windows
        Hide
        ASF subversion and git services added a comment -

        Commit 1646792 from Michael McCandless in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1646792 ]

        LUCENE-5951: these test cases can't run on Windows

        Show
        ASF subversion and git services added a comment - Commit 1646792 from Michael McCandless in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1646792 ] LUCENE-5951 : these test cases can't run on Windows
        Hide
        Anshum Gupta added a comment -

        Bulk close after 5.0 release.

        Show
        Anshum Gupta added a comment - Bulk close after 5.0 release.

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development