Lucene - Core
  1. Lucene - Core
  2. LUCENE-2825

FSDirectory.open should return MMap on 64-bit Solaris

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.1, 4.0-ALPHA
    • Component/s: core/store
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      MMap is ~ 30% faster than NIOFS on this platform.

        Activity

        Hide
        Robert Muir added a comment -

        Here's the patch, and luceneutil 10M doc results:

        (solaris 9, ultra60 2GB ram, 1.6u23 jre with -d64)

        Query QPS trunk QPS patch Pct diff
        spanFirst(unit, 5) 0.68 0.71 3.7%
        spanNear([unit, state], 10, true) 0.14 0.15 7.3%
        "unit state" 0.34 0.37 9.1%
        unit state 0.37 0.45 21.7%
        unit* 1.12 1.40 25.4%
        uni* 0.61 0.79 30.2%
        +unit +state 0.52 0.69 33.1%
        unit~1.0 0.28 0.38 34.9%
        unit~2.0 0.27 0.37 36.2%
        state 1.10 1.52 38.0%
        united~1.0 0.42 0.60 42.7%
        united~2.0 0.10 0.15 46.8%
        un*d 1.97 3.09 56.5%
        u*d 0.52 0.85 61.3%
        +nebraska +state 3.51 6.73 91.6%
        Show
        Robert Muir added a comment - Here's the patch, and luceneutil 10M doc results: (solaris 9, ultra60 2GB ram, 1.6u23 jre with -d64) Query QPS trunk QPS patch Pct diff spanFirst(unit, 5) 0.68 0.71 3.7% spanNear( [unit, state] , 10, true) 0.14 0.15 7.3% "unit state" 0.34 0.37 9.1% unit state 0.37 0.45 21.7% unit* 1.12 1.40 25.4% uni* 0.61 0.79 30.2% +unit +state 0.52 0.69 33.1% unit~1.0 0.28 0.38 34.9% unit~2.0 0.27 0.37 36.2% state 1.10 1.52 38.0% united~1.0 0.42 0.60 42.7% united~2.0 0.10 0.15 46.8% un*d 1.97 3.09 56.5% u*d 0.52 0.85 61.3% +nebraska +state 3.51 6.73 91.6%
        Hide
        Uwe Schindler added a comment -

        Is it slower on Linux-64 because you only enable for windows-64 and solaris-64?

        In general I expected that it is faster on all all 64 bit systems.

        Show
        Uwe Schindler added a comment - Is it slower on Linux-64 because you only enable for windows-64 and solaris-64? In general I expected that it is faster on all all 64 bit systems.
        Hide
        Robert Muir added a comment -

        In the case of Linux-64 and Mac OS X, Mike was testing these, but it looked
        like some queries were actually faster with NIOFS on these platforms.

        (This makes no sense to me, but i figured we could set it for Solaris where its obviously faster always)

        Show
        Robert Muir added a comment - In the case of Linux-64 and Mac OS X, Mike was testing these, but it looked like some queries were actually faster with NIOFS on these platforms. (This makes no sense to me, but i figured we could set it for Solaris where its obviously faster always)
        Hide
        Jason Rutherglen added a comment -

        some queries were actually faster with NIOFS on these platforms

        I think that's to be expected. MMap should be faster than NIOFS in all cases, if it's not then the JVM could be to blame (on 64-bit). There's notes at: LUCENE-753

        Doesn't MMap automatically consume a lot more memory?

        Show
        Jason Rutherglen added a comment - some queries were actually faster with NIOFS on these platforms I think that's to be expected. MMap should be faster than NIOFS in all cases, if it's not then the JVM could be to blame (on 64-bit). There's notes at: LUCENE-753 Doesn't MMap automatically consume a lot more memory?
        Hide
        Robert Muir added a comment -

        Doesn't MMap automatically consume a lot more memory?

        No.

        Show
        Robert Muir added a comment - Doesn't MMap automatically consume a lot more memory? No.
        Hide
        Uwe Schindler added a comment - - edited

        Doesn't MMap automatically consume a lot more memory?

        No, it consumes address space, this is why it only works without problems on 64 bit. The index files are mapped into address space, so handled like a SWAP file, that's the trick. They only consume as much memory from cache space as the operating system decides. Especially it does not consume memory from java heap!

        Show
        Uwe Schindler added a comment - - edited Doesn't MMap automatically consume a lot more memory? No, it consumes address space , this is why it only works without problems on 64 bit. The index files are mapped into address space, so handled like a SWAP file, that's the trick. They only consume as much memory from cache space as the operating system decides. Especially it does not consume memory from java heap!
        Hide
        Yonik Seeley added a comment -

        There could be a number of things that could make this very OS dependent:

        • kernel read-ahead can differ between mmap and traditional IO, and differ between operating systems
          (think about reading 10 bytes from position 4094... if mmap doesn't read-ahead at least one page then you will suffer a double page fault).
        • CPU cache / TLB effects? Using more address space isn't completely free.
        • Java mmap overhead - you don't get access to a raw byte[]
        • userspace-kernel transition times (i.e. for a read system call) differ between operating systems. Linux is very good here, probably leading to less of a penalty for read vs mmap.
        Show
        Yonik Seeley added a comment - There could be a number of things that could make this very OS dependent: kernel read-ahead can differ between mmap and traditional IO, and differ between operating systems (think about reading 10 bytes from position 4094... if mmap doesn't read-ahead at least one page then you will suffer a double page fault). CPU cache / TLB effects? Using more address space isn't completely free. Java mmap overhead - you don't get access to a raw byte[] userspace-kernel transition times (i.e. for a read system call) differ between operating systems. Linux is very good here, probably leading to less of a penalty for read vs mmap.
        Hide
        Robert Muir added a comment -

        There could be a number of things that could make this very OS dependent:

        I agree, we should test any OS before changing any defaults.

        In the Solaris case its clear that mmap is better for our purposes though (as to exactly why, no idea)

        Show
        Robert Muir added a comment - There could be a number of things that could make this very OS dependent: I agree, we should test any OS before changing any defaults. In the Solaris case its clear that mmap is better for our purposes though (as to exactly why, no idea)
        Hide
        Robert Muir added a comment -

        CPU cache / TLB effects? Using more address space isn't completely free.

        In the case of Solaris I think there is much less of a chance of TLB effects?
        For example on Solaris java automatically uses large pages (unlike Linux, Windows etc).

        http://www.oracle.com/technetwork/java/javase/tech/largememory-jsp-137182.html

        Show
        Robert Muir added a comment - CPU cache / TLB effects? Using more address space isn't completely free. In the case of Solaris I think there is much less of a chance of TLB effects? For example on Solaris java automatically uses large pages (unlike Linux, Windows etc). http://www.oracle.com/technetwork/java/javase/tech/largememory-jsp-137182.html
        Hide
        Robert Muir added a comment -

        Committed revision 1052980, 1052981 (3x)

        Show
        Robert Muir added a comment - Committed revision 1052980, 1052981 (3x)
        Hide
        Earwin Burrfoot added a comment -

        CPU cache / TLB effects? Using more address space isn't completely free.

        In the case of Solaris I think there is much less of a chance of TLB effects?
        For example on Solaris java automatically uses large pages (unlike Linux, Windows etc).

        For, like, 13Gb of memory-mapped index, I've seen no noticeable difference between having large pages on and off under Linux. That's some anecdotal evidence, as I'ven't done any extensive research, but still.

        Show
        Earwin Burrfoot added a comment - CPU cache / TLB effects? Using more address space isn't completely free. In the case of Solaris I think there is much less of a chance of TLB effects? For example on Solaris java automatically uses large pages (unlike Linux, Windows etc). For, like, 13Gb of memory-mapped index, I've seen no noticeable difference between having large pages on and off under Linux. That's some anecdotal evidence, as I'ven't done any extensive research, but still.
        Hide
        Grant Ingersoll added a comment -

        Bulk close for 3.1

        Show
        Grant Ingersoll added a comment - Bulk close for 3.1

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development