Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core/store
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      I'm trying to test NRT performance, and noticed when I dump the thread stacks that the darned threads often seem to be in java.nio.Bits.copyToByteArray(Native Method)... so I wondered whether we could/should use direct ByteBuffers, and whether that would gain performance in general. We currently just use our own byte[] buffer via BufferedIndexInput.

      It's hard to test since it's likely platform specific, but if it does result in gains it could be an easy win.

      1. LUCENE-2056.patch
        9 kB
        Michael McCandless

        Activity

        Hide
        mikemccand Michael McCandless added a comment -

        Thanks Steven! Was this with the above alg (ie, 4 threads doing searching)?

        Could you also try the search using NIOFSDirectory?

        Also, if possible, it'd be better to test against a larger index – such super-fast queries allow the query init cost to unduly impact that results (eg, allocating a direct buffer is more costly than allocating a non-direct buffer).

        Show
        mikemccand Michael McCandless added a comment - Thanks Steven! Was this with the above alg (ie, 4 threads doing searching)? Could you also try the search using NIOFSDirectory? Also, if possible, it'd be better to test against a larger index – such super-fast queries allow the query init cost to unduly impact that results (eg, allocating a direct buffer is more costly than allocating a non-direct buffer).
        Hide
        steve_rowe Steve Rowe added a comment -

        I used an index built from Reuters line docs, and for the queries, the 92
        English queries from AnswerBus's most recent 100 queries, with question and quotation marks stripped.

        On Windows Vista 64-bit, 2 CPU cores (Intel Core 2 6600@2.40GHz), Sun JDK 1.6.0_15 64-bit:

        Directory runCnt recsPerRun rec/s
        FSDirectory 40 2279 379.73
        DirectNIOFSDirectory 40 2171 361.73 (5% slower)

        On Windows 7 64-bit, 4 CPU cores (Intel Core i5 750 @ 2.67 GHz), Sun JDK 1.6.0_20 64-bit:

        Directory runCnt recsPerRun rec/s
        FSDirectory 40 2754 458.92
        DirectNIOFSDirectory 40 2658 442.61 (4% slower)
        Show
        steve_rowe Steve Rowe added a comment - I used an index built from Reuters line docs, and for the queries, the 92 English queries from AnswerBus's most recent 100 queries, with question and quotation marks stripped. On Windows Vista 64-bit, 2 CPU cores (Intel Core 2 6600@2.40GHz), Sun JDK 1.6.0_15 64-bit: Directory runCnt recsPerRun rec/s FSDirectory 40 2279 379.73 DirectNIOFSDirectory 40 2171 361.73 (5% slower) On Windows 7 64-bit, 4 CPU cores (Intel Core i5 750 @ 2.67 GHz), Sun JDK 1.6.0_20 64-bit: Directory runCnt recsPerRun rec/s FSDirectory 40 2754 458.92 DirectNIOFSDirectory 40 2658 442.61 (4% slower)
        Hide
        mikemccand Michael McCandless added a comment -

        It's remotely possible that using direct byte buffers (in the above patch) works around the nasty Sun JVM bug (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6265734) that makes NIOFSDirectory useless on windows...

        Can someone w/ access to a multi-CPU/core Windows box test this?

        You just need an existing index, and then something like this alg (searches w/ 4 threads) EXCEPT you have to temporarily edit FSDirectory.java to return this DirectNIOFSDirectory on Windows:

        analyzer=org.apache.lucene.analysis.core.WhitespaceAnalyzer
        directory=FSDirectory
        work.dir = /x/lucene/trunkwiki
        
        log.step=100000
        
        search.num.hits=10
        
        query.maker=org.apache.lucene.benchmark.byTask.feeds.FileBasedQueryMaker
        file.query.maker.file = queries.txt
        
        # task at this depth or less would print when they start
        task.max.depth.log=2
        
        log.queries=true
        # -------------------------------------------------------------------------------------
        
        { "Rounds"
        
            OpenReader
            [ { "topDocs" Search > : 6.0s }: 4
            CloseReader
        
            RepSumByPref topDocs
        
            NewRound
        
        } : 10
        
        Show
        mikemccand Michael McCandless added a comment - It's remotely possible that using direct byte buffers (in the above patch) works around the nasty Sun JVM bug ( http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6265734 ) that makes NIOFSDirectory useless on windows... Can someone w/ access to a multi-CPU/core Windows box test this? You just need an existing index, and then something like this alg (searches w/ 4 threads) EXCEPT you have to temporarily edit FSDirectory.java to return this DirectNIOFSDirectory on Windows: analyzer=org.apache.lucene.analysis.core.WhitespaceAnalyzer directory=FSDirectory work.dir = /x/lucene/trunkwiki log.step=100000 search.num.hits=10 query.maker=org.apache.lucene.benchmark.byTask.feeds.FileBasedQueryMaker file.query.maker.file = queries.txt # task at this depth or less would print when they start task.max.depth.log=2 log.queries=true # ------------------------------------------------------------------------------------- { "Rounds" OpenReader [ { "topDocs" Search > : 6.0s }: 4 CloseReader RepSumByPref topDocs NewRound } : 10
        Hide
        mikemccand Michael McCandless added a comment -

        Attached patch, creating a DirectNIOFSDirectory, using direct ByteBuffers for read (Indexinput) and write (IndexOutput).

        With some simple initial tests (a TermQuery, OR query, PhraseQuery), on CentOS 5.4, Java 1.6.0_17 64bit, it seems to be a bit (~1-3%) faster than NIOFSDirectory.

        Show
        mikemccand Michael McCandless added a comment - Attached patch, creating a DirectNIOFSDirectory, using direct ByteBuffers for read (Indexinput) and write (IndexOutput). With some simple initial tests (a TermQuery, OR query, PhraseQuery), on CentOS 5.4, Java 1.6.0_17 64bit, it seems to be a bit (~1-3%) faster than NIOFSDirectory.
        Hide
        mikemccand Michael McCandless added a comment -

        Hmm not quite a ringing endorsement I don't think I'll look into this any time soon (plenty on my plate!!) so if someone else wants to try, maybe with Java 7, go for it!

        Show
        mikemccand Michael McCandless added a comment - Hmm not quite a ringing endorsement I don't think I'll look into this any time soon (plenty on my plate!!) so if someone else wants to try, maybe with Java 7, go for it!
        Hide
        markrmiller@gmail.com Mark Miller added a comment -

        Will be interesting to see what you come up with. I replaced the byte buffer in BufferedIndexInput with a direct buffer a year or two ago and it slowed things down. Then I read a bunch about how there were various issues with direct buffers - they really got NIO right on the first go or two They are supposed to be much faster in java 7.

        Who knows though - I was lazy and went with a pretty much straight port to direct buffers. Can prob get a much better kick with some pooling or something.

        Show
        markrmiller@gmail.com Mark Miller added a comment - Will be interesting to see what you come up with. I replaced the byte buffer in BufferedIndexInput with a direct buffer a year or two ago and it slowed things down. Then I read a bunch about how there were various issues with direct buffers - they really got NIO right on the first go or two They are supposed to be much faster in java 7. Who knows though - I was lazy and went with a pretty much straight port to direct buffers. Can prob get a much better kick with some pooling or something.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        I have an uncomfortable feeling that it will be slower.
        IIRC, there's no way to get a byte[] from a direct byte buffer, so all of our methods that get a byte at a time will be making method calls. If those calls were directly implemented by the JVM as intrinsics... perhaps it would be faster. In general though, I've learned to lower my expectations (compared to the hype we've sometimes heard from Sun) and sometimes I'm pleasantly surprised

        Show
        yseeley@gmail.com Yonik Seeley added a comment - I have an uncomfortable feeling that it will be slower. IIRC, there's no way to get a byte[] from a direct byte buffer, so all of our methods that get a byte at a time will be making method calls. If those calls were directly implemented by the JVM as intrinsics... perhaps it would be faster. In general though, I've learned to lower my expectations (compared to the hype we've sometimes heard from Sun) and sometimes I'm pleasantly surprised

          People

          • Assignee:
            Unassigned
            Reporter:
            mikemccand Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development