Lucene - Core
  1. Lucene - Core
  2. LUCENE-3515

Possible slowdown of indexing/merging on 3.x vs trunk

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Opening an issue to pursue the possible slowdown Marc Sturlese uncovered.

      1. LUCENE-3515.patch
        15 kB
        Michael McCandless
      2. LUCENE-3515.patch
        12 kB
        Michael McCandless
      3. LUCENE-index-34.patch
        6 kB
        Michael McCandless
      4. LUCENE-index-40.patch
        6 kB
        Michael McCandless
      5. stdout-snow-leopard.tar.gz
        96 kB
        Marc Sturlese
      6. TestGenerationTime.java.3x
        5 kB
        Michael McCandless
      7. TestGenerationTime.java.40
        6 kB
        Michael McCandless

        Activity

        Uwe Schindler made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Michael McCandless committed 1188431 (1 file)
        Reviews: none

        LUCENE-3515: remove bogus change entry on 3.x for this issue (it wasn't a bug in 3.x)

        Michael McCandless made changes -
        Fix Version/s 3.5 [ 12317877 ]
        Hide
        Michael McCandless added a comment -

        This bug was only present in 4.0.

        Show
        Michael McCandless added a comment - This bug was only present in 4.0.
        Michael McCandless made changes -
        Link This issue is blocked by LUCENE-3524 [ LUCENE-3524 ]
        Michael McCandless made changes -
        Link This issue is blocked by LUCENE-3524 [ LUCENE-3524 ]
        Michael McCandless made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Michael McCandless added a comment -

        Thank you Marc and Erick! This was a devious issue and severely impacted merge performance for non-MMapDir impls.

        Show
        Michael McCandless added a comment - Thank you Marc and Erick! This was a devious issue and severely impacted merge performance for non-MMapDir impls.
        Michael McCandless made changes -
        Attachment LUCENE-3515.patch [ 12499025 ]
        Hide
        Michael McCandless added a comment -

        New patch.

        I changed the test to just use random terms, and also added the
        workaround for Pulsing's non-reuse.

        I discovered SimpleText also fails to properly reuse (it failed the
        new test) and fixed that.

        I think it's ready!

        Show
        Michael McCandless added a comment - New patch. I changed the test to just use random terms, and also added the workaround for Pulsing's non-reuse. I discovered SimpleText also fails to properly reuse (it failed the new test) and fixed that. I think it's ready!
        Hide
        Robert Muir added a comment -

        i opened LUCENE-3517 for the pulsing bug.

        But i think we should fix this 'general bug' which affects all codecs first?
        I think LUCENE-3517 might involve attributes-policeman

        we could do this:

            String bodyCodec = CodecProvider.getDefault().getFieldCodec("body");
            assumeFalse("PulsingCodec fails this test because of over-cloning", bodyCodec.equals("Pulsing") || bodyCodec.equals("MockRandom"));
        
        Show
        Robert Muir added a comment - i opened LUCENE-3517 for the pulsing bug. But i think we should fix this 'general bug' which affects all codecs first? I think LUCENE-3517 might involve attributes-policeman we could do this: String bodyCodec = CodecProvider.getDefault().getFieldCodec("body"); assumeFalse("PulsingCodec fails this test because of over-cloning", bodyCodec.equals("Pulsing") || bodyCodec.equals("MockRandom"));
        Hide
        Michael McCandless added a comment -

        This seed will fail due to pulsing codec:

        otv TestForTooMuchCloning.test -seed 6407e19e4835e90d:-d7cdae0d4a378eb:-ee1a92b677887
        
        Show
        Michael McCandless added a comment - This seed will fail due to pulsing codec: otv TestForTooMuchCloning.test -seed 6407e19e4835e90d:-d7cdae0d4a378eb:-ee1a92b677887
        Michael McCandless made changes -
        Assignee Michael McCandless [ mikemccand ]
        Michael McCandless made changes -
        Attachment LUCENE-3515.patch [ 12498922 ]
        Hide
        Michael McCandless added a comment -

        Patch, fixing the lack of reuse in MultiTermsEnum (used during merging).

        The lack of re-use meant we were cloning 2 IndexInputs (frq/prx) per term being merged! For NIOFSDir this then meant we would read the same 4K region of the file over and over and over again.

        I added a test to catch "over-cloning", but, the test sometimes fails when it gets pulsing codec because that codec does not properly reuse (a known issue that we should now fix!).

        Show
        Michael McCandless added a comment - Patch, fixing the lack of reuse in MultiTermsEnum (used during merging). The lack of re-use meant we were cloning 2 IndexInputs (frq/prx) per term being merged! For NIOFSDir this then meant we would read the same 4K region of the file over and over and over again. I added a test to catch "over-cloning", but, the test sometimes fails when it gets pulsing codec because that codec does not properly reuse (a known issue that we should now fix!).
        Hide
        Robert Muir added a comment -

        it looks like the bug will especially affect any directory that uses bufferedindexinput (NIOFS/SimpleFS).

        The problem is multitermsenum doesnt reuse the sub-docs&positionsenums, so for each term*segment we clone the input, and
        bufferedindexinput.clone() sets the clone's buffer to null.

        so across lots of low freq-terms we re-read 4096 bytes (MERGE_BUFFER_SIZE) to refill the buffer on each one...

        mmapdirectory is less affected because it has no buffer to re-read but seems like fixing the reusing would even help it...

        Show
        Robert Muir added a comment - it looks like the bug will especially affect any directory that uses bufferedindexinput (NIOFS/SimpleFS). The problem is multitermsenum doesnt reuse the sub-docs&positionsenums, so for each term*segment we clone the input, and bufferedindexinput.clone() sets the clone's buffer to null. so across lots of low freq-terms we re-read 4096 bytes (MERGE_BUFFER_SIZE) to refill the buffer on each one... mmapdirectory is less affected because it has no buffer to re-read but seems like fixing the reusing would even help it...
        Hide
        Marc Sturlese added a comment -

        I've tried snow leopard and MMap. Trunk went much better than before:

        trunk
        Time taken indexing: 416
        Time taken closing: 1
        Time taken whole process: 417

        3.4:
        Time taken indexing: 321
        Time taken closing: 46
        Time taken whole process: 368

        Show
        Marc Sturlese added a comment - I've tried snow leopard and MMap. Trunk went much better than before: trunk Time taken indexing: 416 Time taken closing: 1 Time taken whole process: 417 3.4: Time taken indexing: 321 Time taken closing: 46 Time taken whole process: 368
        Hide
        Robert Muir added a comment -

        simplefs is slow too.

        Show
        Robert Muir added a comment - simplefs is slow too.
        Hide
        Robert Muir added a comment -

        something is severely wrong on mac os X with NIOFSDirectory (when merging, on the read side).

        with mmapdirectory, its fast.

        Show
        Robert Muir added a comment - something is severely wrong on mac os X with NIOFSDirectory (when merging, on the read side). with mmapdirectory, its fast.
        Marc Sturlese made changes -
        Attachment stdout-snow-leopard.tar.gz [ 12498904 ]
        Hide
        Marc Sturlese added a comment - - edited

        Seeing others results the "problem" seems to be merges are very slow on trunk on Snow Leopard (where I've been running the tests).
        I've attached the whole stdout for both executions. Soon will run the tests on a debian box and post the results too.
        Here are the results:
        trunk
        Time taken indexing: 927
        Time taken closing: 283
        Time taken whole process: 1211

        3.4
        Time taken indexing: 303
        Time taken closing: 37
        Time taken whole process: 340

        Show
        Marc Sturlese added a comment - - edited Seeing others results the "problem" seems to be merges are very slow on trunk on Snow Leopard (where I've been running the tests). I've attached the whole stdout for both executions. Soon will run the tests on a debian box and post the results too. Here are the results: trunk Time taken indexing: 927 Time taken closing: 283 Time taken whole process: 1211 3.4 Time taken indexing: 303 Time taken closing: 37 Time taken whole process: 340
        Hide
        Michael McCandless added a comment -

        I actually see 3.x running slower:

        Trunk:

          Time taken indexing: 184
          Time taken closing: 2
          Time taken whole process: 187
        

        3.x:

        Time taken indexing: 205
          Time taken closing: 40
          Time taken whole process: 245
        

        But, this is because 3.x does a level 2 merge just before close, and
        the close must wait for that merge to complete. Whereas trunk never
        gets to the level 2 merge (only level 0/1), likely because trunk's RAM
        efficiency is a better and so it packs more docs into each level 0 segment.

        If we pushed the number of doc up to maybe 1.1M then we should
        similarly see trunk trigger a level 2 merge.

        Really when benchmarking indexing it's best to just close the IW
        without waiting for merges; otherwise you're comparing apples &
        oranges.

        Show
        Michael McCandless added a comment - I actually see 3.x running slower: Trunk: Time taken indexing: 184 Time taken closing: 2 Time taken whole process: 187 3.x: Time taken indexing: 205 Time taken closing: 40 Time taken whole process: 245 But, this is because 3.x does a level 2 merge just before close, and the close must wait for that merge to complete. Whereas trunk never gets to the level 2 merge (only level 0/1), likely because trunk's RAM efficiency is a better and so it packs more docs into each level 0 segment. If we pushed the number of doc up to maybe 1.1M then we should similarly see trunk trigger a level 2 merge. Really when benchmarking indexing it's best to just close the IW without waiting for merges; otherwise you're comparing apples & oranges.
        Hide
        Robert Muir added a comment -

        3.x:

        [junit] Time taken indexing: 114
        [junit] Time taken closing: 20
        [junit] Time taken whole process: 134

        4.x:

        [junit] Time taken indexing: 133
        [junit] Time taken closing: 1
        [junit] Time taken whole process: 134

        So for me, its the same speed.

        I didnt use the public static void main, i used the test as-is, except i disabled assertions and forced StandardCodec.

        Show
        Robert Muir added a comment - 3.x: [junit] Time taken indexing: 114 [junit] Time taken closing: 20 [junit] Time taken whole process: 134 4.x: [junit] Time taken indexing: 133 [junit] Time taken closing: 1 [junit] Time taken whole process: 134 So for me, its the same speed. I didnt use the public static void main, i used the test as-is, except i disabled assertions and forced StandardCodec.
        Michael McCandless made changes -
        Attachment TestGenerationTime.java.3x [ 12498895 ]
        Attachment TestGenerationTime.java.40 [ 12498896 ]
        Hide
        Michael McCandless added a comment -

        Same tests, just cut over to static void main instead.

        Show
        Michael McCandless added a comment - Same tests, just cut over to static void main instead.
        Hide
        Robert Muir added a comment -

        we should run the comparisons as a public static void main i think instead of extending LuceneTestCase.

        4.0 could be getting SimpleText etc

        Show
        Robert Muir added a comment - we should run the comparisons as a public static void main i think instead of extending LuceneTestCase. 4.0 could be getting SimpleText etc
        Michael McCandless made changes -
        Field Original Value New Value
        Attachment LUCENE-index-34.patch [ 12498890 ]
        Attachment LUCENE-index-40.patch [ 12498891 ]
        Hide
        Michael McCandless added a comment -

        Patches from Marc showing the issue (thanks Marc!).

        Show
        Michael McCandless added a comment - Patches from Marc showing the issue (thanks Marc!).
        Michael McCandless created issue -

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development