Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.10, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      I don't know why this was removed, but this is buggy and just asking for trouble.

        Activity

        Hide
        Shai Erera added a comment -

        IIRC, I changed that on LUCENE-2245. The core motivation was to keep the addIndexes versions doing the minimal work that's required for them to function properly. So the Directory version merely copies the files over to the target index and the Reader version migrates them to the target index. Other than that, we should let the MP workout the best merges to execute etc.

        This lets the caller decide how expensive should addIndexes be, on his part. You can freely call forceMerge/maybeMerge before/after addIndexes, or not execute merges at all. I found this beneficial e.g. when creating indexes via MapReduce, where I used addIndexes, but wanted to wait with merges until the end. When addIndexes always calls maybeMerge, it means it always does this work, and the only way to avoid that is use NoMergePolicy. Perhaps NoMP came after the addIndexes cleanup, not sure...

        I like it that addIndexes does the minimum work that's needed. Can you elaborate why is this buggy? What prevents the app from invoking merges itself? I mean, this is not like addDocument() calls ... I believe that addIndexes are less common, and more ... expert?

        Show
        Shai Erera added a comment - IIRC, I changed that on LUCENE-2245 . The core motivation was to keep the addIndexes versions doing the minimal work that's required for them to function properly. So the Directory version merely copies the files over to the target index and the Reader version migrates them to the target index. Other than that, we should let the MP workout the best merges to execute etc. This lets the caller decide how expensive should addIndexes be, on his part. You can freely call forceMerge/maybeMerge before/after addIndexes, or not execute merges at all. I found this beneficial e.g. when creating indexes via MapReduce, where I used addIndexes, but wanted to wait with merges until the end. When addIndexes always calls maybeMerge, it means it always does this work, and the only way to avoid that is use NoMergePolicy. Perhaps NoMP came after the addIndexes cleanup, not sure... I like it that addIndexes does the minimum work that's needed. Can you elaborate why is this buggy? What prevents the app from invoking merges itself? I mean, this is not like addDocument() calls ... I believe that addIndexes are less common, and more ... expert?
        Hide
        Robert Muir added a comment -

        I don't agree with this argument about "This lets the caller decide how expensive should addIndexes be, on his part."

        The user can freely configure this with MergePolicy. Its no different from any other index operation. This is a bug.

        There is lots of confusion, including a current discussion on the ML.

        Show
        Robert Muir added a comment - I don't agree with this argument about "This lets the caller decide how expensive should addIndexes be, on his part." The user can freely configure this with MergePolicy. Its no different from any other index operation. This is a bug. There is lots of confusion, including a current discussion on the ML.
        Hide
        Mark Miller added a comment -

        The user can freely configure this with MergePolicy. Its no different from any other index operation. This is a bug.

        I was leaning towards Shai's argument at first, but after a bit of deeper thought, I agree with Robert.

        I don't know that having an option to not use the merge policy will add any confusion if the default is right, but it does seem the merge policy itself is sufficient for cases I can think of. I don't know that you need this extra way to control merges.

        Show
        Mark Miller added a comment - The user can freely configure this with MergePolicy. Its no different from any other index operation. This is a bug. I was leaning towards Shai's argument at first, but after a bit of deeper thought, I agree with Robert. I don't know that having an option to not use the merge policy will add any confusion if the default is right, but it does seem the merge policy itself is sufficient for cases I can think of. I don't know that you need this extra way to control merges.
        Hide
        Robert Muir added a comment -

        FYI: this is the third time i've heard of this trap hitting people and creating hundreds or thousands of index segments: once was from coworkers at a past job, twice was lucene user list discussion "Merger performance degradation on 3.6.1", thrice was Erick's recent ML post.

        For people that don't want merging they have NoMergePolicy, maybeMerge() is even documented as "expert" and "Explicit calls to maybeMerge() are usually not necessary. The most common case is when merge policy parameters have changed". So requiring the user to manually invoke this after index operations to prevent segment explosion is wrong IMO.

        Show
        Robert Muir added a comment - FYI: this is the third time i've heard of this trap hitting people and creating hundreds or thousands of index segments: once was from coworkers at a past job, twice was lucene user list discussion "Merger performance degradation on 3.6.1", thrice was Erick's recent ML post. For people that don't want merging they have NoMergePolicy, maybeMerge() is even documented as "expert" and "Explicit calls to maybeMerge() are usually not necessary. The most common case is when merge policy parameters have changed". So requiring the user to manually invoke this after index operations to prevent segment explosion is wrong IMO.
        Hide
        Erick Erickson added a comment -

        Agreed, we rely on segment merging in the normal state, to have it fail in this case is trappy.

        Commit it I say.

        Show
        Erick Erickson added a comment - Agreed, we rely on segment merging in the normal state, to have it fail in this case is trappy. Commit it I say.
        Hide
        David Smiley added a comment -

        Commit it I say.

        +1; this is a bug.

        Show
        David Smiley added a comment - Commit it I say. +1; this is a bug.
        Hide
        Mark Miller added a comment -

        I smell the consensus of silence - can we fix this now?

        Show
        Mark Miller added a comment - I smell the consensus of silence - can we fix this now?
        Hide
        Mark Miller added a comment -

        Can we get this in for 4.10 Robert Muir?

        Show
        Mark Miller added a comment - Can we get this in for 4.10 Robert Muir ?
        Hide
        Michael McCandless added a comment -

        +1

        Show
        Michael McCandless added a comment - +1
        Hide
        Shai Erera added a comment -

        +1. Now that we can set the MP dynamically, if someone doesn't want to perform merges, he can set the MP to NoMP, call addIndexes, then change it back.

        Show
        Shai Erera added a comment - +1. Now that we can set the MP dynamically, if someone doesn't want to perform merges, he can set the MP to NoMP, call addIndexes, then change it back.
        Hide
        Ryan Ernst added a comment -

        If this was that important, why wasn't it pushed on weeks ago? I hate to play the bad guy, but I'm a little scared to throw this into the release branch, when it has undergone minimal testing (ie it hasn't be hammered by jenkins for days or weeks).

        Show
        Ryan Ernst added a comment - If this was that important, why wasn't it pushed on weeks ago? I hate to play the bad guy, but I'm a little scared to throw this into the release branch, when it has undergone minimal testing (ie it hasn't be hammered by jenkins for days or weeks).
        Hide
        Mark Miller added a comment -

        I think we would have way more major concerns if this was a dangerous change.

        This is a really ugly bug with a simple fix.

        Show
        Mark Miller added a comment - I think we would have way more major concerns if this was a dangerous change. This is a really ugly bug with a simple fix.
        Hide
        Robert Muir added a comment -

        While we think about it i will go to trunk and 4x with the change.

        Show
        Robert Muir added a comment - While we think about it i will go to trunk and 4x with the change.
        Hide
        Ryan Ernst added a comment -

        I think we would have way more major concerns if this was a dangerous change.
        This is a really ugly bug with a simple fix.

        I don't know enough about the history of this issue here, so I trust your judgement. I'm just trying to be protective from a RM standpoint.

        Show
        Ryan Ernst added a comment - I think we would have way more major concerns if this was a dangerous change. This is a really ugly bug with a simple fix. I don't know enough about the history of this issue here, so I trust your judgement. I'm just trying to be protective from a RM standpoint.
        Hide
        ASF subversion and git services added a comment -

        Commit 1619642 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1619642 ]

        LUCENE-5672: Fix addIndexes to call maybeMerge

        Show
        ASF subversion and git services added a comment - Commit 1619642 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1619642 ] LUCENE-5672 : Fix addIndexes to call maybeMerge
        Hide
        ASF subversion and git services added a comment -

        Commit 1619646 from Robert Muir in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1619646 ]

        LUCENE-5672: Fix addIndexes to call maybeMerge

        Show
        ASF subversion and git services added a comment - Commit 1619646 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1619646 ] LUCENE-5672 : Fix addIndexes to call maybeMerge
        Hide
        ASF subversion and git services added a comment -

        Commit 1619838 from Ryan Ernst in branch 'dev/branches/lucene_solr_4_10'
        [ https://svn.apache.org/r1619838 ]

        LUCENE-5672: Fix addIndexes to call maybeMerge

        Show
        ASF subversion and git services added a comment - Commit 1619838 from Ryan Ernst in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1619838 ] LUCENE-5672 : Fix addIndexes to call maybeMerge
        Hide
        ASF subversion and git services added a comment -

        Commit 1619840 from Ryan Ernst in branch 'dev/trunk'
        [ https://svn.apache.org/r1619840 ]

        LUCENE-5672,LUCENE-5897,LUCENE-5400: move changes entry to 4.10.0

        Show
        ASF subversion and git services added a comment - Commit 1619840 from Ryan Ernst in branch 'dev/trunk' [ https://svn.apache.org/r1619840 ] LUCENE-5672 , LUCENE-5897 , LUCENE-5400 : move changes entry to 4.10.0
        Hide
        ASF subversion and git services added a comment -

        Commit 1619841 from Ryan Ernst in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1619841 ]

        LUCENE-5672,LUCENE-5897,LUCENE-5400: move changes entry to 4.10.0

        Show
        ASF subversion and git services added a comment - Commit 1619841 from Ryan Ernst in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1619841 ] LUCENE-5672 , LUCENE-5897 , LUCENE-5400 : move changes entry to 4.10.0
        Hide
        ASF subversion and git services added a comment -

        Commit 1619842 from Ryan Ernst in branch 'dev/branches/lucene_solr_4_10'
        [ https://svn.apache.org/r1619842 ]

        LUCENE-5672,LUCENE-5897,LUCENE-5400: move changes entry to 4.10.0

        Show
        ASF subversion and git services added a comment - Commit 1619842 from Ryan Ernst in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1619842 ] LUCENE-5672 , LUCENE-5897 , LUCENE-5400 : move changes entry to 4.10.0

          People

          • Assignee:
            Robert Muir
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development