Solr
  1. Solr
  2. SOLR-5811

The Overseer will retry work items until success.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.7.1, 4.8, 6.0
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      This means that if you get a bad item in the ZK distributed queue, it can lock up your Overseer as it continuously retries the bad command. The workaround is to manually clear the Overseer ZK queue.

      1. SOLR-5811.patch
        25 kB
        Mark Miller
      2. SOLR-5811.patch
        18 kB
        Mark Miller

        Issue Links

          Activity

          Hide
          Mark Miller added a comment - - edited

          When the Overseer was first considered, one of the primary ideas was that commands could fail over if not completed or be retried on failures, etc. A lot of this is not there yet. The Overseer will actually retry failed commands now, but it's much too dumb about it - a command that cannot or will not succeed will tie up the whole processing pipeline.

          In the short term, I don't think we should retry most work items.

          Longer term, it seems like we should track retries and perhaps give up at some point - or something smart than retrying as fast as possible until success.

          Show
          Mark Miller added a comment - - edited When the Overseer was first considered, one of the primary ideas was that commands could fail over if not completed or be retried on failures, etc. A lot of this is not there yet. The Overseer will actually retry failed commands now, but it's much too dumb about it - a command that cannot or will not succeed will tie up the whole processing pipeline. In the short term, I don't think we should retry most work items. Longer term, it seems like we should track retries and perhaps give up at some point - or something smart than retrying as fast as possible until success.
          Hide
          Mark Miller added a comment -

          Another patch with some improved checking around finding a collection name so that we will have better errors in a similar situation.

          Show
          Mark Miller added a comment - Another patch with some improved checking around finding a collection name so that we will have better errors in a similar situation.
          Hide
          ASF subversion and git services added a comment -

          Commit 1574280 from Mark Miller in branch 'dev/trunk'
          [ https://svn.apache.org/r1574280 ]

          SOLR-5811: The Overseer will retry work items until success, which is a serious problem if you hit a bad work item.

          Show
          ASF subversion and git services added a comment - Commit 1574280 from Mark Miller in branch 'dev/trunk' [ https://svn.apache.org/r1574280 ] SOLR-5811 : The Overseer will retry work items until success, which is a serious problem if you hit a bad work item.
          Hide
          ASF subversion and git services added a comment -

          Commit 1574281 from Mark Miller in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1574281 ]

          SOLR-5811: The Overseer will retry work items until success, which is a serious problem if you hit a bad work item.

          Show
          ASF subversion and git services added a comment - Commit 1574281 from Mark Miller in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1574281 ] SOLR-5811 : The Overseer will retry work items until success, which is a serious problem if you hit a bad work item.
          Hide
          ASF subversion and git services added a comment -

          Commit 1574580 from Mark Miller in branch 'dev/trunk'
          [ https://svn.apache.org/r1574580 ]

          SOLR-5811: Additional cleanup.

          Show
          ASF subversion and git services added a comment - Commit 1574580 from Mark Miller in branch 'dev/trunk' [ https://svn.apache.org/r1574580 ] SOLR-5811 : Additional cleanup.
          Hide
          ASF subversion and git services added a comment -

          Commit 1574581 from Mark Miller in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1574581 ]

          SOLR-5811: Additional cleanup.

          Show
          ASF subversion and git services added a comment - Commit 1574581 from Mark Miller in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1574581 ] SOLR-5811 : Additional cleanup.
          Hide
          ASF subversion and git services added a comment -

          Commit 1574753 from Mark Miller in branch 'dev/trunk'
          [ https://svn.apache.org/r1574753 ]

          SOLR-5811: Improve logged message.

          Show
          ASF subversion and git services added a comment - Commit 1574753 from Mark Miller in branch 'dev/trunk' [ https://svn.apache.org/r1574753 ] SOLR-5811 : Improve logged message.
          Hide
          ASF subversion and git services added a comment -

          Commit 1574763 from Mark Miller in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1574763 ]

          SOLR-5811: Improve logged message.

          Show
          ASF subversion and git services added a comment - Commit 1574763 from Mark Miller in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1574763 ] SOLR-5811 : Improve logged message.
          Hide
          Steve Rowe added a comment -

          Mark Miller, any reason not to backport this to 4.7.1?

          Show
          Steve Rowe added a comment - Mark Miller , any reason not to backport this to 4.7.1?
          Hide
          ASF subversion and git services added a comment -

          Commit 1581184 from Steve Rowe in branch 'dev/branches/lucene_solr_4_7'
          [ https://svn.apache.org/r1581184 ]

          SOLR-5811: The Overseer will retry work items until success, which is a serious problem if you hit a bad work item. (merged branch_4x r1574281)

          Show
          ASF subversion and git services added a comment - Commit 1581184 from Steve Rowe in branch 'dev/branches/lucene_solr_4_7' [ https://svn.apache.org/r1581184 ] SOLR-5811 : The Overseer will retry work items until success, which is a serious problem if you hit a bad work item. (merged branch_4x r1574281)
          Hide
          ASF subversion and git services added a comment -

          Commit 1581185 from Steve Rowe in branch 'dev/trunk'
          [ https://svn.apache.org/r1581185 ]

          SOLR-5811: move CHANGES.txt entry to 4.7.1 section

          Show
          ASF subversion and git services added a comment - Commit 1581185 from Steve Rowe in branch 'dev/trunk' [ https://svn.apache.org/r1581185 ] SOLR-5811 : move CHANGES.txt entry to 4.7.1 section
          Hide
          ASF subversion and git services added a comment -

          Commit 1581187 from Steve Rowe in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1581187 ]

          SOLR-5811: move CHANGES.txt entry to 4.7.1 section (merged trunk r1581185)

          Show
          ASF subversion and git services added a comment - Commit 1581187 from Steve Rowe in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1581187 ] SOLR-5811 : move CHANGES.txt entry to 4.7.1 section (merged trunk r1581185)
          Hide
          ASF subversion and git services added a comment -

          Commit 1581193 from Steve Rowe in branch 'dev/branches/lucene_solr_4_7'
          [ https://svn.apache.org/r1581193 ]

          SOLR-5811: Additional cleanup and improve logged message. (merged branch_4x r1574581 and r1574763)

          Show
          ASF subversion and git services added a comment - Commit 1581193 from Steve Rowe in branch 'dev/branches/lucene_solr_4_7' [ https://svn.apache.org/r1581193 ] SOLR-5811 : Additional cleanup and improve logged message. (merged branch_4x r1574581 and r1574763)
          Hide
          Steve Rowe added a comment -

          Mark, can this issue be resolved?

          Show
          Steve Rowe added a comment - Mark, can this issue be resolved?
          Hide
          Steve Rowe added a comment -

          Bulk close 4.7.1 issues

          Show
          Steve Rowe added a comment - Bulk close 4.7.1 issues

            People

            • Assignee:
              Mark Miller
              Reporter:
              Mark Miller
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development