Solr
  1. Solr
  2. SOLR-3126

We should try to do a quick sync on std start up recovery before trying to do a full blown replication.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      just more efficient - especially on cluster shutdown/start where the replicas may all be up to date and match anway.

      1. SOLR-3126.patch
        13 kB
        Mark Miller
      2. SOLR-3126.patch
        6 kB
        Mark Miller

        Issue Links

          Activity

          Hide
          Yonik Seeley added a comment -

          IMO the best way forward on this issue is to get sane logging so we can figure out what's happening to what core.

          Show
          Yonik Seeley added a comment - IMO the best way forward on this issue is to get sane logging so we can figure out what's happening to what core.
          Hide
          Mark Miller added a comment -

          I think ive made some progress on tracking this down. It looks like perhaps the 4 second wait we do to make sure no updates are still finishing that started seeing stale state might not be long enough after some stuff was rearranged. Boosting that wait is getting me better results - still testing though.

          Show
          Mark Miller added a comment - I think ive made some progress on tracking this down. It looks like perhaps the 4 second wait we do to make sure no updates are still finishing that started seeing stale state might not be long enough after some stuff was rearranged. Boosting that wait is getting me better results - still testing though.
          Hide
          Mark Miller added a comment -

          Hmm...somehow this has made regular replication recovery unstable in some situations (fairly often on apache jenkins, less often locally)...trying to figure out where/how.

          Show
          Mark Miller added a comment - Hmm...somehow this has made regular replication recovery unstable in some situations (fairly often on apache jenkins, less often locally)...trying to figure out where/how.
          Hide
          Mark Miller added a comment -

          path for this - I stop committing in the prep recovery cmd so that it can be used also in the sync case - in the replicate case, we do a prep recovery cmd then an explicit commit

          Show
          Mark Miller added a comment - path for this - I stop committing in the prep recovery cmd so that it can be used also in the sync case - in the replicate case, we do a prep recovery cmd then an explicit commit
          Hide
          Mark Miller added a comment -

          Actually I should probably do one more thing here - wait to start sync until we are sure the leader sees as recovering.

          Show
          Mark Miller added a comment - Actually I should probably do one more thing here - wait to start sync until we are sure the leader sees as recovering.
          Hide
          Mark Miller added a comment -

          Alright, this is in.

          Show
          Mark Miller added a comment - Alright, this is in.
          Hide
          Mark Miller added a comment -

          Whoops - was not building the leader url correctly - fixed. I'll commit this soon.

          Show
          Mark Miller added a comment - Whoops - was not building the leader url correctly - fixed. I'll commit this soon.
          Hide
          Mark Miller added a comment -

          Current WIP.

          Still trying to track down an issue around FullSolrCloudTest#brindDownShardIndexSomeDocsAndRecover

          Show
          Mark Miller added a comment - Current WIP. Still trying to track down an issue around FullSolrCloudTest#brindDownShardIndexSomeDocsAndRecover

            People

            • Assignee:
              Mark Miller
              Reporter:
              Mark Miller
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development