Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-11794

PULL replicas stop replicating after a RELOAD collection action

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 7.1, 7.2
    • 7.3, 8.0
    • Linux version 2.6.32-642.15.1.el6.x86_64 (mockbuild@c1bm.rdu2.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC) ) #1 SMP Fri Feb 24 14:31:22 UTC 2017

    Description

      UPDATE

      PULL replica replication stops after calling the RELOAD collection API, even without any config/schema changes!
      It's also happening when schema API is used to add a new field.

      An operating SolrCloud with NRT, TLOG, and PULL replicas.
      Solr - 7.1.0
      ZK - 3.4.10

      Used config set - sample_techproducts_configs
      Shards - 1

      Whenever a schema change (adding of new fields/changing field types) is pushed to ZK and the collection is reloaded using
      /solr/admin/collections?action=RELOAD&name=sample, the index changes stop replicating to PULL replicas. NRT and TLOG are able to replicate the index.

      Before the schema change, I can see the indexFetcher thread running on PULL replica
      2017-12-26 10:17:11.802 INFO (indexFetcher-14-thread-1) [c:sample s:shard1 r:core_node6 x:sample_shard1_replica_p5] o.a.s.h.IndexFetcher Master's generation: 2
      2017-12-26 10:17:11.802 INFO (indexFetcher-14-thread-1) [c:sample s:shard1 r:core_node6 x:sample_shard1_replica_p5] o.a.s.h.IndexFetcher Master's version: 1514283298419
      2017-12-26 10:17:11.802 INFO (indexFetcher-14-thread-1) [c:sample s:shard1 r:core_node6 x:sample_shard1_replica_p5] o.a.s.h.IndexFetcher Slave's generation: 2
      2017-12-26 10:17:11.802 INFO (indexFetcher-14-thread-1) [c:sample s:shard1 r:core_node6 x:sample_shard1_replica_p5] o.a.s.h.IndexFetcher Slave's version: 1514283298419
      2017-12-26 10:17:11.802 INFO (indexFetcher-14-thread-1) [c:sample s:shard1 r:core_node6 x:sample_shard1_replica_p5] o.a.s.h.IndexFetcher Slave in sync with master.

      After that, the following change in schema that is made to managed-schema of sample_techproducts_configs, pushed to ZK, and collection reloaded.
      <field name="testpoint_1" type="point" indexed="true" stored="true"/>
      <field name="testpoint_2" type="point" indexed="true" stored="true"/>
      <field name="testpoint_3" type="point" indexed="true" stored="true"/>

      I can no longer see IndexFetcher thread running on PULL replica. No logs are printed. The logs end with the collection reload log
      2017-12-26 10:22:09.256 INFO (qtp128526626-16) [c:sample s:shard1 r:core_node6 x:sample_shard1_replica_p5] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores params=

      {core=sample_shard1_replica_p5&qt=/admin/cores&action=RELOAD&wt=javabin&version=2}

      status=0 QTime=624

      The index is never modified after this, and leader doesn't get the polls from the PULL replica.

      Observations:

      • Manually forcing an index fetch using /replication?command=fetchindex syncs the index, but doesn't start the IndexFetcher polling.
      • Restarting the replica will sync the index, starts IndexFetcher thread and polling.
      • Removing and adding the replica back as PULL will sync the index, starts IndexFetcher thread and polling.

      Attachments

        1. SOLR-11794.patch
          6 kB
          Tomas Eduardo Fernandez Lobbe
        2. SOLR-11794.patch
          0.7 kB
          Samuel Tatipamula

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tflobbe Tomas Eduardo Fernandez Lobbe
            samuel.tatipamula Samuel Tatipamula
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment