Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-11794

PULL replicas stop replicating after a RELOAD collection action

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 7.1, 7.2
    • 7.3, 8.0
    • Linux version 2.6.32-642.15.1.el6.x86_64 (mockbuild@c1bm.rdu2.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC) ) #1 SMP Fri Feb 24 14:31:22 UTC 2017

    Description

      UPDATE

      PULL replica replication stops after calling the RELOAD collection API, even without any config/schema changes!
      It's also happening when schema API is used to add a new field.

      An operating SolrCloud with NRT, TLOG, and PULL replicas.
      Solr - 7.1.0
      ZK - 3.4.10

      Used config set - sample_techproducts_configs
      Shards - 1

      Whenever a schema change (adding of new fields/changing field types) is pushed to ZK and the collection is reloaded using
      /solr/admin/collections?action=RELOAD&name=sample, the index changes stop replicating to PULL replicas. NRT and TLOG are able to replicate the index.

      Before the schema change, I can see the indexFetcher thread running on PULL replica
      2017-12-26 10:17:11.802 INFO (indexFetcher-14-thread-1) [c:sample s:shard1 r:core_node6 x:sample_shard1_replica_p5] o.a.s.h.IndexFetcher Master's generation: 2
      2017-12-26 10:17:11.802 INFO (indexFetcher-14-thread-1) [c:sample s:shard1 r:core_node6 x:sample_shard1_replica_p5] o.a.s.h.IndexFetcher Master's version: 1514283298419
      2017-12-26 10:17:11.802 INFO (indexFetcher-14-thread-1) [c:sample s:shard1 r:core_node6 x:sample_shard1_replica_p5] o.a.s.h.IndexFetcher Slave's generation: 2
      2017-12-26 10:17:11.802 INFO (indexFetcher-14-thread-1) [c:sample s:shard1 r:core_node6 x:sample_shard1_replica_p5] o.a.s.h.IndexFetcher Slave's version: 1514283298419
      2017-12-26 10:17:11.802 INFO (indexFetcher-14-thread-1) [c:sample s:shard1 r:core_node6 x:sample_shard1_replica_p5] o.a.s.h.IndexFetcher Slave in sync with master.

      After that, the following change in schema that is made to managed-schema of sample_techproducts_configs, pushed to ZK, and collection reloaded.
      <field name="testpoint_1" type="point" indexed="true" stored="true"/>
      <field name="testpoint_2" type="point" indexed="true" stored="true"/>
      <field name="testpoint_3" type="point" indexed="true" stored="true"/>

      I can no longer see IndexFetcher thread running on PULL replica. No logs are printed. The logs end with the collection reload log
      2017-12-26 10:22:09.256 INFO (qtp128526626-16) [c:sample s:shard1 r:core_node6 x:sample_shard1_replica_p5] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores params=

      {core=sample_shard1_replica_p5&qt=/admin/cores&action=RELOAD&wt=javabin&version=2}

      status=0 QTime=624

      The index is never modified after this, and leader doesn't get the polls from the PULL replica.

      Observations:

      • Manually forcing an index fetch using /replication?command=fetchindex syncs the index, but doesn't start the IndexFetcher polling.
      • Restarting the replica will sync the index, starts IndexFetcher thread and polling.
      • Removing and adding the replica back as PULL will sync the index, starts IndexFetcher thread and polling.

      Attachments

        1. SOLR-11794.patch
          0.7 kB
          Samuel Tatipamula
        2. SOLR-11794.patch
          6 kB
          Tomas Eduardo Fernandez Lobbe

        Issue Links

          Activity

            People

              tflobbe Tomas Eduardo Fernandez Lobbe
              samuel.tatipamula Samuel Tatipamula
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: