Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-7869

Overseer does not handle BadVersionException correctly

    XMLWordPrintableJSON

Details

    Description

      If the /clusterstate.json is modified externally then the Overseer can go into an infinite loop upon a BadVersionException alternately trying to execute main queue and then the work queue:

      ERROR - 2015-08-04 18:49:56.224; [   ] org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer work queue loop
      org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /clusterstate.json
              at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
              at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
              at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270)
              at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:362)
              at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:359)
              at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)
              at org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:359)
              at org.apache.solr.cloud.overseer.ZkStateWriter.writePendingUpdates(ZkStateWriter.java:180)
              at org.apache.solr.cloud.overseer.ZkStateWriter.enqueueUpdate(ZkStateWriter.java:67)
              at org.apache.solr.cloud.Overseer$ClusterStateUpdater.processQueueItem(Overseer.java:286)
              at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:168)
              at java.lang.Thread.run(Thread.java:745)
      INFO  - 2015-08-04 18:49:56.224; [   ] org.apache.solr.cloud.Overseer$ClusterStateUpdater; processMessage: queueSize: 1, message = {
        "operation":"state",
        "state":"down",
        "base_url":"http://127.0.1.1:7574/solr",
        "core":"test_shard1_replica1",
        "roles":null,
        "node_name":"127.0.1.1:7574_solr",
        "shard":null,
        "collection":"test",
        "core_node_name":"core_node1"} current state version: 9
      INFO  - 2015-08-04 18:49:56.224; [   ] org.apache.solr.cloud.overseer.ReplicaMutator; Update state numShards=null message={
        "operation":"state",
        "state":"down",
        "base_url":"http://127.0.1.1:7574/solr",
        "core":"test_shard1_replica1",
        "roles":null,
        "node_name":"127.0.1.1:7574_solr",
        "shard":null,
        "collection":"test",
        "core_node_name":"core_node1"}
      INFO  - 2015-08-04 18:49:56.224; [   ] org.apache.solr.cloud.overseer.ReplicaMutator; shard=shard1 is already registered
      ERROR - 2015-08-04 18:49:56.225; [   ] org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer main queue loop
      org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /clusterstate.json
              at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
              at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
              at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270)
              at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:362)
              at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:359)
              at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)
              at org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:359)
              at org.apache.solr.cloud.overseer.ZkStateWriter.writePendingUpdates(ZkStateWriter.java:180)
              at org.apache.solr.cloud.overseer.ZkStateWriter.enqueueUpdate(ZkStateWriter.java:67)
              at org.apache.solr.cloud.Overseer$ClusterStateUpdater.processQueueItem(Overseer.java:286)
              at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:213)
              at java.lang.Thread.run(Thread.java:745)
      INFO  - 2015-08-04 18:49:56.225; [   ] org.apache.solr.common.cloud.ZkStateReader; Updating data for gettingstarted to ver 8
      

      Attachments

        1. SOLR-7869.patch
          23 kB
          Shalin Shekhar Mangar
        2. SOLR-7869.patch
          23 kB
          Shalin Shekhar Mangar
        3. SOLR-7869.patch
          8 kB
          Shalin Shekhar Mangar
        4. SOLR-7869.patch
          3 kB
          Scott Blum

        Activity

          People

            shalin Shalin Shekhar Mangar
            shalin Shalin Shekhar Mangar
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: