Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
5.2.1
Description
If the /clusterstate.json is modified externally then the Overseer can go into an infinite loop upon a BadVersionException alternately trying to execute main queue and then the work queue:
ERROR - 2015-08-04 18:49:56.224; [ ] org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer work queue loop org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /clusterstate.json at org.apache.zookeeper.KeeperException.create(KeeperException.java:115) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270) at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:362) at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:359) at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61) at org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:359) at org.apache.solr.cloud.overseer.ZkStateWriter.writePendingUpdates(ZkStateWriter.java:180) at org.apache.solr.cloud.overseer.ZkStateWriter.enqueueUpdate(ZkStateWriter.java:67) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.processQueueItem(Overseer.java:286) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:168) at java.lang.Thread.run(Thread.java:745) INFO - 2015-08-04 18:49:56.224; [ ] org.apache.solr.cloud.Overseer$ClusterStateUpdater; processMessage: queueSize: 1, message = { "operation":"state", "state":"down", "base_url":"http://127.0.1.1:7574/solr", "core":"test_shard1_replica1", "roles":null, "node_name":"127.0.1.1:7574_solr", "shard":null, "collection":"test", "core_node_name":"core_node1"} current state version: 9 INFO - 2015-08-04 18:49:56.224; [ ] org.apache.solr.cloud.overseer.ReplicaMutator; Update state numShards=null message={ "operation":"state", "state":"down", "base_url":"http://127.0.1.1:7574/solr", "core":"test_shard1_replica1", "roles":null, "node_name":"127.0.1.1:7574_solr", "shard":null, "collection":"test", "core_node_name":"core_node1"} INFO - 2015-08-04 18:49:56.224; [ ] org.apache.solr.cloud.overseer.ReplicaMutator; shard=shard1 is already registered ERROR - 2015-08-04 18:49:56.225; [ ] org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer main queue loop org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /clusterstate.json at org.apache.zookeeper.KeeperException.create(KeeperException.java:115) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270) at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:362) at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:359) at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61) at org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:359) at org.apache.solr.cloud.overseer.ZkStateWriter.writePendingUpdates(ZkStateWriter.java:180) at org.apache.solr.cloud.overseer.ZkStateWriter.enqueueUpdate(ZkStateWriter.java:67) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.processQueueItem(Overseer.java:286) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:213) at java.lang.Thread.run(Thread.java:745) INFO - 2015-08-04 18:49:56.225; [ ] org.apache.solr.common.cloud.ZkStateReader; Updating data for gettingstarted to ver 8