Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-8843

Missing fallback for NoChildrenForEphemeralsException on ZkController.getLeaderPropsWithFallback for rolling upgrade

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 5.4.1, 5.5
    • None
    • SolrCloud
    • None

    Description

      When doing a rolling upgrade from 5.3.2 to 5.4.1 (or 5.5.0) leader election fails with the following error (NoChildrenForEphemeralsException):

      ERROR org.apache.solr.cloud.ShardLeaderElectionContext  [c:collection s:shard1 r:core_node1 x:collection_shard1_replica1] – There was a problem trying to register as the leader:org.apache.solr.common.SolrException: Could not register as the leader because creating the ephemeral registration node in ZooKeeper failed
      #011at org.apache.solr.cloud.ShardLeaderElectionContextBase.runLeaderProcess(ElectionContext.java:214)
      #011at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:406)
      #011at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:198)
      #011at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:158)
      #011at org.apache.solr.cloud.LeaderElector.access$200(LeaderElector.java:59)
      #011at org.apache.solr.cloud.LeaderElector$ElectionWatcher.process(LeaderElector.java:389)
      #011at org.apache.solr.common.cloud.SolrZkClient$3$1.run(SolrZkClient.java:264)
      #011at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      #011at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      #011at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:232)
      #011at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      #011at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      #011at java.lang.Thread.run(Thread.java:745)
      Caused by: org.apache.zookeeper.KeeperException$NoChildrenForEphemeralsException: KeeperErrorCode = NoChildrenForEphemerals
      #011at org.apache.zookeeper.KeeperException.create(KeeperException.java:117)
      #011at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:949)
      #011at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915)
      #011at org.apache.solr.common.cloud.SolrZkClient$11.execute(SolrZkClient.java:570)
      #011at org.apache.solr.common.cloud.SolrZkClient$11.execute(SolrZkClient.java:567)
      #011at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)
      #011at org.apache.solr.common.cloud.SolrZkClient.multi(SolrZkClient.java:567)
      #011at org.apache.solr.cloud.ShardLeaderElectionContextBase$1.execute(ElectionContext.java:197)
      #011at org.apache.solr.common.util.RetryUtil.retryOnThrowable(RetryUtil.java:50)
      #011at org.apache.solr.common.util.RetryUtil.retryOnThrowable(RetryUtil.java:43)
      #011at org.apache.solr.cloud.ShardLeaderElectionContextBase.runLeaderProcess(ElectionContext.java:179)
      #011... 12 more
      

      A similar issues has been resolved with SOLR-8561, but it is not handling the case NoChildrenForEphemeralsException.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              enrico.hartung Enrico Hartung
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: