Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9471

Another race condition in ClusterStatus.getClusterStatus

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • 6.1
    • None
    • SolrCloud
    • None

    Description

      Reading cluster state information using /collections?action=CLUSTERSTATUS can fail if there's a concurrent deletion of a collection with its configset.

      The code in ClusterStatus.getClusterStatus

      • gets collection names
      • for every collection reads its "config name" from Zk

      The problem is that if there's a concurrent delete operation for a collection and its configset in between then ClusterState.getCollection can fail thus causing the whole operation to fail. It seems that it would be better to catch Zk's NoNodeException for this particular case and handle it somehow (can we ignore this collection right away or should we re-check?)

      Error loading config name for collection test (500)  Trace: org.apache.solr.common.SolrException: Error loading config name for collection test
      	at org.apache.solr.common.cloud.ZkStateReader.readConfigName(ZkStateReader.java:196)
      	at org.apache.solr.handler.admin.ClusterStatus.getClusterStatus(ClusterStatus.java:141)
      	at org.apache.solr.handler.admin.CollectionsHandler$CollectionOperation$21.call(CollectionsHandler.java:695)
      ...
      Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /collections/test
      	at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
      	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
      	at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
      	at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:348)
      	at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:345)
      	at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
      	at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:345)
      	at org.apache.solr.common.cloud.ZkStateReader.readConfigName(ZkStateReader.java:178)
      	... 32 more
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            alexey.serba Alexey Serba
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment