Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-7940

[CollectionAPI] Frequent Cluster Status timeout

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.10.2
    • Fix Version/s: 5.3
    • Component/s: SolrCloud
    • Labels:
      None
    • Environment:

      Ubuntu on Azure

      Description

      Very often we have a timeout when we call http://server2:8080/solr/admin/collections?action=CLUSTERSTATUS&wt=json

      {"responseHeader": 
      {"status": 500,
      "QTime": 180100},
      "error": 
      {"msg": "CLUSTERSTATUS the collection time out:180s",
      "trace": "org.apache.solr.common.SolrException: CLUSTERSTATUS the collection time out:180s\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:368)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:320)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleClusterStatus(CollectionsHandler.java:640)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:220)\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:729)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:267)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1338)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:350)\n\tat org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)\n\tat org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)\n\tat org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)\n\tat org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:630)\n\tat org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)\n\tat org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:77)\n\tat org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:606)\n\tat org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:46)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:603)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:538)\n\tat java.lang.Thread.run(Thread.java:745)\n",
      "code": 500}}
      

      The cluster has 3 SolR nodes with 6 small collections replicated on all nodes.
      We were using this api to monitor cluster state but it was failing every 10 minutes. We switched by using ZkStateReader in CloudSolrServer and it has been working for a day without problems.

      Is there a kind of deadlock as this call was been made on the three nodes concurrently?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                stephlag Stephan Lagraulet
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: