Solr
  1. Solr
  2. SOLR-4165

Queries blocked when stopping a node

    Details

    • Type: Bug Bug
    • Status: Reopened
    • Priority: Critical Critical
    • Resolution: Unresolved
    • Affects Version/s: 5.0
    • Fix Version/s: 4.9, 5.0
    • Component/s: search, SolrCloud
    • Labels:
      None
    • Environment:

      5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11 11:52:06

      Description

      Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again.

      We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration.

      There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier.

      The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter.

      UPDATE:
      Since SOLR-3655 queries are no longer blocked when starting a node, only for a few seconds when a stopping node using Solr 5.0.0.2013.02.15.13.26.04

        Issue Links

          Activity

          Markus Jelsma created issue -
          Mark Miller made changes -
          Field Original Value New Value
          Link This issue relates to SOLR-3655 [ SOLR-3655 ]
          Mark Miller made changes -
          Assignee Mark Miller [ markrmiller@gmail.com ]
          Mark Miller made changes -
          Fix Version/s 4.1 [ 12321141 ]
          Mark Miller made changes -
          Fix Version/s 4.2 [ 12323893 ]
          Fix Version/s 4.1 [ 12321141 ]
          Mark Miller made changes -
          Link This issue relates to SOLR-3655 [ SOLR-3655 ]
          Mark Miller made changes -
          Link This issue duplicates SOLR-3655 [ SOLR-3655 ]
          Mark Miller made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Duplicate [ 3 ]
          Markus Jelsma made changes -
          Resolution Duplicate [ 3 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Markus Jelsma made changes -
          Summary Queries blocked when stopping and starting a node Queries blocked when stopping a node
          Description Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again.

          We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration.

          There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier.

          The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter.

          Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries briefly when a node is stopped gracefully and again blocks queries for at least a few seconds when the node is started again.

          We're using siege to send roughly 10 queries per second to a pair a load balancers. Those load balancers ping (admin/ping) each node every few hundres milliseconds. The ping queries continue to operate normally while the requests to our main request handler is blocked. A manual request directly to a live Solr node is also blocked for the same duration.

          There are no errors logged. But it is clear that the the entire cluster blocks queries as soon as the starting node is reading its config from Zookeeper, likely even slightly earlier.

          The blocking time when stopping a node varies between 1 or 5 seconds. The blocking time when starting a node varies between 10 up to 30 seconds. The blocked queries come rushing in again after a queue of ping requests are served. The ping request sets the main request handler via the qt parameter.



          UPDATE:
          Since SOLR-3655 queries are no longer blocked when starting a node, only for a few seconds when a stopping node using Solr 5.0.0.2013.02.15.13.26.04
          Mark Miller made changes -
          Link This issue is duplicated by SOLR-4421 [ SOLR-4421 ]
          Robert Muir made changes -
          Fix Version/s 4.3 [ 12324128 ]
          Fix Version/s 5.0 [ 12321664 ]
          Fix Version/s 4.2 [ 12323893 ]
          Uwe Schindler made changes -
          Fix Version/s 4.4 [ 12324324 ]
          Fix Version/s 4.3 [ 12324128 ]
          Steve Rowe made changes -
          Fix Version/s 5.0 [ 12321664 ]
          Fix Version/s 4.5 [ 12324743 ]
          Fix Version/s 4.4 [ 12324324 ]
          Adrien Grand made changes -
          Fix Version/s 4.6 [ 12325000 ]
          Fix Version/s 5.0 [ 12321664 ]
          Fix Version/s 4.5 [ 12324743 ]
          Uwe Schindler made changes -
          Fix Version/s 4.7 [ 12325573 ]
          Fix Version/s 4.6 [ 12325000 ]
          David Smiley made changes -
          Fix Version/s 4.8 [ 12326254 ]
          Fix Version/s 4.7 [ 12325573 ]
          Uwe Schindler made changes -
          Fix Version/s 4.9 [ 12326731 ]
          Fix Version/s 5.0 [ 12321664 ]
          Fix Version/s 4.8 [ 12326254 ]

            People

            • Assignee:
              Mark Miller
              Reporter:
              Markus Jelsma
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Development