Solr
  1. Solr
  2. SOLR-2308

Race condition still exists in StreamingUpdateSolrServer which could cause it to hang

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.1
    • Fix Version/s: 3.1, 4.0-ALPHA
    • Component/s: clients - java
    • Labels:
      None

      Description

      We are still seeing the same issue as SOLR-1711 & SOLR-1885 with Solr1.4.1
      We get into this situation when all the runner threads die due to a broken pipe, while the BlockingQueue is still full. All of the producer threads are all blocked on the BlockingQueue.put() method. Since the runners are spawned by the producers, which are all blocked, runner threads never get created to drain the queue.

      Here's a potential fix. In the runner code, replace these lines:

      // remove it from the list of running things...
      synchronized (runners) {
          runners.remove( this );
      }
      

      with these lines:

      // remove it from the list of running things unless we are the last runner and the queue is full...
      synchronized (runners) {
          if (runners.size() == 1 && queue.remainingCapacity() == 0) {
              // keep this runner alive
              scheduler.execute(this);
          } else {
              runners.remove( this );
          }
      }
      

        Issue Links

          Activity

          Hide
          Jan Høydahl added a comment -

          Linking to the issue that actually got committed

          Show
          Jan Høydahl added a comment - Linking to the issue that actually got committed
          Hide
          Grant Ingersoll added a comment -

          Bulk close for 3.1.0 release

          Show
          Grant Ingersoll added a comment - Bulk close for 3.1.0 release
          Hide
          Yonik Seeley added a comment -

          Oops, I saw Johannes fix posted in SOLR-1744 and didn't know this issue existed.
          Anyway, this should now be fixed in 3x and trunk.

          Show
          Yonik Seeley added a comment - Oops, I saw Johannes fix posted in SOLR-1744 and didn't know this issue existed. Anyway, this should now be fixed in 3x and trunk.
          Hide
          Russell Teabeault added a comment - - edited

          It seems that this condition happens when HttpClient.executeMethod(method) is called in the Runner and it throws an exception. This would cause the runner to get removed from the runners queue. And if this happened quickly for all the runners then callers to the request method would block trying to add the request to the blocking queue if the request queue was full.

          Would another possible solution be to add a catch for the try block where HttpClient.executeMethod(method) is called?

              int statusCode = getHttpClient().executeMethod(method);
              if (statusCode != HttpStatus.SC_OK) {
                  StringBuilder msg = new StringBuilder();
                  msg.append( method.getStatusLine().getReasonPhrase() );
                  msg.append( "\n\n" );
                  msg.append( method.getStatusText() );
                  msg.append( "\n\n" );
                  msg.append( "request: "+method.getURI() );
                  handleError( new Exception( msg.toString() ) );
               }
           }  /** catch added here */
           catch(Exception e) {
               handleError( e);
           } finally {
             try {
                // make sure to release the connection
                 if(method != null)
                     method.releaseConnection();
                }
                catch( Exception ex ){}
          }
          
          Show
          Russell Teabeault added a comment - - edited It seems that this condition happens when HttpClient.executeMethod(method) is called in the Runner and it throws an exception. This would cause the runner to get removed from the runners queue. And if this happened quickly for all the runners then callers to the request method would block trying to add the request to the blocking queue if the request queue was full. Would another possible solution be to add a catch for the try block where HttpClient.executeMethod(method) is called? int statusCode = getHttpClient().executeMethod(method); if (statusCode != HttpStatus.SC_OK) { StringBuilder msg = new StringBuilder(); msg.append( method.getStatusLine().getReasonPhrase() ); msg.append( "\n\n" ); msg.append( method.getStatusText() ); msg.append( "\n\n" ); msg.append( "request: " +method.getURI() ); handleError( new Exception( msg.toString() ) ); } } /** catch added here */ catch (Exception e) { handleError( e); } finally { try { // make sure to release the connection if (method != null ) method.releaseConnection(); } catch ( Exception ex ){} }

            People

            • Assignee:
              Unassigned
              Reporter:
              Johannes
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development