Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-6983

SocketExceptions no longer trigger retries when processing distributed updates

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 4.7
    • Fix Version/s: None
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      Our production Solr cluster is frequently placing replicas into leader-initiated recovery whenever a "java.net.SocketException: Connection reset" is thrown when processing distributed updates.

      This problem surfaced after upgrading from Solr 4.6.1 to Solr 4.10.2. In the old version, a retry was attempted whenever a SocketException was encountered when a leader was updating a replica. After the upgrade to Solr 4.10.2, this retry mechanism no longer occurs.

      Here is an example stacktrace:

      2015-01-11 09:38:00.913 [updateExecutor-1-thread-35734] ERROR org.apache.solr.update.StreamingSolrServers  – error
      java.net.SocketException: Connection reset
              at java.net.SocketInputStream.read(SocketInputStream.java:196)
              at java.net.SocketInputStream.read(SocketInputStream.java:122)
              at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
              at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
              at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
              at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
              at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
              at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
              at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
              at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
              at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
              at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
              at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
              at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
              at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
              at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
              at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
              at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
              at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
              at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)
      2015-01-11 09:38:00.917 [qtp268575911-3616964] WARN  org.apache.solr.update.processor.DistributedUpdateProcessor  – Error sending update
      java.net.SocketException: Connection reset
              at java.net.SocketInputStream.read(SocketInputStream.java:196)
              at java.net.SocketInputStream.read(SocketInputStream.java:122)
              at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
              at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
              at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
              at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
              at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
              at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
              at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
              at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
              at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
              at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
              at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
              at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
              at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
              at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
              at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
              at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
              at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
              at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)
      2015-01-11 09:38:00.958 [qtp268575911-3616964] ERROR org.apache.solr.update.processor.DistributedUpdateProcessor  – Setting up to try to start recovery on replica http://solr26:8983/solr/listings/ after: java.net.SocketException: Connection reset
      

      The underlying SocketException may be related to SOLR-6931 and the disabling of the HttpClient stale connection check.

      It appears that the retry logic was removed from the SolrCommandDistributor starting with 4.7 to address SOLR-5509. See https://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/update/SolrCmdDistributor.java?r1=1546672&r2=1546164&pathrev=1546672

      Is there any way the retry logic could be restored for Solr 4.10.x ?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                lmartin Lisa Martin
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: