Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-10698

StreamHandler should allow connections to be closed early

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.6.1, 7.0, master (8.0)
    • Component/s: None
    • Security Level: Public (Default Security Level. Issues are Public)
    • Labels:
      None

      Description

      Before a stream is drained out, if we call close() we get an exception like this:

      at
      org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:215)
      at
      org.apache.http.impl.io.ChunkedInputStream.close(ChunkedInputStream.java:316)
      at
      org.apache.http.impl.execchain.ResponseEntityProxy.streamClosed(ResponseEntityProxy.java:128)
      at
      org.apache.http.conn.EofSensorInputStream.checkClose(EofSensorInputStream.java:228)
      at
      org.apache.http.conn.EofSensorInputStream.close(EofSensorInputStream.java:174)
      at sun.nio.cs.StreamDecoder.implClose(StreamDecoder.java:378)
      at sun.nio.cs.StreamDecoder.close(StreamDecoder.java:193)
      at java.io.InputStreamReader.close(InputStreamReader.java:199)
      at
      org.apache.solr.client.solrj.io.stream.JSONTupleStream.close(JSONTupleStream.java:91)
      at
      org.apache.solr.client.solrj.io.stream.SolrStream.close(SolrStream.java:186)
      
      

      As quoted from https://www.mail-archive.com/solr-user@lucene.apache.org/msg130676.html the problem seems to when we hit an exception the /steam handler does not close the stream.

      1. SOLR-10698.patch
        4 kB
        Joel Bernstein

        Activity

        Hide
        joel.bernstein Joel Bernstein added a comment -

        I started to dig into this ticket. What I found straight off is that when a client disconnects, the streaming code continues to write data to the outputstream and and an exception is never thrown. The expected behavior would be to receive an org.eclipse.jetty.io.EofException. I'm currently investigating how this could be happening. One possible reason for this is that the exception is being swallowed somewhere in Jetty's outputstream stack, so that client disconnects don't throw exceptions. I'll dig into this and see what I find.

        Show
        joel.bernstein Joel Bernstein added a comment - I started to dig into this ticket. What I found straight off is that when a client disconnects, the streaming code continues to write data to the outputstream and and an exception is never thrown. The expected behavior would be to receive an org.eclipse.jetty.io.EofException. I'm currently investigating how this could be happening. One possible reason for this is that the exception is being swallowed somewhere in Jetty's outputstream stack, so that client disconnects don't throw exceptions. I'll dig into this and see what I find.
        Hide
        joel.bernstein Joel Bernstein added a comment -

        Update on this ticket. It turned out the problem had nothing to do with jetty. The issue is that the HttpClient continues to consume the stream even after the inputstream from the httpclient has been closed. Varun Thacker, noticed the odd looking read following the close in the stack trace. This is the ChunkedInputStream close behavior, which is to consume the entire stream when close is called.

        I'll post a patch shortly that closes the HttpResponse rather then the inputstream. The patch also catches all exceptions that occur while writing to the client and closes the underlying tuple stream.

        The effect of these two changes should be that when the client closes a TupleStream:

        1) The server writing the tuples will get a broken pipe exception and close it's underlying TupleStream.
        2) This will cause a chain reaction of broken pipes and stream closings that will eventually shut down all streams and stop the /export handler with a broken pipe exception.

        This patch has not yet been tested.

        Show
        joel.bernstein Joel Bernstein added a comment - Update on this ticket. It turned out the problem had nothing to do with jetty. The issue is that the HttpClient continues to consume the stream even after the inputstream from the httpclient has been closed. Varun Thacker , noticed the odd looking read following the close in the stack trace. This is the ChunkedInputStream close behavior, which is to consume the entire stream when close is called. I'll post a patch shortly that closes the HttpResponse rather then the inputstream. The patch also catches all exceptions that occur while writing to the client and closes the underlying tuple stream. The effect of these two changes should be that when the client closes a TupleStream: 1) The server writing the tuples will get a broken pipe exception and close it's underlying TupleStream. 2) This will cause a chain reaction of broken pipes and stream closings that will eventually shut down all streams and stop the /export handler with a broken pipe exception. This patch has not yet been tested.
        Hide
        joel.bernstein Joel Bernstein added a comment -

        In my local testing it looks like the patch resolves the issue. I'll do some more testing but it looks promising.

        Show
        joel.bernstein Joel Bernstein added a comment - In my local testing it looks like the patch resolves the issue. I'll do some more testing but it looks promising.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 02b1c8aa360c8c87bf4cc20b688d7993ec6d7b1b in lucene-solr's branch refs/heads/master from Joel Bernstein
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=02b1c8a ]

        SOLR-10698: StreamHandler should allow connections to be closed early

        Show
        jira-bot ASF subversion and git services added a comment - Commit 02b1c8aa360c8c87bf4cc20b688d7993ec6d7b1b in lucene-solr's branch refs/heads/master from Joel Bernstein [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=02b1c8a ] SOLR-10698 : StreamHandler should allow connections to be closed early
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit c71ce16bb90c5fc58acbc8f9ee2e74249f11f3ac in lucene-solr's branch refs/heads/master from Joel Bernstein
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c71ce16 ]

        SOLR-10698: Fix precommit

        Show
        jira-bot ASF subversion and git services added a comment - Commit c71ce16bb90c5fc58acbc8f9ee2e74249f11f3ac in lucene-solr's branch refs/heads/master from Joel Bernstein [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c71ce16 ] SOLR-10698 : Fix precommit
        Hide
        joel.bernstein Joel Bernstein added a comment -

        There is now a fix for this in master. We should discuss the backporting of this ticket. A couple of things to think about:

        1) Will there be a Lucene/Solr 6.7. Even though 7.0 is imminent I think a 6.7 would be useful for users that will not be able to immediately upgrade to 7.0.

        2) If there is not going to be a 6.7 do we need a bug fix release on 6.6 for this issue.

        3) There may be a need to backport this even further to a build prior to the refactoring for binary transport. If that's the case then it's not a straight backport and will involve different classes. We can discuss what classes need to be changed in this scenario.

        Show
        joel.bernstein Joel Bernstein added a comment - There is now a fix for this in master. We should discuss the backporting of this ticket. A couple of things to think about: 1) Will there be a Lucene/Solr 6.7. Even though 7.0 is imminent I think a 6.7 would be useful for users that will not be able to immediately upgrade to 7.0. 2) If there is not going to be a 6.7 do we need a bug fix release on 6.6 for this issue. 3) There may be a need to backport this even further to a build prior to the refactoring for binary transport. If that's the case then it's not a straight backport and will involve different classes. We can discuss what classes need to be changed in this scenario.
        Hide
        erickerickson Erick Erickson added a comment -

        My vote would be to back-port for 6.7. I suspect that there will be one just to wrap up the 6x code line with a bow. If a 6.6.x is released we can back port then I should think if we want.

        Show
        erickerickson Erick Erickson added a comment - My vote would be to back-port for 6.7. I suspect that there will be one just to wrap up the 6x code line with a bow. If a 6.6.x is released we can back port then I should think if we want.
        Hide
        joel.bernstein Joel Bernstein added a comment -

        +1. This should be a simple backport.

        Show
        joel.bernstein Joel Bernstein added a comment - +1. This should be a simple backport.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        My vote would be to back-port for 6.7. I suspect that there will be one just to wrap up the 6x code line with a bow.

        I agree - there's no reason any release needs to be particularly large, so having 6.7 as just a wrap-up release (probably concurrently with 7.0?) makes sense.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - My vote would be to back-port for 6.7. I suspect that there will be one just to wrap up the 6x code line with a bow. I agree - there's no reason any release needs to be particularly large, so having 6.7 as just a wrap-up release (probably concurrently with 7.0?) makes sense.
        Hide
        erickerickson Erick Erickson added a comment -

        Grabbed to back-port to 6.7

        Show
        erickerickson Erick Erickson added a comment - Grabbed to back-port to 6.7
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit a8db3701e93bca385bf15058c1abdbfd786a88ae in lucene-solr's branch refs/heads/branch_6x from Joel Bernstein
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a8db370 ]

        SOLR-10698: StreamHandler should allow connections to be closed early

        (cherry picked from commit 02b1c8a)

        Show
        jira-bot ASF subversion and git services added a comment - Commit a8db3701e93bca385bf15058c1abdbfd786a88ae in lucene-solr's branch refs/heads/branch_6x from Joel Bernstein [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a8db370 ] SOLR-10698 : StreamHandler should allow connections to be closed early (cherry picked from commit 02b1c8a)
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit f9f64a9574d6da18a7395a22ce73c40438899c11 in lucene-solr's branch refs/heads/branch_6x from Joel Bernstein
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f9f64a9 ]

        SOLR-10698: Fix precommit

        (cherry picked from commit c71ce16)

        Show
        jira-bot ASF subversion and git services added a comment - Commit f9f64a9574d6da18a7395a22ce73c40438899c11 in lucene-solr's branch refs/heads/branch_6x from Joel Bernstein [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f9f64a9 ] SOLR-10698 : Fix precommit (cherry picked from commit c71ce16)
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Just for some additional context – The close method tries to consume the entire stream because if you don't do that, the connection can no longer be re-used and must be discarded. So in normal cases, this behavior of consuming the stream is a good thing because it allows you the connection to go back to the pool and be reused. But obviously for long running requests that receive huge amounts of data from StreamHandler or ExportHandler, attempting to consume the stream fully on abort is a problem.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Just for some additional context – The close method tries to consume the entire stream because if you don't do that, the connection can no longer be re-used and must be discarded. So in normal cases, this behavior of consuming the stream is a good thing because it allows you the connection to go back to the pool and be reused. But obviously for long running requests that receive huge amounts of data from StreamHandler or ExportHandler, attempting to consume the stream fully on abort is a problem.
        Hide
        varunthacker Varun Thacker added a comment -

        Hi Joel,

        Should we add a CHANGES.txt entry for this?

        Show
        varunthacker Varun Thacker added a comment - Hi Joel, Should we add a CHANGES.txt entry for this?
        Hide
        joel.bernstein Joel Bernstein added a comment -

        Yes, let's add this to CHANGES.txt. I'll be working on docs and updates to CHANGES for the 7.0 release today. I'll include this ticket.

        Show
        joel.bernstein Joel Bernstein added a comment - Yes, let's add this to CHANGES.txt. I'll be working on docs and updates to CHANGES for the 7.0 release today. I'll include this ticket.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 5786d092545f323c0c13adcb4633d949d08b1bd3 in lucene-solr's branch refs/heads/master from Joel Bernstein
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5786d09 ]

        SOLR-10698: Updated CHANGES.txt

        Show
        jira-bot ASF subversion and git services added a comment - Commit 5786d092545f323c0c13adcb4633d949d08b1bd3 in lucene-solr's branch refs/heads/master from Joel Bernstein [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5786d09 ] SOLR-10698 : Updated CHANGES.txt
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 92f3caf8cfc93b8525895d4d2fa9a2c383b90c08 in lucene-solr's branch refs/heads/branch_7_0 from Joel Bernstein
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=92f3caf ]

        SOLR-10698: Updated CHANGES.txt

        Show
        jira-bot ASF subversion and git services added a comment - Commit 92f3caf8cfc93b8525895d4d2fa9a2c383b90c08 in lucene-solr's branch refs/heads/branch_7_0 from Joel Bernstein [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=92f3caf ] SOLR-10698 : Updated CHANGES.txt
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 39ac9def8529347a3b0c3912b2797f4021ac72cb in lucene-solr's branch refs/heads/branch_7x from Joel Bernstein
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=39ac9de ]

        SOLR-10698: Updated CHANGES.txt

        Show
        jira-bot ASF subversion and git services added a comment - Commit 39ac9def8529347a3b0c3912b2797f4021ac72cb in lucene-solr's branch refs/heads/branch_7x from Joel Bernstein [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=39ac9de ] SOLR-10698 : Updated CHANGES.txt
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit f31c5a2906efb92900bf66373ecdd4d21ba4110e in lucene-solr's branch refs/heads/branch_6_6 from Varun Thacker
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f31c5a2 ]

        SOLR-10698: StreamHandler should allow connections to be closed early

        Show
        jira-bot ASF subversion and git services added a comment - Commit f31c5a2906efb92900bf66373ecdd4d21ba4110e in lucene-solr's branch refs/heads/branch_6_6 from Varun Thacker [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f31c5a2 ] SOLR-10698 : StreamHandler should allow connections to be closed early

          People

          • Assignee:
            erickerickson Erick Erickson
            Reporter:
            varunthacker Varun Thacker
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development