Derby
  1. Derby
  2. DERBY-5271

Client may hang if the server crashes due to a java.lang.Error

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 10.1.3.1, 10.2.2.0, 10.3.3.0, 10.4.2.0, 10.5.3.0, 10.6.2.1, 10.7.1.1, 10.8.1.2, 10.9.1.0
    • Fix Version/s: 10.8.2.2, 10.9.1.0
    • Component/s: Network Server
    • Labels:
      None

      Description

      When certain types of errors are raised while the network server is processing a client request, the server is left in a semi-degraded state. The problem this issue is concerned with, is that the client socket is kept open even though the server in a kind of degraded state (server JVM still alive). This causes the client to hang, until the server JVM is killed, in a read-call on the socket.

      I'm able to reproduce this with an OOME being raised on the server.

      In my opinion, hanging when there is no chance of progression is bad behavior. Furthermore, it causes trouble for automated testing.

        Activity

        Hide
        Kristian Waagan added a comment -

        Re-closing issue.

        Show
        Kristian Waagan added a comment - Re-closing issue.
        Hide
        Kathey Marsden added a comment -

        reopen to fix affects version. I think this goes all the way back.

        Show
        Kathey Marsden added a comment - reopen to fix affects version. I think this goes all the way back.
        Hide
        Kristian Waagan added a comment -

        Closing issue.

        Show
        Kristian Waagan added a comment - Closing issue.
        Hide
        Kristian Waagan added a comment -

        Added two extra comments with revision 1160667.

        Show
        Kristian Waagan added a comment - Added two extra comments with revision 1160667.
        Hide
        Kristian Waagan added a comment -

        Backported to 10.8 with revision 1160590.
        Resolving issue.

        Show
        Kristian Waagan added a comment - Backported to 10.8 with revision 1160590. Resolving issue.
        Hide
        Kristian Waagan added a comment -

        Committed patch 1a to trunk with revision 1158108.
        I plan to backport this fix.

        Regarding making the server hang, I can't guarantee that it won't happen. If it happens it would hopefully affect only the worker thread that crashed. Most of the Error subclasses are pretty serious and the JVM will come down in many cases. OOME isn't necessarily one of these - what happens depends on the nature of the shortage and in which threads an OOME is raised. Further, a single hung ClientThread may not be a problem, assuming that database resources (like locks) have been released.

        Show
        Kristian Waagan added a comment - Committed patch 1a to trunk with revision 1158108. I plan to backport this fix. Regarding making the server hang, I can't guarantee that it won't happen. If it happens it would hopefully affect only the worker thread that crashed. Most of the Error subclasses are pretty serious and the JVM will come down in many cases. OOME isn't necessarily one of these - what happens depends on the nature of the shortage and in which threads an OOME is raised. Further, a single hung ClientThread may not be a problem, assuming that database resources (like locks) have been released.
        Hide
        Dag H. Wanvik added a comment -

        Agreed. Just wondering, I haven't studied the code: if there a chance that trying to close the session might make the server hang (and not get to throw Error)?

        Show
        Dag H. Wanvik added a comment - Agreed. Just wondering, I haven't studied the code: if there a chance that trying to close the session might make the server hang (and not get to throw Error)?
        Hide
        Knut Anders Hatlen added a comment -

        This looks like a reasonable fix. +1

        Show
        Knut Anders Hatlen added a comment - This looks like a reasonable fix. +1
        Hide
        Kristian Waagan added a comment -

        Attaching an initial fix proposal with patch 1a to get the discussion started.
        It addresses the problem I encountered, where the client hung due to an OOME on the server (same machine/JVM). This was part of a test run, and I had to manually kill the JVM to get the test script to continue (I think the first time this happened the process was left untouched for more than 12 hours).

        Show
        Kristian Waagan added a comment - Attaching an initial fix proposal with patch 1a to get the discussion started. It addresses the problem I encountered, where the client hung due to an OOME on the server (same machine/JVM). This was part of a test run, and I had to manually kill the JVM to get the test script to continue (I think the first time this happened the process was left untouched for more than 12 hours).

          People

          • Assignee:
            Kristian Waagan
            Reporter:
            Kristian Waagan
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development