Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-313

Threads in servers should not die silently.

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      If there is an uncaught exception, some threads in a server may die silently. The corresponding error message does not show up in the log.

        Issue Links

          Activity

          Hide
          Tsz Wo Nicholas Sze added a comment -

          We should register a Thread.UncaughtExceptionHandler. When there is an uncaught exception, the handler should

          • log the exception,
          • restart the corresponding thread if possible,
          • shutdown the server if necessary.
          Show
          Tsz Wo Nicholas Sze added a comment - We should register a Thread.UncaughtExceptionHandler. When there is an uncaught exception, the handler should log the exception, restart the corresponding thread if possible, shutdown the server if necessary.
          Hide
          Nigel Daley added a comment -

          And/or ensure (via code review) that Thread and Runnable classes catch and handle Throwable exceptions properly.

          Show
          Nigel Daley added a comment - And/or ensure (via code review) that Thread and Runnable classes catch and handle Throwable exceptions properly.
          Hide
          Steve Loughran added a comment -

          I dont think JVM shutdown is an action that should be taken by a handler, as it is making assumptions about how the process is running that may not always hold. It will certainly make a mess of JUnit tests, for example.

          How about we identify wherever threads get used and make sure that exceptions get caught and reported. In my codebase we have replaced Runnable with an interface that throws things

          public interface Executable

          { void execute() throws Throwable; }

          There is a matching thread class that implements this and calls any implementation it is bonded to; threads are caught and a notification object called. This lets us catch everything and listen for problems.

          public void run() {
          try

          { execute(); }

          catch (Throwable throwable)

          { setThrown(throwable); }

          finally {
          synchronized (notifyObject)

          { //set the finished bit finished = true; //notify any waiters notifyObject.notifyAll(); }

          }
          }

          If the services aren't catching failures, it also means they are probably leaking threads. Not good for extended use.

          I'd be happier with

          • something like the code above
          • every thread to have a useful toString() override
          • the UncaughtExceptionHandler only logs the uncaught exception at ERROR level, with the relevant toString(). The error message should tell the viewer to log a bugrep on apache jira

          Alternatively: use the java5 concurrency stuff instead of threads.

          Show
          Steve Loughran added a comment - I dont think JVM shutdown is an action that should be taken by a handler, as it is making assumptions about how the process is running that may not always hold. It will certainly make a mess of JUnit tests, for example. How about we identify wherever threads get used and make sure that exceptions get caught and reported. In my codebase we have replaced Runnable with an interface that throws things public interface Executable { void execute() throws Throwable; } There is a matching thread class that implements this and calls any implementation it is bonded to; threads are caught and a notification object called. This lets us catch everything and listen for problems. public void run() { try { execute(); } catch (Throwable throwable) { setThrown(throwable); } finally { synchronized (notifyObject) { //set the finished bit finished = true; //notify any waiters notifyObject.notifyAll(); } } } If the services aren't catching failures, it also means they are probably leaking threads. Not good for extended use. I'd be happier with something like the code above every thread to have a useful toString() override the UncaughtExceptionHandler only logs the uncaught exception at ERROR level, with the relevant toString(). The error message should tell the viewer to log a bugrep on apache jira Alternatively: use the java5 concurrency stuff instead of threads.
          Hide
          Tom White added a comment -

          Alternatively: use the java5 concurrency stuff instead of threads.

          +1

          Callable is almost equivalent to your Executable interface.

          Show
          Tom White added a comment - Alternatively: use the java5 concurrency stuff instead of threads. +1 Callable is almost equivalent to your Executable interface.
          Hide
          Steve Loughran added a comment -

          so it is. +1 to using it, adding handlers for failures that log and retain errors, report to hosting services

          Show
          Steve Loughran added a comment - so it is. +1 to using it, adding handlers for failures that log and retain errors, report to hosting services
          Hide
          Allen Wittenauer added a comment -

          This probably needs to get revisited.... so PING!

          Show
          Allen Wittenauer added a comment - This probably needs to get revisited.... so PING!

            People

            • Assignee:
              Unassigned
              Reporter:
              Tsz Wo Nicholas Sze
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:

                Development