Hadoop YARN
  1. Hadoop YARN
  2. YARN-68

NodeManager will refuse to shutdown indefinitely due to container log aggregation

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.3
    • Fix Version/s: 2.0.2-alpha, 0.23.3
    • Component/s: nodemanager
    • Labels:
      None
    • Environment:

      QE

      Description

      The nodemanager is able to get into a state where containermanager.logaggregation.AppLogAggregatorImpl will apparently wait
      indefinitely for log aggregation to complete for an application, even if that application has abnormally terminated and is no longer present.

      Observed behavior is that an attempt to stop the nodemanager daemon will return but have no effect, the nm log continually displays messages similar to this:

      [Thread-1]2012-08-21 17:44:07,581 INFO
      org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
      Waiting for aggregation to complete for application_1345221477405_2733

      The only recovery we found to work was to 'kill -9' the nm process.

      What exactly causes the NM to enter this state is unclear but we do see this behavior reliably when the NM has run a task which failed, for example when debugging oozie distcp actions and having a distcp map task fail, the NM that was running the container will now enter this state where a shutdown on said NM will never complete, 'never' in this case was waiting for 2 hours before killing the nodemanager process.

      1. YARN-68-1.patch
        10 kB
        Daryn Sharp
      2. YARN-68.patch
        9 kB
        Daryn Sharp

        Activity

        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #1188 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1188/)
        YARN-68. NodeManager will refuse to shutdown indefinitely due to container log aggregation (daryn via bobby) (Revision 1381317)

        Result = ABORTED
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381317
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregator.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1188 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1188/ ) YARN-68 . NodeManager will refuse to shutdown indefinitely due to container log aggregation (daryn via bobby) (Revision 1381317) Result = ABORTED bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381317 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregator.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #1157 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1157/)
        YARN-68. NodeManager will refuse to shutdown indefinitely due to container log aggregation (daryn via bobby) (Revision 1381317)

        Result = FAILURE
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381317
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregator.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1157 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1157/ ) YARN-68 . NodeManager will refuse to shutdown indefinitely due to container log aggregation (daryn via bobby) (Revision 1381317) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381317 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregator.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-0.23-Build #366 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/366/)
        svn merge -c 1381317 FIXES: YARN-68. NodeManager will refuse to shutdown indefinitely due to container log aggregation (daryn via bobby) (Revision 1381320)

        Result = UNSTABLE
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381320
        Files :

        • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregator.java
        • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
        • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
        • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #366 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/366/ ) svn merge -c 1381317 FIXES: YARN-68 . NodeManager will refuse to shutdown indefinitely due to container log aggregation (daryn via bobby) (Revision 1381320) Result = UNSTABLE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381320 Files : /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregator.java /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #2710 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2710/)
        YARN-68. NodeManager will refuse to shutdown indefinitely due to container log aggregation (daryn via bobby) (Revision 1381317)

        Result = FAILURE
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381317
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregator.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #2710 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2710/ ) YARN-68 . NodeManager will refuse to shutdown indefinitely due to container log aggregation (daryn via bobby) (Revision 1381317) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381317 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregator.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #2749 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2749/)
        YARN-68. NodeManager will refuse to shutdown indefinitely due to container log aggregation (daryn via bobby) (Revision 1381317)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381317
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregator.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2749 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2749/ ) YARN-68 . NodeManager will refuse to shutdown indefinitely due to container log aggregation (daryn via bobby) (Revision 1381317) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381317 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregator.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Hide
        Robert Joseph Evans added a comment -

        Thanks Daryn, I put this into 0.23 branch-2, and trunk

        Show
        Robert Joseph Evans added a comment - Thanks Daryn, I put this into 0.23 branch-2, and trunk
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #2686 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2686/)
        YARN-68. NodeManager will refuse to shutdown indefinitely due to container log aggregation (daryn via bobby) (Revision 1381317)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381317
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregator.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #2686 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2686/ ) YARN-68 . NodeManager will refuse to shutdown indefinitely due to container log aggregation (daryn via bobby) (Revision 1381317) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381317 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregator.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Hide
        Robert Joseph Evans added a comment -

        Looks good +1. I'll check this in.

        Show
        Robert Joseph Evans added a comment - Looks good +1. I'll check this in.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12543326/YARN-68.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/14//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/14//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12543326/YARN-68.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/14//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/14//console This message is automatically generated.
        Hide
        Daryn Sharp added a comment -

        Your wish is my command!

        Show
        Daryn Sharp added a comment - Your wish is my command!
        Hide
        Robert Joseph Evans added a comment -

        The changes look good for the most part. I would like to see the join method in AppLogAgregator removed. It looks like it is no longer used, and the implementation has changed so it looks like it will no longer work anyways.

        Show
        Robert Joseph Evans added a comment - The changes look good for the most part. I would like to see the join method in AppLogAgregator removed. It looks like it is no longer used, and the implementation has changed so it looks like it will no longer work anyways.
        Hide
        Daryn Sharp added a comment -

        Try much harder to shutdown the aggregators. Will stop all the threads in the thread pool instead of assuming every aggregator has an active thread. Better exception handling and setting of state to make it harder to get into a bad state. It's not perfect because jammed threads can still block shutdown/restart, but the improved logic it makes it much less likely.

        Show
        Daryn Sharp added a comment - Try much harder to shutdown the aggregators. Will stop all the threads in the thread pool instead of assuming every aggregator has an active thread. Better exception handling and setting of state to make it harder to get into a bad state. It's not perfect because jammed threads can still block shutdown/restart, but the improved logic it makes it much less likely.
        Hide
        Daryn Sharp added a comment -

        This also prevents the NM from internally restarting after the RM is bounced, or the NM goes out of sync for too long. The stop sets a boolean to signal the (nonexistent or stuck) thread to finish, and then waits for the (nonexistent or stuck) thread to set another boolean that it's finished. This will cause the NM to wait forever and be unresponsive to shutdowns or internal restarts.

        Show
        Daryn Sharp added a comment - This also prevents the NM from internally restarting after the RM is bounced, or the NM goes out of sync for too long. The stop sets a boolean to signal the (nonexistent or stuck) thread to finish, and then waits for the (nonexistent or stuck) thread to set another boolean that it's finished. This will cause the NM to wait forever and be unresponsive to shutdowns or internal restarts.

          People

          • Assignee:
            Daryn Sharp
            Reporter:
            patrick white
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development