Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3878

AsyncDispatcher can hang while stopping if it is configured for draining events on stop

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.7.0
    • Fix Version/s: 2.8.0, 2.7.2, 2.6.3, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      The sequence of events is as under :

      1. RM is stopped while putting a RMStateStore Event to RMStateStore's AsyncDispatcher. This leads to an Interrupted Exception being thrown.
      2. As RM is being stopped, RMStateStore's AsyncDispatcher is also stopped. On serviceStop, we will check if all events have been drained and wait for event queue to drain(as RM State Store dispatcher is configured for queue to drain on stop).
      3. This condition never becomes true and AsyncDispatcher keeps on waiting incessantly for dispatcher event queue to drain till JVM exits.

      Initial exception while posting RM State store event to queue

      2015-06-27 20:08:35,922 DEBUG [main] service.AbstractService (AbstractService.java:enterState(452)) - Service: Dispatcher entered state STOPPED
      2015-06-27 20:08:35,923 WARN  [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:handle(247)) - AsyncDispatcher thread interrupted
      java.lang.InterruptedException
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
      	at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
      	at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
      	at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:244)
      	at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.updateApplicationAttemptState(RMStateStore.java:652)
      	at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.rememberTargetTransitionsAndStoreState(RMAppAttemptImpl.java:1173)
      	at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$3300(RMAppAttemptImpl.java:109)
      	at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ContainerFinishedTransition.transition(RMAppAttemptImpl.java:1650)
      	at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ContainerFinishedTransition.transition(RMAppAttemptImpl.java:1619)
      	at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
      	at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
      	at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
      	at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
      	at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:786)
      	at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108)
      	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:838)
      

      JStack of AsyncDispatcher hanging on stop

      "AsyncDispatcher event handler" prio=10 tid=0x00007fb980222800 nid=0x4b1e waiting on condition [0x00007fb9654e9000]
         java.lang.Thread.State: WAITING (parking)
              at sun.misc.Unsafe.park(Native Method)
              - parking to wait for  <0x0000000700b79250> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
              at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
              at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:113)
              at java.lang.Thread.run(Thread.java:744)
      
      "main" prio=10 tid=0x00007fb98000a800 nid=0x49c3 in Object.wait() [0x00007fb989851000]
         java.lang.Thread.State: TIMED_WAITING (on object monitor)
      	at java.lang.Object.wait(Native Method)
      	- waiting on <0x0000000700b79430> (a java.lang.Object)
      	at org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:156)
      	- locked <0x0000000700b79430> (a java.lang.Object)
      	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
      	- locked <0x0000000700b79420> (a java.lang.Object)
      	at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStop(RMStateStore.java:515)
      	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
      	- locked <0x0000000700b79630> (a java.lang.Object)
      	at org.apache.hadoop.service.AbstractService.close(AbstractService.java:250)
      	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStop(ResourceManager.java:599)
      

      We keep on getting below logs

      2015-06-27 20:08:35,926 INFO  [main] event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(140)) - AsyncDispatcher is draining to stop, igonring any new events.
      2015-06-27 20:08:36,926 INFO  [main] event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(144)) - Waiting for AsyncDispatcher to drain. Thread state is :WAITING
      2015-06-27 20:08:37,927 INFO  [main] event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(144)) - Waiting for AsyncDispatcher to drain. Thread state is :WAITING
      2015-06-27 20:08:38,927 INFO  [main] event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(144)) - Waiting for AsyncDispatcher to drain. Thread state is :WAITING
      2015-06-27 20:08:39,928 INFO  [main] event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(144)) - Waiting for AsyncDispatcher to drain. Thread state is :WAITING
      2015-06-27 20:08:40,929 INFO  [main] event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(144)) - Waiting for AsyncDispatcher to drain. Thread state is :WAITING
      2015-06-27 20:08:41,929 INFO  [main] event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(144)) - Waiting for AsyncDispatcher to drain. Thread state is :WAITING
      2015-06-27 20:08:42,930 INFO  [main] event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(144)) - Waiting for AsyncDispatcher to drain. Thread state is :WAITING
      2015-06-27 20:08:43,930 INFO  [main] event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(144)) - Waiting for AsyncDispatcher to drain. Thread state is :WAITING
      2015-06-27 20:08:44,931 INFO  [main] event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(144)) - Waiting for AsyncDispatcher to drain. Thread state is :WAITING
      2015-06-27 20:08:45,931 INFO  [main] event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(144)) - Waiting for AsyncDispatcher to drain. Thread state is :WAITING
      2015-06-27 20:08:46,932 INFO  [main] event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(144)) - Waiting for AsyncDispatcher to drain. Thread state is :WAITING
      
      1. YARN-3878.01.patch
        2 kB
        Varun Saxena
      2. YARN-3878.02.patch
        5 kB
        Varun Saxena
      3. YARN-3878.03.patch
        6 kB
        Varun Saxena
      4. YARN-3878.04.patch
        6 kB
        Varun Saxena
      5. YARN-3878.05.patch
        7 kB
        Varun Saxena
      6. YARN-3878.06.patch
        7 kB
        Varun Saxena
      7. YARN-3878.07.patch
        7 kB
        Varun Saxena
      8. YARN-3878.08.patch
        7 kB
        Varun Saxena
      9. YARN-3878.09_reprorace.pat_h
        7 kB
        Anubhav Dhoot
      10. YARN-3878.09.patch
        5 kB
        Varun Saxena
      11. YARN-3878-branch-2.6.01.patch
        5 kB
        Varun Saxena

        Issue Links

          Activity

          Hide
          varun_saxena Varun Saxena added a comment -

          The reason for this issue is as under :

          • On call to GenericEventHandler#handle we make the flag drained as false even before putting the event in queue. If InterruptedException occurs, this flag is never reset.
            public void handle(Event event) {
                  if (blockNewEvents) {
                    return;
                  }
                  drained = false;
                 .......
                  try {
                    eventQueue.put(event);
                  } catch (InterruptedException e) {
                    if (!stopped) {
                      LOG.warn("AsyncDispatcher thread interrupted", e);
                    }
                    throw new YarnRuntimeException(e);
                  }
            
          • In AsyncDispatcher#serviceStop, code is as under. As can be seen we wait for drained flag to become true
             protected void serviceStop() throws Exception {
                if (drainEventsOnStop) {
                  blockNewEvents = true;
                  LOG.info("AsyncDispatcher is draining to stop, igonring any new events.");
                  synchronized (waitForDrained) {
                    while (!drained && eventHandlingThread.isAlive()) {
                      waitForDrained.wait(1000);
                      LOG.info("Waiting for AsyncDispatcher to drain. Thread state is :" +
                          eventHandlingThread.getState());
                    }
                  }
                }
            
          • In event queue thread's run method though, we make a call to LinkedBlockingQueue#take which is a blocking call till it finds elements in the queue. So if queue is already empty and as nothing is posting elements to this queue, the drained flag would never become true.
              Runnable createThread() {
                return new Runnable() {
                  @Override
                  public void run() {
                    while (!stopped && !Thread.currentThread().isInterrupted()) {
                      drained = eventQueue.isEmpty();
                      ......
                      try {
                        event = eventQueue.take();
                      } catch(InterruptedException ie) {
                        ....
                      }
                      if (event != null) {
                        dispatch(event);
                      }
            
          Show
          varun_saxena Varun Saxena added a comment - The reason for this issue is as under : On call to GenericEventHandler#handle we make the flag drained as false even before putting the event in queue. If InterruptedException occurs, this flag is never reset. public void handle(Event event) { if (blockNewEvents) { return ; } drained = false ; ....... try { eventQueue.put(event); } catch (InterruptedException e) { if (!stopped) { LOG.warn( "AsyncDispatcher thread interrupted" , e); } throw new YarnRuntimeException(e); } In AsyncDispatcher#serviceStop , code is as under. As can be seen we wait for drained flag to become true protected void serviceStop() throws Exception { if (drainEventsOnStop) { blockNewEvents = true ; LOG.info( "AsyncDispatcher is draining to stop, igonring any new events." ); synchronized (waitForDrained) { while (!drained && eventHandlingThread.isAlive()) { waitForDrained.wait(1000); LOG.info( "Waiting for AsyncDispatcher to drain. Thread state is :" + eventHandlingThread.getState()); } } } In event queue thread's run method though, we make a call to LinkedBlockingQueue#take which is a blocking call till it finds elements in the queue. So if queue is already empty and as nothing is posting elements to this queue, the drained flag would never become true. Runnable createThread() { return new Runnable () { @Override public void run() { while (!stopped && ! Thread .currentThread().isInterrupted()) { drained = eventQueue.isEmpty(); ...... try { event = eventQueue.take(); } catch (InterruptedException ie) { .... } if (event != null ) { dispatch(event); }
          Hide
          varun_saxena Varun Saxena added a comment -

          According to me, drained flag is unnecessary. We can directly use LinkedBlockingQueue#isEmpty() instead. It is a thread safe call as it uses AtomicInteger internally and accurately determine if the queue has been drained or not as well

          Show
          varun_saxena Varun Saxena added a comment - According to me, drained flag is unnecessary. We can directly use LinkedBlockingQueue#isEmpty() instead. It is a thread safe call as it uses AtomicInteger internally and accurately determine if the queue has been drained or not as well
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 16m 10s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 javac 7m 44s There were no new javac warning messages.
          +1 javadoc 9m 46s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 51s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 34s mvn install still works.
          +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
          +1 findbugs 1m 34s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 2m 4s Tests passed in hadoop-yarn-common.
              40m 43s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12743200/YARN-3878.01.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / a78d507
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8418/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8418/testReport/
          Java 1.7.0_55
          uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8418/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 10s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac 7m 44s There were no new javac warning messages. +1 javadoc 9m 46s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 51s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 1m 34s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 2m 4s Tests passed in hadoop-yarn-common.     40m 43s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12743200/YARN-3878.01.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / a78d507 hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8418/artifact/patchprocess/testrun_hadoop-yarn-common.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8418/testReport/ Java 1.7.0_55 uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8418/console This message was automatically generated.
          Hide
          jianhe Jian He added a comment -

          Varun Saxena, thanks for reporting this. Mind adding a test case to verify this ?

          Show
          jianhe Jian He added a comment - Varun Saxena , thanks for reporting this. Mind adding a test case to verify this ?
          Hide
          varun_saxena Varun Saxena added a comment -

          Jian He, added a test case

          Show
          varun_saxena Varun Saxena added a comment - Jian He , added a test case
          Hide
          jianhe Jian He added a comment -

          Hi Varun Saxena, the test seems not adequate. It doesn't prove the AsyncDispatcher will hang in this case. Could you update the test case to simulate this scenario and will actually hang without the core changes of the patch ?

          Show
          jianhe Jian He added a comment - Hi Varun Saxena , the test seems not adequate. It doesn't prove the AsyncDispatcher will hang in this case. Could you update the test case to simulate this scenario and will actually hang without the core changes of the patch ?
          Hide
          varun_saxena Varun Saxena added a comment -

          Jian He, the test case as such is adequate.
          I have basically added two assert statements. If first statement is true, and second is false, hang will occur.
          But as assert statements are there, test would fail before hang occurs.
          Without core changes of patch, test will fail at second assertion point.
          But if you remove this second assertion point, hang will occur and test case time out.

              Assert.assertTrue("Event Queue should have been empty",
                  eventQueue.isEmpty());
              Assert.assertTrue("Async Dispatcher should have been drained as event " +
                  "queue is empty", disp.isDrained());
          
          

          So do you want me to remove this second assertion statement so that test case doesnt fail before hang ? (without core changes).
          Let me know.

          Show
          varun_saxena Varun Saxena added a comment - Jian He , the test case as such is adequate. I have basically added two assert statements. If first statement is true, and second is false, hang will occur. But as assert statements are there, test would fail before hang occurs. Without core changes of patch, test will fail at second assertion point. But if you remove this second assertion point, hang will occur and test case time out. Assert.assertTrue( "Event Queue should have been empty" , eventQueue.isEmpty()); Assert.assertTrue( "Async Dispatcher should have been drained as event " + "queue is empty" , disp.isDrained()); So do you want me to remove this second assertion statement so that test case doesnt fail before hang ? (without core changes). Let me know.
          Hide
          jianhe Jian He added a comment -

          ah, sorry, I overlooked.
          lgtm, thanks !

          Show
          jianhe Jian He added a comment - ah, sorry, I overlooked. lgtm, thanks !
          Hide
          devaraj.k Devaraj K added a comment -

          Thanks Varun Saxena for the patch and Jian He for the review. There are few minor comments on the patch, you can address these before getting this patch in.

          • I think AsyncDispatcher.isDrained() method can be removed now from AsyncDispatcher and eventQueue.isEmpty() can verified directly in the tests.
          • In TestAsyncDispatcher, can you remove this jira number comment and add the comment about what test does?
            /* Test to verify fix for YARN-3878 */
            
          • In TestAsyncDispatcher, Please use disp.close() instead of disp.stop().
          Show
          devaraj.k Devaraj K added a comment - Thanks Varun Saxena for the patch and Jian He for the review. There are few minor comments on the patch, you can address these before getting this patch in. I think AsyncDispatcher.isDrained() method can be removed now from AsyncDispatcher and eventQueue.isEmpty() can verified directly in the tests. In TestAsyncDispatcher, can you remove this jira number comment and add the comment about what test does? /* Test to verify fix for YARN-3878 */ In TestAsyncDispatcher, Please use disp.close() instead of disp.stop().
          Hide
          varun_saxena Varun Saxena added a comment -

          Thanks for the review Devaraj K

          I think AsyncDispatcher.isDrained() method can be removed now from AsyncDispatcher and eventQueue.isEmpty() can verified directly in the tests.

          Hmm. As queue is a private variable, I think we need some method to return its status. Maybe isDrained isnt semantically correct. I can change the method name. Although as currently isdrained is not checked anywhere except in DrainDispatcher#await, we can easily remove it. And if somebody else requires, they can add it later. Thoughts ?

          In TestAsyncDispatcher, can you remove this jira number comment and add the comment about what test does?

          Ok.

          In TestAsyncDispatcher, Please use disp.close() instead of disp.stop().

          In this case, close() directly calls stop() so both of them are in essence same same. Any reason you prefer close over stop ?

          Show
          varun_saxena Varun Saxena added a comment - Thanks for the review Devaraj K I think AsyncDispatcher.isDrained() method can be removed now from AsyncDispatcher and eventQueue.isEmpty() can verified directly in the tests. Hmm. As queue is a private variable, I think we need some method to return its status. Maybe isDrained isnt semantically correct. I can change the method name. Although as currently isdrained is not checked anywhere except in DrainDispatcher#await , we can easily remove it. And if somebody else requires, they can add it later. Thoughts ? In TestAsyncDispatcher, can you remove this jira number comment and add the comment about what test does? Ok. In TestAsyncDispatcher, Please use disp.close() instead of disp.stop(). In this case, close() directly calls stop() so both of them are in essence same same. Any reason you prefer close over stop ?
          Hide
          devaraj.k Devaraj K added a comment -

          Although as currently isdrained is not checked anywhere except in DrainDispatcher#await, we can easily remove it. And if somebody else requires, they can add it later. Thoughts ?

          eventQueue is getting passed from DrainDispatcher constructers and the same can be used in DrainDispatcher.await(). If anybody else wants it in future then they can add the method based on the scenario. I don't see any real reason for keeping this method.

          In this case, close() directly calls stop() so both of them are in essence same same. Any reason you prefer close over stop ?

          I agree both do the same but we can avoid the javac warning for disp i.e Resource leak: 'disp' is never closed.

          And also there are some rawtypes warnings showing up in the test, you can probably suppress them as well.

          Show
          devaraj.k Devaraj K added a comment - Although as currently isdrained is not checked anywhere except in DrainDispatcher#await, we can easily remove it. And if somebody else requires, they can add it later. Thoughts ? eventQueue is getting passed from DrainDispatcher constructers and the same can be used in DrainDispatcher.await(). If anybody else wants it in future then they can add the method based on the scenario. I don't see any real reason for keeping this method. In this case, close() directly calls stop() so both of them are in essence same same. Any reason you prefer close over stop ? I agree both do the same but we can avoid the javac warning for disp i.e Resource leak: 'disp' is never closed . And also there are some rawtypes warnings showing up in the test, you can probably suppress them as well.
          Hide
          varun_saxena Varun Saxena added a comment -

          I agree both do the same but we can avoid the javac warning for disp i.e Resource leak: 'disp' is never closed.

          Ok...Will make the change.

          Will go ahead and remove the isDrained function as well.

          Show
          varun_saxena Varun Saxena added a comment - I agree both do the same but we can avoid the javac warning for disp i.e Resource leak: 'disp' is never closed. Ok...Will make the change. Will go ahead and remove the isDrained function as well.
          Hide
          varun_saxena Varun Saxena added a comment -

          Devaraj K, addressed your comments

          Show
          varun_saxena Varun Saxena added a comment - Devaraj K , addressed your comments
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 16m 2s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 2 new or modified test files.
          +1 javac 7m 36s There were no new javac warning messages.
          +1 javadoc 9m 38s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 0m 53s The applied patch generated 2 new checkstyle issues (total was 6, now 8).
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 34s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 1m 34s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 1m 58s Tests passed in hadoop-yarn-common.
              40m 15s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12743549/YARN-3878.03.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 2eae130
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8424/artifact/patchprocess/diffcheckstylehadoop-yarn-common.txt
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8424/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8424/testReport/
          Java 1.7.0_55
          uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8424/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 2s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 2 new or modified test files. +1 javac 7m 36s There were no new javac warning messages. +1 javadoc 9m 38s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 0m 53s The applied patch generated 2 new checkstyle issues (total was 6, now 8). +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 34s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 1m 58s Tests passed in hadoop-yarn-common.     40m 15s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12743549/YARN-3878.03.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 2eae130 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8424/artifact/patchprocess/diffcheckstylehadoop-yarn-common.txt hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8424/artifact/patchprocess/testrun_hadoop-yarn-common.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8424/testReport/ Java 1.7.0_55 uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8424/console This message was automatically generated.
          Hide
          devaraj.k Devaraj K added a comment -

          Thanks Varun Saxena for the updated patch.

          -  private final BlockingQueue<Event> eventQueue;
          +  protected final BlockingQueue<Event> eventQueue;
          

          In my previous comment, I meant not to increase the eventQueue visibility to protected for accessing in the tests, I was thinking to check the eventQueue in DrainDispatcher in the same way how you are checking in TestAsyncDispatcher.testDispatcherOnCloseIfQueueEmpty(). I realize that it is not possible to check the due to explicit parameterized constructor{{ this(new LinkedBlockingQueue<Event>());}} in DrainDispatcher, I am sorry for not checking this previously. I think you can keep the AsyncDispatcher.isDrained() to access in DrainDispatcher with the same name or better name. It should the fix the induced checkstyle as well.

          Show
          devaraj.k Devaraj K added a comment - Thanks Varun Saxena for the updated patch. - private final BlockingQueue <Event> eventQueue; + protected final BlockingQueue <Event> eventQueue; In my previous comment, I meant not to increase the eventQueue visibility to protected for accessing in the tests, I was thinking to check the eventQueue in DrainDispatcher in the same way how you are checking in TestAsyncDispatcher.testDispatcherOnCloseIfQueueEmpty(). I realize that it is not possible to check the due to explicit parameterized constructor{{ this(new LinkedBlockingQueue<Event>());}} in DrainDispatcher, I am sorry for not checking this previously. I think you can keep the AsyncDispatcher.isDrained() to access in DrainDispatcher with the same name or better name. It should the fix the induced checkstyle as well.
          Hide
          varun_saxena Varun Saxena added a comment -

          Yeah had seen that in checkstyle report. Will fix both checkstyle issues and post a patch. Will have a getter function for event queue instead of isDrained

          Show
          varun_saxena Varun Saxena added a comment - Yeah had seen that in checkstyle report. Will fix both checkstyle issues and post a patch. Will have a getter function for event queue instead of isDrained
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 15m 54s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 2 new or modified test files.
          +1 javac 7m 34s There were no new javac warning messages.
          +1 javadoc 9m 37s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 1m 1s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 35s mvn install still works.
          +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
          +1 findbugs 1m 35s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 1m 57s Tests passed in hadoop-yarn-common.
              40m 13s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12743604/YARN-3878.04.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 2eae130
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8426/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8426/testReport/
          Java 1.7.0_55
          uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8426/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 15m 54s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 2 new or modified test files. +1 javac 7m 34s There were no new javac warning messages. +1 javadoc 9m 37s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 1m 1s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 35s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 1m 35s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 1m 57s Tests passed in hadoop-yarn-common.     40m 13s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12743604/YARN-3878.04.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 2eae130 hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8426/artifact/patchprocess/testrun_hadoop-yarn-common.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8426/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8426/console This message was automatically generated.
          Hide
          kasha Karthik Kambatla added a comment -

          Thanks for reporting and working on this issue, Varun Saxena. The latest patch looks mostly good, but for a few nits:

          1. I am not sure exposing the event queue is warranted here. Exposing numPendingEvents or leaving isDrained as is or with another name seems better.
          2. The test uses a sleep to wait for the dispatcher to start. The sleep could be too small on some of our Jenkins machines. We could sleep for short durations in a loop, but that is not particularly clean either. How about blocking on start() until it actually starts, or adding an awaitStart method that uses some form of wait-notify? If the latter is too complicated, I am okay with punting it to another JIRA.

          Otherwise, it looks good to me.

          Show
          kasha Karthik Kambatla added a comment - Thanks for reporting and working on this issue, Varun Saxena . The latest patch looks mostly good, but for a few nits: I am not sure exposing the event queue is warranted here. Exposing numPendingEvents or leaving isDrained as is or with another name seems better. The test uses a sleep to wait for the dispatcher to start. The sleep could be too small on some of our Jenkins machines. We could sleep for short durations in a loop, but that is not particularly clean either. How about blocking on start() until it actually starts, or adding an awaitStart method that uses some form of wait-notify? If the latter is too complicated, I am okay with punting it to another JIRA. Otherwise, it looks good to me.
          Hide
          varun_saxena Varun Saxena added a comment -

          Thanks for the review Karthik Kambatla
          Will add a method hasNoPendingEvents to check if any events exist in the queue. Because as per current code only this is required. And that will address both yours and Devaraj K's concern.
          Will handle the other comment as well

          Show
          varun_saxena Varun Saxena added a comment - Thanks for the review Karthik Kambatla Will add a method hasNoPendingEvents to check if any events exist in the queue. Because as per current code only this is required. And that will address both yours and Devaraj K 's concern. Will handle the other comment as well
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 16m 10s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 2 new or modified test files.
          +1 javac 7m 38s There were no new javac warning messages.
          +1 javadoc 9m 36s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 52s There were no new checkstyle issues.
          +1 whitespace 0m 1s The patch has no lines that end in whitespace.
          +1 install 1m 34s mvn install still works.
          +1 eclipse:eclipse 0m 38s The patch built with eclipse:eclipse.
          -1 findbugs 1m 35s The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 1m 59s Tests passed in hadoop-yarn-common.
              40m 30s  



          Reason Tests
          FindBugs module:hadoop-yarn-common



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12743636/YARN-3878.05.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 688617d
          Findbugs warnings https://builds.apache.org/job/PreCommit-YARN-Build/8428/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8428/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8428/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8428/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 10s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 2 new or modified test files. +1 javac 7m 38s There were no new javac warning messages. +1 javadoc 9m 36s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 52s There were no new checkstyle issues. +1 whitespace 0m 1s The patch has no lines that end in whitespace. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 38s The patch built with eclipse:eclipse. -1 findbugs 1m 35s The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. +1 yarn tests 1m 59s Tests passed in hadoop-yarn-common.     40m 30s   Reason Tests FindBugs module:hadoop-yarn-common Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12743636/YARN-3878.05.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 688617d Findbugs warnings https://builds.apache.org/job/PreCommit-YARN-Build/8428/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8428/artifact/patchprocess/testrun_hadoop-yarn-common.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8428/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8428/console This message was automatically generated.
          Hide
          varun_saxena Varun Saxena added a comment -

          Attached patch to fix findbugs warning

          Show
          varun_saxena Varun Saxena added a comment - Attached patch to fix findbugs warning
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 pre-patch 19m 20s Findbugs (version ) appears to be broken on trunk.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 2 new or modified test files.
          +1 javac 9m 34s There were no new javac warning messages.
          +1 javadoc 12m 31s There were no new javadoc warning messages.
          +1 release audit 0m 34s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 43s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 49s mvn install still works.
          +1 eclipse:eclipse 0m 49s The patch built with eclipse:eclipse.
          +1 findbugs 1m 57s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 2m 15s Tests passed in hadoop-yarn-common.
              49m 37s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12743638/YARN-3878.06.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 688617d
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8429/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8429/testReport/
          Java 1.7.0_55
          uname Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8429/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 pre-patch 19m 20s Findbugs (version ) appears to be broken on trunk. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 2 new or modified test files. +1 javac 9m 34s There were no new javac warning messages. +1 javadoc 12m 31s There were no new javadoc warning messages. +1 release audit 0m 34s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 43s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 49s mvn install still works. +1 eclipse:eclipse 0m 49s The patch built with eclipse:eclipse. +1 findbugs 1m 57s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 2m 15s Tests passed in hadoop-yarn-common.     49m 37s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12743638/YARN-3878.06.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 688617d hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8429/artifact/patchprocess/testrun_hadoop-yarn-common.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8429/testReport/ Java 1.7.0_55 uname Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8429/console This message was automatically generated.
          Hide
          kasha Karthik Kambatla added a comment -

          hasNoPendingEvents seems a little backwards. Can we do hasPendingEvents instead and check for the negation of it?

          Show
          kasha Karthik Kambatla added a comment - hasNoPendingEvents seems a little backwards. Can we do hasPendingEvents instead and check for the negation of it?
          Hide
          varun_saxena Varun Saxena added a comment -

          Ok...Changed method name to hasPendingEvents

          Show
          varun_saxena Varun Saxena added a comment - Ok...Changed method name to hasPendingEvents
          Hide
          devaraj.k Devaraj K added a comment -

          Adding to Karthik Kambatla comment,

          1. Can you add @VisibleForTesting annotation to this method?

          +  protected boolean hasPendingEvents() {
          +    return !eventQueue.isEmpty();
          +  }
          +
          

          2. I don't convince that it is a right way to modify the source code and add a public method for test purpose. For waiting to start the AsyncDispatcher, can't we do like below similar to the way we do in other tests for waiting to change the service state?

              int waitCount = 0;
              while (disp.getServiceState() != STATE.STARTED
                  && waitCount++ < 60) {
                Thread.sleep(1500);
              }
          
          Show
          devaraj.k Devaraj K added a comment - Adding to Karthik Kambatla comment, 1. Can you add @VisibleForTesting annotation to this method? + protected boolean hasPendingEvents() { + return !eventQueue.isEmpty(); + } + 2. I don't convince that it is a right way to modify the source code and add a public method for test purpose. For waiting to start the AsyncDispatcher, can't we do like below similar to the way we do in other tests for waiting to change the service state? int waitCount = 0; while (disp.getServiceState() != STATE.STARTED && waitCount++ < 60) { Thread.sleep(1500); }
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 18m 2s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 2 new or modified test files.
          +1 javac 9m 41s There were no new javac warning messages.
          +1 javadoc 11m 52s There were no new javadoc warning messages.
          +1 release audit 0m 25s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 1m 2s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 51s mvn install still works.
          +1 eclipse:eclipse 0m 43s The patch built with eclipse:eclipse.
          +1 findbugs 2m 29s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 2m 36s Tests passed in hadoop-yarn-common.
              48m 45s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12743662/YARN-3878.07.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 688617d
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8432/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8432/testReport/
          Java 1.7.0_55
          uname Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8432/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 18m 2s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 2 new or modified test files. +1 javac 9m 41s There were no new javac warning messages. +1 javadoc 11m 52s There were no new javadoc warning messages. +1 release audit 0m 25s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 1m 2s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 51s mvn install still works. +1 eclipse:eclipse 0m 43s The patch built with eclipse:eclipse. +1 findbugs 2m 29s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 2m 36s Tests passed in hadoop-yarn-common.     48m 45s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12743662/YARN-3878.07.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 688617d hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8432/artifact/patchprocess/testrun_hadoop-yarn-common.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8432/testReport/ Java 1.7.0_55 uname Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8432/console This message was automatically generated.
          Hide
          jianhe Jian He added a comment -

          I don't convince that it is a right way to modify the source code and add a public method for test purpose.

          agree.

          Show
          jianhe Jian He added a comment - I don't convince that it is a right way to modify the source code and add a public method for test purpose. agree.
          Hide
          devaraj.k Devaraj K added a comment -

          Cancelling patch to address the comments.

          Show
          devaraj.k Devaraj K added a comment - Cancelling patch to address the comments.
          Hide
          ozawa Tsuyoshi Ozawa added a comment -

          FYI, I found a news which has impacts against this JIRA, so sharing it: http://www.infoq.com/news/2015/05/redhat-futex

          Quoting from the URL:

          “The impact of this kernel bug is very simple: user processes can
          deadlock and hang in seemingly impossible situations. A futex wait
          call (and anything using a futex wait) can stay blocked forever, even
          though it had been properly woken up by someone. Thread.park() in Java
          may stay parked. Etc. If you are lucky you may also find soft lockup
          messages in your dmesg logs. If you are not that lucky (like us, for
          example), you'll spend a couple of months of someone's time trying to
          find the fault in your code, when there is nothing there to find.”

          > RHEL 6 (and CentOS 6, and SL 6): 6.0-6.5 are good. 6.6 is BAD. 6.6.z is good.
          > RHEL 7 (and CentOS 7, and SL 7): 7.1 is BAD. As of yesterday. there does not yet appear to be a 7.x fix. [May 13, 2015]
          > RHEL 5 (and CentOS 5, and SL 5): All versions are good (including 5.11).

          Show
          ozawa Tsuyoshi Ozawa added a comment - FYI, I found a news which has impacts against this JIRA, so sharing it: http://www.infoq.com/news/2015/05/redhat-futex Quoting from the URL: “The impact of this kernel bug is very simple: user processes can deadlock and hang in seemingly impossible situations. A futex wait call (and anything using a futex wait) can stay blocked forever, even though it had been properly woken up by someone. Thread.park() in Java may stay parked. Etc. If you are lucky you may also find soft lockup messages in your dmesg logs. If you are not that lucky (like us, for example), you'll spend a couple of months of someone's time trying to find the fault in your code, when there is nothing there to find.” > RHEL 6 (and CentOS 6, and SL 6): 6.0-6.5 are good. 6.6 is BAD. 6.6.z is good. > RHEL 7 (and CentOS 7, and SL 7): 7.1 is BAD. As of yesterday. there does not yet appear to be a 7.x fix. [May 13, 2015] > RHEL 5 (and CentOS 5, and SL 5): All versions are good (including 5.11).
          Hide
          varun_saxena Varun Saxena added a comment -

          Thanks a lot for sharing the info Tsuyoshi Ozawa.

          Show
          varun_saxena Varun Saxena added a comment - Thanks a lot for sharing the info Tsuyoshi Ozawa .
          Hide
          varun_saxena Varun Saxena added a comment -

          Jian He / Devaraj K / Karthik Kambatla,

          Checking for service state won't work as AsyncDispatcher's service state will become STARTED even before event handler thread actually starts running.
          So should I go back to putting hardcoded sleep in test ? I think 2 seconds should be more than enough.

          Other solutions would require some sort of code to be added in AsyncDispatcher.

          Show
          varun_saxena Varun Saxena added a comment - Jian He / Devaraj K / Karthik Kambatla , Checking for service state won't work as AsyncDispatcher's service state will become STARTED even before event handler thread actually starts running. So should I go back to putting hardcoded sleep in test ? I think 2 seconds should be more than enough. Other solutions would require some sort of code to be added in AsyncDispatcher.
          Hide
          jianhe Jian He added a comment -

          You may try creating a new test sub class which extends AsyncDispatcher class and overrides the serviceStart method. Test can wait till the serviceStart method to finish.

          Show
          jianhe Jian He added a comment - You may try creating a new test sub class which extends AsyncDispatcher class and overrides the serviceStart method. Test can wait till the serviceStart method to finish.
          Hide
          varun_saxena Varun Saxena added a comment -

          Jian He, we have a DrainDispatcher class which is used only for test. I think I add it there then.
          But I would still need to change the visibility of eventHandlingThread variable (in AsyncDispatcher) to protected to check whether this thread has started or not. We can wait there(by calling Thread#yield}} till event handling thread has started. Thoughts ?

          Show
          varun_saxena Varun Saxena added a comment - Jian He , we have a DrainDispatcher class which is used only for test. I think I add it there then. But I would still need to change the visibility of eventHandlingThread variable (in AsyncDispatcher) to protected to check whether this thread has started or not. We can wait there(by calling Thread#yield}} till event handling thread has started. Thoughts ?
          Hide
          varun_saxena Varun Saxena added a comment -

          Or have a protected function returning reference to event handling thread

          Show
          varun_saxena Varun Saxena added a comment - Or have a protected function returning reference to event handling thread
          Hide
          jianhe Jian He added a comment -

          Or have a protected function returning reference to event handling thread

          Sounds ok to me.

          Show
          jianhe Jian He added a comment - Or have a protected function returning reference to event handling thread Sounds ok to me.
          Hide
          jianhe Jian He added a comment -

          Given dispatcher is a sensitive piece of code, waiting 2-3 seconds in test is ok to me too. I don't have strong preference.

          Show
          jianhe Jian He added a comment - Given dispatcher is a sensitive piece of code, waiting 2-3 seconds in test is ok to me too. I don't have strong preference.
          Hide
          devaraj.k Devaraj K added a comment -

          Given dispatcher is a sensitive piece of code, waiting 2-3 seconds in test is ok to me too.

          It is ok for me too.

          If we want have a more reliable way to know that whether thread is started or not, we can have a protected and test scoped method which just returns the eventHandlingThread state(or tells whether thread is started or not) and then we can wait in the test till the eventHandlingThread has started. I think this may not be required for dispatcher eventHandlingThread and waiting for 2-3 seconds would be ok.

          Show
          devaraj.k Devaraj K added a comment - Given dispatcher is a sensitive piece of code, waiting 2-3 seconds in test is ok to me too. It is ok for me too. If we want have a more reliable way to know that whether thread is started or not, we can have a protected and test scoped method which just returns the eventHandlingThread state(or tells whether thread is started or not) and then we can wait in the test till the eventHandlingThread has started. I think this may not be required for dispatcher eventHandlingThread and waiting for 2-3 seconds would be ok.
          Hide
          varun_saxena Varun Saxena added a comment -

          Ok. Will update a patch with 2 seconds sleep by today evening

          Show
          varun_saxena Varun Saxena added a comment - Ok. Will update a patch with 2 seconds sleep by today evening
          Hide
          kasha Karthik Kambatla added a comment -

          If we want have a more reliable way to know that whether thread is started or not, we can have a protected and test scoped method which just returns the eventHandlingThread state(or tells whether thread is started or not) and then we can wait in the test till the eventHandlingThread has started.

          Like this approach, but I am not against the sleep approach.

          With a sleep approach, my concern is with a relying on a single sleep. Instead, we could do the following:

          for (int i = 0; i < numRetries; i++) {
            // check 
            Thread.sleep(100);
          }
          // check
          

          Given all this, Devaraj's suggestion of adding a protected, test-scoped method is probably equally simple

          Show
          kasha Karthik Kambatla added a comment - If we want have a more reliable way to know that whether thread is started or not, we can have a protected and test scoped method which just returns the eventHandlingThread state(or tells whether thread is started or not) and then we can wait in the test till the eventHandlingThread has started. Like this approach, but I am not against the sleep approach. With a sleep approach, my concern is with a relying on a single sleep. Instead, we could do the following: for ( int i = 0; i < numRetries; i++) { // check Thread .sleep(100); } // check Given all this, Devaraj's suggestion of adding a protected, test-scoped method is probably equally simple
          Hide
          varun_saxena Varun Saxena added a comment -

          Yeah but we still need to check the thread state. Then let me add something in DrainDispatcher(subclass of AsyncDispatcher) to return thread state and wait on it.

          Show
          varun_saxena Varun Saxena added a comment - Yeah but we still need to check the thread state. Then let me add something in DrainDispatcher(subclass of AsyncDispatcher) to return thread state and wait on it.
          Hide
          kasha Karthik Kambatla added a comment -

          +1, pending Jenkins.

          Will go ahead and commit this once Jenkins is okay.

          Show
          kasha Karthik Kambatla added a comment - +1, pending Jenkins. Will go ahead and commit this once Jenkins is okay.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          -1 pre-patch 15m 33s Findbugs (version ) appears to be broken on trunk.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 2 new or modified test files.
          +1 javac 7m 50s There were no new javac warning messages.
          +1 javadoc 9m 50s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 28s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 23s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 1m 35s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 1m 58s Tests passed in hadoop-yarn-common.
              39m 37s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12744343/YARN-3878.08.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 2e3d83f
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8462/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8462/testReport/
          Java 1.7.0_55
          uname Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8462/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 pre-patch 15m 33s Findbugs (version ) appears to be broken on trunk. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 2 new or modified test files. +1 javac 7m 50s There were no new javac warning messages. +1 javadoc 9m 50s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 28s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 23s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 35s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 1m 58s Tests passed in hadoop-yarn-common.     39m 37s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12744343/YARN-3878.08.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 2e3d83f hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8462/artifact/patchprocess/testrun_hadoop-yarn-common.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8462/testReport/ Java 1.7.0_55 uname Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8462/console This message was automatically generated.
          Hide
          kasha Karthik Kambatla added a comment -

          Thanks Varun Saxena for the contribution, Jian He and Devaraj K for your reviews.

          Show
          kasha Karthik Kambatla added a comment - Thanks Varun Saxena for the contribution, Jian He and Devaraj K for your reviews.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8140 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8140/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8140 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8140/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/CHANGES.txt
          Hide
          varun_saxena Varun Saxena added a comment -

          Thanks Karthik Kambatla for the commit and review.
          Thanks to Jian He and Devaraj K for the reviews as well.

          Show
          varun_saxena Varun Saxena added a comment - Thanks Karthik Kambatla for the commit and review. Thanks to Jian He and Devaraj K for the reviews as well.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #252 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/252/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #252 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/252/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #982 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/982/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #982 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/982/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2179 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2179/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2179 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2179/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2198 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2198/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2198 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2198/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #240 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/240/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #240 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/240/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #250 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/250/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #250 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/250/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha) (kasha: rev aa067c6aa47b4c79577096817acc00ad6421180c) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/CHANGES.txt
          Hide
          varun_saxena Varun Saxena added a comment -

          Jian He / Karthik Kambatla, added an addendum patch.

          Show
          varun_saxena Varun Saxena added a comment - Jian He / Karthik Kambatla , added an addendum patch.
          Hide
          varun_saxena Varun Saxena added a comment -

          Should I reopen the issue ?

          Show
          varun_saxena Varun Saxena added a comment - Should I reopen the issue ?
          Hide
          jianhe Jian He added a comment -

          re-opened this, also reverted the previous patch.
          Varun Saxena, could you upload a clean delta patch for this ? thanks !

          Show
          jianhe Jian He added a comment - re-opened this, also reverted the previous patch. Varun Saxena , could you upload a clean delta patch for this ? thanks !
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #8157 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8157/)
          Revert "YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #8157 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8157/ ) Revert " YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          Hide
          varun_saxena Varun Saxena added a comment -

          Jian He, will update a patch soon.

          Show
          varun_saxena Varun Saxena added a comment - Jian He , will update a patch soon.
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 16m 11s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 2 new or modified test files.
          +1 javac 7m 41s There were no new javac warning messages.
          +1 javadoc 9m 30s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 54s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 22s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 1m 35s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 1m 56s Tests passed in hadoop-yarn-common.
              40m 8s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12745193/YARN-3878.09.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / a431ed9
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8528/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8528/testReport/
          Java 1.7.0_55
          uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8528/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 11s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 2 new or modified test files. +1 javac 7m 41s There were no new javac warning messages. +1 javadoc 9m 30s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 54s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 22s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 35s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 1m 56s Tests passed in hadoop-yarn-common.     40m 8s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12745193/YARN-3878.09.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / a431ed9 hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8528/artifact/patchprocess/testrun_hadoop-yarn-common.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8528/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8528/console This message was automatically generated.
          Hide
          varun_saxena Varun Saxena added a comment -

          Jian He, Karthik Kambatla, Devaraj K, kindly review.
          Have retained the drained flag now.

          Just added a check for resetting drained flag on interrupted exception if queue is empty.

          Show
          varun_saxena Varun Saxena added a comment - Jian He , Karthik Kambatla , Devaraj K , kindly review. Have retained the drained flag now. Just added a check for resetting drained flag on interrupted exception if queue is empty.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #256 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/256/)
          Revert "YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #256 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/256/ ) Revert " YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #986 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/986/)
          Revert "YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #986 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/986/ ) Revert " YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2183 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2183/)
          Revert "YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2183 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2183/ ) Revert " YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #244 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/244/)
          Revert "YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #244 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/244/ ) Revert " YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2202 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2202/)
          Revert "YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2202 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2202/ ) Revert " YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #254 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/254/)
          Revert "YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #254 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/254/ ) Revert " YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. (Varun Saxena via kasha)" (jianhe: rev 2466460d4cd13ad5837c044476b26e63082c1d37) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          Hide
          jianhe Jian He added a comment -

          latest patch looks good to me, will commit if no comments from others.

          Show
          jianhe Jian He added a comment - latest patch looks good to me, will commit if no comments from others.
          Hide
          adhoot Anubhav Dhoot added a comment -

          LGTM. Unrelated to this patch there is an existing tiny race between GenericEventHandler checking for blockNewEvents and setting drained to false. If serviceStop happens in between this, it can set blockNewEvents and with drained still true, it can cause eventHandlingThread to finish before the one last event gets added in the queue. Not sure its worth changing the product code for this since shutdown cannot guarantee all events will be processed.

          Show
          adhoot Anubhav Dhoot added a comment - LGTM. Unrelated to this patch there is an existing tiny race between GenericEventHandler checking for blockNewEvents and setting drained to false. If serviceStop happens in between this, it can set blockNewEvents and with drained still true, it can cause eventHandlingThread to finish before the one last event gets added in the queue. Not sure its worth changing the product code for this since shutdown cannot guarantee all events will be processed.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Attaching a patch just to demonstrate the race. Since its trying to demonstrate the race it injects an artificial delay, hence not making it an official patch. Run the test testBlockNewEvents to show that an event can be in the queue while serviceStop happens.

          Show
          adhoot Anubhav Dhoot added a comment - Attaching a patch just to demonstrate the race. Since its trying to demonstrate the race it injects an artificial delay, hence not making it an official patch. Run the test testBlockNewEvents to show that an event can be in the queue while serviceStop happens.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 patch 0m 1s The patch file was not named according to hadoop's naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute for instructions.
          -1 pre-patch 15m 1s Findbugs (version ) appears to be broken on trunk.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 2 new or modified test files.
          -1 javac 7m 37s The applied patch generated 1 additional warning messages.
          +1 javadoc 9m 42s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 25s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 21s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 1m 33s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          -1 yarn tests 1m 57s Tests failed in hadoop-yarn-common.
              38m 34s  



          Reason Tests
          Failed unit tests hadoop.yarn.event.TestAsyncDispatcher



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12745865/YARN-3878.09_reprorace.pat_h
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 419c51d
          javac https://builds.apache.org/job/PreCommit-YARN-Build/8576/artifact/patchprocess/diffJavacWarnings.txt
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8576/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8576/testReport/
          Java 1.7.0_55
          uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8576/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 patch 0m 1s The patch file was not named according to hadoop's naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute for instructions. -1 pre-patch 15m 1s Findbugs (version ) appears to be broken on trunk. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 2 new or modified test files. -1 javac 7m 37s The applied patch generated 1 additional warning messages. +1 javadoc 9m 42s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 25s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 21s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 33s The patch does not introduce any new Findbugs (version 3.0.0) warnings. -1 yarn tests 1m 57s Tests failed in hadoop-yarn-common.     38m 34s   Reason Tests Failed unit tests hadoop.yarn.event.TestAsyncDispatcher Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12745865/YARN-3878.09_reprorace.pat_h Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 419c51d javac https://builds.apache.org/job/PreCommit-YARN-Build/8576/artifact/patchprocess/diffJavacWarnings.txt hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8576/artifact/patchprocess/testrun_hadoop-yarn-common.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8576/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8576/console This message was automatically generated.
          Hide
          jianhe Jian He added a comment -

          Anubhav, thanks for reviewing the patch. I think given that we cannot guarantee shutdown will process all events - main dispatcher may also have some events pending which are not drained - in any case we are going to lose those events, to keep it simple, it's ok to not handle this rare condition.

          Show
          jianhe Jian He added a comment - Anubhav, thanks for reviewing the patch. I think given that we cannot guarantee shutdown will process all events - main dispatcher may also have some events pending which are not drained - in any case we are going to lose those events, to keep it simple, it's ok to not handle this rare condition.
          Hide
          varun_saxena Varun Saxena added a comment -

          Thanks Anubhav Dhoot for the review. I agree there will be race.

          To fix this, we would need to introduce some sort of synchronization in frequently called path of GenericEventHandler#handle.
          But as the race is very rare, and impacts only one event and there might be other events as well which may not be handled due to stop, I think it would be fine to ignore this.

          Show
          varun_saxena Varun Saxena added a comment - Thanks Anubhav Dhoot for the review. I agree there will be race. To fix this, we would need to introduce some sort of synchronization in frequently called path of GenericEventHandler#handle. But as the race is very rare, and impacts only one event and there might be other events as well which may not be handled due to stop, I think it would be fine to ignore this.
          Hide
          adhoot Anubhav Dhoot added a comment -

          Agree this is ok to ignore

          Show
          adhoot Anubhav Dhoot added a comment - Agree this is ok to ignore
          Hide
          jianhe Jian He added a comment -

          thanks ! committing this.

          Show
          jianhe Jian He added a comment - thanks ! committing this.
          Hide
          jianhe Jian He added a comment -

          committed to trunk, branch-2, branch-2.7, thanks Varun !
          thanks Anubhav for reviewing the patch !

          Show
          jianhe Jian He added a comment - committed to trunk, branch-2, branch-2.7, thanks Varun ! thanks Anubhav for reviewing the patch !
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8197 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8197/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8197 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8197/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #994 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/994/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #994 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/994/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #264 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/264/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #264 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/264/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          Hide
          varun_saxena Varun Saxena added a comment -

          Thanks Jian He for the commit and several others for review

          Show
          varun_saxena Varun Saxena added a comment - Thanks Jian He for the commit and several others for review
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2191 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2191/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2191 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2191/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #253 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/253/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #253 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/253/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #261 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/261/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #261 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/261/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2210 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2210/)
          YARN-3878. AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2210 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2210/ ) YARN-3878 . AsyncDispatcher can hang while stopping if it is configured for draining events on stop. Contributed by Varun Saxena (jianhe: rev 393fe71771e3ac6bc0efe59d9aaf19d3576411b3) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/TestAsyncDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/event/DrainDispatcher.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java hadoop-yarn-project/CHANGES.txt
          Hide
          sjlee0 Sangjin Lee added a comment -

          Does this issue exist in 2.6.x? Should this be backported to branch-2.6?

          Show
          sjlee0 Sangjin Lee added a comment - Does this issue exist in 2.6.x? Should this be backported to branch-2.6?
          Hide
          varun_saxena Varun Saxena added a comment -

          Sangjin Lee, yes this exists in branch-2.6 too.
          I will help in rebase.

          Show
          varun_saxena Varun Saxena added a comment - Sangjin Lee , yes this exists in branch-2.6 too. I will help in rebase.
          Hide
          varun_saxena Varun Saxena added a comment -

          Sangjin Lee, I have backported the changes and updated a branch-2.6 patch

          Show
          varun_saxena Varun Saxena added a comment - Sangjin Lee , I have backported the changes and updated a branch-2.6 patch
          Hide
          sjlee0 Sangjin Lee added a comment -

          +1. Committed it to branch-2.6. Thanks Varun Saxena!

          Show
          sjlee0 Sangjin Lee added a comment - +1. Committed it to branch-2.6. Thanks Varun Saxena !

            People

            • Assignee:
              varun_saxena Varun Saxena
              Reporter:
              varun_saxena Varun Saxena
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development