Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3242

Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client

    Details

    • Hadoop Flags:
      Reviewed

      Description

      Old ZK client session watcher event messed up new ZK client session due to ZooKeeper asynchronously closing client session.
      The watcher event from old ZK client session can still be sent to ZKRMStateStore after the old ZK client session is closed.
      This will cause seriously problem:ZKRMStateStore out of sync with ZooKeeper session.
      We only have one ZKRMStateStore but we can have multiple ZK client sessions.
      Currently ZKRMStateStore#processWatchEvent doesn't check whether this watcher event is from current session. So the watcher event from old ZK client session which just is closed will still be processed.
      For example, If a Disconnected event received from old session after new session is connected, the zkClient will be set to null

              case Disconnected:
                LOG.info("ZKRMStateStore Session disconnected");
                oldZkClient = zkClient;
                zkClient = null;
                break;
      

      Then ZKRMStateStore won't receive SyncConnected event from new session because new session is already in SyncConnected state and it won't send SyncConnected event until it is disconnected and connected again.
      Then we will see all the ZKRMStateStore operations fail with IOException "Wait for ZKClient creation timed out" until RM shutdown.

      The following code from zookeeper(ClientCnxn#EventThread) show even after receive eventOfDeath, EventThread will still process all the events until waitingEvents queue is empty.

                    while (true) {
                       Object event = waitingEvents.take();
                       if (event == eventOfDeath) {
                          wasKilled = true;
                       } else {
                          processEvent(event);
                       }
                       if (wasKilled)
                          synchronized (waitingEvents) {
                             if (waitingEvents.isEmpty()) {
                                isRunning = false;
                                break;
                             }
                          }
                    }
      
            private void processEvent(Object event) {
                try {
                    if (event instanceof WatcherSetEventPair) {
                        // each watcher will process the event
                        WatcherSetEventPair pair = (WatcherSetEventPair) event;
                        for (Watcher watcher : pair.watchers) {
                            try {
                                watcher.process(pair.event);
                            } catch (Throwable t) {
                                LOG.error("Error while calling watcher ", t);
                            }
                        }
                    } else {
      
          public void disconnect() {
              if (LOG.isDebugEnabled()) {
                  LOG.debug("Disconnecting client for session: 0x"
                            + Long.toHexString(getSessionId()));
              }
      
              sendThread.close();
              eventThread.queueEventOfDeath();
          }
      
          public void close() throws IOException {
              if (LOG.isDebugEnabled()) {
                  LOG.debug("Closing client for session: 0x"
                            + Long.toHexString(getSessionId()));
              }
      
              try {
                  RequestHeader h = new RequestHeader();
                  h.setType(ZooDefs.OpCode.closeSession);
      
                  submitRequest(h, null, null, null);
              } catch (InterruptedException e) {
                  // ignore, close the send/event threads
              } finally {
                  disconnect();
              }
          }
      
      1. YARN-3242.000.patch
        5 kB
        zhihai xu
      2. YARN-3242.001.patch
        7 kB
        zhihai xu
      3. YARN-3242.002.patch
        9 kB
        zhihai xu
      4. YARN-3242.003.patch
        10 kB
        zhihai xu
      5. YARN-3242.004.patch
        10 kB
        zhihai xu

        Activity

        Hide
        zxu zhihai xu added a comment -

        The following ZooKeeper client logs in RM show this error:

        // old session closed
        2015-02-16 06:01:12,985 INFO org.apache.zookeeper.ZooKeeper: Session: 0x24b8df4044005d4 closed
        
        
        // new session created and connected
        2015-02-16 06:01:12,991 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete sessionid = 0x24b8df4044005d8, negotiated timeout = 10000
        2015-02-16 06:01:12,994 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Watcher event type: None with state:SyncConnected for path:null for Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
        
        
        // old session disconnected and EventThread shutdown
        2015-02-16 06:01:12,995 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Watcher event type: None with state:Disconnected for path:null for Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
        2015-02-16 06:01:12,995 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down
        
        // Error: Wait for ZKClient creation timed out and RM shutdown
        2015-02-16 06:01:13,095 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing application with id application_1424095053378_0010
        2015-02-16 06:01:33,100 ERROR org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error storing app: application_1424095053378_0010
        java.io.IOException: Wait for ZKClient creation timed out
        2015-02-16 06:01:33,107 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
        

        The following ZooKeeper server logs show the new session 0x24b8df4044005d8 connected until RM shutdown at 2015-02-16 06:01:33.

        2015-02-16 06:01:12,991 INFO org.apache.zookeeper.server.ZooKeeperServer: Established session 0x24b8df4044005d8 with negotiated timeout 10000 for client 
        
        2015-02-16 06:01:33,886 WARN org.apache.zookeeper.server.NIOServerCnxn: caught end of stream exception
        EndOfStreamException: Unable to read additional data from client sessionid 0x24b8df4044005d8, likely client has closed socket
        	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
        	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        	at java.lang.Thread.run(Thread.java:744)
        2015-02-16 06:01:33,888 INFO org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection for client which had sessionid 0x24b8df4044005d8
        
        Show
        zxu zhihai xu added a comment - The following ZooKeeper client logs in RM show this error: // old session closed 2015-02-16 06:01:12,985 INFO org.apache.zookeeper.ZooKeeper: Session: 0x24b8df4044005d4 closed // new session created and connected 2015-02-16 06:01:12,991 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete sessionid = 0x24b8df4044005d8, negotiated timeout = 10000 2015-02-16 06:01:12,994 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Watcher event type: None with state:SyncConnected for path: null for Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED // old session disconnected and EventThread shutdown 2015-02-16 06:01:12,995 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Watcher event type: None with state:Disconnected for path: null for Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED 2015-02-16 06:01:12,995 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down // Error: Wait for ZKClient creation timed out and RM shutdown 2015-02-16 06:01:13,095 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing application with id application_1424095053378_0010 2015-02-16 06:01:33,100 ERROR org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error storing app: application_1424095053378_0010 java.io.IOException: Wait for ZKClient creation timed out 2015-02-16 06:01:33,107 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 The following ZooKeeper server logs show the new session 0x24b8df4044005d8 connected until RM shutdown at 2015-02-16 06:01:33. 2015-02-16 06:01:12,991 INFO org.apache.zookeeper.server.ZooKeeperServer: Established session 0x24b8df4044005d8 with negotiated timeout 10000 for client 2015-02-16 06:01:33,886 WARN org.apache.zookeeper.server.NIOServerCnxn: caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x24b8df4044005d8, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang. Thread .run( Thread .java:744) 2015-02-16 06:01:33,888 INFO org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection for client which had sessionid 0x24b8df4044005d8
        Hide
        zxu zhihai xu added a comment -

        I uploaded a draft patch which will only process watcher event from current ZooKeeper Client session.

        Show
        zxu zhihai xu added a comment - I uploaded a draft patch which will only process watcher event from current ZooKeeper Client session.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12700076/YARN-3242.000.patch
        against trunk revision fe7a302.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

        org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6690//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6690//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6690//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700076/YARN-3242.000.patch against trunk revision fe7a302. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6690//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6690//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6690//console This message is automatically generated.
        Hide
        zxu zhihai xu added a comment -

        I find out the oldZkClient is not useful any more, the added activeZkClient can replace it.
        uploaded a new patch YARN-3242.001.patch which remove oldZkClient.

        Show
        zxu zhihai xu added a comment - I find out the oldZkClient is not useful any more, the added activeZkClient can replace it. uploaded a new patch YARN-3242 .001.patch which remove oldZkClient.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12700085/YARN-3242.001.patch
        against trunk revision fe7a302.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6691//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6691//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6691//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700085/YARN-3242.001.patch against trunk revision fe7a302. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6691//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6691//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6691//console This message is automatically generated.
        Hide
        zxu zhihai xu added a comment -

        I uploaded a new patch YARN-3242.002.patch which add a test case: old client session Disconnected event won't affect the current client session.

        Show
        zxu zhihai xu added a comment - I uploaded a new patch YARN-3242 .002.patch which add a test case: old client session Disconnected event won't affect the current client session.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12700107/YARN-3242.002.patch
        against trunk revision fe7a302.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6693//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6693//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6693//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700107/YARN-3242.002.patch against trunk revision fe7a302. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6693//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6693//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6693//console This message is automatically generated.
        Hide
        zxu zhihai xu added a comment -

        I checked the findbugs warning message, all these findbugs are related to my change.

        Show
        zxu zhihai xu added a comment - I checked the findbugs warning message, all these findbugs are related to my change.
        Hide
        zxu zhihai xu added a comment -

        I uploaded a new patch YARN-3242.003.patch which add more test cases to send watcher event to both previous and current client session.

        Show
        zxu zhihai xu added a comment - I uploaded a new patch YARN-3242 .003.patch which add more test cases to send watcher event to both previous and current client session.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12700122/YARN-3242.003.patch
        against trunk revision fe7a302.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6694//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6694//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6694//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700122/YARN-3242.003.patch against trunk revision fe7a302. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6694//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6694//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6694//console This message is automatically generated.
        Hide
        adhoot Anubhav Dhoot added a comment -

        zhihai xu patch looks good overall.
        Instead of blindly switching in zkClient on a connect and removing it on a disconnect, we verify is activeZkClient is the one receiving the event
        Makes sense then that we get rid of oldZkClient logic and just have one zk client activeZkCLient that can get events, and on connection event is activated for use as zkClient to actually do processing.

        Verified that the updated unit test fails if i remove the check if (zk != activeZkClient) {

        The only minor nits
        a) is if we could add comments that activeZkClient is not used to do actual processing (thats still zkClient) but only to process watched events and on connection event it gets activated into zkClient.
        b) Also will CountdownWatcher#setWatchedClient be ever more than once? If not rename it to initializeWatchedClient and let it throw if client is already not null.

        LGTM otherwise

        Show
        adhoot Anubhav Dhoot added a comment - zhihai xu patch looks good overall. Instead of blindly switching in zkClient on a connect and removing it on a disconnect, we verify is activeZkClient is the one receiving the event Makes sense then that we get rid of oldZkClient logic and just have one zk client activeZkCLient that can get events, and on connection event is activated for use as zkClient to actually do processing. Verified that the updated unit test fails if i remove the check if (zk != activeZkClient) { The only minor nits a) is if we could add comments that activeZkClient is not used to do actual processing (thats still zkClient) but only to process watched events and on connection event it gets activated into zkClient. b) Also will CountdownWatcher#setWatchedClient be ever more than once? If not rename it to initializeWatchedClient and let it throw if client is already not null. LGTM otherwise
        Hide
        zxu zhihai xu added a comment -

        Anubhav Dhoot, thanks for the review, both suggestions are good to me. I uploaded a new patch YARN-3242.004.patch, which addressed both comments. Please review it.
        thanks zhihai

        Show
        zxu zhihai xu added a comment - Anubhav Dhoot , thanks for the review, both suggestions are good to me. I uploaded a new patch YARN-3242 .004.patch, which addressed both comments. Please review it. thanks zhihai
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12702245/YARN-3242.004.patch
        against trunk revision e17e5ba.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        -1 javac. The applied patch generated 1236 javac compiler warnings (more than the trunk's current 1199 warnings).

        -1 javadoc. The javadoc tool appears to have generated 34 warning messages.
        See https://builds.apache.org/job/PreCommit-YARN-Build/6824//artifact/patchprocess/diffJavadocWarnings.txt for details.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws:

        org.apache.hadoop.ha.TestZKFailoverController
        org.apache.hadoop.ha.TestZKFailoverControllerStress

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6824//testReport/
        Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6824//artifact/patchprocess/diffJavacWarnings.txt
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6824//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12702245/YARN-3242.004.patch against trunk revision e17e5ba. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. -1 javac . The applied patch generated 1236 javac compiler warnings (more than the trunk's current 1199 warnings). -1 javadoc . The javadoc tool appears to have generated 34 warning messages. See https://builds.apache.org/job/PreCommit-YARN-Build/6824//artifact/patchprocess/diffJavadocWarnings.txt for details. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws: org.apache.hadoop.ha.TestZKFailoverController org.apache.hadoop.ha.TestZKFailoverControllerStress Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6824//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6824//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6824//console This message is automatically generated.
        Hide
        adhoot Anubhav Dhoot added a comment -

        LGTM

        Show
        adhoot Anubhav Dhoot added a comment - LGTM
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12702269/YARN-3242.004.patch
        against trunk revision e17e5ba.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 7 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

        org.apache.hadoop.yarn.server.resourcemanager.security.TestAMRMTokens
        org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStoreZKClientConnections

        The following test timeouts occurred in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

        org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6827//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6827//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6827//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12702269/YARN-3242.004.patch against trunk revision e17e5ba. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 7 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.security.TestAMRMTokens org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStoreZKClientConnections The following test timeouts occurred in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6827//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6827//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6827//console This message is automatically generated.
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        zhihai xu thanks for reporting the issue .
        I have one dobut ,

        The watcher event from old ZK client session can still be sent to ZKRMStateStore after the old ZK client session is closed.

        This should be bug in Zookeeper right?

        Show
        rohithsharma Rohith Sharma K S added a comment - zhihai xu thanks for reporting the issue . I have one dobut , The watcher event from old ZK client session can still be sent to ZKRMStateStore after the old ZK client session is closed. This should be bug in Zookeeper right?
        Hide
        zxu zhihai xu added a comment -

        Hi Rohith,
        We can't say it is a bug at Zookeeper. It is just the way ZooKeeper is implemented. This implementation need the client side to maintain each client session separately even after ZooKeeper close is called. In this way, you can close the ZooKeeper quickly without waiting for all these events processed.
        The ZooKeeper implementation can be improved definitely. Even suppose no more watched events after ZooKeeper close, there may be still some race condition to solve if process watched event is called at the same time as ZooKeeper close.
        I also talked with some zookeeper guy, he said the ZooKeeper code base is pretty stable, there may not be any big change like this one.
        thanks
        zhihai

        Show
        zxu zhihai xu added a comment - Hi Rohith, We can't say it is a bug at Zookeeper. It is just the way ZooKeeper is implemented. This implementation need the client side to maintain each client session separately even after ZooKeeper close is called. In this way, you can close the ZooKeeper quickly without waiting for all these events processed. The ZooKeeper implementation can be improved definitely. Even suppose no more watched events after ZooKeeper close, there may be still some race condition to solve if process watched event is called at the same time as ZooKeeper close. I also talked with some zookeeper guy, he said the ZooKeeper code base is pretty stable, there may not be any big change like this one. thanks zhihai
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Thanks for detailed explanation. I was able to reproduce this issue frequently on low end machine. I deployed the patch in cluster and verified. It is working fine & RM is able to continue without shutdown. From the below log, I see that event Disconnected is from old session.

        2015-03-04 11:42:06,445 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Watcher event type: None with state:SyncConnected for path:null for Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
        2015-03-04 11:42:06,445 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: ZKRMStateStore Session connected
        2015-03-04 11:42:06,445 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Ignore watcher event type: None with state:Disconnected for path:null from old session
        2015-03-04 11:42:06,460 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down
        2015-03-04 11:42:06,460 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced...
        2015-03-04 11:42:06,987 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old node exists: 0a0c7961726e2d636c75737465721203726d31
        
        Show
        rohithsharma Rohith Sharma K S added a comment - Thanks for detailed explanation. I was able to reproduce this issue frequently on low end machine. I deployed the patch in cluster and verified. It is working fine & RM is able to continue without shutdown. From the below log, I see that event Disconnected is from old session. 2015-03-04 11:42:06,445 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Watcher event type: None with state:SyncConnected for path:null for Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED 2015-03-04 11:42:06,445 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: ZKRMStateStore Session connected 2015-03-04 11:42:06,445 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Ignore watcher event type: None with state:Disconnected for path:null from old session 2015-03-04 11:42:06,460 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-03-04 11:42:06,460 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced... 2015-03-04 11:42:06,987 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old node exists: 0a0c7961726e2d636c75737465721203726d31
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Are the test failures are related?

        Show
        rohithsharma Rohith Sharma K S added a comment - Are the test failures are related?
        Hide
        zxu zhihai xu added a comment -

        All these test failures are not related to my changes, which is due to java.net.BindException.
        I verified all these tests are passed in my local build.

        Show
        zxu zhihai xu added a comment - All these test failures are not related to my changes, which is due to java.net.BindException. I verified all these tests are passed in my local build.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12702592/YARN-3242.004.patch
        against trunk revision ed70fa1.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 7 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The following test timeouts occurred in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

        org.apache.hadoop.http.TestHttpServerLifecycle

        The test build failed in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6844//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6844//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6844//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12702592/YARN-3242.004.patch against trunk revision ed70fa1. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 7 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The following test timeouts occurred in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.http.TestHttpServerLifecycle The test build failed in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6844//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6844//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6844//console This message is automatically generated.
        Hide
        zxu zhihai xu added a comment -

        Hi Rohith, thanks for the review and verifying the patch.
        I restarted the test, TestHttpServerLifecycle failure is not related to my change. it passed in my local latest test.

        -------------------------------------------------------
         T E S T S
        -------------------------------------------------------
        Running org.apache.hadoop.http.TestHttpServerLifecycle
        Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.24 sec - in org.apache.hadoop.http.TestHttpServerLifecycle
        Results :
        Tests run: 7, Failures: 0, Errors: 0, Skipped: 0
        

        Also findbugs warning is not related to my change.
        many thanks
        zhihai

        Show
        zxu zhihai xu added a comment - Hi Rohith, thanks for the review and verifying the patch. I restarted the test, TestHttpServerLifecycle failure is not related to my change. it passed in my local latest test. ------------------------------------------------------- T E S T S ------------------------------------------------------- Running org.apache.hadoop.http.TestHttpServerLifecycle Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.24 sec - in org.apache.hadoop.http.TestHttpServerLifecycle Results : Tests run: 7, Failures: 0, Errors: 0, Skipped: 0 Also findbugs warning is not related to my change. many thanks zhihai
        Hide
        kasha Karthik Kambatla added a comment -

        The patch looks good to me. The findbugs warnings look unrelated.

        +1. Checking this in.

        Note: Moving to Curator will likely take care of it, but that effort has hit some road bumps and is taking longer than anticipated.

        Show
        kasha Karthik Kambatla added a comment - The patch looks good to me. The findbugs warnings look unrelated. +1. Checking this in. Note: Moving to Curator will likely take care of it, but that effort has hit some road bumps and is taking longer than anticipated.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #7263 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7263/)
        YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java
        • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #7263 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7263/ ) YARN-3242 . Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java hadoop-yarn-project/CHANGES.txt
        Hide
        kasha Karthik Kambatla added a comment -

        Thanks for working on this, zhihai xu. Nice debugging and I like the test too.

        Thanks for the reviews, Anubhav Dhoot.

        Just committed this to trunk and branch-2.

        Show
        kasha Karthik Kambatla added a comment - Thanks for working on this, zhihai xu . Nice debugging and I like the test too. Thanks for the reviews, Anubhav Dhoot . Just committed this to trunk and branch-2.
        Hide
        zxu zhihai xu added a comment -

        thanks Anubhav Dhoot, Rohith for the review and verification.
        Karthik Kambatla, many thanks for the commit and review.

        Show
        zxu zhihai xu added a comment - thanks Anubhav Dhoot , Rohith for the review and verification. Karthik Kambatla , many thanks for the commit and review.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #123 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/123/)
        YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #123 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/123/ ) YARN-3242 . Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java hadoop-yarn-project/CHANGES.txt hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk #857 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/857/)
        YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #857 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/857/ ) YARN-3242 . Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f) hadoop-yarn-project/CHANGES.txt hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk #2055 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2055/)
        YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
        • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2055 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2055/ ) YARN-3242 . Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java hadoop-yarn-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #114 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/114/)
        YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f)

        • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #114 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/114/ ) YARN-3242 . Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #123 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/123/)
        YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #123 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/123/ ) YARN-3242 . Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java hadoop-yarn-project/CHANGES.txt hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk #2073 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2073/)
        YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java
        • hadoop-yarn-project/CHANGES.txt
        • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2073 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2073/ ) YARN-3242 . Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha) (kasha: rev 8d88691d162f87f95c9ed7e0a569ef08e8385d4f) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java hadoop-yarn-project/CHANGES.txt hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Pulled this into 2.6.1. Ran compilation and TestZKRMStateStoreZKClientConnections before the push. Patch applied cleanly.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Pulled this into 2.6.1. Ran compilation and TestZKRMStateStoreZKClientConnections before the push. Patch applied cleanly.

          People

          • Assignee:
            zxu zhihai xu
            Reporter:
            zxu zhihai xu
          • Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development