Uploaded image for project: 'Apache Gobblin'
  1. Apache Gobblin
  2. GOBBLIN-1382

gobblin yarn fails to clean up old ZK node sometime

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.15.0
    • 0.17.0
    • gobblin-yarn
    • None

    Description

      This is a log from localhost, this does not happen all the time, but when I subsequently stop and start the yarn, it fails due to existing znode.

       This shows the failed start:

      ==> logs/yarn.out <====> logs/yarn.out <==2021-02-07 16:58:44 PST INFO  [main] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Using ZooKeeper connection string: localhost:21812021-02-07 16:58:45 PST WARN  [main] org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable2021-02-07 16:58:46 PST INFO  [main] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Creating Helix cluster GobblinYarn-2 with overwrite: true2021-02-07 16:58:46 PST ERROR [main] org.apache.helix.manager.zk.ZKHelixAdmin  - Error creating cluster:GobblinYarn-2org.I0Itec.zkclient.exception.ZkNodeExistsException: org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /GobblinYarn-2/CONTROLLER at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:55) at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1161) at org.apache.helix.manager.zk.zookeeper.ZkClient.create(ZkClient.java:535) at org.apache.helix.manager.zk.client.SharedZkClient.create(SharedZkClient.java:85) at org.apache.helix.manager.zk.zookeeper.ZkClient.createPersistent(ZkClient.java:362) at org.apache.helix.manager.zk.zookeeper.ZkClient.createPersistent(ZkClient.java:338) at org.apache.helix.manager.zk.zookeeper.ZkClient.createPersistent(ZkClient.java:317) at org.apache.helix.manager.zk.ZKHelixAdmin.createZKPaths(ZKHelixAdmin.java:750) at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:715) at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:162) at org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:91) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:346) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1120)Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /GobblinYarn-2/CONTROLLER at org.apache.zookeeper.KeeperException.create(KeeperException.java:122) at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:792) at org.apache.helix.manager.zk.zookeeper.ZkConnection.create(ZkConnection.java:114) at org.apache.helix.manager.zk.zookeeper.ZkClient$1.call(ZkClient.java:538) at org.apache.helix.manager.zk.zookeeper.ZkClient$1.call(ZkClient.java:535) at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1151) ... 11 more
      ==> logs/yarn.err <==Exception in thread "main" org.apache.helix.HelixException: cluster GobblinYarn-2 is not setup yet at org.apache.helix.manager.zk.ZKHelixAdmin.addStateModelDef(ZKHelixAdmin.java:989) at org.apache.helix.tools.ClusterSetup.addStateModelDef(ClusterSetup.java:361) at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:165) at org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:91) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:346) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1120)
      ==> logs/yarn.out <==2021-02-07 16:58:46 PST INFO  [Thread-5] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Stopping the GobblinYarnAppLauncher2021-02-07 16:58:46 PST INFO  [Thread-5] org.apache.gobblin.util.ExecutorsUtils  - Attempting to shutdown ExecutorService: java.util.concurrent.Executors$DelegatedScheduledExecutorService@1a5b4edc2021-02-07 16:58:46 PST INFO  [Thread-5] org.apache.gobblin.util.ExecutorsUtils  - Successfully shutdown ExecutorService: java.util.concurrent.Executors$DelegatedScheduledExecutorService@1a5b4edc2021-02-07 16:58:46 PST INFO  [Thread-5] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Disabling all live Helix instances..
      ==> logs/yarn.err <==Exception in thread "Thread-5" org.apache.helix.HelixException: HelixManager (ZkClient) is not connected. Call HelixManager#connect() at org.apache.helix.manager.zk.ZKHelixManager.checkConnected(ZKHelixManager.java:363) at org.apache.helix.manager.zk.ZKHelixManager.getClusterManagmentTool(ZKHelixManager.java:908) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.disableLiveHelixInstances(GobblinYarnAppLauncher.java:544) at org.apache.gobblin.yarn.GobblinYarnAppLauncher.stop(GobblinYarnAppLauncher.java:447) at org.apache.gobblin.yarn.GobblinYarnAppLauncher$2.run(GobblinYarnAppLauncher.java:1107)
      

       

      This shows the successful next run

      ==> logs/yarn.out <==
      2021-02-07 17:02:30 PST INFO  [main] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Using ZooKeeper connection string: localhost:2181
      2021-02-07 17:02:31 PST WARN  [main] org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      2021-02-07 17:02:32 PST INFO  [main] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Creating Helix cluster GobblinYarn-2 with overwrite: true
      2021-02-07 17:02:32 PST INFO  [main] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Created Helix cluster GobblinYarn-2
      2021-02-07 17:02:32 PST INFO  [main] org.apache.hadoop.yarn.client.RMProxy  - Connecting to ResourceManager at /0.0.0.0:8032
      2021-02-07 17:02:33 PST WARN  [main] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Found 0 live instances in the cluster.
      2021-02-07 17:02:33 PST INFO  [main] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - No reconnectable application found so submitting a new application
      2021-02-07 17:02:33 PST INFO  [main] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - creating new yarn application
      2021-02-07 17:02:33 PST INFO  [main] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - created new yarn application: 3
      2021-02-07 17:02:33 PST INFO  [main] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Configured GobblinApplicationMaster work directory to: hdfs://localhost:8020/tmp/gobblin-yarn/GobblinYarn-2/application_1612697249959_0003/appmaster
      

       

      Attachments

        Activity

          People

            abti Abhishek Tiwari
            jaysen Jay Sen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: