Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16947

RBF NamenodeHeartbeatService to report error for not being able to register namenode in state store

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • rbf
    • Reviewed

    Description

      Namenode heartbeat service should provide error with full stacktrace if it cannot register namenode in the state store. As of today, we only log info msg.

      For zookeeper based impl, this might mean either a) curator manager is not initialized or b) if it fails to write to znode after exhausting retries. For either of these cases, reporting only INFO log might not be good enough and we might have to look for errors elsewhere.

       

      Sample example:

      2023-02-20 23:10:33,714 DEBUG [NamenodeHeartbeatService {ns} nn0-0] router.NamenodeHeartbeatService - Received service state: ACTIVE from HA namenode: {ns}-nn0:nn-0-{ns}.{cluster}:9000
      2023-02-20 23:10:33,731 INFO  [NamenodeHeartbeatService {ns} nn0-0] impl.MembershipStoreImpl - Inserting new NN registration: nn-0.namenode.{cluster}:8888->{ns}:nn0:nn-0-{ns}.{cluster}:9000-ACTIVE
      2023-02-20 23:10:33,731 INFO  [NamenodeHeartbeatService {ns} nn0-0] router.NamenodeHeartbeatService - Cannot register namenode in the State Store
       

      If we could log full stacktrace:

      2023-02-21 00:20:24,691 ERROR [NamenodeHeartbeatService {ns} nn0-0] router.NamenodeHeartbeatService - Cannot register namenode in the State Store
      org.apache.hadoop.hdfs.server.federation.store.StateStoreUnavailableException: State Store driver StateStoreZooKeeperImpl in nn-0.namenode.{cluster} is not ready.
              at org.apache.hadoop.hdfs.server.federation.store.driver.StateStoreDriver.verifyDriverReady(StateStoreDriver.java:158)
              at org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.putAll(StateStoreZooKeeperImpl.java:235)
              at org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreBaseImpl.put(StateStoreBaseImpl.java:74)
              at org.apache.hadoop.hdfs.server.federation.store.impl.MembershipStoreImpl.namenodeHeartbeat(MembershipStoreImpl.java:179)
              at org.apache.hadoop.hdfs.server.federation.resolver.MembershipNamenodeResolver.registerNamenode(MembershipNamenodeResolver.java:381)
              at org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateState(NamenodeHeartbeatService.java:317)
              at org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.lambda$periodicInvoke$0(NamenodeHeartbeatService.java:244)
      ...
      ... 

      Attachments

        Issue Links

          Activity

            People

              vjasani Viraj Jasani
              vjasani Viraj Jasani
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: