Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-22257

Metrics collector fails to stop after Datanode is stopped in distributed mode

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.0.0
    • 2.6.0
    • ambari-metrics
    • None

    Description

      AMS collector stop failed due to timeout at the ams-hbase regionserver stop. The log contains lots of exceptions related to DN connection issues during the stop. The problem here is that DNs were stopped before the collector.

      2017-10-17 14:29:10,689 ERROR [Thread-274] hdfs.DFSClient: Failed to close inode 17762
      org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/ams/hbase/WALs/ctr-e134-1499953498516-230429-01-000007.hwx.site,61320,1508248489809/ctr-e134-1499953498516-230429-01-000007.hwx.site%2C61320%2C1508248489809.default.1508250548392 could only be replicated to 0 nodes instead of minReplication (=1).  There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
              at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1719)
      

      Attachments

        1. AMBARI-22257.patch
          0.8 kB
          Siddharth Wagle

        Activity

          People

            swagle Siddharth Wagle
            swagle Siddharth Wagle
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: