Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-5529

{ Disk Fail } Can we shutdown the DN when it meet's disk failed condition

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Scenario :
      ========

      had configured the two dir's for the datanode
      One dir is not having the permissions,Hence is throwing following exception and getting NPE while sending the heartbeat..

      2013-11-19 17:35:26,599 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-994471486-10.18.40.21-1384754500555 (storage id DS-1184111760-10.18.40.38-50010-1384862726499) service to HOST-10-18-91-26/10.18.40.21:8020
      org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 1, volumes configured: 2, volumes failed: 1, volume failures tolerated: 0
              at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.<init>(FsDatasetImpl.java:202)
              at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:34)
              at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:30)
              at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:966)
              at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:928)
              at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:285)
              at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:222)
              at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664)
              at java.lang.Thread.run(Thread.java:662)
      2013-11-19 17:35:26,602 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-994471486-10.18.40.21-1384754500555 (storage id DS-1184111760-10.18.40.38-50010-1384862726499) service to HOST-10-18-91-26/10.18.40.21:8020
      2013-11-19 17:35:26,602 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-994471486-10.18.40.21-1384754500555 (storage id DS-1184111760-10.18.40.38-50010-1384862726499) service to linux-hadoop/10.18.40.14:8020 beginning handshake with NN
      2013-11-19 17:35:26,648 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-994471486-10.18.40.21-1384754500555 (storage id DS-1184111760-10.18.40.38-50010-1384862726499) service to linux-hadoop/10.18.40.14:8020 successfully registered with NN
      2013-11-19 17:35:26,648 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: For namenode linux-hadoop/10.18.40.14:8020 using DELETEREPORT_INTERVAL of 300000 msec  BLOCKREPORT_INTERVAL of 21600000msec Initial delay: 0msec; heartBeatInterval=3000
      2013-11-19 17:35:26,649 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService for Block pool BP-994471486-10.18.40.21-1384754500555 (storage id DS-1184111760-10.18.40.38-50010-1384862726499) service to linux-hadoop/10.18.40.14:8020
      java.lang.NullPointerException
              at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:439)
              at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:525)
              at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:676)
              at java.lang.Thread.run(Thread.java:662)
      

        Issue Links

          Activity

          Brahma Reddy Battula made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Duplicate [ 3 ]
          Hide
          Brahma Reddy Battula added a comment -

          Closing since it will be handled as part of the HDFS-2882

          Show
          Brahma Reddy Battula added a comment - Closing since it will be handled as part of the HDFS-2882
          Vinayakumar B made changes -
          Field Original Value New Value
          Link This issue is duplicated by HDFS-2882 [ HDFS-2882 ]
          Hide
          Vinayakumar B added a comment -

          This issue is same as HDFS-2882 this is also one of the reason to fail the initialization of Blockpool as mentioned in HDFS-2882
          Latest patch attached to HDFS-2882 can make the datanode shutdown in this case.

          Show
          Vinayakumar B added a comment - This issue is same as HDFS-2882 this is also one of the reason to fail the initialization of Blockpool as mentioned in HDFS-2882 Latest patch attached to HDFS-2882 can make the datanode shutdown in this case.
          Brahma Reddy Battula created issue -

            People

            • Assignee:
              Unassigned
              Reporter:
              Brahma Reddy Battula
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development