Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-6753

Initialize checkDisk when DirectoryScanner not able to get files list for scanning

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.7.0
    • None
    • None
    • Reviewed

    Description

      Env Details :
      =============
      Cluster has 3 Datanode
      Cluster installed with "Rex" user
      dfs.datanode.failed.volumes.tolerated = 3
      dfs.blockreport.intervalMsec = 18000
      dfs.datanode.directoryscan.interval = 120
      DN_XX1.XX1.XX1.XX1 data dir = /mnt/tmp_Datanode,/home/REX/data/dfs1/data,/home/REX/data/dfs2/data,/opt/REX/dfs/data

      /home/REX/data/dfs1/data,/home/REX/data/dfs2/data,/opt/REX/dfs/data - permission is denied ( hence DN considered the volume as failed )

      Expected behavior is observed when disk is not full:
      ========================================

      Step 1: Change the permissions of /mnt/tmp_Datanode to root

      Step 2: Perform write operations ( DN detects that all Volume configured is failed and gets shutdown )

      Scenario 1:
      ===========

      Step 1 : Make /mnt/tmp_Datanode disk full and change the permissions to root
      Step 2 : Perform client write operations ( disk full exception is thrown , but Datanode is not getting shutdown , eventhough all the volume configured has failed)

       
      2014-07-21 14:10:52,814 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: XX1.XX1.XX1.XX1:50010:DataXceiver error processing WRITE_BLOCK operation  src: /XX2.XX2.XX2.XX2:10106 dst: /XX1.XX1.XX1.XX1:50010
       
      org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: Out of space: The volume with the most available space (=4096 B) is less than the block size (=134217728 B).
       
      at org.apache.hadoop.hdfs.server.datanode.fsdataset.RoundRobinVolumeChoosingPolicy.chooseVolume(RoundRobinVolumeChoosingPolicy.java:60)
       
      

      Observations :
      ==============
      1. Write operations does not shutdown Datanode , eventhough all the volume configured is failed ( When one of the disk is full and for all the disk permission is denied)

      2. Directory scannning fails , still DN is not getting shutdown

       
      2014-07-21 14:13:00,180 WARN org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: Exception occured while compiling report: 
       
      java.io.IOException: Invalid directory or I/O error occurred for dir: /mnt/tmp_Datanode/current/BP-1384489961-XX2.XX2.XX2.XX2-845784615183/current/finalized
       
      at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1164)
       
      at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner$ReportCompiler.compileReport(DirectoryScanner.java:596)
       
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            andreina J.Andreina Assign to me
            andreina J.Andreina
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment