Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13476

HDFS (Hadoop/HDP 2.7.3.2.6.4.0-91) reports CORRUPT files

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 2.7.4
    • None
    • datanode
    • None

    Description

      We have a security software runs on local file system(ext4), and the security software denies some particular users to access some particular HDFS folders based on security policy. For example, the security policy always gives the user hdfs full permission, and denies the user yarn to access /dir1.  If the user yarn tries to access a file under HDFS folder /dir1, the security software denies the access and returns EACCES from file system call through errno. This used to work because the data corruption was determined by block scanner(https://blog.cloudera.com/blog/2016/12/hdfs-datanode-scanners-and-disk-checker-explained/).

      On HDP 2.7.3.2.6.4.0-91, HDFS reports a lot data corruptions because of the security policy to deny file access in HDFS from local file system. We debugged HDFS and found out BlockSender() directly calls the following statements and may cause the problem:

      datanode.notifyNamenodeDeletedBlock(block, replica.getStorageUuid());
      datanode.data.invalidate(block.getBlockPoolId(), new Block[]{block.getLocalBlock()});

      In the mean time, the block scanner is not triggered because of the undocumented property dfs.datanode.disk.check.min.gap. However the problem is still there if we disable dfs.datanode.disk.check.min.gap by setting it to 0. . 

      Attachments

        Activity

          People

            Unassigned Unassigned
            fxu_36@hotmail.com feng xu
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: