Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12618

fsck -includeSnapshots reports wrong amount of total blocks

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Minor
    • Resolution: Unresolved
    • 3.0.0-alpha3
    • None
    • tools
    • None

    Description

      When snapshot is enabled, if a file is deleted but is contained by a snapshot, fsck will not reported blocks for such file, showing different number of total blocks than what is exposed in the Web UI.

      This should be fine, as fsck provides -includeSnapshots option. The problem is that -includeSnapshots option causes fsck to count blocks for every occurrence of a file on snapshots, which is wrong because these blocks should be counted only once (for instance, if a 100MB file is present on 3 snapshots, it would still map to one block only in hdfs). This causes fsck to report much more blocks than what actually exist in hdfs and is reported in the Web UI.

      Here's an example:

      1) HDFS has two files of 2 blocks each:

      $ hdfs dfs -ls -R /
      drwxr-xr-x   - root supergroup          0 2017-10-07 21:21 /snap-test
      -rw-r--r--   1 root supergroup  209715200 2017-10-07 20:16 /snap-test/file1
      -rw-r--r--   1 root supergroup  209715200 2017-10-07 20:17 /snap-test/file2
      drwxr-xr-x   - root supergroup          0 2017-05-13 13:03 /test
      

      2) There are two snapshots, with the two files present on each of the snapshots:

      $ hdfs dfs -ls -R /snap-test/.snapshot
      drwxr-xr-x   - root supergroup          0 2017-10-07 21:21 /snap-test/.snapshot/snap1
      -rw-r--r--   1 root supergroup  209715200 2017-10-07 20:16 /snap-test/.snapshot/snap1/file1
      -rw-r--r--   1 root supergroup  209715200 2017-10-07 20:17 /snap-test/.snapshot/snap1/file2
      drwxr-xr-x   - root supergroup          0 2017-10-07 21:21 /snap-test/.snapshot/snap2
      -rw-r--r--   1 root supergroup  209715200 2017-10-07 20:16 /snap-test/.snapshot/snap2/file1
      -rw-r--r--   1 root supergroup  209715200 2017-10-07 20:17 /snap-test/.snapshot/snap2/file2
      

      3) fsck -includeSnapshots reports 12 blocks in total (4 blocks for the normal file path, plus 4 blocks for each snapshot path):

      $ hdfs fsck / -includeSnapshots
      FSCK started by root (auth:SIMPLE) from /127.0.0.1 for path / at Mon Oct 09 15:15:36 BST 2017
      
      Status: HEALTHY
       Number of data-nodes:	1
       Number of racks:		1
       Total dirs:			6
       Total symlinks:		0
      
      Replicated Blocks:
       Total size:	1258291200 B
       Total files:	6
       Total blocks (validated):	12 (avg. block size 104857600 B)
       Minimally replicated blocks:	12 (100.0 %)
       Over-replicated blocks:	0 (0.0 %)
       Under-replicated blocks:	0 (0.0 %)
       Mis-replicated blocks:		0 (0.0 %)
       Default replication factor:	1
       Average block replication:	1.0
       Missing blocks:		0
       Corrupt blocks:		0
       Missing replicas:		0 (0.0 %)
      

      4) Web UI shows the correct number (4 blocks only):

      Security is off.
      Safemode is off.
      5 files and directories, 4 blocks = 9 total filesystem object(s).
      

      I would like to work on this solution, will propose an initial solution shortly.

      Attachments

        1. HDFS-12618.006.patch
          36 kB
          Wellington Chevreuil
        2. HDFS-12618.005.patch
          33 kB
          Wellington Chevreuil
        3. HDFS-12618.004.patch
          32 kB
          Wellington Chevreuil
        4. HDFS-12618.003.patch
          11 kB
          Wellington Chevreuil
        5. HDFS-12618.002.patch
          12 kB
          Wellington Chevreuil
        6. HDFS-12618.001.patch
          9 kB
          Wellington Chevreuil
        7. HDFS-121618.initial
          4 kB
          Wellington Chevreuil

        Activity

          People

            wchevreuil Wellington Chevreuil
            wchevreuil Wellington Chevreuil
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: