Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
Reviewed
-
Add a -x option for "hdfs -du" and "hdfs -count" commands to exclude snapshots from being calculated.
Description
When running hadoop fs -du on a snapshotted directory (or one of its children), the report includes space consumed by blocks that are only present in the snapshots. This is confusing for end users.
$ hadoop fs -du -h -s /tmp/parent /tmp/parent/* 799.7 M 2.3 G /tmp/parent 799.7 M 2.3 G /tmp/parent/sub1 $ hdfs dfs -createSnapshot /tmp/parent snap1 Created snapshot /tmp/parent/.snapshot/snap1 $ hadoop fs -rm -skipTrash /tmp/parent/sub1/* ... $ hadoop fs -du -h -s /tmp/parent /tmp/parent/* 799.7 M 2.3 G /tmp/parent 799.7 M 2.3 G /tmp/parent/sub1 $ hdfs dfs -deleteSnapshot /tmp/parent snap1 $ hadoop fs -du -h -s /tmp/parent /tmp/parent/* 0 0 /tmp/parent 0 0 /tmp/parent/sub1
It would be helpful if we had a flag, say -X, to exclude any snapshot related disk usage in the output
Attachments
Attachments
Issue Links
- is related to
-
HDFS-10875 Optimize du -x to cache intermediate result
- Resolved