Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Currently when you run hdfs dfs -du command against a path on Ozone, it uses the default implementation from FileSystem class in the Hadoop project, and that does not care to calculate with replication factor by default. In DistributedFileSystem and in a couple of FileSystem implementation there is an override to calculate the full replicated size properly.
Currently the output is something like this for a folder which has file with replication factor of 3:
hdfs dfs -du -s -h o3fs://perfbucket.volume.ozone1/terasort/datagen 931.3 G 931.3 G o3fs://perfbucket.volume.ozone1/terasort/datagen
The command in Ozone's case as well should report the replicated size az the second number so something around 2.7TB in this case.
In order to do so, we should implement getContentSummary and calculate the replicated size in the response properly in order to get there.