Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-3721

Implement getContentSummary to provide replicated size properly to dfs -du command

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      Currently when you run hdfs dfs -du command against a path on Ozone, it uses the default implementation from FileSystem class in the Hadoop project, and that does not care to calculate with replication factor by default. In DistributedFileSystem and in a couple of FileSystem implementation there is an override to calculate the full replicated size properly.

      Currently the output is something like this for a folder which has file with replication factor of 3:

      hdfs dfs -du -s -h o3fs://perfbucket.volume.ozone1/terasort/datagen
      931.3 G  931.3 G  o3fs://perfbucket.volume.ozone1/terasort/datagen
      

      The command in Ozone's case as well should report the replicated size az the second number so something around 2.7TB in this case.
      In order to do so, we should implement getContentSummary and calculate the replicated size in the response properly in order to get there.

        Attachments

          Activity

            People

            • Assignee:
              pifta István Fajth
              Reporter:
              pifta István Fajth
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: