Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-7196

Disk space used by failed job(teragen here) is not reclaimable

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • 1.3.0
    • Ozone Datanode
    • None
    • Apache Ozone 1.0.0

    Description

      On Fresh ozone cluster, ran a tergane job and killed it around 25% completion. this left ozone used about 74.4GB but none of the files written is listing.

      Issue can be reproducible with below steps. ( snapshots from the recon UI will be attached for usage reference)

      ozone sh volume create  o3://ozonefrankserviceid/testvol/
      ozone sh bucket create o3://ozonefrankserviceid/testvol/testbucketyarn jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar teragen -Dmapreduce.job.maps=2 1000000000 ofs://ozonefrankserviceid/testvol/testbucket
       ozone sh volume create  o3://ozonefrankserviceid/testvol/
       ozone sh bucket create o3://ozonefrankserviceid/testvol/testbucket
       
       yarn jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar teragen -Dmapreduce.job.maps=2 1000000000 ofs://ozonefrankserviceid/testvol/testbucket/teragentest1
       
       ozone fs -ls ofs://ozonefrankserviceid/testvol/testbucket/teragentest1
       ozone fs -ls ofs://ozonefrankserviceid/testvol/testbucket/teragentest1/_temporary
       ozone fs -ls ofs://ozonefrankserviceid/testvol/testbucket/teragentest1/_temporary/1
       ozone fs -ls ofs://ozonefrankserviceid/testvol/testbucket/teragentest1/_temporary/1/_temporary
       ozone fs -du -s -h ofs://ozonefrankserviceid/testvol/testbucket/teragentest1/_temporary/1/_temporary
       ozone fs -ls ofs://ozonefrankserviceid/testvol/testbucket/teragentest1/_temporary/1/_temporary/attempt_1661777485132_0001_m_000000_2 --> no files/bject
       ozone fs -ls ofs://ozonefrankserviceid/testvol/testbucket/teragentest1/_temporary/1/_temporary/attempt_1661777485132_0001_m_000001_2 --> no files/object
       
       Ozone usage is increased in the recon UI as 75GB
       
       hdfs dfs -rm -r -skipTrash ofs://ozonefrankserviceid/testvol/testbucket/teragentest1
       ozone sh bucket delete o3://ozonefrankserviceid/testvol/testbucket
       
       [root@DNHOST1 ozone-conf]# grep -A1 'hdds.datanode.dir' ozone-site.xml
          <name>hdds.datanode.dir</name>
          <value>/var/lib/hadoop-ozone/datanode/data</value>
      [root@DNHOST1 ozone-conf]#[root@DNHOST1 containerDir0]# du -sh /var/lib/hadoop-ozone/datanode/data/hdds/a9461a7f-ef81-4942-a278-15ff7602df14/current/containerDir0/
      26G    /var/lib/hadoop-ozone/datanode/data/hdds/a9461a7f-ef81-4942-a278-15ff7602df14/current/containerDir0/
      [root@DNHOST1 containerDir0]#
      [root@DNHOST1 chunks]# ozone sh volume list  o3://ozonefrankserviceid/ -a |egrep 'name|usedNamespace'
        "name" : "s3v",
        "usedNamespace" : 0,
          "name" : "om",
        "name" : "testvol",
        "usedNamespace" : 0,
          "name" : "hive/HMSHOST.example.com@SUPPORT.COM",
          "name" : "hive",
      [root@DNHOST1 chunks]# 

      Attachments

        1. Ozone usage_ after_failing_cleanup.png
          217 kB
          Franklinsam Paul
        2. Ozone usage_ fresh_install.png
          368 kB
          Franklinsam Paul

        Activity

          People

            Unassigned Unassigned
            frnklnsm Franklinsam Paul
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: