Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7982

huge non dfs space used

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 2.6.0
    • Fix Version/s: 2.7.0
    • Component/s: datanode
    • Labels:
      None

      Description

      Hi...

      I'm trying to load an external textfile table into a internal orc table using hive. My process failed with the following error :
      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/hive/blablabla.... could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and no node(s) are excluded in this operation.

      After investigation, I saw that the quantity of "non dfs space" grows more and more, until the job fails.
      Just before failing, the "non dfs used space" reaches 54.GB on each datanode. I still have space in "remaining DFS".

      Here the dfsadmin report just before the issue :

      [hdfs@hadoop-01 data]$ hadoop dfsadmin -report
      DEPRECATED: Use of this script to execute hdfs command is deprecated.
      Instead use the hdfs command for it.

      Configured Capacity: 475193597952 (442.56 GB)
      Present Capacity: 290358095182 (270.42 GB)
      DFS Remaining: 228619903369 (212.92 GB)
      DFS Used: 61738191813 (57.50 GB)
      DFS Used%: 21.26%
      Under replicated blocks: 38
      Blocks with corrupt replicas: 0
      Missing blocks: 0

      -------------------------------------------------
      Live datanodes (3):

      Name: 192.168.3.36:50010 (hadoop-04.XXXXX.local)
      Hostname: hadoop-04.XXXXX.local
      Decommission Status : Normal
      Configured Capacity: 158397865984 (147.52 GB)
      DFS Used: 20591481196 (19.18 GB)
      Non DFS Used: 61522602976 (57.30 GB)
      DFS Remaining: 76283781812 (71.04 GB)
      DFS Used%: 13.00%
      DFS Remaining%: 48.16%
      Configured Cache Capacity: 0 (0 B)
      Cache Used: 0 (0 B)
      Cache Remaining: 0 (0 B)
      Cache Used%: 100.00%
      Cache Remaining%: 0.00%
      Xceivers: 182
      Last contact: Tue Mar 24 10:56:05 CET 2015

      Name: 192.168.3.35:50010 (hadoop-03.XXXXX.local)
      Hostname: hadoop-03.XXXXX.local
      Decommission Status : Normal
      Configured Capacity: 158397865984 (147.52 GB)
      DFS Used: 20555853589 (19.14 GB)
      Non DFS Used: 61790296136 (57.55 GB)
      DFS Remaining: 76051716259 (70.83 GB)
      DFS Used%: 12.98%
      DFS Remaining%: 48.01%
      Configured Cache Capacity: 0 (0 B)
      Cache Used: 0 (0 B)
      Cache Remaining: 0 (0 B)
      Cache Used%: 100.00%
      Cache Remaining%: 0.00%
      Xceivers: 184
      Last contact: Tue Mar 24 10:56:05 CET 2015

      Name: 192.168.3.37:50010 (hadoop-05.XXXXX.local)
      Hostname: hadoop-05.XXXXX.local
      Decommission Status : Normal
      Configured Capacity: 158397865984 (147.52 GB)
      DFS Used: 20590857028 (19.18 GB)
      Non DFS Used: 61522603658 (57.30 GB)
      DFS Remaining: 76284405298 (71.05 GB)
      DFS Used%: 13.00%
      DFS Remaining%: 48.16%
      Configured Cache Capacity: 0 (0 B)
      Cache Used: 0 (0 B)
      Cache Remaining: 0 (0 B)
      Cache Used%: 100.00%
      Cache Remaining%: 0.00%
      Xceivers: 182
      Last contact: Tue Mar 24 10:56:05 CET 2015

      I was expected to find a temporary space used within my filesystem (ie /data).
      I found the DFS usage under /data/hadoop/hdfs/data (19GB) but no trace of 57GB for non DFS...

      [root@hadoop-05 hadoop]# df -h /data
      Filesystem Size Used Avail Use% Mounted on
      /dev/sdb1 148G 20G 121G 14% /data

      I also checked dfs.datanode.du.reserved that is set to zero.
      [root@hadoop-05 hadoop]# hdfs getconf -confkey dfs.datanode.du.reserved
      0

      Did I miss something ? Where is non DFS space on linux ? Why did I get this message "could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and no node(s) are excluded in this operation." knowing that datanodes were up and running with still remaining DFS space.

      This error is blocking us.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              easyoups regis le bretonnic
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: