Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12113

`hadoop fs -setrep` requries huge amount of memory on client side

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 2.6.0, 2.6.5
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      Java 7

      Description

      $ hadoop fs -setrep -w 3 /
      

      was failing with

      Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
      at java.util.Arrays.copyOf(Arrays.java:2367)
      at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
      at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
      at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
      at java.lang.StringBuilder.append(StringBuilder.java:132)
      at org.apache.hadoop.fs.shell.PathData.getStringForChildPath(PathData.java:305)
      at org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:272)
      at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:373)
      at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:319)
      at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:373)
      at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:319)
      at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:373)
      at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:319)
      at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:373)
      at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:319)
      at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
      at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
      at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
      at org.apache.hadoop.fs.shell.SetReplication.processArguments(SetReplication.java:76)
      at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118)
      at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
      at org.apache.hadoop.fs.FsShell.run(FsShell.java:315)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
      at org.apache.hadoop.fs.FsShell.main(FsShell.java:372)
      

      Until hadoop fs cli command's Java heap memory was allowed to grow to 5Gb:

      HADOOP_HEAPSIZE=5000 hadoop fs -setrep -w 3 /
      

      Notice that this setrep change was done for whole HDFS filesystem.

      So looks like there is a dependency on amount of memory used by `hadoop fs -setrep` command on how many files total HDFS has? This is not a huge HDFS filesystem, I would say even "small" by current standards.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                Tagar Ruslan Dautkhanov
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: