Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-12990

lz4 incompatibility between OS and Hadoop

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 2.6.0
    • None
    • io, native
    • None

    Description

      hdfs dfs -text hit exception when trying to view the compression file created by Linux lz4 tool.

      The Hadoop version has HADOOP-11184 "update lz4 to r123", thus it is using LZ4 library in release r123.

      Linux lz4 version:

      $ /tmp/lz4 -h 2>&1 | head -1
      *** LZ4 Compression CLI 64-bits r123, by Yann Collet (Apr  1 2016) ***
      

      Test steps:

      $ cat 10rows.txt
      001|c1|c2|c3|c4|c5|c6|c7|c8|c9
      002|c1|c2|c3|c4|c5|c6|c7|c8|c9
      003|c1|c2|c3|c4|c5|c6|c7|c8|c9
      004|c1|c2|c3|c4|c5|c6|c7|c8|c9
      005|c1|c2|c3|c4|c5|c6|c7|c8|c9
      006|c1|c2|c3|c4|c5|c6|c7|c8|c9
      007|c1|c2|c3|c4|c5|c6|c7|c8|c9
      008|c1|c2|c3|c4|c5|c6|c7|c8|c9
      009|c1|c2|c3|c4|c5|c6|c7|c8|c9
      010|c1|c2|c3|c4|c5|c6|c7|c8|c9
      $ /tmp/lz4 10rows.txt 10rows.txt.r123.lz4
      Compressed 310 bytes into 105 bytes ==> 33.87%
      $ hdfs dfs -put 10rows.txt.r123.lz4 /tmp
      $ hdfs dfs -text /tmp/10rows.txt.r123.lz4
      16/04/01 08:19:07 INFO compress.CodecPool: Got brand-new decompressor [.lz4]
      Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
          at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:123)
          at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:98)
          at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
          at java.io.InputStream.read(InputStream.java:101)
          at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
          at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
          at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
          at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:106)
          at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:101)
          at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
          at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
          at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
          at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
          at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118)
          at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
          at org.apache.hadoop.fs.FsShell.run(FsShell.java:315)
          at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
          at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
          at org.apache.hadoop.fs.FsShell.main(FsShell.java:372)
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            jzhuge John Zhuge
            Votes:
            5 Vote for this issue
            Watchers:
            21 Start watching this issue

            Dates

              Created:
              Updated: