Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-8900

BuiltInGzipDecompressor throws IOException - stored gzip size doesn't match decompressed size

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1-win, 2.0.1-alpha
    • 1.2.0, 2.0.3-alpha
    • None
    • None
    • Encountered failure when processing large GZIP file

    • Reviewed

    Description

      Encountered failure when processing large GZIP file
      • Gz: Failed in 1hrs, 13mins, 57sec with the error:
      ¸java.io.IOException: IO error in map input file hdfs://localhost:9000/Halo4/json_m/gz/NewFileCat.txt.gz
      at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:242)
      at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
      at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:415)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
      at org.apache.hadoop.mapred.Child.main(Child.java:260)
      Caused by: java.io.IOException: stored gzip size doesn't match decompressed size
      at org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.executeTrailerState(BuiltInGzipDecompressor.java:389)
      at org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.decompress(BuiltInGzipDecompressor.java:224)
      at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:82)
      at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
      at java.io.InputStream.read(InputStream.java:102)
      at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
      at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:136)
      at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:40)
      at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66)
      at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32)
      at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
      at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
      ... 9 more

      Attachments

        1. hadoop-8900.branch-1.patch
          4 kB
          Suresh Srinivas
        2. hadoop8900-2.txt
          5 kB
          Andy Isaacson
        3. BuiltInGzipDecompressor2.patch
          1 kB
          Slavik Krassovsky
        4. hadoop8900.txt
          4 kB
          Andy Isaacson

        Issue Links

          Activity

            People

              adi2 Andy Isaacson
              slavik_krassovsky Slavik Krassovsky
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: