Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-8522

ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.3, 2.0.0-alpha
    • 3.0.0
    • io
    • Reviewed

    Description

      ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used. The issue is that finish() flushes the compressor buffer and writes the gzip CRC32 + data length trailer. After that, resetState() does not repeat the gzip header, but simply starts writing more deflate-compressed data. The resultant files are not readable by the Linux "gunzip" tool. ResetableGzipOutputStream should write valid multi-member gzip files.

      The gzip format is specified in RFC 1952.

      Attachments

        1. HADOOP-8522.05.patch
          9 kB
          Christopher Douglas
        2. HADOOP-8522.06.patch
          9 kB
          Christopher Douglas
        3. HADOOP-8522.07.patch
          8 kB
          Christopher Douglas
        4. HADOOP-8522-4.patch
          7 kB
          Mike Percy

        Issue Links

          Activity

            People

              mpercy Mike Percy
              mpercy Mike Percy
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: