Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-8522

ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.3, 2.0.0-alpha
    • Fix Version/s: 3.0.0
    • Component/s: io
    • Labels:
    • Hadoop Flags:
      Reviewed

      Description

      ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used. The issue is that finish() flushes the compressor buffer and writes the gzip CRC32 + data length trailer. After that, resetState() does not repeat the gzip header, but simply starts writing more deflate-compressed data. The resultant files are not readable by the Linux "gunzip" tool. ResetableGzipOutputStream should write valid multi-member gzip files.

      The gzip format is specified in RFC 1952.

        Attachments

        1. HADOOP-8522.05.patch
          9 kB
          Chris Douglas
        2. HADOOP-8522.06.patch
          9 kB
          Chris Douglas
        3. HADOOP-8522.07.patch
          8 kB
          Chris Douglas
        4. HADOOP-8522-4.patch
          7 kB
          Mike Percy

          Issue Links

            Activity

              People

              • Assignee:
                mpercy Mike Percy
                Reporter:
                mpercy Mike Percy
              • Votes:
                0 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: