Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-8522

ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.3, 2.0.0-alpha
    • 3.0.0
    • io
    • Reviewed

    Description

      ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used. The issue is that finish() flushes the compressor buffer and writes the gzip CRC32 + data length trailer. After that, resetState() does not repeat the gzip header, but simply starts writing more deflate-compressed data. The resultant files are not readable by the Linux "gunzip" tool. ResetableGzipOutputStream should write valid multi-member gzip files.

      The gzip format is specified in RFC 1952.

      Attachments

        1. HADOOP-8522.05.patch
          9 kB
          Christopher Douglas
        2. HADOOP-8522.06.patch
          9 kB
          Christopher Douglas
        3. HADOOP-8522.07.patch
          8 kB
          Christopher Douglas
        4. HADOOP-8522-4.patch
          7 kB
          Mike Percy

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mpercy Mike Percy
            mpercy Mike Percy
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment