Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-42

Pig should be able to split Gzip files like it can split Bzip files

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: impl
    • Labels:
      None

      Description

      It would be nice to be able to split gzip files like we can split bzip files. Unfortunately, we don't have a sync point for the split in the gzip format.

      Gzip file format supports the notion of concatenate gzipped files. When gzipped files are concatenated together they are treated as a single file. So to make a gzipped file splittable we can used an empty compressed file with some salt in the headers as a sync signature. Then we can make the gzip file splittable by using this sync signature between compressed segments of the file.

        Attachments

        1. gzip.patch
          15 kB
          Benjamin Reed

          Issue Links

            Activity

              People

              • Assignee:
                breed Benjamin Reed
                Reporter:
                breed Benjamin Reed
              • Votes:
                1 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: