Pig
  1. Pig
  2. PIG-42

Pig should be able to split Gzip files like it can split Bzip files

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: impl
    • Labels:
      None

      Description

      It would be nice to be able to split gzip files like we can split bzip files. Unfortunately, we don't have a sync point for the split in the gzip format.

      Gzip file format supports the notion of concatenate gzipped files. When gzipped files are concatenated together they are treated as a single file. So to make a gzipped file splittable we can used an empty compressed file with some salt in the headers as a sync signature. Then we can make the gzip file splittable by using this sync signature between compressed segments of the file.

      1. gzip.patch
        15 kB
        Benjamin Reed

        Issue Links

          Activity

            People

            • Assignee:
              Benjamin Reed
              Reporter:
              Benjamin Reed
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development