Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.2
    • Fix Version/s: 0.22.0
    • Component/s: io
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Processing of concatenated gzip files formerly stopped (quietly) at the end of the first substream/"member"; now processing will continue to the end of the concatenated stream, like gzip(1) does. (bzip2 support is unaffected by this patch.)

      Description

      When running MapReduce with concatenated gzip files as input only the first part is read, which is confusing, to say the least. Concatenated gzip is described in http://www.gnu.org/software/gzip/manual/gzip.html#Advanced-usage and in http://www.ietf.org/rfc/rfc1952.txt. (See original report at http://www.nabble.com/Problem-with-Hadoop-and-concatenated-gzip-files-to21383097.html)

      1. C6835-9.patch
        49 kB
        Chris Douglas
      2. grr-hadoop-common.dif.20100614c
        37 kB
        Greg Roelofs
      3. grr-hadoop-mapreduce.dif.20100614c
        25 kB
        Greg Roelofs
      4. HADOOP-6835.v3.yahoo-0.20.2xx-branch.patch
        82 kB
        Greg Roelofs
      5. HADOOP-6835.v4.trunk-hadoop-common.patch
        18 kB
        Greg Roelofs
      6. HADOOP-6835.v4.trunk-hadoop-mapreduce.patch
        47 kB
        Greg Roelofs
      7. HADOOP-6835.v4.yahoo-0.20.2xx-branch.patch
        83 kB
        Greg Roelofs
      8. HADOOP-6835.v5.trunk-hadoop-common.patch
        20 kB
        Greg Roelofs
      9. HADOOP-6835.v6.trunk-hadoop-common.patch
        41 kB
        Greg Roelofs
      10. HADOOP-6835.v7.trunk-hadoop-common.patch
        41 kB
        Greg Roelofs
      11. HADOOP-6835.v8.trunk-hadoop-common.patch
        47 kB
        Greg Roelofs
      12. HADOOP-6835.v9.yahoo-0.20.2xx-branch.patch
        46 kB
        Greg Roelofs
      13. MR-469.v2.yahoo-0.20.2xx-branch.patch
        69 kB
        Greg Roelofs

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Greg Roelofs
              Reporter:
              Tom White
            • Votes:
              2 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development