Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Later
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: io
    • Labels:
      None
    • Release Note:
      Make Gzipped input splittable by offering a tradeoff between "Spent resources" and "Wall clock time"

      Description

      Files compressed with the gzip codec are not splittable due to the nature of the codec.
      This limits the options you have scaling out when reading large gzipped input files.

      Given the fact that gunzipping a 1GiB file usually takes only 2 minutes I figured that for some use cases wasting some resources may result in a shorter job time under certain conditions.
      So reading the entire input file from the start for each split (wasting resources!!) may lead to additional scalability.

      1. HADOOP-7076.patch
        40 kB
        Niels Basjes
      2. HADOOP-7076-2011-01-26.patch
        40 kB
        Niels Basjes
      3. HADOOP-7076-2011-01-29.patch
        41 kB
        Niels Basjes
      4. HADOOP-7076-2011-02-05.patch
        42 kB
        Niels Basjes
      5. HADOOP-7076-2011-02-06.patch
        42 kB
        Niels Basjes
      6. HADOOP-7076-2011-05-18.patch
        43 kB
        Niels Basjes
      7. HADOOP-7076-2011-08-05-2255.patch
        6 kB
        Niels Basjes
      8. HADOOP-7076-2011-08-05-2315.patch
        43 kB
        Niels Basjes
      9. HADOOP-7076-2011-12-04-2332.patch
        40 kB
        Niels Basjes
      10. HADOOP-7076-branch-0.22.patch
        40 kB
        Niels Basjes
      11. HADOOP-7076-2011-12-09.patch
        41 kB
        Niels Basjes
      12. HADOOP-7076-2011-12-09-branch-0.22.patch
        40 kB
        Niels Basjes

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Niels Basjes
              Reporter:
              Niels Basjes
            • Votes:
              0 Vote for this issue
              Watchers:
              23 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development