Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Later
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: io
    • Labels:
      None
    • Target Version/s:
    • Release Note:
      Make Gzipped input splittable by offering a tradeoff between "Spent resources" and "Wall clock time"

      Description

      Files compressed with the gzip codec are not splittable due to the nature of the codec.
      This limits the options you have scaling out when reading large gzipped input files.

      Given the fact that gunzipping a 1GiB file usually takes only 2 minutes I figured that for some use cases wasting some resources may result in a shorter job time under certain conditions.
      So reading the entire input file from the start for each split (wasting resources!!) may lead to additional scalability.

        Attachments

        1. HADOOP-7076-branch-0.22.patch
          40 kB
          Niels Basjes
        2. HADOOP-7076-2011-12-09-branch-0.22.patch
          40 kB
          Niels Basjes
        3. HADOOP-7076-2011-12-09.patch
          41 kB
          Niels Basjes
        4. HADOOP-7076-2011-12-04-2332.patch
          40 kB
          Niels Basjes
        5. HADOOP-7076-2011-08-05-2315.patch
          43 kB
          Niels Basjes
        6. HADOOP-7076-2011-08-05-2255.patch
          6 kB
          Niels Basjes
        7. HADOOP-7076-2011-05-18.patch
          43 kB
          Niels Basjes
        8. HADOOP-7076-2011-02-06.patch
          42 kB
          Niels Basjes
        9. HADOOP-7076-2011-02-05.patch
          42 kB
          Niels Basjes
        10. HADOOP-7076-2011-01-29.patch
          41 kB
          Niels Basjes
        11. HADOOP-7076-2011-01-26.patch
          40 kB
          Niels Basjes
        12. HADOOP-7076.patch
          40 kB
          Niels Basjes

          Issue Links

            Activity

              People

              • Assignee:
                nielsbasjes Niels Basjes
                Reporter:
                nielsbasjes Niels Basjes
              • Votes:
                0 Vote for this issue
                Watchers:
                26 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: