VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Later
    • None
    • None
    • io
    • None
    • Make Gzipped input splittable by offering a tradeoff between "Spent resources" and "Wall clock time"

    Description

      Files compressed with the gzip codec are not splittable due to the nature of the codec.
      This limits the options you have scaling out when reading large gzipped input files.

      Given the fact that gunzipping a 1GiB file usually takes only 2 minutes I figured that for some use cases wasting some resources may result in a shorter job time under certain conditions.
      So reading the entire input file from the start for each split (wasting resources!!) may lead to additional scalability.

      Attachments

        1. HADOOP-7076-branch-0.22.patch
          40 kB
          Niels Basjes
        2. HADOOP-7076-2011-12-09-branch-0.22.patch
          40 kB
          Niels Basjes
        3. HADOOP-7076-2011-12-09.patch
          41 kB
          Niels Basjes
        4. HADOOP-7076-2011-12-04-2332.patch
          40 kB
          Niels Basjes
        5. HADOOP-7076-2011-08-05-2315.patch
          43 kB
          Niels Basjes
        6. HADOOP-7076-2011-08-05-2255.patch
          6 kB
          Niels Basjes
        7. HADOOP-7076-2011-05-18.patch
          43 kB
          Niels Basjes
        8. HADOOP-7076-2011-02-06.patch
          42 kB
          Niels Basjes
        9. HADOOP-7076-2011-02-05.patch
          42 kB
          Niels Basjes
        10. HADOOP-7076-2011-01-29.patch
          41 kB
          Niels Basjes
        11. HADOOP-7076-2011-01-26.patch
          40 kB
          Niels Basjes
        12. HADOOP-7076.patch
          40 kB
          Niels Basjes

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            nielsbasjes Niels Basjes
            nielsbasjes Niels Basjes
            Votes:
            0 Vote for this issue
            Watchers:
            24 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment