Hadoop Common
  1. Hadoop Common
  2. HADOOP-6153

RAgzip: multiple map tasks for a large gzipped file

    Details

    • Type: Improvement Improvement
    • Status: Patch Available
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 0.21.0
    • Fix Version/s: None
    • Component/s: io, native
    • Labels:
      None
    • Environment:

      It requires zlib 1.2.2.4 or higher. (We tested on zlib 1.2.3)

      Description

      It support to enable multiple map tasks for one large gzipped file.

      1. HADOOP-6153.patch
        40 kB
        Daehyun Kim
      2. hadoop-6153.txt
        41 kB
        Todd Lipcon

        Issue Links

          Activity

          Hide
          Daehyun Kim added a comment -

          The hadoop project is splited.
          So I split my patch.

          Show
          Daehyun Kim added a comment - The hadoop project is splited. So I split my patch.
          Hide
          Daehyun Kim added a comment -

          1. To merge RAGzip with GzipCodec class, I add a method 'getDecompressorForRandomAccess(...)' to GzipCodec class. This method return the Decompressor for random access.

          2. I add the '-makeap' option that is FS Shell Command. This option supports to make the access point of the gzip.

          Show
          Daehyun Kim added a comment - 1. To merge RAGzip with GzipCodec class, I add a method 'getDecompressorForRandomAccess(...)' to GzipCodec class. This method return the Decompressor for random access. 2. I add the '-makeap' option that is FS Shell Command. This option supports to make the access point of the gzip.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12414595/HADOOP-6153.patch
          against trunk revision 798093.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/589/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12414595/HADOOP-6153.patch against trunk revision 798093. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/589/console This message is automatically generated.
          Hide
          Daehyun Kim added a comment -

          My patch brings the hunk error.
          I think the reason I copied the text of the patch in my linux terminal and pasted the text to notepad.

          Show
          Daehyun Kim added a comment - My patch brings the hunk error. I think the reason I copied the text of the patch in my linux terminal and pasted the text to notepad.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12414706/HADOOP-6153.patch
          against trunk revision 798247.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/590/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/590/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/590/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/590/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12414706/HADOOP-6153.patch against trunk revision 798247. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/590/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/590/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/590/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/590/console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          This patch fell out of date. I'm regenerating a new patch against trunk, will resubmit to Hudson soon.

          Show
          Todd Lipcon added a comment - This patch fell out of date. I'm regenerating a new patch against trunk, will resubmit to Hudson soon.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12428245/hadoop-6153.txt
          against trunk revision 891511.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/214/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/214/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/214/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/214/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428245/hadoop-6153.txt against trunk revision 891511. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/214/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/214/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/214/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/214/console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12428245/hadoop-6153.txt
          against trunk revision 1031422.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/72//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428245/hadoop-6153.txt against trunk revision 1031422. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/72//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12428245/hadoop-6153.txt
          against trunk revision 1071364.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/291//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428245/hadoop-6153.txt against trunk revision 1071364. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/291//console This message is automatically generated.
          Hide
          Harsh J added a comment -

          Sorry for coming in very late. The patch has gone quite stale for trunk today. Would you be willing to rebase?

          I've also linked in two related issues. Perhaps you may have comments on these alternative approaches, or they may on this?

          Show
          Harsh J added a comment - Sorry for coming in very late. The patch has gone quite stale for trunk today. Would you be willing to rebase? I've also linked in two related issues. Perhaps you may have comments on these alternative approaches, or they may on this?

            People

            • Assignee:
              Daehyun Kim
              Reporter:
              Daehyun Kim
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:

                Development