Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1428

Make block size and the size of archive created files configurable.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: harchive
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently the block size used by archives is the default block size of the hdfs filesystem. We need to make it configurable so that the block size can be higher for the part files that archives create.
      Also, we need to make the size of part files in archives configurable again to make it bigger in size and create less number of such files.

      1. BinaryFileGenerator.java
        7 kB
        Tsz Wo Nicholas Sze
      2. BinaryFileGenerator.java
        10 kB
        Tsz Wo Nicholas Sze
      3. BinaryFileGenerator.java
        18 kB
        Tsz Wo Nicholas Sze
      4. MAPREDUCE-1428.patch
        9 kB
        Mahadev konar
      5. MAPREDUCE-1428.patch
        9 kB
        Mahadev konar

        Issue Links

          Activity

          Hide
          Tsz Wo Nicholas Sze added a comment -

          Here is a program to generate files with different sizes. I used it to test archive with max_k=32.

          BinaryFileGenerator.java: generate files with sizes 0, 2^k-1, 2^k and 2^k+1 for k=1,.., max_k.

          Show
          Tsz Wo Nicholas Sze added a comment - Here is a program to generate files with different sizes. I used it to test archive with max_k=32. BinaryFileGenerator.java: generate files with sizes 0, 2^k-1, 2^k and 2^k+1 for k=1,.., max_k.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          BinaryFileGenerator.java: also generate random size files in many directories.

          Show
          Tsz Wo Nicholas Sze added a comment - BinaryFileGenerator.java: also generate random size files in many directories.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          BinaryFileGenerator.java: added a verifier.

          Show
          Tsz Wo Nicholas Sze added a comment - BinaryFileGenerator.java: added a verifier.
          Hide
          dhruba borthakur added a comment -

          waiting expectantly for this patch to get committed.....

          Show
          dhruba borthakur added a comment - waiting expectantly for this patch to get committed.....
          Hide
          Rodrigo Schmidt added a comment -

          Mahadev, are you working on this? If not, I can take a stab at it during the weekend. Just let me know.

          Show
          Rodrigo Schmidt added a comment - Mahadev, are you working on this? If not, I can take a stab at it during the weekend. Just let me know.
          Hide
          Mahadev konar added a comment -

          rodrigo, i laready have a patch in place for it. I just need to add some unit tests. hopefully ill upload a patch by this weekend given our zookeeper release candidate is out ...

          Show
          Mahadev konar added a comment - rodrigo, i laready have a patch in place for it. I just need to add some unit tests. hopefully ill upload a patch by this weekend given our zookeeper release candidate is out ...
          Hide
          Rodrigo Schmidt added a comment -

          Great! I'll be happy to review it then.

          Show
          Rodrigo Schmidt added a comment - Great! I'll be happy to review it then.
          Hide
          Mahadev konar added a comment -

          this patch fixes the issue and adds a test case.

          Show
          Mahadev konar added a comment - this patch fixes the issue and adds a test case.
          Hide
          Mahadev konar added a comment -

          also, I will be adding documentation in MAPREDUCE-1514 for the improvements in this jira.

          Show
          Mahadev konar added a comment - also, I will be adding documentation in MAPREDUCE-1514 for the improvements in this jira.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12440165/MAPREDUCE-1428.patch
          against trunk revision 928104.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 2220 javac compiler warnings (more than the trunk's current 2219 warnings).

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/70/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/70/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/70/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/70/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12440165/MAPREDUCE-1428.patch against trunk revision 928104. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 2220 javac compiler warnings (more than the trunk's current 2219 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/70/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/70/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/70/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/70/console This message is automatically generated.
          Hide
          Mahadev konar added a comment -

          trying hudson again!

          Show
          Mahadev konar added a comment - trying hudson again!
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 patch looks good.

          Show
          Tsz Wo Nicholas Sze added a comment - +1 patch looks good.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12440165/MAPREDUCE-1428.patch
          against trunk revision 929712.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 2220 javac compiler warnings (more than the trunk's current 2219 warnings).

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/82/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/82/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/82/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/82/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12440165/MAPREDUCE-1428.patch against trunk revision 929712. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 2220 javac compiler warnings (more than the trunk's current 2219 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/82/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/82/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/82/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/82/console This message is automatically generated.
          Hide
          Mahadev konar added a comment -

          the javac warning is because of :

          [deprecation] create(org.apache.hadoop.fs.Path,org.apache.hadoop.fs.permission.FsPermission,boolean,int,short,long,org.apache.hadoop.util.Progressable) in org.apache.hadoop.fs.FileSystem has been deprecated
          [javac]         partStream = destFs.create(tmpOutput, new FsPermission((short)0700),
          
          

          I dont see a solution arnd it.

          Show
          Mahadev konar added a comment - the javac warning is because of : [deprecation] create(org.apache.hadoop.fs.Path,org.apache.hadoop.fs.permission.FsPermission, boolean , int , short , long ,org.apache.hadoop.util.Progressable) in org.apache.hadoop.fs.FileSystem has been deprecated [javac] partStream = destFs.create(tmpOutput, new FsPermission(( short )0700), I dont see a solution arnd it.
          Hide
          Mahadev konar added a comment -

          this patch fixes the javac warning!

          Show
          Mahadev konar added a comment - this patch fixes the javac warning!
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12440531/MAPREDUCE-1428.patch
          against trunk revision 930088.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/88/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/88/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/88/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/88/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12440531/MAPREDUCE-1428.patch against trunk revision 930088. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/88/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/88/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/88/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/88/console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12440531/MAPREDUCE-1428.patch
          against trunk revision 930088.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/89/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/89/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/89/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/89/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12440531/MAPREDUCE-1428.patch against trunk revision 930088. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/89/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/89/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/89/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/89/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this. Thanks, Mahadev!

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, Mahadev!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #300 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/300/)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #300 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/300/ )
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #278 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/278/)
          . Make block size and the size of archive created files configurable. Contributed by mahadev

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #278 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/278/ ) . Make block size and the size of archive created files configurable. Contributed by mahadev

            People

            • Assignee:
              Mahadev konar
              Reporter:
              Mahadev konar
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development