Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1832

Support for file sizes less than 1MB in DFSIO benchmark.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.2
    • Fix Version/s: 0.20.3, 0.21.0, 0.22.0
    • Component/s: benchmarks
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently DFSIO benchmark allows to specify files sizes in 1MB increments. It would be useful to be able to specify smaller sizes.

      1. TestDFSIO-fsize.patch
        19 kB
        Konstantin Shvachko
      2. TestDFSIO-fsize-0.20.patch
        43 kB
        Konstantin Shvachko
      3. TestDFSIO-fsize.patch
        19 kB
        Konstantin Shvachko

        Issue Links

          Activity

          Hide
          Konstantin Shvachko added a comment -
          • This patch lets specify files in bytes, KB, MB, GB, and TB. That is, one can say -fileSize 37B.
            The default units are still MBs. That is, -fileSize 37 sets file size to 37MB.
          • TestDFSIO implements Tool class, which lets it accept generic options, including -D.
            This particularly solves the problem reported in MAPREDUCE-1614. To specify output directory do this:
            TestDFSIO -Dtest.build.data=/user/me/DFSIO -write -nrFiles 12 -fileSize 1024KB
            
          • Usage is updated respectively.
          • Implementing Tool places all the main startup logic in run(), which eliminates the necessity for methods to remain static. I removed static.
          Show
          Konstantin Shvachko added a comment - This patch lets specify files in bytes, KB, MB, GB, and TB. That is, one can say -fileSize 37B . The default units are still MBs. That is, -fileSize 37 sets file size to 37MB. TestDFSIO implements Tool class, which lets it accept generic options, including -D. This particularly solves the problem reported in MAPREDUCE-1614 . To specify output directory do this: TestDFSIO -Dtest.build.data=/user/me/DFSIO -write -nrFiles 12 -fileSize 1024KB Usage is updated respectively. Implementing Tool places all the main startup logic in run() , which eliminates the necessity for methods to remain static. I removed static.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12446093/TestDFSIO-fsize.patch
          against trunk revision 950286.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 8 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446093/TestDFSIO-fsize.patch against trunk revision 950286. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/console This message is automatically generated.
          Hide
          Konstantin Shvachko added a comment -

          This is a backport for branch 0.20. I had to apply several other patches in order to get it in sync with current version of DFSIO. That is why the patch is larger than the patch for trunk.
          Applying to branch 0.21 should not be a problem.

          Show
          Konstantin Shvachko added a comment - This is a backport for branch 0.20. I had to apply several other patches in order to get it in sync with current version of DFSIO. That is why the patch is larger than the patch for trunk. Applying to branch 0.21 should not be a problem.
          Hide
          Konstantin Boudnik added a comment -

          +1 patch looks good. A nit (for a couple of places):

          • instead of DFSConfigKeys.DFS_SUPPORT_APPEND_KEY instead of "dfs.support.append"

          Other than that looking real good.

          Show
          Konstantin Boudnik added a comment - +1 patch looks good. A nit (for a couple of places): instead of DFSConfigKeys.DFS_SUPPORT_APPEND_KEY instead of "dfs.support.append" Other than that looking real good.
          Hide
          Konstantin Boudnik added a comment -

          BTW, I think it worth closing MAPREDUCE-1614 as dup of this JIRA.

          Show
          Konstantin Boudnik added a comment - BTW, I think it worth closing MAPREDUCE-1614 as dup of this JIRA.
          Hide
          Konstantin Shvachko added a comment -

          Using DFSConfigKeys.DFS_SUPPORT_APPEND_KEY as Cos pointing out.

          Show
          Konstantin Shvachko added a comment - Using DFSConfigKeys.DFS_SUPPORT_APPEND_KEY as Cos pointing out.
          Hide
          Konstantin Shvachko added a comment -

          I just committed this.

          Show
          Konstantin Shvachko added a comment - I just committed this.

            People

            • Assignee:
              Konstantin Shvachko
              Reporter:
              Konstantin Shvachko
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development