Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1832

Support for file sizes less than 1MB in DFSIO benchmark.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.2
    • Fix Version/s: 0.20.3, 0.21.0, 0.22.0
    • Component/s: benchmarks
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently DFSIO benchmark allows to specify files sizes in 1MB increments. It would be useful to be able to specify smaller sizes.

      1. TestDFSIO-fsize.patch
        19 kB
        Konstantin Shvachko
      2. TestDFSIO-fsize-0.20.patch
        43 kB
        Konstantin Shvachko
      3. TestDFSIO-fsize.patch
        19 kB
        Konstantin Shvachko

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          1h 22m 1 Konstantin Shvachko 02/Jun/10 03:15
          Patch Available Patch Available Resolved Resolved
          1d 23h 53m 1 Konstantin Shvachko 04/Jun/10 03:09
          Resolved Resolved Closed Closed
          81d 19h 12m 1 Tom White 24/Aug/10 22:21
          Gavin made changes -
          Link This issue is depended upon by WHIRR-662 [ WHIRR-662 ]
          Gavin made changes -
          Link This issue blocks WHIRR-662 [ WHIRR-662 ]
          Steve Loughran made changes -
          Link This issue blocks WHIRR-662 [ WHIRR-662 ]
          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Konstantin Shvachko made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Fix Version/s 0.21.0 [ 12314045 ]
          Fix Version/s 0.22.0 [ 12314184 ]
          Resolution Fixed [ 1 ]
          Hide
          Konstantin Shvachko added a comment -

          I just committed this.

          Show
          Konstantin Shvachko added a comment - I just committed this.
          Konstantin Shvachko made changes -
          Attachment TestDFSIO-fsize.patch [ 12446291 ]
          Hide
          Konstantin Shvachko added a comment -

          Using DFSConfigKeys.DFS_SUPPORT_APPEND_KEY as Cos pointing out.

          Show
          Konstantin Shvachko added a comment - Using DFSConfigKeys.DFS_SUPPORT_APPEND_KEY as Cos pointing out.
          Hide
          Konstantin Boudnik added a comment -

          BTW, I think it worth closing MAPREDUCE-1614 as dup of this JIRA.

          Show
          Konstantin Boudnik added a comment - BTW, I think it worth closing MAPREDUCE-1614 as dup of this JIRA.
          Hide
          Konstantin Boudnik added a comment -

          +1 patch looks good. A nit (for a couple of places):

          • instead of DFSConfigKeys.DFS_SUPPORT_APPEND_KEY instead of "dfs.support.append"

          Other than that looking real good.

          Show
          Konstantin Boudnik added a comment - +1 patch looks good. A nit (for a couple of places): instead of DFSConfigKeys.DFS_SUPPORT_APPEND_KEY instead of "dfs.support.append" Other than that looking real good.
          Konstantin Shvachko made changes -
          Attachment TestDFSIO-fsize-0.20.patch [ 12446209 ]
          Hide
          Konstantin Shvachko added a comment -

          This is a backport for branch 0.20. I had to apply several other patches in order to get it in sync with current version of DFSIO. That is why the patch is larger than the patch for trunk.
          Applying to branch 0.21 should not be a problem.

          Show
          Konstantin Shvachko added a comment - This is a backport for branch 0.20. I had to apply several other patches in order to get it in sync with current version of DFSIO. That is why the patch is larger than the patch for trunk. Applying to branch 0.21 should not be a problem.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12446093/TestDFSIO-fsize.patch
          against trunk revision 950286.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 8 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446093/TestDFSIO-fsize.patch against trunk revision 950286. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/553/console This message is automatically generated.
          Konstantin Shvachko made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Konstantin Shvachko made changes -
          Attachment TestDFSIO-fsize.patch [ 12446093 ]
          Hide
          Konstantin Shvachko added a comment -
          • This patch lets specify files in bytes, KB, MB, GB, and TB. That is, one can say -fileSize 37B.
            The default units are still MBs. That is, -fileSize 37 sets file size to 37MB.
          • TestDFSIO implements Tool class, which lets it accept generic options, including -D.
            This particularly solves the problem reported in MAPREDUCE-1614. To specify output directory do this:
            TestDFSIO -Dtest.build.data=/user/me/DFSIO -write -nrFiles 12 -fileSize 1024KB
            
          • Usage is updated respectively.
          • Implementing Tool places all the main startup logic in run(), which eliminates the necessity for methods to remain static. I removed static.
          Show
          Konstantin Shvachko added a comment - This patch lets specify files in bytes, KB, MB, GB, and TB. That is, one can say -fileSize 37B . The default units are still MBs. That is, -fileSize 37 sets file size to 37MB. TestDFSIO implements Tool class, which lets it accept generic options, including -D. This particularly solves the problem reported in MAPREDUCE-1614 . To specify output directory do this: TestDFSIO -Dtest.build.data=/user/me/DFSIO -write -nrFiles 12 -fileSize 1024KB Usage is updated respectively. Implementing Tool places all the main startup logic in run() , which eliminates the necessity for methods to remain static. I removed static.
          Konstantin Shvachko made changes -
          Link This issue is duplicated by HDFS-1182 [ HDFS-1182 ]
          Konstantin Shvachko made changes -
          Field Original Value New Value
          Link This issue incorporates MAPREDUCE-1614 [ MAPREDUCE-1614 ]
          Konstantin Shvachko created issue -

            People

            • Assignee:
              Konstantin Shvachko
              Reporter:
              Konstantin Shvachko
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development