Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5125

TestDFSIO should write less compressible data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 2.0.3-alpha, 1.1.2
    • None
    • test
    • None

    Description

      Currently, TestDFSIO writes a short repeating string of sequential (byte)0 through (byte)50. This makes its output very compressible (I measured 250:1 by LZOing the resulting file). This makes the results of TestDFSIO very hard to compare when running on HDFS vs other file systems which may include some compression on the network, disk, or both – what is ostensibly a benchmark of IO throughput yields completely skewed results towards the system with compression.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tlipcon Todd Lipcon
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: