Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-72

hadoop doesn't take advatage of distributed compiting in TestDFSIO

    XMLWordPrintableJSON

Details

    • Test
    • Status: Closed
    • Major
    • Resolution: Won't Fix
    • None
    • 0.2.0
    • fs
    • None
    • 200 node cluster

    Description

      TestDFSIO runs N map jobs, each either writing to or reading from a separate file of the same size,
      and collects statistical information on its performance.
      The reducer further calculates the overall statistics for all maps.
      It outputs the following data:

      • read or write test
      • date and time the test finished
      • number of files
      • total number of bytes processed
      • overall throughput in mb/sec
      • average IO rate in mb/sec per file

      _Results_
      I run 7 iterations of the test one after another on a cluster of ~200 nodes.
      The file size is the same in all cases 320Mb.
      The number of files tried is 1,2,4,8,16,32,64.
      The log file with statistics is attached.
      It looks like we don't have any distributed computing here at all.
      The total execution time increases proportionally to the total size of data both for writes and reads.
      Another thing is that the io ratio for read is higher than the write rate just gradually.
      For comparison I attach time measuring for the same ios performed on the same cluster but sequentially in a simple loop.
      This is the summary:

      Files map/red time sequential time
      1 49 34
      2 86 69
      4 158 131
      8 299 266
      16 569 532
      32 1131
      64 2218

      This doesn't look good, unless there is something wrong with my test (attached) or the cluster settings.

      Attachments

        1. TestDFSIO_results_200_node_cluster.log
          3 kB
          Konstantin Shvachko
        2. TestDFSIO_results_sequential.log
          0.6 kB
          Konstantin Shvachko
        3. TestDFSIO_results.log
          4 kB
          Konstantin Shvachko
        4. TestDFSIO.java
          13 kB
          Konstantin Shvachko

        Activity

          People

            Unassigned Unassigned
            shv Konstantin Shvachko
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: