Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3285

map tasks with node local splits do not always read from local nodes

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 0.17.0
    • None
    • None
    • Reviewed

    Description

      I ran a simple map/reduce job counting the number of records in the input data.
      The number of reducers was set to 1.
      I did not set the number of mappers. Thus by default, all splits except the last split of a file contain one dfs block (128MB in my case).
      The web gui indicated that 99% of map tasks were with local splits.
      Thus I expected that most of the dfs reads should have come from the local data nodes.
      However, when I examine the traffic of the ethernet interfaces,
      I found about 50% traffic of each node were through the loopback interface and other 50% were through the ethernet card!
      Also, the switch monitoring indicated that a lot of traffic went through the links and cross racks!
      This indicated that the data locality feature does not work as expected.

      To confirm that, I set the number of map tasks to a very high number so that it forced the split size down to about 27MB.
      The web gui indicated that 99% of map tasks were with local splits, as expected.
      The ethernet interface monitor showed that almost 100% traffic went through the loopback interface, as it should be.
      I found about 50% traffic of each node were through the loopback interface and other 50% were through the ethernet card!
      Also, the switch monitoring indicated that there were very little traffic through the links and cross racks.

      This implies that some corner cases are not handled properly.

      Attachments

        1. 3285-5.patch
          5 kB
          Owen O'Malley
        2. 3285-4.patch
          5 kB
          Owen O'Malley
        3. 3285-3.patch
          5 kB
          Owen O'Malley
        4. 3285.patch
          1 kB
          Owen O'Malley
        5. 3285.patch
          5 kB
          Owen O'Malley

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            omalley Owen O'Malley
            runping Runping Qi
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment