Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-4780

Task Tracker burns a lot of cpu in calling getLocalCache

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.19.0
    • Fix Version/s: 0.19.2
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      make DistributedCache remember the size of each cache directory

      Description

      I noticed that many times, a task tracker max up to 6 cpus.
      During that time, iostat shows majority of that was system cpu.
      That situation can last for quite long.
      During that time, I saw a number of threads were in the following state:

      java.lang.Thread.State: RUNNABLE
      at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
      at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228)
      at java.io.File.exists(File.java:733)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:399)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.filecache.DistributedCache.getLocalCache(DistributedCache.java:176)
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:140)

      I suspect that getLocalCache is too expensive.
      And calling it for every task initialization seems too much waste.

        Attachments

        1. Hadoop-4780-2.patch
          7 kB
          He Yongqiang
        2. 4780-2v19.patch
          7 kB
          Chris Douglas

          Issue Links

            Activity

              People

              • Assignee:
                he yongqiang He Yongqiang
                Reporter:
                runping Runping Qi
              • Votes:
                2 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: