Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-4780

Task Tracker burns a lot of cpu in calling getLocalCache

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.19.0
    • 0.19.2
    • None
    • None
    • Reviewed
    • make DistributedCache remember the size of each cache directory

    Description

      I noticed that many times, a task tracker max up to 6 cpus.
      During that time, iostat shows majority of that was system cpu.
      That situation can last for quite long.
      During that time, I saw a number of threads were in the following state:

      java.lang.Thread.State: RUNNABLE
      at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
      at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228)
      at java.io.File.exists(File.java:733)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:399)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
      at org.apache.hadoop.filecache.DistributedCache.getLocalCache(DistributedCache.java:176)
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:140)

      I suspect that getLocalCache is too expensive.
      And calling it for every task initialization seems too much waste.

      Attachments

        1. 4780-2v19.patch
          7 kB
          Christopher Douglas
        2. Hadoop-4780-2.patch
          7 kB
          He Yongqiang

        Issue Links

          Activity

            People

              he yongqiang He Yongqiang
              runping Runping Qi
              Votes:
              2 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: