Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8326

Yarn 3.0 seems runs slower than Yarn 2.6



    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.2.0, 3.1.1
    • yarn
    • None


      Hi,  I am running testcases on Yarn 2.6 and Yarn 3.0 and found out the performance seems like twice slower on Yarn 3.0, and the performance would get even slower if we acquire more containers.   I looked at the node manager logs on 2.6 vs 3.0.   Here is what I find below.  

      On 2.6 ,  this is a life cycle of a specific container,  from beginning to end, it takes about  8 seconds (9:53:50 to 9:53:58). 

      On 3.0: the life cycle of a specific container looks like this,  it takes 20 seconds to finish the same job.  (9:51:44 to 9:52:04)

       It seems like on 3.0, it spends an extra 5 seconds on monitor.ContinaerMonitorImpl  (marked in red) which doesn't happen in 2.6,  and also after the job is done, and the container is exiting,  on 3.0, it took 5 seconds to do that (9:51:59 to 9:52:04)  which on 2.6, it only took less than 1/.2 of the time. (9: 53:56 to 9:53:58).  

         Since we are running the same unit testcases and usually acquire more than 4 containers,  therefore, when it addess up all these extra seconds, it became a huge performance issue.  On 2.6, the unittest runs 7 hours whilc on 3.0, the same unitests runs 11 hours.  I was told this performance delay might be caused by Hadoop’s new monitoring system Timeline service v2.  Could someone take a look of this?   Thanks for any help on this!!


        1. image-2018-05-18-15-20-33-839.png
          387 kB
          Hsin-Liang Huang
        2. image-2018-05-18-15-22-30-948.png
          585 kB
          Hsin-Liang Huang
        3. YARN-8326.001.patch
          2 kB
          Shane Kumpf

        Issue Links



              shanekumpf@gmail.com Shane Kumpf
              hlhuang@us.ibm.com Hsin-Liang Huang
              0 Vote for this issue
              13 Start watching this issue