Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6530

Jobtracker is slow when more JT UI requests

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Blocker
    • Resolution: Unresolved
    • 1.2.1
    • None
    • None
    • None

    Description

      JobTracker is slow when there are huge number of Jobs running and 30
      connections were established to info port to view Job status and counters.

      hadoop job -list took 4m22.412s

      We took Jstack traces and found most of the server threads waiting on JobTracker object and the thread which has the lock on JobTracker waits for ResourceBundle object.

      "retireJobs" prio=10 tid=0x00007f2345200800 nid=0x11c1 waiting for
      monitor entry [0x00007f22e3499000]
      java.lang.Thread.State: BLOCKED (on object monitor)
      at
      org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56)

      • waiting to lock <0x0000000197cc6218> (a java.lang.Class for
        org.apache.hadoop.mapreduce.util.ResourceBundles)
        at
        org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89)
        at
        org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135)
        at
        org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47)
        at
        org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75)
        at
        org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130)
        at org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534)
      • locked <0x00000007f8411608> (a org.apache.hadoop.mapred.Counters)
        at
        org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1728)
        at
        org.apache.hadoop.mapred.JobInProgress.getMapCounters(JobInProgress.java:1669)
        at
        org.apache.hadoop.mapred.JobTracker$RetireJobs.addToCache(JobTracker.java:657)
      • locked <0x000000009644ae08> (a
        org.apache.hadoop.mapred.JobTracker$RetireJobs)
        at
        org.apache.hadoop.mapred.JobTracker$RetireJobs.run(JobTracker.java:769)
      • locked <0x00000000964c5550> (a
        org.apache.hadoop.mapred.FairScheduler)
      • locked <0x000000009644a9d0> (a java.util.Collections$SynchronizedMap)
      • locked <0x00000000962ac660> (a org.apache.hadoop.mapred.JobTracker)
        at java.lang.Thread.run(Thread.java:745)

      The ResourceBundle object is locked most of the time by JT GUI jobtracker_jsp and does getMapCounters().

      "926410165@qtp-1732070199-56" daemon prio=10 tid=0x00007f232c4df000 nid=0x27c0
      runnable [0x00007f22db7bf000]
      java.lang.Thread.State: RUNNABLE
      at java.lang.Throwable.fillInStackTrace(Native Method)
      at java.lang.Throwable.fillInStackTrace(Throwable.java:783)

      • locked <0x000000061a49ede0> (a java.util.MissingResourceException)
        at java.lang.Throwable.<init>(Throwable.java:287)
        at java.lang.Exception.<init>(Exception.java:84)
        at java.lang.RuntimeException.<init>(RuntimeException.java:80)
        at
        java.util.MissingResourceException.<init>(MissingResourceException.java:85)
        at
        java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:1499)
        at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1322)
        at java.util.ResourceBundle.getBundle(ResourceBundle.java:1028)
        at
        org.apache.hadoop.mapreduce.util.ResourceBundles.getBundle(ResourceBundles.java:37)
        at
        org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56)
      • locked <0x0000000197cc6218> (a java.lang.Class for
        org.apache.hadoop.mapreduce.util.ResourceBundles)
        at
        org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89)
        at
        org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135)
        at
        org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47)
        at
        org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75)
        at
        org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130)
        at org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534)
      • locked <0x00000007ed1024b8> (a org.apache.hadoop.mapred.Counters)
        at
        org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1728)
        at
        org.apache.hadoop.mapred.JobInProgress.getMapCounters(JobInProgress.java:1669)
        at org.apache.hadoop.mapred.JSPUtil.generateJobTable(JSPUtil.java:436)
        at
        org.apache.hadoop.mapred.jobtracker_jsp._jspService(jobtracker_jsp.java:202)
        at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:98)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

      Every job updates their counters and all 30 UI clients reading the frequently updated counters leading to JT slowness.

      With no JT UI requests, hadoop job -list completes in seconds.

      How to fix JT slowness when there are 30 sessions wants to know the Job status and counters of huge number of Jobs running at a time.

      Is there any workaround like JT UI caching or offloading some part in JT UI frontpage when load is heavy.

      Attachments

        Activity

          People

            Unassigned Unassigned
            prabhujoseph Prabhu Joseph
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: