Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-7191

JobHistoryServer should log exception when loading/parsing history file failed

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 3.0.4, 3.3.0, 3.2.1, 3.1.3
    • mrv2
    • None
    • Reviewed

    Description

      I'm test rolling 2.7.2 to 3.2.0.
      RM& NM has upgrade to 3.2.0, JobHistoryServer is still 2.7.2.
      When submitting MR job using 3.2.0 client I found JobHistory URL could not open, and in webpage showing "Could not load history file hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist"

      There are only loading log just like following and no exception info in log file of JobHistoryServer.

      2019-03-06 16:24:19,489 INFO org.apache.hadoop.mapreduce.v2.hs.CompletedJob: Loading job: job_1551697798944_0020 from file: hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist
      2019-03-06 16:24:19,489 INFO org.apache.hadoop.mapreduce.v2.hs.CompletedJob: Loading history file: [hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist]
      

      After I add some log when loading history file failed I get following exception. 3.2.0 write jhist files using binary format, but 2.7.2 using json format. After I set mapreduce.jobhistory.jhist.format=json in 3.2.0 client configuration, I can get job info from jhs.

      There is still no log in Hadoop-3.2.0, I think it's very helpful to add some log to debug.

      Loading jhist file Exception is follows:

      2019-03-06 16:51:55,664 WARN org.apache.hadoop.mapreduce.v2.hs.CompletedJob: Could not load history file hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist
      java.io.IOException: Incompatible event log version: Avro-Binary
              at org.apache.hadoop.mapreduce.jobhistory.EventReader.<init>(EventReader.java:71)
              at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:139)
              at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.loadFullHistoryData(CompletedJob.java:347)
              at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.<init>(CompletedJob.java:101)
              at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo.loadJob(HistoryFileManager.java:450)
              at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.loadJob(CachedHistoryStorage.java:180)
              at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.access$000(CachedHistoryStorage.java:52)
              at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage$1.load(CachedHistoryStorage.java:103)
              at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage$1.load(CachedHistoryStorage.java:100)
              at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
              at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
              at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
              at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
              at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
              at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969)
              at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829)
              at com.google.common.cache.LocalCache$LocalManualCache.getUnchecked(LocalCache.java:4834)
              at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.getFullJob(CachedHistoryStorage.java:193)
              at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:217)
              at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.requireJob(AppController.java:381)
              at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.job(AppController.java:108)
              at org.apache.hadoop.mapreduce.v2.hs.webapp.HsController.job(HsController.java:104)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      

      Attachments

        1. MAPREDUCE-7191.002.patch
          2 kB
          Jiandan Yang
        2. MAPREDUCE-7191.001.patch
          2 kB
          Jiandan Yang

        Activity

          People

            yangjiandan Jiandan Yang
            yangjiandan Jiandan Yang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: