Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3908

jobhistory server trying to load job conf file from wrong location

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 0.23.0
    • Fix Version/s: None
    • Component/s: mrv1
    • Labels:
      None

      Description

      I have seen a few instance where I try to click on the job configuration link from the job history server web ui and it gives a 500 message. Looking at the job history server log file it shows an exception like:

      2012-02-23 22:16:32,519 ERROR org.apache.hadoop.yarn.webapp.View: Error while reading hdfs://host.com:9000/home/hadoop/mapred/history/done_intermediate/user/job_1330033607650_0001_conf.xml
      java.io.FileNotFoundException: File does not exist: /home/hadoop/mapred/history/done_intermediate/user/job_1330033607650_0001_conf.xml
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:746)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:709)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:681)
      at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:302)
      at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:2

      If I go look in hdfs, it doesn't exist in the done_intermediate directory anymore, it exists in the done directory structure. hdfs://host.com:9000/home/hadoop/mapred/history/done/2012/02/23/000000/job_1330033607650_0001_conf.xml

      I'm not exactly sure how to reproduce this, but I definitely see it every once in a while.

        Activity

        Hide
        Thomas Graves added a comment -

        I should also note that restarting the job history server makes the issue go away and it looks it from the right location in the done directory.

        Show
        Thomas Graves added a comment - I should also note that restarting the job history server makes the issue go away and it looks it from the right location in the done directory.
        Hide
        Siddharth Seth added a comment -

        This happens when the job history file is initially read from the done_intermediate directory, and later moved over to the done directory. The cached CompletedJob object continues to hold a reference to the conf file in the intermediate directory.

        Show
        Siddharth Seth added a comment - This happens when the job history file is initially read from the done_intermediate directory, and later moved over to the done directory. The cached CompletedJob object continues to hold a reference to the conf file in the intermediate directory.
        Hide
        Siddharth Seth added a comment -

        Fixed by MAPREDUCE-3972

        Show
        Siddharth Seth added a comment - Fixed by MAPREDUCE-3972

          People

          • Assignee:
            Unassigned
            Reporter:
            Thomas Graves
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development