Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6091

YARNRunner.getJobStatus() fails with ApplicationNotFoundException if the job rolled off the RM view

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.1.0-beta
    • Fix Version/s: 2.6.0
    • Component/s: client
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      If you query the job status of a job that rolled off the RM view via YARNRunner.getJobStatus(), it fails with an ApplicationNotFoundException. For example,

      2014-09-15 07:09:51,084 ERROR org.apache.pig.tools.grunt.Grunt: ERROR 6017: JobID: job_1410289045532_90542 Reason: java.io.IOException: org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1410289045532_90542' doesn't exist in RM.
      	at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:288)
      	at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:150)
      	at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:337)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
      	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
      	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2058)
      	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2054)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1547)
      	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2052)
      
      	at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:348)
      	at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:419)
      	at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:559)
      	at org.apache.hadoop.mapreduce.Job$1.run(Job.java:314)
      	at org.apache.hadoop.mapreduce.Job$1.run(Job.java:311)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:396)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1547)
      	at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311)
      	at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:599)
      	at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.checkRunningState(ControlledJob.java:257)
      	at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.checkState(ControlledJob.java:282)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.pig.backend.hadoop23.PigJobControl.checkState(PigJobControl.java:120)
      	at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:180)
      	at java.lang.Thread.run(Thread.java:662)
      	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:279)
      

      Prior to 2.1.0, it used to be able to fall back onto the job history server and get the status.

      This appears to be introduced by YARN-873. YARN-873 changed ClientRMService to throw an ApplicationNotFoundException on an unknown app id (from returning null). But MR's ClientServiceDelegate was never modified to change its behavior.

        Attachments

        1. MAPREDUCE-6091.patch
          5 kB
          Sangjin Lee
        2. MAPREDUCE-6091.patch
          3 kB
          Sangjin Lee

          Issue Links

            Activity

              People

              • Assignee:
                sjlee0 Sangjin Lee
                Reporter:
                sjlee0 Sangjin Lee
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: