Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6950

Error Launching job : java.io.IOException: Unknown Job job_xxx_xxx

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.7.1
    • Fix Version/s: None
    • Component/s: mr-am
    • Labels:
      None
    • Target Version/s:

      Description

      some job report error, like this:

      hadoop.mapreduce.Job.monitorAndPrintJob(Job.java 1367) [main] :  map 100% reduce 100%
      [2017-08-31T20:27:12.591+08:00] [INFO] hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java 277) [main] : Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
      [2017-08-31T20:27:12.821+08:00] [INFO] hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java 277) [main] : Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
      [2017-08-31T20:27:13.039+08:00] [INFO] hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java 277) [main] : Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
      [2017-08-31T20:27:13.256+08:00] [ERROR] hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java 1034) [main] : Error Launching job : java.io.IOException: Unknown Job job_xxx_xxx
      

      I found the am container log, like below. Here we know error happened in pipeline, maybe some dn error. And I also found some other reason which close the JobHistoryEventHandler. So MR AM can't write the information for JH. So client counldn't know whether the appplication is finished.

      2017-08-31 20:27:10,813 INFO [Thread-1968] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, writing event MAP_ATTEMPT_STARTED
      2017-08-31 20:27:10,814 ERROR [Thread-1968] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Error writing History Event: org.apache.hadoop.mapreduce.jobhistory.TaskAttemptStartedEvent@2055ea0a
      java.io.EOFException: Premature EOF: no length prefix available
              at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2292)
              at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1317)
              at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
              at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
      2017-08-31 20:27:10,814 INFO [Thread-1968] org.apache.hadoop.service.AbstractService: Service JobHistoryEventHandler failed in state STOPPED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.EOFException: Premature EOF: no length prefix available
      org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.EOFException: Premature EOF: no length prefix available
              at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:580)
              at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:374) 
              at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
              at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
              at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
              at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157)
              at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
      

      This problem is serious , especially for hive. Job must rerun meaninglessly! So I think we need to retry the operation of writing history event.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              zhengchenyu zhengchenyu
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 1m
                1m
                Remaining:
                Remaining Estimate - 1m
                1m
                Logged:
                Time Spent - Not Specified
                Not Specified