Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4457

mr job invalid transition TA_TOO_MANY_FETCH_FAILURE at FAILED

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.23.3
    • Fix Version/s: 0.23.3, 2.0.2-alpha
    • Component/s: mrv2
    • Labels:
      None

      Description

      we saw a job go into the ERROR state from an invalid state transition.

      3,600 INFO [AsyncDispatcher event handler]
      org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
      attempt_1342238829791_2501_m_007743_0 TaskAttempt Transitioned from SUCCEEDED
      to FAILED
      2012-07-16 08:49:53,600 INFO [AsyncDispatcher event handler]
      org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
      attempt_1342238829791_2501_m_008850_0 TaskAttempt Transitioned from SUCCEEDED
      to FAILED
      2012-07-16 08:49:53,600 INFO [AsyncDispatcher event handler]
      org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
      attempt_1342238829791_2501_m_017344_1000 TaskAttempt Transitioned from RUNNING
      to SUCCESS_CONTAINER_CLEANUP
      2012-07-16 08:49:53,601 ERROR [AsyncDispatcher event handler]
      org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle this
      event at current state for attempt_1342238829791_2501_m_000027_0
      org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
      TA_TOO_MANY_FETCH_FAILURE at FAILED
      at
      org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
      at
      org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
      at
      org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
      at
      org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:954)
      at
      org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:133)
      at
      org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:913)
      at
      org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:905)
      at
      org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
      at
      org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.realDispatch(RecoveryService.java:285)
      at
      org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.dispatch(RecoveryService.java:281)
      at
      org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
      at java.lang.Thread.run(Thread.java:619)
      2012-07-16 08:49:53,601 INFO [AsyncDispatcher event handler]
      org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
      attempt_1342238829791_2501_m_029091_1000 TaskAttempt Transitioned from RUNNING
      to SUCCESS_CONTAINER_CLEANUP
      2012-07-16 08:49:53,601 INFO [IPC Server handler 17 on 47153]
      org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from
      attempt_1342238829791_2501_r_000461_1000

      It looks like we possibly got 2 TA_TOO_MANY_FETCH_FAILURE events. The first one moved it to FAILED and then the second one failed because no valid transition.

      1. MR-4457.txt
        6 kB
        Robert Joseph Evans

        Issue Links

          Activity

          Thomas Graves created issue -
          Robert Joseph Evans made changes -
          Field Original Value New Value
          Assignee Robert Joseph Evans [ revans2 ]
          Robert Joseph Evans made changes -
          Attachment MR-4457.txt [ 12538583 ]
          Robert Joseph Evans made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Thomas Graves made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s 0.23.3 [ 12320060 ]
          Fix Version/s 3.0.0 [ 12320355 ]
          Fix Version/s 2.2.0-alpha [ 12322471 ]
          Resolution Fixed [ 1 ]
          Arun C Murthy made changes -
          Fix Version/s 3.0.0 [ 12320355 ]
          Arun C Murthy made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Gera Shegalov made changes -
          Link This issue is related to MAPREDUCE-5409 [ MAPREDUCE-5409 ]

            People

            • Assignee:
              Robert Joseph Evans
              Reporter:
              Thomas Graves
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development