Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-1493

Tez examples sometimes fail in cases where AM recovery kicks in

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 0.5.0
    • None
    • None

    Description

      14/08/25 17:37:03 INFO client.TezClient: Submitting DAG to YARN, applicationId=application_1408499461970_0053, dagName=WordCount
      14/08/25 17:37:03 INFO impl.YarnClientImpl: Submitted application application_1408499461970_0053
      14/08/25 17:37:03 INFO client.TezClient: The url to track the Tez AM: http://jzhangMBPr.local:8088/proxy/application_1408499461970_0053/
      14/08/25 17:37:03 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
      14/08/25 17:37:03 INFO client.AHSProxy: Connecting to Application History server at /0.0.0.0:10200
      14/08/25 17:37:03 INFO rpc.DAGClientRPCImpl: Waiting for DAG to start running
      14/08/25 17:37:07 INFO rpc.DAGClientRPCImpl: DAG: State: RUNNING Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 0 Failed: 0 Killed: 0
      14/08/25 17:37:15 INFO rpc.DAGClientRPCImpl: DAG: State: RUNNING Progress: 50% TotalTasks: 2 Succeeded: 1 Running: 0 Failed: 0 Killed: 0
      14/08/25 17:37:17 INFO rpc.DAGClientRPCImpl: DAG completed. FinalState=SUBMITTED
      WordCount failed with diagnostics: []
      

      The client side shows that the job is failed, but checking the logs found that the recovery works in server side, and eventually finish the job successfully.

      Attachments

        1. Tez-1493-2.patch
          0.7 kB
          Jeff Zhang
        2. Tez-1493.patch
          0.7 kB
          Jeff Zhang

        Activity

          People

            zjffdu Jeff Zhang
            zjffdu Jeff Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: