Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-832

Reduce task stalls due to a task fail & resumes on retry

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.3.0
    • 0.3.0
    • None
    • None

    Description

      The reduce task for Query27 occasionally stalls with

      2014-02-12 13:50:40,829 INFO [AsyncDispatcher event handler] org.apache.tez.dag.history.HistoryEventHandler: [HISTORY][DAG:dag_1392067467498_0279_3][Event:TASK_ATTEMPT_FINISHED]: vertexName=Reducer 3, taskAttemptId=attempt_1392067467498_0279_3_02_000102_0, startTime=1392241839503, finishTime=1392241840829, timeTaken=1326, status=FAILED, diagnostics=Error: java.lang.IllegalStateException: All inputs are expected to ask for memory
          at com.google.common.base.Preconditions.checkState(Preconditions.java:150)
          at org.apache.tez.runtime.common.resources.MemoryDistributor.makeInitialAllocations(MemoryDistributor.java:122)
          at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:195)
          at org.apache.hadoop.mapred.YarnTezDagChild$4.run(YarnTezDagChild.java:503)
          at java.security.AccessController.doPrivileged(Native Method)
          at javax.security.auth.Subject.doAs(Subject.java:415)
          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
          at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:496)
      , counters=Counters: 17, org.apache.tez.common.counters.TaskCounter, MAP_OUTPUT_RECORDS=0, MAP_OUTPUT_BYTES=0, MAP_OUTPUT_MATERIALIZED_BYTES=0, COMBINE_INPUT_RECORDS=0, REDUCE_INPUT_GROUPS=0, REDUCE_SHUFFLE_BYTES=0, REDUCE_INPUT_RECORDS=0, SPILLED_RECORDS=0, SHUFFLED_MAPS=0, FAILED_SHUFFLE=0, MERGED_MAP_OUTPUTS=0, Shuffle Errors, BAD_ID=0, CONNECTION=0, IO_ERROR=0, WRONG_LENGTH=0, WRONG_MAP=0, WRONG_REDUCE=0
      2014-02-12 13:50:40,829 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.TaskAttemptImpl: attempt_1392067467498_0279_3_02_000102_0 TaskAttempt Transitioned from RUNNING to FAIL_IN_PROGRESS due to event TA_FAILED
      

      Attachments

        1. TEZ-832.3.txt
          3 kB
          Siddharth Seth
        2. TEZ-832.2.txt
          3 kB
          Siddharth Seth
        3. TEZ-832.1.txt
          0.8 kB
          Siddharth Seth

        Activity

          People

            sseth Siddharth Seth
            gopalv Gopal Vijayaraghavan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: