Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-878

doAssignAll() in TaskScheduler ignores delayedContainers being out of sync with heldContainers

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.3.0
    • 0.3.0
    • None
    • None

    Description

      I have single-node cluster. I run tez DAG similar to testInputFailureCausesRerunOfTwoVerticesWithoutExit unit test added in TEZ-823. DAG does not exit. There is an NPE in Tez AM logs:

      2014-02-21 23:42:58,123 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.TaskImpl: task_1393025990757_0002_1_00_000000 Task Transitioned from SUCCEEDED to SCHEDULED
      2014-02-21 23:42:58,124 ERROR [DelayedContainerManager] org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[DelayedContainerManager,5,main] threw an Exception.
      java.lang.NullPointerException
      	at org.apache.tez.dag.app.rm.TaskScheduler.assignContainer(TaskScheduler.java:1084)
      	at org.apache.tez.dag.app.rm.TaskScheduler.access$500(TaskScheduler.java:83)
      	at org.apache.tez.dag.app.rm.TaskScheduler$ContainerAssigner.doBookKeepingForAssignedContainer(TaskScheduler.java:1326)
      	at org.apache.tez.dag.app.rm.TaskScheduler$NodeLocalContainerAssigner.assignReUsedContainer(TaskScheduler.java:1351)
      	at org.apache.tez.dag.app.rm.TaskScheduler.assignReUsedContainerWithLocation(TaskScheduler.java:1231)
      	at org.apache.tez.dag.app.rm.TaskScheduler.assignReUsedContainersWithLocation(TaskScheduler.java:1193)
      	at org.apache.tez.dag.app.rm.TaskScheduler.tryAssignReUsedContainers(TaskScheduler.java:505)
      	at org.apache.tez.dag.app.rm.TaskScheduler.access$1000(TaskScheduler.java:83)
      	at org.apache.tez.dag.app.rm.TaskScheduler$DelayedContainerManager.doAssignAll(TaskScheduler.java:1554)
      	at org.apache.tez.dag.app.rm.TaskScheduler$DelayedContainerManager.run(TaskScheduler.java:1456)
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            bikassaha Bikas Saha
            tassapola Tassapol Athiapinya
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment