Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-1587

Some tez-examples fail in local mode

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.5.1
    • None
    • None

    Description

      JoinExample run indefinitely, don't finish

      19:13:58,703 - Thread(Fetcher [hashSide] #1) - (HttpConnection.java:273) - Closing connection on fetcher [hashSide] 114
      19:13:58,703 - Thread(ShuffleRunner [hashSide]) - (ShuffleManager.java:270) - Scheduling fetch for inputHost: jzhangMBPr.local:0
      19:13:58,704 - Thread(ShuffleRunner [hashSide]) - (ShuffleManager.java:333) - Created Fetcher for host: jzhangMBPr.local, with inputs: []
      19:14:03,599 - Thread( main) - (DAGClientRPCImpl.java:444) - DAG: State: RUNNING Progress: 0% TotalTasks: 6 Succeeded: 0 Running: 1 Failed: 0 Killed: 0
      19:14:03,601 - Thread( main) - (DAGClientRPCImpl.java:444) - 	VertexStatus: VertexName: hashSide Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 0 Failed: 0 Killed: 0
      19:14:03,602 - Thread( main) - (DAGClientRPCImpl.java:444) - 	VertexStatus: VertexName: streamingSide Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 0 Failed: 0 Killed: 0
      19:14:03,604 - Thread( main) - (DAGClientRPCImpl.java:444) - 	VertexStatus: VertexName: joiner Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 1 Failed: 0 Killed: 0
      19:14:08,629 - Thread( main) - (DAGClientRPCImpl.java:444) - DAG: State: RUNNING Progress: 0% TotalTasks: 6 Succeeded: 0 Running: 1 Failed: 0 Killed: 0
      19:14:08,631 - Thread( main) - (DAGClientRPCImpl.java:444) - 	VertexStatus: VertexName: hashSide Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 0 Failed: 0 Killed: 0
      19:14:08,632 - Thread( main) - (DAGClientRPCImpl.java:444) - 	VertexStatus: VertexName: streamingSide Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 0 Failed: 0 Killed: 0
      19:14:08,633 - Thread( main) - (DAGClientRPCImpl.java:444) - 	VertexStatus: VertexName: joiner Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 1 Failed: 0 Killed: 0
      19:14:13,658 - Thread( main) - (DAGClientRPCImpl.java:444) - DAG: State: RUNNING Progress: 0% TotalTasks: 6 Succeeded: 0 Running: 1 Failed: 0 Killed: 0
      

      WordCount and OrderedWordCount fail due to the following exception

      19:16:47,499 - Thread( main) - (DAGClientRPCImpl.java:444) - DAG completed. FinalState=FAILED
      WordCount failed with diagnostics: [Vertex re-running, vertexName=Tokenizer, vertexId=vertex_1410779802886_0001_1_00, Vertex failed, vertexName=Summation, vertexId=vertex_1410779802886_0001_1_01, diagnostics=[Task failed, taskId=task_1410779802886_0001_1_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$ShuffleError: error in shuffle in fetcher [Tokenizer] #1
      	at org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:335)
      	at org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:1)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
      	at java.lang.Thread.run(Thread.java:695)
      Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
      	at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:375)
      	at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler.copyFailed(ShuffleScheduler.java:292)
      	at org.apache.tez.runtime.library.common.shuffle.impl.Fetcher.copyFromHost(Fetcher.java:274)
      	at org.apache.tez.runtime.library.common.shuffle.impl.Fetcher.run(Fetcher.java:160)
      , Container container_1410779802886_0001_00_000002 finished with diagnostics set to [TaskExecutionFailure: error in shuffle in fetcher [Tokenizer] #1]], TaskAttempt 1 failed, info=[Error: Failure while running task:org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$ShuffleError: error in shuffle in fetcher [Tokenizer] #2
      	at org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:335)
      	at org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:1)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
      	at java.lang.Thread.run(Thread.java:695)
      
      
      

      Attachments

        1. tez-1587.1.patch
          9 kB
          Prakash Ramachandran

        Activity

          People

            pramachandran Prakash Ramachandran
            zjffdu Jeff Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: