Uploaded image for project: 'Tajo (Retired)'
  1. Tajo (Retired)
  2. TAJO-1560

HashShuffle report should be ignored when a succeed tasks are not included

    XMLWordPrintableJSON

Details

    Description

      Currently, hash shuffle report always send to stage. If a worker ran all task too fast, other worker will be received shouldDie message, and it does not executed any task. but report will be sent.
      Additionally, a case of range shuffle is not need hash shuffle report. It is just unnecessary waiting

      2015-04-16 02:05:49,063 INFO org.apache.tajo.querymaster.Stage: Stage finalize - eb_1429088098190_1356_000001 (total=3, success=3, killed=0)
      2015-04-16 02:05:49,063 INFO org.apache.tajo.querymaster.DefaultTaskScheduler: TaskScheduler schedulingThread stopped
      2015-04-16 02:05:49,064 INFO org.apache.tajo.querymaster.DefaultTaskScheduler: Task Scheduler stopped
      2015-04-16 02:05:49,064 INFO org.apache.tajo.querymaster.QueryMaster: cleanup executionBlocks: 
      2015-04-16 02:05:49,064 INFO org.apache.tajo.worker.TaskRunner: Received ShouldDie flag:eb_1429088098190_1356_000001,container_1429088098190_1356_01_058889
      2015-04-16 02:05:49,064 INFO org.apache.tajo.worker.TaskRunner: Stop TaskRunner: eb_1429088098190_1356_000001,container_1429088098190_1356_01_058889
      2015-04-16 02:05:49,064 INFO org.apache.tajo.worker.TaskRunnerManager: Stop Task:eb_1429088098190_1356_000001,container_1429088098190_1356_01_058889
      2015-04-16 02:05:49,065 INFO org.apache.tajo.querymaster.Stage: eb_1429088098190_1356_000001, waiting for shuffle reports. expected Tasks:3
      2015-04-16 02:05:49,066 INFO org.apache.tajo.worker.TaskRunnerManager: ======================== Processing eb_1429088098190_1356_000001 of type STOP
      2015-04-16 02:05:49,066 INFO org.apache.tajo.storage.HashShuffleAppenderManager: Close HashShuffleAppender:eb_1429088098190_1356_000001, not a hash shuffle
      2015-04-16 02:05:49,066 INFO org.apache.tajo.storage.HashShuffleAppenderManager: Close HashShuffleAppender:eb_1429088098190_1356_000001, not a hash shuffle
      2015-04-16 02:05:49,066 INFO org.apache.tajo.worker.TaskRunnerManager: Stopped execution block:eb_1429088098190_1356_000001
      2015-04-16 02:05:49,066 INFO org.apache.tajo.querymaster.Stage: eb_1429088098190_1356_000001, Received shuffle report: 2/3
      2015-04-16 02:05:49,066 INFO org.apache.tajo.querymaster.Stage: eb_1429088098190_1356_000001, Finalized shuffle reports: 3
      2015-04-16 02:05:49,066 INFO org.apache.tajo.querymaster.Stage: Stage completed - eb_1429088098190_1356_000001 (total=3, success=3, killed=0)
      2015-04-16 02:05:49,066 INFO org.apache.tajo.querymaster.Query: Processing q_1429088098190_1356 of type STAGE_COMPLETED
      2015-04-16 02:05:49,066 INFO org.apache.tajo.querymaster.Stage: eb_1429088098190_1356_000002, Outer volume: 0.0MB, Inner volume: 1.0MB
      2015-04-16 02:05:49,066 INFO org.apache.tajo.querymaster.Stage: eb_1429088098190_1356_000002, Bigger Table's volume is approximately 1 MB
      2015-04-16 02:05:49,066 INFO org.apache.tajo.querymaster.Stage: eb_1429088098190_1356_000002, The determined number of join partitions is 1
      2015-04-16 02:05:49,066 INFO org.apache.tajo.querymaster.Stage: org.apache.tajo.querymaster.DefaultTaskScheduler is chosen for the task scheduling for eb_1429088098190_1356_000002
      2015-04-16 02:05:49,066 INFO org.apache.tajo.querymaster.Query: Scheduling Stage:eb_1429088098190_1356_000002
      2015-04-16 02:05:49,068 INFO org.apache.tajo.storage.FileStorageManager: Total input paths to process : 11
      2015-04-16 02:05:49,068 ERROR org.apache.tajo.querymaster.Stage: Can't handle this event at current state, eventType:SQ_SHUFFLE_REPORT, oldState:SUCCEEDED, nextState:SUCCEEDED
      org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: SQ_SHUFFLE_REPORT at SUCCEEDED
      	at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
      	at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
      	at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
      	at org.apache.tajo.querymaster.Stage.handle(Stage.java:743)
      	at org.apache.tajo.querymaster.QueryMasterTask$StageEventDispatcher.handle(QueryMasterTask.java:226)
      	at org.apache.tajo.querymaster.QueryMasterTask$StageEventDispatcher.handle(QueryMasterTask.java:220)
      	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
      	at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
      	at java.lang.Thread.run(Thread.java:745)
      2015-04-16 02:05:49,068 INFO org.apache.tajo.querymaster.QueryMaster: cleanup executionBlocks: 
      2015-04-16 02:05:49,069 INFO org.apache.tajo.querymaster.Query: Processing q_1429088098190_1356 of type STAGE_COMPLETED
      2015-04-16 02:05:49,069 INFO org.apache.tajo.querymaster.Query: Processing q_1429088098190_1356 of type QUERY_COMPLETED
      2015-04-16 02:05:49,069 INFO org.apache.tajo.querymaster.Query: q_1429088098190_1356 Query Transitioned from QUERY_RUNNING to QUERY_ERROR
      

      Attachments

        1. TAJO-1560.patch
          25 kB
          Jinho Kim
        2. TAJO-1560-branch-0.10.1.patch
          26 kB
          Jinho Kim

        Issue Links

          Activity

            People

              jhkim Jinho Kim
              jhkim Jinho Kim
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: