Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1251

Query is hanging occasionally by shuffle report

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0
    • Fix Version/s: 0.10.0
    • Component/s: Data Shuffle, QueryMaster
    • Labels:
      None

      Description

      Currently, Query is failed when race condition occur in SubQuery.waitingIntermediateReport(). If a event is not complete, other events will be blocked in AsyncDispatcher.
      We should remove the thread lock in event handler.

      2014-12-12 14:05:28,211 INFO org.apache.tajo.master.querymaster.SubQuery: eb_1412172843714_0001_000003, receiveExecutionBlockReport:3
      2014-12-12 14:05:28,227 ERROR org.apache.tajo.master.querymaster.SubQuery: eb_1412172843714_0001_000003, Timeout while receiving intermediate reports: 121471 ms
      
      1. querymaster.png
        476 kB
        Jinho Kim
      2. Tajo.gv
        9 kB
        Jinho Kim

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user jinossy opened a pull request:

          https://github.com/apache/tajo/pull/339

          TAJO-1251: Query is hanging occasionally by shuffle report.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/jinossy/tajo TAJO-1251

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/tajo/pull/339.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #339


          commit 2a0b095bb5a1f9b551a417028cc4ca3c06766501
          Author: jhkim <jhkim@apache.org>
          Date: 2015-01-08T16:04:39Z

          TAJO-1251: Query is hanging occasionally by shuffle report.

          commit 9a201d3d0ffaa36453ae665fe6855a33aeff1f46
          Author: jhkim <jhkim@apache.org>
          Date: 2015-01-08T16:12:02Z

          TAJO-1251: Query is hanging occasionally by shuffle report.


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user jinossy opened a pull request: https://github.com/apache/tajo/pull/339 TAJO-1251 : Query is hanging occasionally by shuffle report. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jinossy/tajo TAJO-1251 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/339.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #339 commit 2a0b095bb5a1f9b551a417028cc4ca3c06766501 Author: jhkim <jhkim@apache.org> Date: 2015-01-08T16:04:39Z TAJO-1251 : Query is hanging occasionally by shuffle report. commit 9a201d3d0ffaa36453ae665fe6855a33aeff1f46 Author: jhkim <jhkim@apache.org> Date: 2015-01-08T16:12:02Z TAJO-1251 : Query is hanging occasionally by shuffle report.
          Hide
          jhkim Jinho Kim added a comment -

          I’ve add FINALIZE event in stage.

          Show
          jhkim Jinho Kim added a comment - I’ve add FINALIZE event in stage.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user hyunsik commented on the pull request:

          https://github.com/apache/tajo/pull/339#issuecomment-69296809

          I have some suggestion:

          • StageState.FINALIZE should be StageState.FINALIZING because it's a continuing state while receiving shuffle reports.
          • lastContactTime does not need to be AtomicLong because lastContactTime update will be serialized by event handler.
          Show
          githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/339#issuecomment-69296809 I have some suggestion: StageState.FINALIZE should be StageState.FINALIZING because it's a continuing state while receiving shuffle reports. lastContactTime does not need to be AtomicLong because lastContactTime update will be serialized by event handler.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user hyunsik commented on the pull request:

          https://github.com/apache/tajo/pull/339#issuecomment-69296837

          Others all look good to me.

          Show
          githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/339#issuecomment-69296837 Others all look good to me.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user hyunsik commented on the pull request:

          https://github.com/apache/tajo/pull/339#issuecomment-69298254

          The event name ``SQ_STAGE_FINALIZE`` is not intuitive regarding what this event does. How about ``SQ_SHUFFLE_REPORT`` instead of ``SQ_STAGE_FINALIZE``?

          Show
          githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/339#issuecomment-69298254 The event name ``SQ_STAGE_FINALIZE`` is not intuitive regarding what this event does. How about ``SQ_SHUFFLE_REPORT`` instead of ``SQ_STAGE_FINALIZE``?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user jinossy commented on the pull request:

          https://github.com/apache/tajo/pull/339#issuecomment-69300636

          Thank you for your suggestion.
          I will update the patch.

          Show
          githubbot ASF GitHub Bot added a comment - Github user jinossy commented on the pull request: https://github.com/apache/tajo/pull/339#issuecomment-69300636 Thank you for your suggestion. I will update the patch.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user hyunsik commented on the pull request:

          https://github.com/apache/tajo/pull/339#issuecomment-69312533

          +1 ship it. The patch looks nice to me.

          Show
          githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/339#issuecomment-69312533 +1 ship it. The patch looks nice to me.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/tajo/pull/339

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/339
          Hide
          jhkim Jinho Kim added a comment -

          committed it

          Show
          jhkim Jinho Kim added a comment - committed it
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Tajo-master-build #546 (See https://builds.apache.org/job/Tajo-master-build/546/)
          TAJO-1251: Query is hanging occasionally by shuffle report. (jinho) (jhkim: rev 50a8a663c2c95f14ca59f3b01ffd79b2578f7f09)

          • tajo-core/src/main/java/org/apache/tajo/master/event/StageShuffleReportEvent.java
          • tajo-dist/pom.xml
          • tajo-core/src/main/java/org/apache/tajo/querymaster/StageState.java
          • tajo-core/src/main/java/org/apache/tajo/querymaster/QueryMasterManagerService.java
          • tajo-core/src/main/java/org/apache/tajo/querymaster/Query.java
          • CHANGES
          • tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java
          • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
          • tajo-core/src/main/java/org/apache/tajo/master/event/StageEventType.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #546 (See https://builds.apache.org/job/Tajo-master-build/546/ ) TAJO-1251 : Query is hanging occasionally by shuffle report. (jinho) (jhkim: rev 50a8a663c2c95f14ca59f3b01ffd79b2578f7f09) tajo-core/src/main/java/org/apache/tajo/master/event/StageShuffleReportEvent.java tajo-dist/pom.xml tajo-core/src/main/java/org/apache/tajo/querymaster/StageState.java tajo-core/src/main/java/org/apache/tajo/querymaster/QueryMasterManagerService.java tajo-core/src/main/java/org/apache/tajo/querymaster/Query.java CHANGES tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java tajo-core/src/main/java/org/apache/tajo/master/event/StageEventType.java
          Hide
          hudson Hudson added a comment -

          ABORTED: Integrated in Tajo-master-CODEGEN-build #185 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/185/)
          TAJO-1251: Query is hanging occasionally by shuffle report. (jinho) (jhkim: rev 50a8a663c2c95f14ca59f3b01ffd79b2578f7f09)

          • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
          • tajo-core/src/main/java/org/apache/tajo/master/event/StageShuffleReportEvent.java
          • tajo-core/src/main/java/org/apache/tajo/querymaster/Query.java
          • tajo-dist/pom.xml
          • tajo-core/src/main/java/org/apache/tajo/master/event/StageEventType.java
          • tajo-core/src/main/java/org/apache/tajo/querymaster/StageState.java
          • tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java
          • CHANGES
          • tajo-core/src/main/java/org/apache/tajo/querymaster/QueryMasterManagerService.java
          Show
          hudson Hudson added a comment - ABORTED: Integrated in Tajo-master-CODEGEN-build #185 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/185/ ) TAJO-1251 : Query is hanging occasionally by shuffle report. (jinho) (jhkim: rev 50a8a663c2c95f14ca59f3b01ffd79b2578f7f09) tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java tajo-core/src/main/java/org/apache/tajo/master/event/StageShuffleReportEvent.java tajo-core/src/main/java/org/apache/tajo/querymaster/Query.java tajo-dist/pom.xml tajo-core/src/main/java/org/apache/tajo/master/event/StageEventType.java tajo-core/src/main/java/org/apache/tajo/querymaster/StageState.java tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java CHANGES tajo-core/src/main/java/org/apache/tajo/querymaster/QueryMasterManagerService.java

            People

            • Assignee:
              jhkim Jinho Kim
              Reporter:
              jhkim Jinho Kim
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development