Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1056

Wrong resource release or wrong task scheduling

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.9.0, block_iteration
    • Fix Version/s: 0.9.0
    • Component/s: None
    • Labels:
      None

      Description

      Please take a look at the following log:

      The worker takes shouldDie message and then it removes ExecutionBlockContext with its resources. But, following tasks which belong to this execution block are still scheduled to this worker and they causes NPE.

      2014-09-20 07:05:21,894 INFO org.apache.tajo.worker.Task: ==================================
      2014-09-20 07:05:21,894 INFO org.apache.tajo.worker.Task: * Subquery ta_1411164263773_0003_000001_000013_00 is initialized
      2014-09-20 07:05:21,894 INFO org.apache.tajo.worker.Task: * InterQuery: true, Use RANGE_SHUFFLE shuffle, Fragments (num: 1), Fetches (total:0) :
      2014-09-20 07:05:21,894 INFO org.apache.tajo.worker.Task: * Local task dir: file:/data01/tajo/data/q_1411164263773_0003/output/1/13_0
      2014-09-20 07:05:21,894 INFO org.apache.tajo.worker.Task: ==================================
      2014-09-20 07:05:21,895 INFO org.apache.tajo.worker.TaskAttemptContext: Query status of ta_1411164263773_0003_000001_000013_00 is changed to TA_RUNNING
      2014-09-20 07:05:21,895 INFO org.apache.tajo.worker.TaskRunner: Accumulated Received Task: 1
      2014-09-20 07:05:21,895 INFO org.apache.tajo.worker.TaskRunner: Initializing: ta_1411164263773_0003_000001_000017_00
      2014-09-20 07:05:21,895 INFO org.apache.tajo.worker.TaskAttemptContext: Query status of ta_1411164263773_0003_000001_000016_00 is changed to TA_PENDING
      2014-09-20 07:05:21,895 INFO org.apache.tajo.worker.Task: ==================================
      2014-09-20 07:05:21,895 INFO org.apache.tajo.worker.Task: * Subquery ta_1411164263773_0003_000001_000016_00 is initialized
      2014-09-20 07:05:21,895 INFO org.apache.tajo.worker.Task: * InterQuery: true, Use RANGE_SHUFFLE shuffle, Fragments (num: 1), Fetches (total:0) :
      2014-09-20 07:05:21,895 INFO org.apache.tajo.worker.Task: * Local task dir: file:/data09/tajo/data/q_1411164263773_0003/output/1/16_0
      2014-09-20 07:05:21,895 INFO org.apache.tajo.worker.Task: ==================================2014-09-20 07:05:21,896 INFO org.apache.tajo.worker.TaskAttemptContext: Query status of ta_1411164263773_0003_000001_000016_00 is changed to TA_RUNNING
      2014-09-20 07:05:21,898 INFO org.apache.tajo.worker.TaskRunner: Received ShouldDie flag:eb_1411164263773_0003_000001,container_1411164263773_0003_01_000063
      2014-09-20 07:05:21,898 INFO org.apache.tajo.worker.TaskRunner: Stop TaskRunner: eb_1411164263773_0003_000001,container_1411164263773_0003_01_000063
      2014-09-20 07:05:21,898 INFO org.apache.tajo.worker.TaskRunnerManager: Stop Task:eb_1411164263773_0003_000001,container_1411164263773_0003_01_00006
      3
      2014-09-20 07:05:21,899 INFO org.apache.tajo.worker.TaskRunnerManager: ======================== Processing eb_1411164263773_0003_000001 of type STO
      P
      2014-09-20 07:05:21,899 INFO org.apache.tajo.storage.HashShuffleAppenderManager: Close HashShuffleAppender:eb_1411164263773_0003_000001, not a hash shuffle
      2014-09-20 07:05:21,899 INFO org.apache.tajo.worker.TaskAttemptContext: Query status of ta_1411164263773_0003_000001_000006_00 is changed to TA_FAILED
      2014-09-20 07:05:21,899 INFO org.apache.tajo.worker.TaskAttemptContext: Query status of ta_1411164263773_0003_000001_000016_00 is changed to TA_FAILED
      2014-09-20 07:05:21,899 INFO org.apache.tajo.worker.TaskAttemptContext: Query status of ta_1411164263773_0003_000001_000009_00 is changed to TA_FAILED
      2014-09-20 07:05:21,899 INFO org.apache.tajo.worker.TaskAttemptContext: Query status of ta_1411164263773_0003_000001_000011_00 is changed to TA_FAILED
      2014-09-20 07:05:21,899 INFO org.apache.tajo.worker.TaskAttemptContext: Query status of ta_1411164263773_0003_000001_000013_00 is changed to TA_FAILED
      2014-09-20 07:05:21,899 INFO org.apache.tajo.storage.HashShuffleAppenderManager: Close HashShuffleAppender:eb_1411164263773_0003_000001, not a hash shuffle
      2014-09-20 07:05:21,899 INFO org.apache.tajo.worker.TaskRunnerManager: Stopped execution block:eb_1411164263773_0003_000001
      2014-09-20 07:05:21,900 INFO org.apache.tajo.worker.TaskAttemptContext: Query status of ta_1411164263773_0003_000001_000017_00 is changed to TA_PENDING
      2014-09-20 07:05:21,900 INFO org.apache.tajo.worker.Task: ==================================
      2014-09-20 07:05:21,900 INFO org.apache.tajo.worker.Task: * Subquery ta_1411164263773_0003_000001_000017_00 is initialized
      2014-09-20 07:05:21,901 INFO org.apache.tajo.worker.Task: * InterQuery: true, Use RANGE_SHUFFLE shuffle, Fragments (num: 1), Fetches (total:0) :
      2014-09-20 07:05:21,901 INFO org.apache.tajo.worker.Task: * Local task dir: file:/data07/tajo/data/q_1411164263773_0003/output/1/17_0
      2014-09-20 07:05:21,901 ERROR org.apache.tajo.worker.Task: >>>>>>>>> compilationContext is NULL
      java.lang.NullPointerException: >>>>>>>>> compilationContext is NULL
              at org.apache.tajo.worker.ExecutionBlockSharedResource.getCompiledComparator(ExecutionBlockSharedResource.java:121)
              at org.apache.tajo.engine.planner.physical.SortExec.<init>(SortExec.java:48)
              at org.apache.tajo.engine.planner.physical.ExternalSortExec.<init>(ExternalSortExec.java:104)
              at org.apache.tajo.engine.planner.physical.ExternalSortExec.<init>(ExternalSortExec.java:139)
              at org.apache.tajo.engine.planner.PhysicalPlannerImpl.createBestSortPlan(PhysicalPlannerImpl.java:1122)
              at org.apache.tajo.engine.planner.PhysicalPlannerImpl.createSortPlan(PhysicalPlannerImpl.java:1117)
              at org.apache.tajo.engine.planner.PhysicalPlannerImpl.createPlanRecursive(PhysicalPlannerImpl.java:206)
              at org.apache.tajo.engine.planner.PhysicalPlannerImpl.createPlan(PhysicalPlannerImpl.java:87)
              at org.apache.tajo.worker.TajoQueryEngine.createPlan(TajoQueryEngine.java:44)
              at org.apache.tajo.worker.Task.run(Task.java:434)
      

        Activity

        Hide
        jhkim Jinho Kim added a comment -

        Thank you for your reporting
        It was caused by TAJO-1015. I missed to remove additional cleanup event.

        Show
        jhkim Jinho Kim added a comment - Thank you for your reporting It was caused by TAJO-1015 . I missed to remove additional cleanup event.
        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user jinossy opened a pull request:

        https://github.com/apache/tajo/pull/151

        TAJO-1056: Wrong resource release or wrong task scheduling.

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/jinossy/tajo TAJO-1056

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/151.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #151


        commit 2575420f9ebf44c621b5237127c787e3b2dffeb2
        Author: jhkim <jhkim@apache.org>
        Date: 2014-09-20T06:13:13Z

        TAJO-1056: Wrong resource release or wrong task scheduling.


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user jinossy opened a pull request: https://github.com/apache/tajo/pull/151 TAJO-1056 : Wrong resource release or wrong task scheduling. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jinossy/tajo TAJO-1056 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/151.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #151 commit 2575420f9ebf44c621b5237127c787e3b2dffeb2 Author: jhkim <jhkim@apache.org> Date: 2014-09-20T06:13:13Z TAJO-1056 : Wrong resource release or wrong task scheduling.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/151#issuecomment-56259290

        +1
        The bug fix is straightforward.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/151#issuecomment-56259290 +1 The bug fix is straightforward.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/151

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/151
        Hide
        jhkim Jinho Kim added a comment -

        committed it
        Thanks for the quick review.

        Show
        jhkim Jinho Kim added a comment - committed it Thanks for the quick review.
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #367 (See https://builds.apache.org/job/Tajo-master-build/367/)
        TAJO-1056: Wrong resource release or wrong task scheduling. (jinho) (jhkim: rev 621d9145ff5f2d2551ebd3fce11a87f23413201e)

        • tajo-core/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java
        • CHANGES
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #367 (See https://builds.apache.org/job/Tajo-master-build/367/ ) TAJO-1056 : Wrong resource release or wrong task scheduling. (jinho) (jhkim: rev 621d9145ff5f2d2551ebd3fce11a87f23413201e) tajo-core/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java CHANGES
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-CODEGEN-build #9 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/9/)
        TAJO-1056: Wrong resource release or wrong task scheduling. (jinho) (jhkim: rev 621d9145ff5f2d2551ebd3fce11a87f23413201e)

        • tajo-core/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java
        • CHANGES
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-CODEGEN-build #9 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/9/ ) TAJO-1056 : Wrong resource release or wrong task scheduling. (jinho) (jhkim: rev 621d9145ff5f2d2551ebd3fce11a87f23413201e) tajo-core/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java CHANGES
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-block_iteration-branch-build #4 (See https://builds.apache.org/job/Tajo-block_iteration-branch-build/4/)
        TAJO-1056: Wrong resource release or wrong task scheduling. (jinho) (jhkim: rev 621d9145ff5f2d2551ebd3fce11a87f23413201e)

        • tajo-core/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java
        • CHANGES
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-block_iteration-branch-build #4 (See https://builds.apache.org/job/Tajo-block_iteration-branch-build/4/ ) TAJO-1056 : Wrong resource release or wrong task scheduling. (jinho) (jhkim: rev 621d9145ff5f2d2551ebd3fce11a87f23413201e) tajo-core/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java CHANGES

          People

          • Assignee:
            jhkim Jinho Kim
            Reporter:
            hyunsik Hyunsik Choi
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development