Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-2082

Aggregation on a derived table which includes union can cause incorrect result

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.11.0, 0.11.1
    • Fix Version/s: 0.12.0, 0.11.2
    • Component/s: distributed query plan
    • Labels:
      None

      Description

      This problem can be reproduced by running the following query on 10GB TPC-H data set.

      select 
        sum(t.cnt) as cnt, o_orderkey, o_custkey 
      from 
        (
          select 
            o_orderkey, o_custkey, CAST(COUNT(1) AS INT4) as cnt 
          from 
            orders 
          group by 
            o_orderkey, o_custkey 
          union all 
          select 
            o_orderkey, o_custkey, o_shippriority 
          from 
            orders
        ) as t 
      group by 
        o_orderkey, o_custkey
      

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user jihoonson opened a pull request:

        https://github.com/apache/tajo/pull/969

        TAJO-2082: Aggregation on a derived table which includes union can cause incorrect result

        I didn't add unit test because it is difficult to reproduce this bug with unit test.
        IMO, it is better to add a verifier to test global plan. I created a jira ticket for global plan verifier. (https://issues.apache.org/jira/browse/TAJO-2084)

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/jihoonson/tajo-2 TAJO-2082

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/969.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #969


        commit b47ba9a9016c1bb7f763157c61915491af58c12f
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2016-02-25T08:06:18Z

        Add shuffle info to master plan

        commit db17a92b91fc648dbd5523cfc2b08df19958f448
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2016-02-25T15:19:55Z

        Add PlanContext.

        commit 00943976a1ba6172082f9b1f10e2875aec440145
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2016-02-26T09:26:42Z

        TAJO-2082

        commit c1af7a88663d551d99da4081e8b02e7b99914861
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2016-02-29T07:37:50Z

        Refactoring stage

        commit 3ee7bbbc464eaacd92b4c86fde62c9adb98ff32a
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2016-03-01T05:32:48Z

        Test finished.

        commit 5ac73773f867bf234ec86c49adc116b99792b695
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2016-03-01T05:40:47Z

        fix comment


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user jihoonson opened a pull request: https://github.com/apache/tajo/pull/969 TAJO-2082 : Aggregation on a derived table which includes union can cause incorrect result I didn't add unit test because it is difficult to reproduce this bug with unit test. IMO, it is better to add a verifier to test global plan. I created a jira ticket for global plan verifier. ( https://issues.apache.org/jira/browse/TAJO-2084 ) You can merge this pull request into a Git repository by running: $ git pull https://github.com/jihoonson/tajo-2 TAJO-2082 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/969.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #969 commit b47ba9a9016c1bb7f763157c61915491af58c12f Author: Jihoon Son <jihoonson@apache.org> Date: 2016-02-25T08:06:18Z Add shuffle info to master plan commit db17a92b91fc648dbd5523cfc2b08df19958f448 Author: Jihoon Son <jihoonson@apache.org> Date: 2016-02-25T15:19:55Z Add PlanContext. commit 00943976a1ba6172082f9b1f10e2875aec440145 Author: Jihoon Son <jihoonson@apache.org> Date: 2016-02-26T09:26:42Z TAJO-2082 commit c1af7a88663d551d99da4081e8b02e7b99914861 Author: Jihoon Son <jihoonson@apache.org> Date: 2016-02-29T07:37:50Z Refactoring stage commit 3ee7bbbc464eaacd92b4c86fde62c9adb98ff32a Author: Jihoon Son <jihoonson@apache.org> Date: 2016-03-01T05:32:48Z Test finished. commit 5ac73773f867bf234ec86c49adc116b99792b695 Author: Jihoon Son <jihoonson@apache.org> Date: 2016-03-01T05:40:47Z fix comment
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/969#discussion_r54996504

        — Diff: tajo-core/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java —
        @@ -23,8 +23,10 @@

        import org.apache.tajo.ExecutionBlockId;
        import org.apache.tajo.QueryId;
        +import org.apache.tajo.annotation.NotNull;
        — End diff –

        unused import

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/969#discussion_r54996504 — Diff: tajo-core/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java — @@ -23,8 +23,10 @@ import org.apache.tajo.ExecutionBlockId; import org.apache.tajo.QueryId; +import org.apache.tajo.annotation.NotNull; — End diff – unused import
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/969#issuecomment-192146036

        +1 LGTM! The change looks straightforward.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/969#issuecomment-192146036 +1 LGTM! The change looks straightforward.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/969

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/969
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/969#issuecomment-192150416

        Thank you for the review.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/969#issuecomment-192150416 Thank you for the review.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-CODEGEN-build #689 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/689/)
        TAJO-2082: Aggregation on a derived table which includes union can cause (jihoonson: rev 7b0af74483521615f302d2a3376556dad325297f)

        • tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java
        • tajo-common/src/main/java/org/apache/tajo/util/graph/SimpleDirectedGraph.java
        • tajo-common/src/main/java/org/apache/tajo/exception/TooLargeInputForCrossJoinException.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/builder/DistinctGroupbyBuilder.java
        • tajo-common/src/main/java/org/apache/tajo/util/graph/DirectedGraphVisitor.java
        • tajo-core/src/main/java/org/apache/tajo/querymaster/Repartitioner.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java
        • CHANGES
        • tajo-common/src/main/java/org/apache/tajo/util/graph/DirectedGraph.java
        • tajo-common/src/test/java/org/apache/tajo/util/graph/TestSimpleDirectedGraph.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java
        • tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/SelfDescSchemaBuildPhase.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/ExecutionBlock.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/rewriter/rules/BroadcastJoinRule.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-CODEGEN-build #689 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/689/ ) TAJO-2082 : Aggregation on a derived table which includes union can cause (jihoonson: rev 7b0af74483521615f302d2a3376556dad325297f) tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java tajo-common/src/main/java/org/apache/tajo/util/graph/SimpleDirectedGraph.java tajo-common/src/main/java/org/apache/tajo/exception/TooLargeInputForCrossJoinException.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/builder/DistinctGroupbyBuilder.java tajo-common/src/main/java/org/apache/tajo/util/graph/DirectedGraphVisitor.java tajo-core/src/main/java/org/apache/tajo/querymaster/Repartitioner.java tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java CHANGES tajo-common/src/main/java/org/apache/tajo/util/graph/DirectedGraph.java tajo-common/src/test/java/org/apache/tajo/util/graph/TestSimpleDirectedGraph.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/SelfDescSchemaBuildPhase.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/ExecutionBlock.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/rewriter/rules/BroadcastJoinRule.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-build #1098 (See https://builds.apache.org/job/Tajo-master-build/1098/)
        TAJO-2082: Aggregation on a derived table which includes union can cause (jihoonson: rev 7b0af74483521615f302d2a3376556dad325297f)

        • tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/rewriter/rules/BroadcastJoinRule.java
        • tajo-common/src/main/java/org/apache/tajo/exception/TooLargeInputForCrossJoinException.java
        • tajo-common/src/main/java/org/apache/tajo/util/graph/DirectedGraph.java
        • tajo-core/src/main/java/org/apache/tajo/querymaster/Repartitioner.java
        • CHANGES
        • tajo-common/src/main/java/org/apache/tajo/util/graph/DirectedGraphVisitor.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/builder/DistinctGroupbyBuilder.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java
        • tajo-common/src/test/java/org/apache/tajo/util/graph/TestSimpleDirectedGraph.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/ExecutionBlock.java
        • tajo-common/src/main/java/org/apache/tajo/util/graph/SimpleDirectedGraph.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/SelfDescSchemaBuildPhase.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-build #1098 (See https://builds.apache.org/job/Tajo-master-build/1098/ ) TAJO-2082 : Aggregation on a derived table which includes union can cause (jihoonson: rev 7b0af74483521615f302d2a3376556dad325297f) tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/rewriter/rules/BroadcastJoinRule.java tajo-common/src/main/java/org/apache/tajo/exception/TooLargeInputForCrossJoinException.java tajo-common/src/main/java/org/apache/tajo/util/graph/DirectedGraph.java tajo-core/src/main/java/org/apache/tajo/querymaster/Repartitioner.java CHANGES tajo-common/src/main/java/org/apache/tajo/util/graph/DirectedGraphVisitor.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/builder/DistinctGroupbyBuilder.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java tajo-common/src/test/java/org/apache/tajo/util/graph/TestSimpleDirectedGraph.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/ExecutionBlock.java tajo-common/src/main/java/org/apache/tajo/util/graph/SimpleDirectedGraph.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/SelfDescSchemaBuildPhase.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-0.11.2-build #177 (See https://builds.apache.org/job/Tajo-0.11.2-build/177/)
        TAJO-2082: Aggregation on a derived table which includes union can cause (jihoonson: rev 3e083b16413ba719767cb2ff5fc96567c53550d8)

        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/rewriter/rules/BroadcastJoinRule.java
        • tajo-common/src/main/java/org/apache/tajo/util/graph/SimpleDirectedGraph.java
        • tajo-common/src/main/java/org/apache/tajo/util/graph/DirectedGraphVisitor.java
        • tajo-common/src/test/java/org/apache/tajo/util/graph/TestSimpleDirectedGraph.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java
        • tajo-core/src/main/java/org/apache/tajo/querymaster/Repartitioner.java
        • tajo-common/src/main/java/org/apache/tajo/util/graph/DirectedGraph.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/ExecutionBlock.java
        • tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/SelfDescSchemaBuildPhase.java
        • CHANGES
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/builder/DistinctGroupbyBuilder.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java
        • tajo-common/src/main/java/org/apache/tajo/exception/TooLargeInputForCrossJoinException.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-0.11.2-build #177 (See https://builds.apache.org/job/Tajo-0.11.2-build/177/ ) TAJO-2082 : Aggregation on a derived table which includes union can cause (jihoonson: rev 3e083b16413ba719767cb2ff5fc96567c53550d8) tajo-core/src/main/java/org/apache/tajo/engine/planner/global/rewriter/rules/BroadcastJoinRule.java tajo-common/src/main/java/org/apache/tajo/util/graph/SimpleDirectedGraph.java tajo-common/src/main/java/org/apache/tajo/util/graph/DirectedGraphVisitor.java tajo-common/src/test/java/org/apache/tajo/util/graph/TestSimpleDirectedGraph.java tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java tajo-core/src/main/java/org/apache/tajo/querymaster/Repartitioner.java tajo-common/src/main/java/org/apache/tajo/util/graph/DirectedGraph.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/ExecutionBlock.java tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/SelfDescSchemaBuildPhase.java CHANGES tajo-core/src/main/java/org/apache/tajo/engine/planner/global/builder/DistinctGroupbyBuilder.java tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java tajo-common/src/main/java/org/apache/tajo/exception/TooLargeInputForCrossJoinException.java

          People

          • Assignee:
            jihoonson Jihoon Son
            Reporter:
            jihoonson Jihoon Son
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development