Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1800

WHERE clause is ignored with UNION

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11.0
    • Component/s: Planner/Optimizer
    • Labels:
      None

      Description

      You can reproduce as follows:

      tpch> select c_custkey, ret
      > from (select c_custkey, ROUND(sum(c_acctbal*15000000)/sum(15000000),4) as ret from customer group by c_custkey
      > union
      > select c_custkey, ROUND(sum(c_acctbal*15000000)/sum(15000000),4) as ret from customer group by c_custkey) a
      > where ret > 0.02;
      c_custkey,  ret
      -------------------------------
      64568,  3631.24
      26627,  9070.85
      105296,  2863.05
      51704,  7404.84
      111424,  -977.6
      32824,  3718.26
      138521,  -510.01
      ...
      

        Activity

        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-0.11.0-build #13 (See https://builds.apache.org/job/Tajo-0.11.0-build/13/)
        TAJO-1800: WHERE clause is ignored with UNION. (hyunsik: rev 0f5d97a0004e91d614fc8093b3f81b0bd8c93899)

        • tajo-core-tests/src/test/resources/results/TestUnionQuery/testUnionAndFilter.plan
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestUnionQuery.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/rules/FilterPushDownRule.java
        • CHANGES
        • tajo-core-tests/src/test/resources/queries/TestUnionQuery/testUnionAndFilter.sql
        • tajo-core-tests/src/test/resources/results/TestUnionQuery/testUnionAndFilter.result
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-0.11.0-build #13 (See https://builds.apache.org/job/Tajo-0.11.0-build/13/ ) TAJO-1800 : WHERE clause is ignored with UNION. (hyunsik: rev 0f5d97a0004e91d614fc8093b3f81b0bd8c93899) tajo-core-tests/src/test/resources/results/TestUnionQuery/testUnionAndFilter.plan tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestUnionQuery.java tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/rules/FilterPushDownRule.java CHANGES tajo-core-tests/src/test/resources/queries/TestUnionQuery/testUnionAndFilter.sql tajo-core-tests/src/test/resources/results/TestUnionQuery/testUnionAndFilter.result
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #834 (See https://builds.apache.org/job/Tajo-master-build/834/)
        TAJO-1800: WHERE clause is ignored with UNION. (hyunsik: rev fd4a3f80c8adb65233ff3fbb1b99dd8585d44a7b)

        • tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/rules/FilterPushDownRule.java
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestUnionQuery.java
        • tajo-core-tests/src/test/resources/queries/TestUnionQuery/testUnionAndFilter.sql
        • tajo-core-tests/src/test/resources/results/TestUnionQuery/testUnionAndFilter.plan
        • CHANGES
        • tajo-core-tests/src/test/resources/results/TestUnionQuery/testUnionAndFilter.result
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #834 (See https://builds.apache.org/job/Tajo-master-build/834/ ) TAJO-1800 : WHERE clause is ignored with UNION. (hyunsik: rev fd4a3f80c8adb65233ff3fbb1b99dd8585d44a7b) tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/rules/FilterPushDownRule.java tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestUnionQuery.java tajo-core-tests/src/test/resources/queries/TestUnionQuery/testUnionAndFilter.sql tajo-core-tests/src/test/resources/results/TestUnionQuery/testUnionAndFilter.plan CHANGES tajo-core-tests/src/test/resources/results/TestUnionQuery/testUnionAndFilter.result
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-CODEGEN-build #478 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/478/)
        TAJO-1800: WHERE clause is ignored with UNION. (hyunsik: rev fd4a3f80c8adb65233ff3fbb1b99dd8585d44a7b)

        • tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/rules/FilterPushDownRule.java
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestUnionQuery.java
        • CHANGES
        • tajo-core-tests/src/test/resources/results/TestUnionQuery/testUnionAndFilter.plan
        • tajo-core-tests/src/test/resources/queries/TestUnionQuery/testUnionAndFilter.sql
        • tajo-core-tests/src/test/resources/results/TestUnionQuery/testUnionAndFilter.result
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-CODEGEN-build #478 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/478/ ) TAJO-1800 : WHERE clause is ignored with UNION. (hyunsik: rev fd4a3f80c8adb65233ff3fbb1b99dd8585d44a7b) tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/rules/FilterPushDownRule.java tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestUnionQuery.java CHANGES tajo-core-tests/src/test/resources/results/TestUnionQuery/testUnionAndFilter.plan tajo-core-tests/src/test/resources/queries/TestUnionQuery/testUnionAndFilter.sql tajo-core-tests/src/test/resources/results/TestUnionQuery/testUnionAndFilter.result
        Hide
        hyunsik Hyunsik Choi added a comment -

        I committed it to master and branch-0.11.0 branches.

        Show
        hyunsik Hyunsik Choi added a comment - I committed it to master and branch-0.11.0 branches.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/727

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/727
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/727#issuecomment-137686381

        +1 LGTM!

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/727#issuecomment-137686381 +1 LGTM!
        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user hyunsik opened a pull request:

        https://github.com/apache/tajo/pull/727

        TAJO-1800: WHERE clause is ignored with UNION.

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/hyunsik/tajo TAJO-1800

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/727.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #727


        commit 9bb6f58e2ebd3ee280d6804239c65fcc4e69a10a
        Author: Hyunsik Choi <hyunsik@apache.org>
        Date: 2015-09-04T06:30:36Z

        TAJO-1800: WHERE clause is ignored with UNION.


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user hyunsik opened a pull request: https://github.com/apache/tajo/pull/727 TAJO-1800 : WHERE clause is ignored with UNION. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hyunsik/tajo TAJO-1800 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/727.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #727 commit 9bb6f58e2ebd3ee280d6804239c65fcc4e69a10a Author: Hyunsik Choi <hyunsik@apache.org> Date: 2015-09-04T06:30:36Z TAJO-1800 : WHERE clause is ignored with UNION.
        Hide
        hyunsik Hyunsik Choi added a comment -

        I misunderstood this query. This query is UNION DISTINCT instead of UNION ALL. So, the plan is correct.

        Show
        hyunsik Hyunsik Choi added a comment - I misunderstood this query. This query is UNION DISTINCT instead of UNION ALL. So, the plan is correct.
        Hide
        hyunsik Hyunsik Choi added a comment -

        Aside from this bug, the query statement causes a wrong query plan as follows:

        explain
        -------------------------------
        TABLE_SUBQUERY(7) as default.a
          => Targets: default.a.c_custkey (INT4), default.a.ret (FLOAT8)
          => out schema: {(2) default.a.c_custkey (INT4), default.a.ret (FLOAT8)}
          => in  schema: {(2) default.a.c_custkey (INT4), default.a.ret (FLOAT8)}
           GROUP_BY(11)(c_custkey,ret)
             => target list: default.#SQ_0.c_custkey (INT4), default.#SQ_0.ret (FLOAT8)
             => out schema:{(2) default.#SQ_0.c_custkey (INT4), default.#SQ_0.ret (FLOAT8)}
             => in schema:{(2) default.#SQ_0.c_custkey (INT4), default.#SQ_0.ret (FLOAT8)}
              TABLE_SUBQUERY(10) as default.#SQ_0
                => Targets: default.#SQ_0.c_custkey (INT4), default.#SQ_0.ret (FLOAT8)
                => out schema: {(2) default.#SQ_0.c_custkey (INT4), default.#SQ_0.ret (FLOAT8)}
                => in  schema: {(2) default.#SQ_0.c_custkey (INT4), default.#SQ_0.ret (FLOAT8)}
                 UNION(6)
                    PROJECTION(5)
                      => Targets: default.customer.c_custkey (INT4), round(?sum_5 (FLOAT8) / CAST (?sum_6 (INT8) AS FLOAT8),4) as ret
                      => out schema: {(2) default.customer.c_custkey (INT4), ret (FLOAT8)}
                      => in  schema: {(3) ?sum_5 (FLOAT8), ?sum_6 (INT8), default.customer.c_custkey (INT4)}
                       GROUP_BY(4)(c_custkey)
                         => exprs: (sum(?multiply_4 (FLOAT8)),sum(15000000))
                         => target list: default.customer.c_custkey (INT4), ?sum_5 (FLOAT8), ?sum_6 (INT8)
                         => out schema:{(3) ?sum_5 (FLOAT8), ?sum_6 (INT8), default.customer.c_custkey (INT4)}
                         => in schema:{(3) ?multiply_4 (FLOAT8), default.customer.c_acctbal (FLOAT8), default.customer.c_custkey (INT4)}
                          SCAN(3) on default.customer
                            => target list: default.customer.c_acctbal (FLOAT8), default.customer.c_custkey (INT4), default.customer.c_acctbal (FLOAT8) * 1.5E7 as ?multiply_4
                            => out schema: {(3) ?multiply_4 (FLOAT8), default.customer.c_acctbal (FLOAT8), default.customer.c_custkey (INT4)}
                            => in schema: {(8) default.customer.c_acctbal (FLOAT8), default.customer.c_address (TEXT), default.customer.c_comment (TEXT), default.customer.c_custkey (INT4), default.customer.c_mktsegment (TEXT), default.customer.c_name (TEXT), default.customer.c_nationkey (INT4), default.customer.c_phone (TEXT)}
                    PROJECTION(2)
                      => Targets: default.customer.c_custkey (INT4), round(?sum_2 (FLOAT8) / CAST (?sum_3 (INT8) AS FLOAT8),4) as ret
                      => out schema: {(2) default.customer.c_custkey (INT4), ret (FLOAT8)}
                      => in  schema: {(3) ?sum_2 (FLOAT8), ?sum_3 (INT8), default.customer.c_custkey (INT4)}
                       GROUP_BY(1)(c_custkey)
                         => exprs: (sum(?multiply_1 (FLOAT8)),sum(15000000))
                         => target list: default.customer.c_custkey (INT4), ?sum_2 (FLOAT8), ?sum_3 (INT8)
                         => out schema:{(3) ?sum_2 (FLOAT8), ?sum_3 (INT8), default.customer.c_custkey (INT4)}
                         => in schema:{(3) ?multiply_1 (FLOAT8), default.customer.c_acctbal (FLOAT8), default.customer.c_custkey (INT4)}
                          SCAN(0) on default.customer
                            => target list: default.customer.c_acctbal (FLOAT8), default.customer.c_custkey (INT4), default.customer.c_acctbal (FLOAT8) * 1.5E7 as ?multiply_1
                            => out schema: {(3) ?multiply_1 (FLOAT8), default.customer.c_acctbal (FLOAT8), default.customer.c_custkey (INT4)}
                            => in schema: {(8) default.customer.c_acctbal (FLOAT8), default.customer.c_address (TEXT), default.customer.c_comment (TEXT), default.customer.c_custkey (INT4), default.customer.c_mktsegment (TEXT), default.customer.c_name (TEXT), default.customer.c_nationkey (INT4), default.customer.c_phone (TEXT)}
        
        Show
        hyunsik Hyunsik Choi added a comment - Aside from this bug, the query statement causes a wrong query plan as follows: explain ------------------------------- TABLE_SUBQUERY(7) as default.a => Targets: default.a.c_custkey (INT4), default.a.ret (FLOAT8) => out schema: {(2) default.a.c_custkey (INT4), default.a.ret (FLOAT8)} => in schema: {(2) default.a.c_custkey (INT4), default.a.ret (FLOAT8)} GROUP_BY(11)(c_custkey,ret) => target list: default.#SQ_0.c_custkey (INT4), default.#SQ_0.ret (FLOAT8) => out schema:{(2) default.#SQ_0.c_custkey (INT4), default.#SQ_0.ret (FLOAT8)} => in schema:{(2) default.#SQ_0.c_custkey (INT4), default.#SQ_0.ret (FLOAT8)} TABLE_SUBQUERY(10) as default.#SQ_0 => Targets: default.#SQ_0.c_custkey (INT4), default.#SQ_0.ret (FLOAT8) => out schema: {(2) default.#SQ_0.c_custkey (INT4), default.#SQ_0.ret (FLOAT8)} => in schema: {(2) default.#SQ_0.c_custkey (INT4), default.#SQ_0.ret (FLOAT8)} UNION(6) PROJECTION(5) => Targets: default.customer.c_custkey (INT4), round(?sum_5 (FLOAT8) / CAST (?sum_6 (INT8) AS FLOAT8),4) as ret => out schema: {(2) default.customer.c_custkey (INT4), ret (FLOAT8)} => in schema: {(3) ?sum_5 (FLOAT8), ?sum_6 (INT8), default.customer.c_custkey (INT4)} GROUP_BY(4)(c_custkey) => exprs: (sum(?multiply_4 (FLOAT8)),sum(15000000)) => target list: default.customer.c_custkey (INT4), ?sum_5 (FLOAT8), ?sum_6 (INT8) => out schema:{(3) ?sum_5 (FLOAT8), ?sum_6 (INT8), default.customer.c_custkey (INT4)} => in schema:{(3) ?multiply_4 (FLOAT8), default.customer.c_acctbal (FLOAT8), default.customer.c_custkey (INT4)} SCAN(3) on default.customer => target list: default.customer.c_acctbal (FLOAT8), default.customer.c_custkey (INT4), default.customer.c_acctbal (FLOAT8) * 1.5E7 as ?multiply_4 => out schema: {(3) ?multiply_4 (FLOAT8), default.customer.c_acctbal (FLOAT8), default.customer.c_custkey (INT4)} => in schema: {(8) default.customer.c_acctbal (FLOAT8), default.customer.c_address (TEXT), default.customer.c_comment (TEXT), default.customer.c_custkey (INT4), default.customer.c_mktsegment (TEXT), default.customer.c_name (TEXT), default.customer.c_nationkey (INT4), default.customer.c_phone (TEXT)} PROJECTION(2) => Targets: default.customer.c_custkey (INT4), round(?sum_2 (FLOAT8) / CAST (?sum_3 (INT8) AS FLOAT8),4) as ret => out schema: {(2) default.customer.c_custkey (INT4), ret (FLOAT8)} => in schema: {(3) ?sum_2 (FLOAT8), ?sum_3 (INT8), default.customer.c_custkey (INT4)} GROUP_BY(1)(c_custkey) => exprs: (sum(?multiply_1 (FLOAT8)),sum(15000000)) => target list: default.customer.c_custkey (INT4), ?sum_2 (FLOAT8), ?sum_3 (INT8) => out schema:{(3) ?sum_2 (FLOAT8), ?sum_3 (INT8), default.customer.c_custkey (INT4)} => in schema:{(3) ?multiply_1 (FLOAT8), default.customer.c_acctbal (FLOAT8), default.customer.c_custkey (INT4)} SCAN(0) on default.customer => target list: default.customer.c_acctbal (FLOAT8), default.customer.c_custkey (INT4), default.customer.c_acctbal (FLOAT8) * 1.5E7 as ?multiply_1 => out schema: {(3) ?multiply_1 (FLOAT8), default.customer.c_acctbal (FLOAT8), default.customer.c_custkey (INT4)} => in schema: {(8) default.customer.c_acctbal (FLOAT8), default.customer.c_address (TEXT), default.customer.c_comment (TEXT), default.customer.c_custkey (INT4), default.customer.c_mktsegment (TEXT), default.customer.c_name (TEXT), default.customer.c_nationkey (INT4), default.customer.c_phone (TEXT)}

          People

          • Assignee:
            hyunsik Hyunsik Choi
            Reporter:
            jihoonson Jihoon Son
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development