Tajo
  1. Tajo
  2. TAJO-316

Improve GreedyHeuristicJoinOrderAlgorithm to deal with non-commutative joins

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8
    • Component/s: planner/optimizer
    • Labels:

      Description

      GreedyHeuristicJoinOrderAlgorithm is a default cost-based join order algorithm. It is designed only for inner joins, and it cannot deal with non-commutative joins, such as left/right/full outer join and semi/anti join. The main goal of this issue is to improve this algorithm to deal with them.

      1. TAJO-316_2.patch
        93 kB
        Hyunsik Choi
      2. TAJO-316.patch
        92 kB
        Hyunsik Choi

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Tajo-trunk-postcommit #589 (See https://builds.apache.org/job/Tajo-trunk-postcommit/589/)
          TAJO-316: Improve GreedyHeuristicJoinOrderAlgorithm to deal with non-commutative joins. (hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=39fe4d765f84af5094387d3a93399c1a0a9fef10)

          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java
          • CHANGES.txt
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/join/GreedyHeuristicJoinOrderAlgorithm.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/join/JoinOrderAlgorithm.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java
          • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestInsertQuery.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/join/FoundJoinOrder.java
          • tajo-core/tajo-core-backend/src/test/tpch/customer.tbl
          • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestTableSubQuery.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
          • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/join/JoinEdge.java
          • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestJoinQuery.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/join/JoinGraph.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/benchmark/TPCH.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/ExecutionBlockCursor.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/TableSubQueryNode.java
          • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestSelectQuery.java
          • tajo-core/tajo-core-backend/benchmark/tpch/customer.schema
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/BasicLogicalPlanVisitor.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/LimitNode.java
          • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalOptimizer.java
          Show
          Hudson added a comment - SUCCESS: Integrated in Tajo-trunk-postcommit #589 (See https://builds.apache.org/job/Tajo-trunk-postcommit/589/ ) TAJO-316 : Improve GreedyHeuristicJoinOrderAlgorithm to deal with non-commutative joins. (hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=39fe4d765f84af5094387d3a93399c1a0a9fef10 ) tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java CHANGES.txt tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/join/GreedyHeuristicJoinOrderAlgorithm.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/join/JoinOrderAlgorithm.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestInsertQuery.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/join/FoundJoinOrder.java tajo-core/tajo-core-backend/src/test/tpch/customer.tbl tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestTableSubQuery.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/join/JoinEdge.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestJoinQuery.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/join/JoinGraph.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/benchmark/TPCH.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/ExecutionBlockCursor.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/TableSubQueryNode.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestSelectQuery.java tajo-core/tajo-core-backend/benchmark/tpch/customer.schema tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/BasicLogicalPlanVisitor.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/LimitNode.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalOptimizer.java
          Hide
          Hyunsik Choi added a comment -

          committed this patch to master. Thank you for the review.

          Show
          Hyunsik Choi added a comment - committed this patch to master. Thank you for the review.
          Hide
          Hyunsik Choi added a comment -

          Thank you for your review. You are right. They cannot be separate because the current global planner is based on the assumption that join tree is left-deep. So, the current global planner couldn't deal with new join plans. And, I'll remove unused imports.

          If there is no objection until tomorrow, I'll commit it. Thanks.

          Show
          Hyunsik Choi added a comment - Thank you for your review. You are right. They cannot be separate because the current global planner is based on the assumption that join tree is left-deep. So, the current global planner couldn't deal with new join plans. And, I'll remove unused imports. If there is no objection until tomorrow, I'll commit it. Thanks.
          Hide
          Jihoon Son added a comment -

          This work looks great as a prototype of the join order optimization for the distributed join.

          This patch contains some changes of global planning of other operators such as sort as well as the join ordering.
          Are these changes just a refactoring of the global planner?
          I think that it would be better that the refactoring is proceeded in a new issue, but it looks hard to be separated.

          Other parts of the patch looks good to me.
          Here is my +1.
          Please remove unused imports from some tests before you commit.

          Show
          Jihoon Son added a comment - This work looks great as a prototype of the join order optimization for the distributed join. This patch contains some changes of global planning of other operators such as sort as well as the join ordering. Are these changes just a refactoring of the global planner? I think that it would be better that the refactoring is proceeded in a new issue, but it looks hard to be separated. Other parts of the patch looks good to me. Here is my +1. Please remove unused imports from some tests before you commit.
          Hide
          Jihoon Son added a comment -

          I'll review this patch at tonight.

          Show
          Jihoon Son added a comment - I'll review this patch at tonight.
          Hide
          Hyunsik Choi added a comment -

          I attach the second patch. This patch fixes additionally TAJO-340.

          Show
          Hyunsik Choi added a comment - I attach the second patch. This patch fixes additionally TAJO-340 .
          Hide
          Jihoon Son added a comment -

          Great work!
          When you post the next patch, I'll review it.

          Show
          Jihoon Son added a comment - Great work! When you post the next patch, I'll review it.
          Hide
          Hyunsik Choi added a comment -

          I've uploaded the first patch.

          This patch does as follows:

          • Improve GreedyHeuristicJoinOrder to handle non-commutative joins, such as left/right/outer and anti/semi joins
          • Improve GreedyHeuristicJoinOrder to generate a bushy join tree
            • This class is based on the greedy operator order algorithm, whose complexity is O(n^3).
            • It has more possibility to generate more optimal join orders and still is fast.
          • Improve GlobalPlanner to handle a bushy join tree
          • Improve ExecutionBlockCursor to handle a bushy tree distributed execution plan

          This work almost done, but this patch still is in progress for more refactors, test, and documentations. I'll post the completed patch tomorrow.

          Show
          Hyunsik Choi added a comment - I've uploaded the first patch. This patch does as follows: Improve GreedyHeuristicJoinOrder to handle non-commutative joins, such as left/right/outer and anti/semi joins Improve GreedyHeuristicJoinOrder to generate a bushy join tree This class is based on the greedy operator order algorithm, whose complexity is O(n^3). It has more possibility to generate more optimal join orders and still is fast. Improve GlobalPlanner to handle a bushy join tree Improve ExecutionBlockCursor to handle a bushy tree distributed execution plan This work almost done, but this patch still is in progress for more refactors, test, and documentations. I'll post the completed patch tomorrow.

            People

            • Assignee:
              Hyunsik Choi
              Reporter:
              Hyunsik Choi
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development