Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-574

Add a sort-based physical executor for column partition store

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: Physical Operator
    • Labels:
      None

      Description

      ColumnPartitionStoreExec keeps numerous open files while it is storing all data. In addition, it's random write gives burden to HDFS namenode.

      To solve this problem, I would like to propose a sort-based physical executor for column partition store. It assumes that input tuples are sorted in an ascending or descending order of partition keys. It means that it needs extra sort operation. But, it opens only one file simultaneously. It writes all data sequentially. In many cases, it would be the best choice for column partition store.

      1. TAJO-574.patch
        51 kB
        Hyunsik Choi

        Activity

        Hide
        hyunsik Hyunsik Choi added a comment -

        Created a review request against branch master in reviewboard
        https://reviews.apache.org/r/17633/

        Show
        hyunsik Hyunsik Choi added a comment - Created a review request against branch master in reviewboard https://reviews.apache.org/r/17633/
        Hide
        tajoqa Tajo QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12626448/TAJO-574.patch
        against master revision d516fc4.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The applied patch does not increase the total number of javadoc warnings.

        +1 checkstyle. The patch generated 0 code style errors.

        -1 findbugs. The patch appears to introduce 204 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in tajo-catalog/tajo-catalog-common tajo-core/tajo-core-backend.

        Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/95//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/95//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-catalog-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/95//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core-backend.html
        Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/95//console

        This message is automatically generated.

        Show
        tajoqa Tajo QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12626448/TAJO-574.patch against master revision d516fc4. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The applied patch does not increase the total number of javadoc warnings. +1 checkstyle. The patch generated 0 code style errors. -1 findbugs. The patch appears to introduce 204 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in tajo-catalog/tajo-catalog-common tajo-core/tajo-core-backend. Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/95//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/95//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-catalog-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/95//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core-backend.html Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/95//console This message is automatically generated.
        Hide
        hyunsik Hyunsik Choi added a comment -

        This issue got +1 on RB. committed it to master branch.

        Show
        hyunsik Hyunsik Choi added a comment - This issue got +1 on RB. committed it to master branch.
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #48 (See https://builds.apache.org/job/Tajo-master-build/48/)
        TAJO-574: Add a sort-based physical executor for column partition store. (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=10c599f4b057308eca7ac8d5d7cc2542a69f0524)

        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/HashBasedColPartitionStoreExec.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/ColumnPartitionedTableStoreExec.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/ColPartitionStoreExec.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/enforce/Enforcer.java
        • CHANGES.txt
        • tajo-core/tajo-core-backend/src/main/proto/TajoWorkerProtocol.proto
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/PhysicalExecutorVisitor.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/BasicPhysicalExecutorVisitor.java
        • tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/statistics/StatisticsUtil.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SortBasedColPartitionStoreExec.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestPhysicalPlanner.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #48 (See https://builds.apache.org/job/Tajo-master-build/48/ ) TAJO-574 : Add a sort-based physical executor for column partition store. (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=10c599f4b057308eca7ac8d5d7cc2542a69f0524 ) tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/HashBasedColPartitionStoreExec.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/ColumnPartitionedTableStoreExec.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/ColPartitionStoreExec.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/enforce/Enforcer.java CHANGES.txt tajo-core/tajo-core-backend/src/main/proto/TajoWorkerProtocol.proto tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/PhysicalExecutorVisitor.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/BasicPhysicalExecutorVisitor.java tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/statistics/StatisticsUtil.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SortBasedColPartitionStoreExec.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestPhysicalPlanner.java

          People

          • Assignee:
            hyunsik Hyunsik Choi
            Reporter:
            hyunsik Hyunsik Choi
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development