Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1403

Improve 'Simple Query' with only partition columns and constant values

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11.0
    • Component/s: None
    • Labels:
      None

      Description

      Tajo shows a very fast response for a simple query ( https://cwiki.apache.org/confluence/display/TAJO/Simple+Query+and+Forwarded+Query) like the followings.

      select * from t1 limit 10;
      

      However, in many cases, tables have partitions.

      create external table t1(id int) using csv with ('csvfile.delimiter'='|') partition by column(dt text) location '/data';
      select * from t1 where dt='2015-03-15' limit 10;
      

      If all predicates in WHERE consist of partition columns and 'EQUAL' predicates with constant values, I think Tajo can handle these cases very fast.

      This kind of queries is very popular for DevOps users and simple ETL apps.

      1. TAJO-1403_jihoon.patch
        13 kB
        Jihoon Son
      2. TAJO-1403.patch
        12 kB
        Dongjoon Hyun

        Activity

        Hide
        hyunsik Hyunsik Choi added a comment -

        It's a good idea. I think that we can put this feature to Tajo soon.

        Show
        hyunsik Hyunsik Choi added a comment - It's a good idea. I think that we can put this feature to Tajo soon.
        Hide
        dongjoon Dongjoon Hyun added a comment -

        Thank you! I will resolve this issue soon.

        Show
        dongjoon Dongjoon Hyun added a comment - Thank you! I will resolve this issue soon.
        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user dongjoon-hyun opened a pull request:

        https://github.com/apache/tajo/pull/431

        TAJO-1403: Improve 'Simple Query' with only partition columns and consta...

        Please see https://issues.apache.org/jira/browse/TAJO-1403.

        Tested by the following command.

        ```
        mvn clean install -Pparallel-test,hcatalog-0.12.0 -DLOG_LEVEL=ERROR -Dmaven.fork.count=2


        Tests run: 1376, Failures: 0, Errors: 0, Skipped: 0
        ```

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/dongjoon-hyun/tajo TAJO-1403

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/431.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #431


        commit 50f1751b97dacec30c98b7c3c3cd74f9f3151a09
        Author: Dongjoon Hyun <dongjoon@apache.org>
        Date: 2015-03-18T06:41:51Z

        TAJO-1403: Improve 'Simple Query' with only partition columns and constant values


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/tajo/pull/431 TAJO-1403 : Improve 'Simple Query' with only partition columns and consta... Please see https://issues.apache.org/jira/browse/TAJO-1403 . Tested by the following command. ``` mvn clean install -Pparallel-test,hcatalog-0.12.0 -DLOG_LEVEL=ERROR -Dmaven.fork.count=2 Tests run: 1376, Failures: 0, Errors: 0, Skipped: 0 ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/tajo TAJO-1403 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/431.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #431 commit 50f1751b97dacec30c98b7c3c3cd74f9f3151a09 Author: Dongjoon Hyun <dongjoon@apache.org> Date: 2015-03-18T06:41:51Z TAJO-1403 : Improve 'Simple Query' with only partition columns and constant values
        Hide
        tajoqa Tajo QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12705288/TAJO-1403.patch
        against master revision release-0.9.0-rc0-206-g82d44af.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The applied patch does not increase the total number of javadoc warnings.

        +1 checkstyle. The patch generated 0 code style errors.

        -1 findbugs. The patch appears to introduce 58 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in tajo-core tajo-plan tajo-storage/tajo-storage-hdfs:
        org.apache.tajo.engine.query.TestCaseByCases

        Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/618//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/618//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/618//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-plan.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/618//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-storage-hdfs.html
        Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/618//console

        This message is automatically generated.

        Show
        tajoqa Tajo QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705288/TAJO-1403.patch against master revision release-0.9.0-rc0-206-g82d44af. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The applied patch does not increase the total number of javadoc warnings. +1 checkstyle. The patch generated 0 code style errors. -1 findbugs. The patch appears to introduce 58 new Findbugs (version 2.0.3) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in tajo-core tajo-plan tajo-storage/tajo-storage-hdfs: org.apache.tajo.engine.query.TestCaseByCases Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/618//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/618//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/618//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-plan.html Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/618//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-storage-hdfs.html Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/618//console This message is automatically generated.
        Hide
        jihoonson Jihoon Son added a comment -

        Jenkins test is failed.

        Show
        jihoonson Jihoon Son added a comment - Jenkins test is failed.
        Hide
        dongjoon Dongjoon Hyun added a comment -

        Thank you for your review, Jihoon Son.

        As I mentioned in my pull request. I tested like this. (Travis CI maven command)

        mvn clean install -Pparallel-test,hcatalog-0.12.0 -DLOG_LEVEL=ERROR -Dmaven.fork.count=2
        -----
        Tests run: 1376, Failures: 0, Errors: 0, Skipped: 0
        

        After your review, I checked again. But, I can not regenerate the failure. At this time, I used the Jenkins maven command to get the same result with Jenkins.

        $ mvn clean test -Phcatalog-0.12.0 -DLOG_LEVEL=ERROR > mvn_clean_test_Phcatalog-0.12.0
        ------
        Tests run: 1376, Failures: 0, Errors: 0, Skipped: 0
        

        Here is my new test result text file.

        https://app.box.com/s/zf589fb3o03rhmchk1y70kvu1fdeawoz

        I'm not sure about the reason. Have you ever experienced something like this?
        In my opinion, could you pull and check in your environment, please?

        Regards,
        Dongjoon.

        Show
        dongjoon Dongjoon Hyun added a comment - Thank you for your review, Jihoon Son . As I mentioned in my pull request. I tested like this. (Travis CI maven command) mvn clean install -Pparallel-test,hcatalog-0.12.0 -DLOG_LEVEL=ERROR -Dmaven.fork.count=2 ----- Tests run: 1376, Failures: 0, Errors: 0, Skipped: 0 After your review, I checked again. But, I can not regenerate the failure. At this time, I used the Jenkins maven command to get the same result with Jenkins. $ mvn clean test -Phcatalog-0.12.0 -DLOG_LEVEL=ERROR > mvn_clean_test_Phcatalog-0.12.0 ------ Tests run: 1376, Failures: 0, Errors: 0, Skipped: 0 Here is my new test result text file. https://app.box.com/s/zf589fb3o03rhmchk1y70kvu1fdeawoz I'm not sure about the reason. Have you ever experienced something like this? In my opinion, could you pull and check in your environment, please? Regards, Dongjoon.
        Hide
        jihoonson Jihoon Son added a comment -

        Dongjoon Hyun, thanks for sharing your test result.
        I also tested your patch with my laptop. And, as you said, all unit tests were successfully passed.
        I think that the test failure is caused by the Jenkins' misbehavior, but would like to check again.
        Would you mind triggering the test again?

        Show
        jihoonson Jihoon Son added a comment - Dongjoon Hyun , thanks for sharing your test result. I also tested your patch with my laptop. And, as you said, all unit tests were successfully passed. I think that the test failure is caused by the Jenkins' misbehavior, but would like to check again. Would you mind triggering the test again?
        Hide
        dongjoon Dongjoon Hyun added a comment -

        Thank you very much for spending your time. Then, I will trigger again and keep my fingers crossed.

        Show
        dongjoon Dongjoon Hyun added a comment - Thank you very much for spending your time. Then, I will trigger again and keep my fingers crossed.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user dongjoon-hyun closed the pull request at:

        https://github.com/apache/tajo/pull/431

        Show
        githubbot ASF GitHub Bot added a comment - Github user dongjoon-hyun closed the pull request at: https://github.com/apache/tajo/pull/431
        Hide
        dongjoon Dongjoon Hyun added a comment -

        To ensure everything, I created a new branch from the current master and am testing again. I will create a new pull request again.
        For archival purpose, old patch is archived still there.
        https://github.com/dongjoon-hyun/tajo/commit/50f1751b97dacec30c98b7c3c3cd74f9f3151a09

        Show
        dongjoon Dongjoon Hyun added a comment - To ensure everything, I created a new branch from the current master and am testing again. I will create a new pull request again. For archival purpose, old patch is archived still there. https://github.com/dongjoon-hyun/tajo/commit/50f1751b97dacec30c98b7c3c3cd74f9f3151a09
        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user dongjoon-hyun opened a pull request:

        https://github.com/apache/tajo/pull/434

        TAJO-1403: Improve 'Simple Query' with only partition columns and constant values

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/dongjoon-hyun/tajo TAJO-1403

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/434.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #434


        commit ea2e948dcd838e42a5f92ce4e954d69265fcfa8d
        Author: Dongjoon Hyun <dongjoon@apache.org>
        Date: 2015-03-18T23:11:05Z

        TAJO-1403: Improve 'Simple Query' with only partition columns and constant values


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/tajo/pull/434 TAJO-1403 : Improve 'Simple Query' with only partition columns and constant values You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/tajo TAJO-1403 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/434.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #434 commit ea2e948dcd838e42a5f92ce4e954d69265fcfa8d Author: Dongjoon Hyun <dongjoon@apache.org> Date: 2015-03-18T23:11:05Z TAJO-1403 : Improve 'Simple Query' with only partition columns and constant values
        Hide
        dongjoon Dongjoon Hyun added a comment -

        I sent a new pull request with the same patch, Jihoon Son.

        Show
        dongjoon Dongjoon Hyun added a comment - I sent a new pull request with the same patch, Jihoon Son .
        Hide
        dongjoon Dongjoon Hyun added a comment -

        The same patch based on master "a9ae3ca TAJO-1337: Implements common modules to handle RESTful API".

        Show
        dongjoon Dongjoon Hyun added a comment - The same patch based on master "a9ae3ca TAJO-1337 : Implements common modules to handle RESTful API".
        Hide
        tajoqa Tajo QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12705476/TAJO-1403.patch
        against master revision release-0.9.0-rc0-207-ga9ae3ca.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The applied patch does not increase the total number of javadoc warnings.

        +1 checkstyle. The patch generated 0 code style errors.

        -1 findbugs. The patch appears to introduce 58 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in tajo-core tajo-plan tajo-storage/tajo-storage-hdfs.

        Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/621//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/621//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-plan.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/621//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/621//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-storage-hdfs.html
        Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/621//console

        This message is automatically generated.

        Show
        tajoqa Tajo QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705476/TAJO-1403.patch against master revision release-0.9.0-rc0-207-ga9ae3ca. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The applied patch does not increase the total number of javadoc warnings. +1 checkstyle. The patch generated 0 code style errors. -1 findbugs. The patch appears to introduce 58 new Findbugs (version 2.0.3) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in tajo-core tajo-plan tajo-storage/tajo-storage-hdfs. Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/621//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/621//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-plan.html Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/621//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/621//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-storage-hdfs.html Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/621//console This message is automatically generated.
        Hide
        jihoonson Jihoon Son added a comment -

        Jenkins test seems to be back to normal. I'll review tonight.

        Show
        jihoonson Jihoon Son added a comment - Jenkins test seems to be back to normal. I'll review tonight.
        Hide
        dongjoon Dongjoon Hyun added a comment -

        Thank you for your keeping attention!

        Show
        dongjoon Dongjoon Hyun added a comment - Thank you for your keeping attention!
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/434#issuecomment-83973176

        I'll review this weekend.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/434#issuecomment-83973176 I'll review this weekend.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user dongjoon-hyun commented on the pull request:

        https://github.com/apache/tajo/pull/434#issuecomment-83982054

        Finally! Thank you, @jihoonson . I'm working this weekend also, Please ask anything if you need.

        Show
        githubbot ASF GitHub Bot added a comment - Github user dongjoon-hyun commented on the pull request: https://github.com/apache/tajo/pull/434#issuecomment-83982054 Finally! Thank you, @jihoonson . I'm working this weekend also, Please ask anything if you need.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/434#discussion_r26903451

        — Diff: tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileStorageManager.java —
        @@ -722,8 +737,18 @@ public void purgeTable(TableDesc tableDesc) throws IOException {

        List<FileStatus> nonZeroLengthFiles = new ArrayList<FileStatus>();
        if (fs.exists(tablePath)) {

        • getNonZeroLengthDataFiles(fs, tablePath, nonZeroLengthFiles, currentPage, numResultFragments,
        • new AtomicInteger(0), tableDesc.hasPartition(), 0, partitionDepth);
          + if (!partitionPath.isEmpty())
          + {
            • End diff –

        Would you move this parenthesis according to our code convention?

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/434#discussion_r26903451 — Diff: tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileStorageManager.java — @@ -722,8 +737,18 @@ public void purgeTable(TableDesc tableDesc) throws IOException { List<FileStatus> nonZeroLengthFiles = new ArrayList<FileStatus>(); if (fs.exists(tablePath)) { getNonZeroLengthDataFiles(fs, tablePath, nonZeroLengthFiles, currentPage, numResultFragments, new AtomicInteger(0), tableDesc.hasPartition(), 0, partitionDepth); + if (!partitionPath.isEmpty()) + { End diff – Would you move this parenthesis according to our code convention?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user dongjoon-hyun commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/434#discussion_r26903467

        — Diff: tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileStorageManager.java —
        @@ -722,8 +737,18 @@ public void purgeTable(TableDesc tableDesc) throws IOException {

        List<FileStatus> nonZeroLengthFiles = new ArrayList<FileStatus>();
        if (fs.exists(tablePath)) {

        • getNonZeroLengthDataFiles(fs, tablePath, nonZeroLengthFiles, currentPage, numResultFragments,
        • new AtomicInteger(0), tableDesc.hasPartition(), 0, partitionDepth);
          + if (!partitionPath.isEmpty())
          + {
            • End diff –

        Oh, Sure.

        Show
        githubbot ASF GitHub Bot added a comment - Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/tajo/pull/434#discussion_r26903467 — Diff: tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileStorageManager.java — @@ -722,8 +737,18 @@ public void purgeTable(TableDesc tableDesc) throws IOException { List<FileStatus> nonZeroLengthFiles = new ArrayList<FileStatus>(); if (fs.exists(tablePath)) { getNonZeroLengthDataFiles(fs, tablePath, nonZeroLengthFiles, currentPage, numResultFragments, new AtomicInteger(0), tableDesc.hasPartition(), 0, partitionDepth); + if (!partitionPath.isEmpty()) + { End diff – Oh, Sure.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/434#discussion_r26903480

        — Diff: tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java —
        @@ -133,11 +134,90 @@ public static boolean checkIfSimpleQuery(LogicalPlan plan) {
        }
        }
        }
        +
        + if (!noWhere && scanNode.getTableDesc().isExternal() && scanNode.getTableDesc().getPartitionMethod() != null) {
        — End diff –

        A managed table can also be partitioned.
        Would you explain why you added the isExternal() condition?

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/434#discussion_r26903480 — Diff: tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java — @@ -133,11 +134,90 @@ public static boolean checkIfSimpleQuery(LogicalPlan plan) { } } } + + if (!noWhere && scanNode.getTableDesc().isExternal() && scanNode.getTableDesc().getPartitionMethod() != null) { — End diff – A managed table can also be partitioned. Would you explain why you added the isExternal() condition?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user dongjoon-hyun commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/434#discussion_r26903503

        — Diff: tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java —
        @@ -133,11 +134,90 @@ public static boolean checkIfSimpleQuery(LogicalPlan plan) {
        }
        }
        }
        +
        + if (!noWhere && scanNode.getTableDesc().isExternal() && scanNode.getTableDesc().getPartitionMethod() != null) {
        — End diff –

        Yes, I developed handle both, but I found that Tajo does not support partitioned insert well.
        So, I created the following issue, and change to handle external table only.
        https://issues.apache.org/jira/browse/TAJO-1416

        Show
        githubbot ASF GitHub Bot added a comment - Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/tajo/pull/434#discussion_r26903503 — Diff: tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java — @@ -133,11 +134,90 @@ public static boolean checkIfSimpleQuery(LogicalPlan plan) { } } } + + if (!noWhere && scanNode.getTableDesc().isExternal() && scanNode.getTableDesc().getPartitionMethod() != null) { — End diff – Yes, I developed handle both, but I found that Tajo does not support partitioned insert well. So, I created the following issue, and change to handle external table only. https://issues.apache.org/jira/browse/TAJO-1416
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user dongjoon-hyun commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/434#discussion_r26903529

        — Diff: tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java —
        @@ -133,11 +134,90 @@ public static boolean checkIfSimpleQuery(LogicalPlan plan) {
        }
        }
        }
        +
        + if (!noWhere && scanNode.getTableDesc().isExternal() && scanNode.getTableDesc().getPartitionMethod() != null) {
        — End diff –

        After fixing that issue, I planned to move to managed table creation.
        Actually, I don't know Tajo internal managed table file structure.
        I'm familiar with Hive only.

        Show
        githubbot ASF GitHub Bot added a comment - Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/tajo/pull/434#discussion_r26903529 — Diff: tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java — @@ -133,11 +134,90 @@ public static boolean checkIfSimpleQuery(LogicalPlan plan) { } } } + + if (!noWhere && scanNode.getTableDesc().isExternal() && scanNode.getTableDesc().getPartitionMethod() != null) { — End diff – After fixing that issue, I planned to move to managed table creation. Actually, I don't know Tajo internal managed table file structure. I'm familiar with Hive only.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/434#issuecomment-84623003

        @dongjoon-hyun, thanks for your nice work.
        In overall, the patch looks good, but I want to check my understanding.
        If the ```WHERE``` clause involves only a subset of partition columns, it seems that every partition directory which are not specified in the ```where``` clause will be read.
        Please correct me if I misunderstand something.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/434#issuecomment-84623003 @dongjoon-hyun, thanks for your nice work. In overall, the patch looks good, but I want to check my understanding. If the ```WHERE``` clause involves only a subset of partition columns, it seems that every partition directory which are not specified in the ```where``` clause will be read. Please correct me if I misunderstand something.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/434#discussion_r26903732

        — Diff: tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java —
        @@ -133,11 +134,90 @@ public static boolean checkIfSimpleQuery(LogicalPlan plan) {
        }
        }
        }
        +
        + if (!noWhere && scanNode.getTableDesc().isExternal() && scanNode.getTableDesc().getPartitionMethod() != null) {
        — End diff –

        Ok. If so, would you add a TODO comment?
        It will be nice if it contains the specific issue number and description.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/434#discussion_r26903732 — Diff: tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java — @@ -133,11 +134,90 @@ public static boolean checkIfSimpleQuery(LogicalPlan plan) { } } } + + if (!noWhere && scanNode.getTableDesc().isExternal() && scanNode.getTableDesc().getPartitionMethod() != null) { — End diff – Ok. If so, would you add a TODO comment? It will be nice if it contains the specific issue number and description.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user dongjoon-hyun commented on the pull request:

        https://github.com/apache/tajo/pull/434#issuecomment-84625233

        You're right, but when I can construct prefix from the subset of partition column, this patch handle that as a simple query.
        /table_name/year=x/day=y/hour=z (full path is simple query)
        /table_name/year=x/day=y (prefix path is constructed, so simple query)
        /table_name/ /day=y/hour=z (year is unknown, this is not simple query. It will be handled in an old fashion)
        /table_name/year=x/ /hour=z (day is unknown, this is not simple query, too.)

        Show
        githubbot ASF GitHub Bot added a comment - Github user dongjoon-hyun commented on the pull request: https://github.com/apache/tajo/pull/434#issuecomment-84625233 You're right, but when I can construct prefix from the subset of partition column, this patch handle that as a simple query. /table_name/year=x/day=y/hour=z (full path is simple query) /table_name/year=x/day=y (prefix path is constructed, so simple query) /table_name/ /day=y/hour=z (year is unknown, this is not simple query. It will be handled in an old fashion) /table_name/year=x/ /hour=z (day is unknown, this is not simple query, too.)
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user dongjoon-hyun commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/434#discussion_r26903782

        — Diff: tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java —
        @@ -133,11 +134,90 @@ public static boolean checkIfSimpleQuery(LogicalPlan plan) {
        }
        }
        }
        +
        + if (!noWhere && scanNode.getTableDesc().isExternal() && scanNode.getTableDesc().getPartitionMethod() != null) {
        — End diff –

        No problem.

        Show
        githubbot ASF GitHub Bot added a comment - Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/tajo/pull/434#discussion_r26903782 — Diff: tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java — @@ -133,11 +134,90 @@ public static boolean checkIfSimpleQuery(LogicalPlan plan) { } } } + + if (!noWhere && scanNode.getTableDesc().isExternal() && scanNode.getTableDesc().getPartitionMethod() != null) { — End diff – No problem.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/434#issuecomment-84626314

        Thanks. It's clear to me.
        I left only trivial comments. When you reflect all of them, I'll give +1.
        Anyway, I'm not pushing you, but do you have any plans to support the last two cases?

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/434#issuecomment-84626314 Thanks. It's clear to me. I left only trivial comments. When you reflect all of them, I'll give +1. Anyway, I'm not pushing you, but do you have any plans to support the last two cases?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user dongjoon-hyun commented on the pull request:

        https://github.com/apache/tajo/pull/434#issuecomment-84631294

        For the last two cases, I thought it will need to eval on each tuple, so I'm not sure it's compatible with Tajo philosophy. Do you have some idea in mind?

        Show
        githubbot ASF GitHub Bot added a comment - Github user dongjoon-hyun commented on the pull request: https://github.com/apache/tajo/pull/434#issuecomment-84631294 For the last two cases, I thought it will need to eval on each tuple, so I'm not sure it's compatible with Tajo philosophy. Do you have some idea in mind?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user dongjoon-hyun commented on the pull request:

        https://github.com/apache/tajo/pull/434#issuecomment-84631757

        I changed the code according your comment. (Indentation and TODO documentation with issues)

        Show
        githubbot ASF GitHub Bot added a comment - Github user dongjoon-hyun commented on the pull request: https://github.com/apache/tajo/pull/434#issuecomment-84631757 I changed the code according your comment. (Indentation and TODO documentation with issues)
        Hide
        dongjoon Dongjoon Hyun added a comment -

        Here, I uploaded new patch, too.

        Show
        dongjoon Dongjoon Hyun added a comment - Here, I uploaded new patch, too.
        Hide
        tajoqa Tajo QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12706397/TAJO-1403.patch
        against master revision release-0.9.0-rc0-213-g3aaff38.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The applied patch does not increase the total number of javadoc warnings.

        +1 checkstyle. The patch generated 0 code style errors.

        -1 findbugs. The patch appears to introduce 4 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in tajo-core tajo-plan tajo-storage/tajo-storage-hdfs.

        Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/636//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/636//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/636//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-storage-hdfs.html
        Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/636//console

        This message is automatically generated.

        Show
        tajoqa Tajo QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12706397/TAJO-1403.patch against master revision release-0.9.0-rc0-213-g3aaff38. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The applied patch does not increase the total number of javadoc warnings. +1 checkstyle. The patch generated 0 code style errors. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 2.0.3) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in tajo-core tajo-plan tajo-storage/tajo-storage-hdfs. Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/636//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/636//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/636//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-storage-hdfs.html Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/636//console This message is automatically generated.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/434#issuecomment-84746926

        I think that it is possible to prune unnecessary partitions for the last two cases without hurting our philosophy as follows.
        When visiting a partition directory,

        • if the current directory is leaf, read all files.
        • otherwise,
        • if some conditions are given for the current partition, visit only qualified sub-directory.
        • otherwise, visit all sub-directory.

        Anyway, here is my +1. I'll commit it.
        However, I missed one more comment. EvalTreeUtil is the more proper class for checkIfPartitionSelection() and getPartitionValue() functions rather than PlannerUtil because every function related to EvalNode is located in that class.
        If you don't mind, I'll move those functions to EvalTreeUtil before commit.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/434#issuecomment-84746926 I think that it is possible to prune unnecessary partitions for the last two cases without hurting our philosophy as follows. When visiting a partition directory, if the current directory is leaf, read all files. otherwise, if some conditions are given for the current partition, visit only qualified sub-directory. otherwise, visit all sub-directory. Anyway, here is my +1. I'll commit it. However, I missed one more comment. EvalTreeUtil is the more proper class for checkIfPartitionSelection() and getPartitionValue() functions rather than PlannerUtil because every function related to EvalNode is located in that class. If you don't mind, I'll move those functions to EvalTreeUtil before commit.
        Hide
        jihoonson Jihoon Son added a comment -

        I just moved checkIfPartitionSelection() and getPartitionValue() functions from PlannerUtil to EvalTreeUtil.

        Show
        jihoonson Jihoon Son added a comment - I just moved checkIfPartitionSelection() and getPartitionValue() functions from PlannerUtil to EvalTreeUtil.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/434

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/434
        Hide
        tajoqa Tajo QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12706435/TAJO-1403_jihoon.patch
        against master revision release-0.9.0-rc0-214-g8d0146b.

        -1 patch. The patch command could not apply the patch.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/640//console

        This message is automatically generated.

        Show
        tajoqa Tajo QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12706435/TAJO-1403_jihoon.patch against master revision release-0.9.0-rc0-214-g8d0146b. -1 patch. The patch command could not apply the patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/640//console This message is automatically generated.
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #625 (See https://builds.apache.org/job/Tajo-master-build/625/)
        TAJO-1403: Improve 'Simple Query' with only partition columns and constant values. (jihoonson: rev 8d0146b8d8f5eeac37fe3f531ce1362af3b20c2f)

        • tajo-core/src/main/java/org/apache/tajo/master/exec/QueryExecutor.java
        • tajo-core/src/main/java/org/apache/tajo/master/exec/NonForwardQueryResultFileScanner.java
        • tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileStorageManager.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/expr/EvalTreeUtil.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java
        • CHANGES
        • tajo-core/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #625 (See https://builds.apache.org/job/Tajo-master-build/625/ ) TAJO-1403 : Improve 'Simple Query' with only partition columns and constant values. (jihoonson: rev 8d0146b8d8f5eeac37fe3f531ce1362af3b20c2f) tajo-core/src/main/java/org/apache/tajo/master/exec/QueryExecutor.java tajo-core/src/main/java/org/apache/tajo/master/exec/NonForwardQueryResultFileScanner.java tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileStorageManager.java tajo-plan/src/main/java/org/apache/tajo/plan/expr/EvalTreeUtil.java tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java CHANGES tajo-core/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-CODEGEN-build #263 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/263/)
        TAJO-1403: Improve 'Simple Query' with only partition columns and constant values. (jihoonson: rev 8d0146b8d8f5eeac37fe3f531ce1362af3b20c2f)

        • tajo-core/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java
        • tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileStorageManager.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/expr/EvalTreeUtil.java
        • tajo-core/src/main/java/org/apache/tajo/master/exec/QueryExecutor.java
        • tajo-core/src/main/java/org/apache/tajo/master/exec/NonForwardQueryResultFileScanner.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java
        • CHANGES
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-CODEGEN-build #263 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/263/ ) TAJO-1403 : Improve 'Simple Query' with only partition columns and constant values. (jihoonson: rev 8d0146b8d8f5eeac37fe3f531ce1362af3b20c2f) tajo-core/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileStorageManager.java tajo-plan/src/main/java/org/apache/tajo/plan/expr/EvalTreeUtil.java tajo-core/src/main/java/org/apache/tajo/master/exec/QueryExecutor.java tajo-core/src/main/java/org/apache/tajo/master/exec/NonForwardQueryResultFileScanner.java tajo-plan/src/main/java/org/apache/tajo/plan/util/PlannerUtil.java CHANGES
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user dongjoon-hyun commented on the pull request:

        https://github.com/apache/tajo/pull/434#issuecomment-84809864

        Sure, please do that. I will learn from that. And when improving this further, I will start from there.
        Thank you for your kind leading, @jihoonson

        Show
        githubbot ASF GitHub Bot added a comment - Github user dongjoon-hyun commented on the pull request: https://github.com/apache/tajo/pull/434#issuecomment-84809864 Sure, please do that. I will learn from that. And when improving this further, I will start from there. Thank you for your kind leading, @jihoonson
        Hide
        jihoonson Jihoon Son added a comment -

        I put my patch on this Jira issue.
        Thanks for your contribution!

        Show
        jihoonson Jihoon Son added a comment - I put my patch on this Jira issue. Thanks for your contribution!
        Hide
        jihoonson Jihoon Son added a comment -

        Hi Dongjoon Hyun.
        I made a mistake during merging your patch into master.
        You can see that the author of the commit is recorded as my name in the git log.
        However, your contribution is recorded at the change log.
        Truly appologize my mistake.

        Show
        jihoonson Jihoon Son added a comment - Hi Dongjoon Hyun . I made a mistake during merging your patch into master. You can see that the author of the commit is recorded as my name in the git log. However, your contribution is recorded at the change log. Truly appologize my mistake.
        Hide
        dongjoon Dongjoon Hyun added a comment -

        Jihoon Son, that's all right. No problem at all for me.

        Show
        dongjoon Dongjoon Hyun added a comment - Jihoon Son , that's all right. No problem at all for me.

          People

          • Assignee:
            dongjoon Dongjoon Hyun
            Reporter:
            dongjoon Dongjoon Hyun
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development