Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-897

PartitionedTableRewriter is repeated several times with same table.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.9.0
    • Component/s: None
    • Labels:
      None

      Description

      See the title.
      If there is some block which contains partitioned table, PartitionedTableRewriter runs several time. At first time after finding partition path, PartitionedTableRewriter removes partitioned filter condition. So next time all partition is selected for scanning.
      I ran the next query. customer_parts table is partitioned by c_nationkey.

      select a.c_custkey, b.c_custkey from 
       (select c_custkey, c_nationkey from customer_parts where c_nationkey < 0 
       union all 
        select c_custkey, c_nationkey from customer_parts where c_nationkey < 0 
      ) a
      left outer join customer_parts b
      on a.c_custkey = b.c_custkey 
      and a.c_nationkey > 0
      
      =======================================================
      Block Id: eb_1404224996147_0002_000001 [LEAF]
      =======================================================
      
      [Outgoing]
      [q_1404224996147_0002] 1 => 3 (type=HASH_SHUFFLE, key=default.a.c_custkey (INT4), num=32)
      
      TABLE_SUBQUERY(19) as default.a
        => Targets: default.a.c_custkey (INT4) as default.a.c_custkey
        => out schema: {(1) default.a.c_custkey (INT4)}
        => in  schema: {(2) default.a.c_custkey (INT4),default.a.c_nationkey (INT4)}
         PARTITIONS_SCAN(16) on default.customer_parts
           => target list: default.customer_parts.c_custkey (INT4), default.customer_parts.c_nationkey (INT4)
           => num of filtered paths: 5
           => out schema: {(2) default.customer_parts.c_custkey (INT4),default.customer_parts.c_nationkey (INT4)}
           => in schema: {(7) default.customer_parts.c_custkey (INT4),default.customer_parts.c_name (TEXT),default.customer_parts.c_address (TEXT),default.customer_parts.c_phone (TEXT),default.customer_parts.c_acctbal (FLOAT8),default.customer_parts.c_mktsegment (TEXT),default.customer_parts.c_comment (TEXT)}
           => 0: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=1
           => 1: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=13
           => 2: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=15
           => 3: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=3
           => 4: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=4
      
      =======================================================
      Block Id: eb_1404224996147_0002_000002 [LEAF]
      =======================================================
      
      [Outgoing]
      [q_1404224996147_0002] 2 => 3 (type=HASH_SHUFFLE, key=default.a.c_custkey (INT4), num=32)
      
      TABLE_SUBQUERY(20) as default.a
        => Targets: default.a.c_custkey (INT4)
        => out schema: {(1) default.a.c_custkey (INT4)}
        => in  schema: {(2) default.a.c_custkey (INT4),default.a.c_nationkey (INT4)}
         PARTITIONS_SCAN(17) on default.customer_parts
           => target list: default.customer_parts.c_custkey (INT4), default.customer_parts.c_nationkey (INT4)
           => num of filtered paths: 5
           => out schema: {(2) default.customer_parts.c_custkey (INT4),default.customer_parts.c_nationkey (INT4)}
           => in schema: {(7) default.customer_parts.c_custkey (INT4),default.customer_parts.c_name (TEXT),default.customer_parts.c_address (TEXT),default.customer_parts.c_phone (TEXT),default.customer_parts.c_acctbal (FLOAT8),default.customer_parts.c_mktsegment (TEXT),default.customer_parts.c_comment (TEXT)}
           => 0: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=1
           => 1: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=13
           => 2: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=15
           => 3: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=3
           => 4: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=4
      
      =======================================================
      Block Id: eb_1404224996147_0002_000004 [LEAF]
      =======================================================
      
      [Outgoing]
      [q_1404224996147_0002] 4 => 3 (type=HASH_SHUFFLE, key=default.b.c_custkey (INT4), num=32)
      
      PARTITIONS_SCAN(15) on default.customer_parts
        => target list: default.b.c_custkey (INT4)
        => num of filtered paths: 5
        => out schema: {(1) default.b.c_custkey (INT4)}
        => in schema: {(7) default.b.c_custkey (INT4),default.b.c_name (TEXT),default.b.c_address (TEXT),default.b.c_phone (TEXT),default.b.c_acctbal (FLOAT8),default.b.c_mktsegment (TEXT),default.b.c_comment (TEXT)}
        => 0: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=1
        => 1: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=13
        => 2: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=15
        => 3: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=3
        => 4: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=4
      
      =======================================================
      Block Id: eb_1404224996147_0002_000003 [ROOT]
      =======================================================
      

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user babokim opened a pull request:

        https://github.com/apache/tajo/pull/54

        TAJO-897: PartitionedTableRewriter is repeated several times with same table.

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/babokim/tajo TAJO-897

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/54.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #54


        commit e676c8fabe69caf62112d8a23f2325de8c14f1d4
        Author: 김형준 <babokim@babokim-macbook-pro.local>
        Date: 2014-07-03T18:09:36Z

        TAJO-897: PartitionedTableRewriter is repeated several times with same table.


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user babokim opened a pull request: https://github.com/apache/tajo/pull/54 TAJO-897 : PartitionedTableRewriter is repeated several times with same table. You can merge this pull request into a Git repository by running: $ git pull https://github.com/babokim/tajo TAJO-897 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/54.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #54 commit e676c8fabe69caf62112d8a23f2325de8c14f1d4 Author: 김형준 <babokim@babokim-macbook-pro.local> Date: 2014-07-03T18:09:36Z TAJO-897 : PartitionedTableRewriter is repeated several times with same table.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/54#issuecomment-48136430

        +1

        The patch looks good to me. It's nice finding.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/54#issuecomment-48136430 +1 The patch looks good to me. It's nice finding.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/54

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/54
        Hide
        hyunsik Hyunsik Choi added a comment -

        committed to master branch.

        Show
        hyunsik Hyunsik Choi added a comment - committed to master branch.
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #275 (See https://builds.apache.org/job/Tajo-master-build/275/)
        TAJO-897: PartitionedTableRewriter is repeated several times with same table. (Hyoungjun Kim via hyunsik) (hyunsik: rev 844ffd7d2428209292e41dadbfed19ee03c37deb)

        • CHANGES
        • tajo-core/src/test/java/org/apache/tajo/engine/query/TestJoinOnPartitionedTables.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #275 (See https://builds.apache.org/job/Tajo-master-build/275/ ) TAJO-897 : PartitionedTableRewriter is repeated several times with same table. (Hyoungjun Kim via hyunsik) (hyunsik: rev 844ffd7d2428209292e41dadbfed19ee03c37deb) CHANGES tajo-core/src/test/java/org/apache/tajo/engine/query/TestJoinOnPartitionedTables.java tajo-core/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java

          People

          • Assignee:
            hjkim Hyoungjun Kim
            Reporter:
            hjkim Hyoungjun Kim
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development