Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: Catalog
    • Labels:
      None

      Description

      To support table partitioning, Tajo catalog should supports the table partitioning. Each partition entry should include partition table id, partition key ids, partition types (i.e., hash, range, list, and key), partition number, min, max, and hash id.

      1. TAJO-284_10.patch
        116 kB
        Jaehwa Jung
      2. TAJO-284_2.patch
        79 kB
        Jaehwa Jung
      3. TAJO-284_3.patch
        80 kB
        Jaehwa Jung
      4. TAJO-284_4.patch
        81 kB
        Jaehwa Jung
      5. TAJO-284_5.patch
        89 kB
        Jaehwa Jung
      6. TAJO-284_6.patch
        91 kB
        Jaehwa Jung
      7. TAJO-284_7.patch
        112 kB
        Jaehwa Jung
      8. TAJO-284_8.patch
        112 kB
        Jaehwa Jung
      9. TAJO-284_9.patch
        116 kB
        Jaehwa Jung
      10. TAJO-284.patch
        79 kB
        Jaehwa Jung

        Activity

        Hide
        blrunner Jaehwa Jung added a comment -

        If there isn't anyone who wants to resolve it, I want to resolve this issue.

        Show
        blrunner Jaehwa Jung added a comment - If there isn't anyone who wants to resolve it, I want to resolve this issue.
        Hide
        hyunsik Hyunsik Choi added a comment -

        Feel free to assign yourself

        Show
        hyunsik Hyunsik Choi added a comment - Feel free to assign yourself
        Hide
        blrunner Jaehwa Jung added a comment -

        Thanks Hyunsik. I'll start to work.

        Show
        blrunner Jaehwa Jung added a comment - Thanks Hyunsik. I'll start to work.
        Hide
        blrunner Jaehwa Jung added a comment -

        I designed two types of table schema for this issue as follows:

        • master table
          In this case, master table includes informations for table partition. And expressions for partition will be saved by json type.
          • PARTITION
            Column Name Column Type Remark
            TABLE_ID varchar(255)  
            PARTITON_ID int(11)  
            TYPE char(1) 0:HASH, 1:RANGE, 2:LIST, 3:COLUMN
            EXPRESSIONS} text written by json type
        • master table and expressions table
          master table includes informations except expressions. expressions table includes informations for partition key expressions.
          • PARTITION
            Column Name Column Type Remark
            TABLE_ID varchar(255)  
            PARTITON_ID int(11)  
            TYPE char(1) 0:HASH, 1:RANGE, 2:LIST, 3:COLUMN
          • PARTITION_EXPRESSION
            Column Name Column Type Remark
            PARTITON_ID int(11)  
            EXPRESSION_ID int(11)  
            COLUMN_NAME varchar(255)  
            PARTITION_NUMBERS int(11)  
            OPERAND char(1) 0:less than, 1:in
            VALUES varchar(255)  

        How about above design? You are welcome to another opinion.

        Show
        blrunner Jaehwa Jung added a comment - I designed two types of table schema for this issue as follows: master table In this case, master table includes informations for table partition. And expressions for partition will be saved by json type. PARTITION Column Name Column Type Remark TABLE_ID varchar(255)   PARTITON_ID int(11)   TYPE char(1) 0:HASH, 1:RANGE, 2:LIST, 3:COLUMN EXPRESSIONS} text written by json type master table and expressions table master table includes informations except expressions. expressions table includes informations for partition key expressions. PARTITION Column Name Column Type Remark TABLE_ID varchar(255)   PARTITON_ID int(11)   TYPE char(1) 0:HASH, 1:RANGE, 2:LIST, 3:COLUMN PARTITION_EXPRESSION Column Name Column Type Remark PARTITON_ID int(11)   EXPRESSION_ID int(11)   COLUMN_NAME varchar(255)   PARTITION_NUMBERS int(11)   OPERAND char(1) 0:less than, 1:in VALUES varchar(255)   How about above design? You are welcome to another opinion.
        Hide
        hyunsik Hyunsik Choi added a comment -

        Here are comments for both design:

        • The name PARTITIONS would be better than PARTITION.
        • TID is a primary key of TABLES. So, TID would be better than TABLE_ID.
        • partition name is missing. Each hash and range partition can have its partition name.

        I prefer the first design because it looks more scalable. As you know, we need to consider more than 10 million partitions. The second design involves join operation. It may be less scalable than the first one.

        Additionally, we need to consider access pattern from a given query. Given a query with some filter conditions. Catalog will find matched partitions with filter condition. Especially, EXPRESSIONS field should be efficiently searchable from some range filter.

        Show
        hyunsik Hyunsik Choi added a comment - Here are comments for both design: The name PARTITIONS would be better than PARTITION. TID is a primary key of TABLES. So, TID would be better than TABLE_ID. partition name is missing. Each hash and range partition can have its partition name. I prefer the first design because it looks more scalable. As you know, we need to consider more than 10 million partitions. The second design involves join operation. It may be less scalable than the first one. Additionally, we need to consider access pattern from a given query. Given a query with some filter conditions. Catalog will find matched partitions with filter condition. Especially, EXPRESSIONS field should be efficiently searchable from some range filter.
        Hide
        blrunner Jaehwa Jung added a comment -

        Thanks Hyunsik.

        I agree with your opinion. I think that a given query will be more useful better than structured columns.
        So, I modified the design as follows:

        PARTITION
        Column Name Column Type Remark
        PARTITION_ID int(11)  
        PARTITION_NAME varchar(255)  
        TID int(11)  
        TYPE char(1) 0:HASH, 1:RANGE, 2:LIST, 3:COLUMN, and so on
        EXPRESSIONS text a partition definition phase at a given query


        A range partition sample is as follows:

        Given query
        CREATE TABLE sales ( col1 int, col2 int)
        PARTITION BY RANGE (col1)
         (
          PARTITION col1 VALUES LESS THAN (2),
          PARTITION col1 VALUES LESS THAN (5),
          PARTITION col1 VALUES LESS THAN (MAXVALUE)
         );
        
        PARTITION
        Column Name Value
        PARTITION_ID 10
        PARTITION_NAME RANGE_PARTITION
        TID 1
        TYPE 1
        EXPRESSIONS PARTITION BY RANGE (col1)
        (
        PARTITION col1 VALUES LESS THAN (2),
        PARTITION col1 VALUES LESS THAN (5),
        PARTITION col1 VALUES LESS THAN (MAXVALUE)
        );
        Show
        blrunner Jaehwa Jung added a comment - Thanks Hyunsik. I agree with your opinion. I think that a given query will be more useful better than structured columns. So, I modified the design as follows: PARTITION Column Name Column Type Remark PARTITION_ID int(11)   PARTITION_NAME varchar(255)   TID int(11)   TYPE char(1) 0:HASH, 1:RANGE, 2:LIST, 3:COLUMN, and so on EXPRESSIONS text a partition definition phase at a given query A range partition sample is as follows: Given query CREATE TABLE sales ( col1 int, col2 int) PARTITION BY RANGE (col1) ( PARTITION col1 VALUES LESS THAN (2), PARTITION col1 VALUES LESS THAN (5), PARTITION col1 VALUES LESS THAN (MAXVALUE) ); PARTITION Column Name Value PARTITION_ID 10 PARTITION_NAME RANGE_PARTITION TID 1 TYPE 1 EXPRESSIONS PARTITION BY RANGE (col1) ( PARTITION col1 VALUES LESS THAN (2), PARTITION col1 VALUES LESS THAN (5), PARTITION col1 VALUES LESS THAN (MAXVALUE) );
        Hide
        hyunsik Hyunsik Choi added a comment -

        If EXPRESSIONS contains all partitions as you mentioned, it cannot contain a number of partitions and cannot utilize the indexing techniques of underlying RDBMS. In my opinion, each partition should be one row.

        Show
        hyunsik Hyunsik Choi added a comment - If EXPRESSIONS contains all partitions as you mentioned, it cannot contain a number of partitions and cannot utilize the indexing techniques of underlying RDBMS. In my opinion, each partition should be one row.
        Hide
        blrunner Jaehwa Jung added a comment -

        Thanks Hyunsik.

        I modified the design as follows:

        PARTITION
        Column Name Column Type Remark
        PARTITION_ID int(11)  
        PARTITION_NAME varchar(255) if there isn't a given partition name, Tajo needs to name automatically.
        TID int(11)  
        TYPE char(1) 0:HASH, 1:RANGE, 2:LIST, 3:COLUMN, and so on
        COLUMNS varchar(255) partition column id list which separated by comma
        EXPRESSIONS text a each partition definition phase at a given query


        A range partition sample is as follows:

        Given query
        CREATE TABLE sales ( member_id int, sale_amt int)
        PARTITION BY RANGE (member_id)
         (
          PARTITION member_q1 VALUES LESS THAN (2),
          PARTITION member_q2 VALUES LESS THAN (5),
          PARTITION member_q3 VALUES LESS THAN (MAXVALUE)
         );
        
        
        PARTITION #1
        Column Name Value
        PARTITION_ID 10
        PARTITION_NAME member_q1
        TID 1
        TYPE 1
        COLUMNS 20
        EXPRESSIONS PARTITION member_q1 VALUES LESS THAN (2)
        PARTITION #2
        Column Name Value
        PARTITION_ID 11
        PARTITION_NAME member_q1
        TID 1
        TYPE 1
        COLUMNS 20
        EXPRESSIONS PARTITION member_q2 VALUES LESS THAN (5)
        PARTITION #3
        Column Name Value
        PARTITION_ID 13
        PARTITION_NAME member_q1
        TID 1
        TYPE 1
        COLUMNS 20
        EXPRESSIONS PARTITION member_q3 VALUES LESS THAN (MAXVALUE)
        Show
        blrunner Jaehwa Jung added a comment - Thanks Hyunsik. I modified the design as follows: PARTITION Column Name Column Type Remark PARTITION_ID int(11)   PARTITION_NAME varchar(255) if there isn't a given partition name, Tajo needs to name automatically. TID int(11)   TYPE char(1) 0:HASH, 1:RANGE, 2:LIST, 3:COLUMN, and so on COLUMNS varchar(255) partition column id list which separated by comma EXPRESSIONS text a each partition definition phase at a given query A range partition sample is as follows: Given query CREATE TABLE sales ( member_id int, sale_amt int) PARTITION BY RANGE (member_id) ( PARTITION member_q1 VALUES LESS THAN (2), PARTITION member_q2 VALUES LESS THAN (5), PARTITION member_q3 VALUES LESS THAN (MAXVALUE) ); PARTITION #1 Column Name Value PARTITION_ID 10 PARTITION_NAME member_q1 TID 1 TYPE 1 COLUMNS 20 EXPRESSIONS PARTITION member_q1 VALUES LESS THAN (2) PARTITION #2 Column Name Value PARTITION_ID 11 PARTITION_NAME member_q1 TID 1 TYPE 1 COLUMNS 20 EXPRESSIONS PARTITION member_q2 VALUES LESS THAN (5) PARTITION #3 Column Name Value PARTITION_ID 13 PARTITION_NAME member_q1 TID 1 TYPE 1 COLUMNS 20 EXPRESSIONS PARTITION member_q3 VALUES LESS THAN (MAXVALUE)
        Hide
        hyunsik Hyunsik Choi added a comment -

        EXPRESSIONS field seems to contain redundant data. PARTITION, VALUES, LESS, THAN, and partition name all may be unnecessary. I think that EXPRESSIONS field should contain a single value for range and hash. For list, it should contain only value list. This approach would be easy to use indexes that RDBMS provides.

        Show
        hyunsik Hyunsik Choi added a comment - EXPRESSIONS field seems to contain redundant data. PARTITION, VALUES, LESS, THAN, and partition name all may be unnecessary. I think that EXPRESSIONS field should contain a single value for range and hash. For list, it should contain only value list. This approach would be easy to use indexes that RDBMS provides.
        Hide
        blrunner Jaehwa Jung added a comment -

        OK. I understood your opinion at last.
        I modified the design again as follows:

        PARTITION
        Column Name Column Type Remark
        PARTITION_ID int(11)  
        PARTITION_NAME varchar(255) if there isn't a given partition name, Tajo needs to name automatically.
        TID int(11)  
        TYPE char(1) 0:HASH, 1:RANGE, 2:LIST, 3:COLUMN, and so on
        COLUMNS varchar(255) partition column id list which separated by comma
        EXPRESSIONS text a each partition value phase at a given query


        A range partition sample is as follows:

        Given query
        CREATE TABLE sales ( member_id int, sale_amt int)
        PARTITION BY RANGE (member_id)
         (
          PARTITION member_q1 VALUES LESS THAN (2),
          PARTITION member_q2 VALUES LESS THAN (5),
          PARTITION member_q3 VALUES LESS THAN (MAXVALUE)
         );
        
        
        PARTITION #1
        Column Name Value
        PARTITION_ID 10
        PARTITION_NAME member_q1
        TID 1
        TYPE 1
        COLUMNS 20
        EXPRESSIONS 2
        PARTITION #2
        Column Name Value
        PARTITION_ID 11
        PARTITION_NAME member_q2
        TID 1
        TYPE 1
        COLUMNS 20
        EXPRESSIONS 5
        PARTITION #3
        Column Name Value
        PARTITION_ID 12
        PARTITION_NAME member_q3
        TID 1
        TYPE 1
        COLUMNS 20
        EXPRESSIONS MAXVALUE


        A list partition sample is as follows:

        Given query
        CREATE TABLE areas ( area_name text,  area_id int)
        PARTITION BY LIST (area_name)
         (
          PARTITION partition_area_q1 VALUES ('Seoul', '서울'),
          PARTITION partition_area_q2 VALUES ('Busan', '부산')
         );
        
        
        PARTITION #1
        Column Name Value
        PARTITION_ID 20
        PARTITION_NAME partition_area_q1
        TID 2
        TYPE 2
        COLUMNS 22
        EXPRESSIONS 'Seoul', '서울'
        PARTITION #2
        Column Name Value
        PARTITION_ID 21
        PARTITION_NAME partition_area_q2
        TID 2
        TYPE 2
        COLUMNS 22
        EXPRESSIONS 'Busan', '부산'
        Show
        blrunner Jaehwa Jung added a comment - OK. I understood your opinion at last. I modified the design again as follows: PARTITION Column Name Column Type Remark PARTITION_ID int(11)   PARTITION_NAME varchar(255) if there isn't a given partition name, Tajo needs to name automatically. TID int(11)   TYPE char(1) 0:HASH, 1:RANGE, 2:LIST, 3:COLUMN, and so on COLUMNS varchar(255) partition column id list which separated by comma EXPRESSIONS text a each partition value phase at a given query A range partition sample is as follows: Given query CREATE TABLE sales ( member_id int, sale_amt int) PARTITION BY RANGE (member_id) ( PARTITION member_q1 VALUES LESS THAN (2), PARTITION member_q2 VALUES LESS THAN (5), PARTITION member_q3 VALUES LESS THAN (MAXVALUE) ); PARTITION #1 Column Name Value PARTITION_ID 10 PARTITION_NAME member_q1 TID 1 TYPE 1 COLUMNS 20 EXPRESSIONS 2 PARTITION #2 Column Name Value PARTITION_ID 11 PARTITION_NAME member_q2 TID 1 TYPE 1 COLUMNS 20 EXPRESSIONS 5 PARTITION #3 Column Name Value PARTITION_ID 12 PARTITION_NAME member_q3 TID 1 TYPE 1 COLUMNS 20 EXPRESSIONS MAXVALUE A list partition sample is as follows: Given query CREATE TABLE areas ( area_name text, area_id int) PARTITION BY LIST (area_name) ( PARTITION partition_area_q1 VALUES ('Seoul', '서울'), PARTITION partition_area_q2 VALUES ('Busan', '부산') ); PARTITION #1 Column Name Value PARTITION_ID 20 PARTITION_NAME partition_area_q1 TID 2 TYPE 2 COLUMNS 22 EXPRESSIONS 'Seoul', '서울' PARTITION #2 Column Name Value PARTITION_ID 21 PARTITION_NAME partition_area_q2 TID 2 TYPE 2 COLUMNS 22 EXPRESSIONS 'Busan', '부산'
        Hide
        hyunsik Hyunsik Choi added a comment -

        +1 for the schema design. The design looks good for me.

        Show
        hyunsik Hyunsik Choi added a comment - +1 for the schema design. The design looks good for me.
        Hide
        jihoonson Jihoon Son added a comment -

        +1. This design looks good to me, too.

        Show
        jihoonson Jihoon Son added a comment - +1. This design looks good to me, too.
        Hide
        blrunner Jaehwa Jung added a comment -

        Thanks guys.
        I'll start to work from now on.

        Show
        blrunner Jaehwa Jung added a comment - Thanks guys. I'll start to work from now on.
        Hide
        blrunner Jaehwa Jung added a comment -

        In this patch, I added following features.

        • Add ProtocolBuffer: PartitionPro, SpecifierProto, ValueProto.
        • update addTable and deleteTable at classes which implement CatalogStore.
        • Add a parameter which includes Partition informations at GlobalEngine and LogicalPlan and LogicalPlanner.

        And, I modified Partition table schema as follows.

        Column Name Type Remark
        PID INT partition id
        name VARCHAR(255) partition name
        TID int table id
        type varchar(10) partition type (ex: HASH, LIST, etc)
        quantity int partition numbers for hash partitioned table
        columns varchar(255) column id list separated by comma
        expressions text subpartition value expression list separated by comma

        I used queries to verify the patch as follows:

        • CREATE TABLE sales ( col1 int, col2 int) ;
        • CREATE TABLE sales_list ( col1 text, col2 text) PARTITION BY LIST (col1) ( PARTITION part_list_1 VALUES ('Seoul', 'Keongkido'), PARTITION part_list_2 VALUES ('Busan', 'Daeku') );
        • CREATE TABLE sales_range ( col1 int, col2 int) PARTITION BY RANGE (col1) ( PARTITION part_range_1 VALUES LESS THAN (2), PARTITION part_range_2 VALUES LESS THAN (5), PARTITION part_range_3 VALUES LESS THAN (MAXVALUE) );
        • CREATE TABLE sales_hash1 ( col1 int, col2 int) PARTITION BY HASH (col1) ( PARTITION part1, PARTITION part2, PARTITION part3 );
        • CREATE TABLE sales_hash2 ( col1 int, col2 int) PARTITION BY HASH (col1) PARTITIONS 2;
        • CREATE TABLE sales_column ( col1 int, col2 int) PARTITION BY COLUMN (col1, col2, col3);
        Show
        blrunner Jaehwa Jung added a comment - In this patch, I added following features. Add ProtocolBuffer: PartitionPro, SpecifierProto, ValueProto. update addTable and deleteTable at classes which implement CatalogStore. Add a parameter which includes Partition informations at GlobalEngine and LogicalPlan and LogicalPlanner. And, I modified Partition table schema as follows. Column Name Type Remark PID INT partition id name VARCHAR(255) partition name TID int table id type varchar(10) partition type (ex: HASH, LIST, etc) quantity int partition numbers for hash partitioned table columns varchar(255) column id list separated by comma expressions text subpartition value expression list separated by comma I used queries to verify the patch as follows: CREATE TABLE sales ( col1 int, col2 int) ; CREATE TABLE sales_list ( col1 text, col2 text) PARTITION BY LIST (col1) ( PARTITION part_list_1 VALUES ('Seoul', 'Keongkido'), PARTITION part_list_2 VALUES ('Busan', 'Daeku') ); CREATE TABLE sales_range ( col1 int, col2 int) PARTITION BY RANGE (col1) ( PARTITION part_range_1 VALUES LESS THAN (2), PARTITION part_range_2 VALUES LESS THAN (5), PARTITION part_range_3 VALUES LESS THAN (MAXVALUE) ); CREATE TABLE sales_hash1 ( col1 int, col2 int) PARTITION BY HASH (col1) ( PARTITION part1, PARTITION part2, PARTITION part3 ); CREATE TABLE sales_hash2 ( col1 int, col2 int) PARTITION BY HASH (col1) PARTITIONS 2; CREATE TABLE sales_column ( col1 int, col2 int) PARTITION BY COLUMN (col1, col2, col3);
        Hide
        blrunner Jaehwa Jung added a comment -

        I attached second patch file.

        Show
        blrunner Jaehwa Jung added a comment - I attached second patch file.
        Hide
        hyunsik Hyunsik Choi added a comment -

        I'll take a look at the patch tomorrow.

        Show
        hyunsik Hyunsik Choi added a comment - I'll take a look at the patch tomorrow.
        Hide
        blrunner Jaehwa Jung added a comment -

        I attached third patch. I added a module to verify columns for partitioning and it synchronized by the lastest repository.

        Show
        blrunner Jaehwa Jung added a comment - I attached third patch. I added a module to verify columns for partitioning and it synchronized by the lastest repository.
        Hide
        blrunner Jaehwa Jung added a comment -

        I added to set partition for StoreNode at LogicalPlanner.

        Show
        blrunner Jaehwa Jung added a comment - I added to set partition for StoreNode at LogicalPlanner.
        Hide
        blrunner Jaehwa Jung added a comment -

        Is there anyone who reviews the patch?

        Show
        blrunner Jaehwa Jung added a comment - Is there anyone who reviews the patch?
        Hide
        hyunsik Hyunsik Choi added a comment -

        I'll review this patch by today's night.

        Show
        hyunsik Hyunsik Choi added a comment - I'll review this patch by today's night.
        Hide
        hyunsik Hyunsik Choi added a comment -

        I'm very sorry for late review.

        For convenience, I moved the patch to the reviewboard.

        Show
        hyunsik Hyunsik Choi added a comment - I'm very sorry for late review. For convenience, I moved the patch to the reviewboard.
        Hide
        hyunsik Hyunsik Choi added a comment - - edited

        I leaved the comments on the reviewboard.
        https://reviews.apache.org/r/15917/

        Show
        hyunsik Hyunsik Choi added a comment - - edited I leaved the comments on the reviewboard. https://reviews.apache.org/r/15917/
        Hide
        blrunner Jaehwa Jung added a comment -

        I updated the patch by review board comments as follows.

        • table name modified from PARTITION to to PARTITIONS.
        • fieldsByQialifiedName type roll back.
        • delete ValueProto and modified SpecifierProto.
        • apply PartitionType enum type.
        • add TestCatalog cases.
        • add TestTajoClient cases.

        Please, check the patch again.

        Show
        blrunner Jaehwa Jung added a comment - I updated the patch by review board comments as follows. table name modified from PARTITION to to PARTITIONS. fieldsByQialifiedName type roll back. delete ValueProto and modified SpecifierProto. apply PartitionType enum type. add TestCatalog cases. add TestTajoClient cases. Please, check the patch again.
        Hide
        hyunsik Hyunsik Choi added a comment -

        I leaved some comments for the second patch on the reviewboard.
        https://reviews.apache.org/r/15917/

        Show
        hyunsik Hyunsik Choi added a comment - I leaved some comments for the second patch on the reviewboard. https://reviews.apache.org/r/15917/
        Hide
        blrunner Jaehwa Jung added a comment -

        I modified the patch again as follows:

        • Partition Class renamed.
        • Unnecessary method removed.
        • PARTITIONS column type modified.
        Show
        blrunner Jaehwa Jung added a comment - I modified the patch again as follows: Partition Class renamed. Unnecessary method removed. PARTITIONS column type modified.
        Hide
        hyunsik Hyunsik Choi added a comment -

        +1

        The patch looks great for me.

        After this patch, could you please update the documentation by this patch?
        http://tajo.incubator.apache.org/tajo-0.8.0-doc.html

        In order to reflect the up-to-date contents to tajo-0.8.0-doc.html file, please just update the markdown file (tajo-project/src/site/markdown/tajo-0.8.0-doc.md).

        Show
        hyunsik Hyunsik Choi added a comment - +1 The patch looks great for me. After this patch, could you please update the documentation by this patch? http://tajo.incubator.apache.org/tajo-0.8.0-doc.html In order to reflect the up-to-date contents to tajo-0.8.0-doc.html file, please just update the markdown file (tajo-project/src/site/markdown/tajo-0.8.0-doc.md).
        Hide
        hyunsik Hyunsik Choi added a comment -

        I have a question. Should we update the catalog database after this patch?

        Show
        hyunsik Hyunsik Choi added a comment - I have a question. Should we update the catalog database after this patch?
        Hide
        blrunner Jaehwa Jung added a comment -

        I modified the patch to update existing catalog database.
        Check it again.

        Show
        blrunner Jaehwa Jung added a comment - I modified the patch to update existing catalog database. Check it again.
        Hide
        blrunner Jaehwa Jung added a comment -

        Sorry, I renamed the patch.

        Show
        blrunner Jaehwa Jung added a comment - Sorry, I renamed the patch.
        Hide
        blrunner Jaehwa Jung added a comment -

        I updated the patch by the latest repository.

        Show
        blrunner Jaehwa Jung added a comment - I updated the patch by the latest repository.
        Hide
        hyunsik Hyunsik Choi added a comment -

        The patch looks great for me. Additionally, It would be really great for this patch to enables TajoDump to print out PARTITION clause DDL.

        Show
        hyunsik Hyunsik Choi added a comment - The patch looks great for me. Additionally, It would be really great for this patch to enables TajoDump to print out PARTITION clause DDL.
        Hide
        blrunner Jaehwa Jung added a comment -

        Thanks Hyunsik.
        I update the patch as follows:

        • added partition phase to TajoDump.
        • the partition name column of PARTITIONS table should be nullable.
        • If user write partition numbers for Hash partition, Tajo make sub-partitions by numbers.
        Show
        blrunner Jaehwa Jung added a comment - Thanks Hyunsik. I update the patch as follows: added partition phase to TajoDump. the partition name column of PARTITIONS table should be nullable. If user write partition numbers for Hash partition, Tajo make sub-partitions by numbers.
        Hide
        blrunner Jaehwa Jung added a comment -

        I update the patch to synchronize by the lastest repository.

        Show
        blrunner Jaehwa Jung added a comment - I update the patch to synchronize by the lastest repository.
        Hide
        hyunsik Hyunsik Choi added a comment -

        +1
        Great job. Ship it!

        Show
        hyunsik Hyunsik Choi added a comment - +1 Great job. Ship it!
        Hide
        blrunner Jaehwa Jung added a comment -

        Thanks Hyunsik Choi.
        I've just committed now.

        Show
        blrunner Jaehwa Jung added a comment - Thanks Hyunsik Choi . I've just committed now.
        Hide
        jihoonson Jihoon Son added a comment -

        Great work!

        Show
        jihoonson Jihoon Son added a comment - Great work!
        Hide
        blrunner Jaehwa Jung added a comment -

        Thanks Jihoon Son.
        I'm glad to finish this issue.

        Show
        blrunner Jaehwa Jung added a comment - Thanks Jihoon Son . I'm glad to finish this issue.
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-trunk-postcommit #584 (See https://builds.apache.org/job/Tajo-trunk-postcommit/584/)
        TAJO-284: Add table partitioning entry to Catalog. (jaehwa) (jhjung: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=0b0de13b2444f9e75ad3e1f42cba51ddb1f86dc2)

        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlan.java
        • tajo-catalog/tajo-catalog-server/src/test/java/org/apache/tajo/catalog/TestCatalog.java
        • tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/store/MySQLStore.java
        • tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/partition/Partitions.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/CreateTableNode.java
        • tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/CatalogServer.java
        • tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/store/AbstractDBStore.java
        • tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/CatalogUtil.java
        • tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/DDLBuilder.java
        • tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/Schema.java
        • tajo-catalog/tajo-catalog-server/src/test/java/org/apache/tajo/catalog/TestDBStore.java
        • tajo-catalog/tajo-catalog-common/src/main/proto/CatalogProtos.proto
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/client/TestTajoClient.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java
        • tajo-catalog/tajo-catalog-server/pom.xml
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/StoreTableNode.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TajoMasterClientService.java
        • tajo-algebra/src/main/java/org/apache/tajo/algebra/CreateTable.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/GlobalEngine.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/cli/TajoCli.java
        • tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/CatalogConstants.java
        • tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/partition/Specifier.java
        • tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/TableDesc.java
        • tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/store/DerbyStore.java
        • CHANGES.txt
        • tajo-core/tajo-core-backend/src/main/proto/ClientProtos.proto
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-trunk-postcommit #584 (See https://builds.apache.org/job/Tajo-trunk-postcommit/584/ ) TAJO-284 : Add table partitioning entry to Catalog. (jaehwa) (jhjung: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=0b0de13b2444f9e75ad3e1f42cba51ddb1f86dc2 ) tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlan.java tajo-catalog/tajo-catalog-server/src/test/java/org/apache/tajo/catalog/TestCatalog.java tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/store/MySQLStore.java tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/partition/Partitions.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/CreateTableNode.java tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/CatalogServer.java tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/store/AbstractDBStore.java tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/CatalogUtil.java tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/DDLBuilder.java tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/Schema.java tajo-catalog/tajo-catalog-server/src/test/java/org/apache/tajo/catalog/TestDBStore.java tajo-catalog/tajo-catalog-common/src/main/proto/CatalogProtos.proto tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/client/TestTajoClient.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java tajo-catalog/tajo-catalog-server/pom.xml tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/StoreTableNode.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TajoMasterClientService.java tajo-algebra/src/main/java/org/apache/tajo/algebra/CreateTable.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/GlobalEngine.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/cli/TajoCli.java tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/CatalogConstants.java tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/partition/Specifier.java tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/TableDesc.java tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/store/DerbyStore.java CHANGES.txt tajo-core/tajo-core-backend/src/main/proto/ClientProtos.proto
        Hide
        coderplay Min Zhou added a comment -

        One question, what is the meaning of isOmitValues in PartitionDesc?

        Show
        coderplay Min Zhou added a comment - One question, what is the meaning of isOmitValues in PartitionDesc?
        Hide
        hyunsik Hyunsik Choi added a comment -

        isOmitValues is an obsolete variable. At the initial time, we thought basic hive-style column partition as well as extended hive-style column partition which stores partition column values too. isOmitValues is a flag to distinguish them.

        But, extended hive-style was eliminated, and we should remove isOmitValues variable.

        Show
        hyunsik Hyunsik Choi added a comment - isOmitValues is an obsolete variable. At the initial time, we thought basic hive-style column partition as well as extended hive-style column partition which stores partition column values too. isOmitValues is a flag to distinguish them. But, extended hive-style was eliminated, and we should remove isOmitValues variable.

          People

          • Assignee:
            blrunner Jaehwa Jung
            Reporter:
            hyunsik Hyunsik Choi
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development