Details

      Description

      The synopsis is as follows.

      CREATE [ UNIQUE ] INDEX [ name ] ON table [ USING method ]
          ( { column | ( expression ) } [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
          [ WHERE predicate ]
      
      1. TAJO-836_2.patch
        45 kB
        Jihoon Son
      2. TAJO-836_3.patch
        45 kB
        Jihoon Son
      3. TAJO-836.patch
        44 kB
        Jihoon Son

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/74#issuecomment-49548724

        Thank you for your work.

        BTW, while I'm reviewing this patch, I found that this patch still includes TAJO-836 which was committed to index_support branch before 10 days. Could you check this again?

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/74#issuecomment-49548724 Thank you for your work. BTW, while I'm reviewing this patch, I found that this patch still includes TAJO-836 which was committed to index_support branch before 10 days. Could you check this again?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson closed the pull request at:

        https://github.com/apache/tajo/pull/15

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson closed the pull request at: https://github.com/apache/tajo/pull/15
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/15#issuecomment-48582570

        I reflected Hyunsik's comment.
        If there isn't any objection, I'll commit to the separate branch shortly.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/15#issuecomment-48582570 I reflected Hyunsik's comment. If there isn't any objection, I'll commit to the separate branch shortly.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/15#issuecomment-48072122

        Thanks for your review!
        I'll commit after reflect your comment.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/15#issuecomment-48072122 Thanks for your review! I'll commit after reflect your comment.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/15#issuecomment-48003561

        While this feature is incomplete, it would be nice if we work it in a separate branch.

        If so, here is my +1. The patch looks ready to be committed to the branch.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/15#issuecomment-48003561 While this feature is incomplete, it would be nice if we work it in a separate branch. If so, here is my +1. The patch looks ready to be committed to the branch.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/15#issuecomment-48001823

        Thanks for your comment. In this patch, I just intended supporting the index creation. I hope that IndexScan works well because it is already used for range partition. In my opinion, we can test index scan for the various case after https://issues.apache.org/jira/browse/TAJO-838.

        However, as you said, this patch lacks unit tests. So, it would be better to progress in another branch instead of master. What do you think about it?

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/15#issuecomment-48001823 Thanks for your comment. In this patch, I just intended supporting the index creation. I hope that IndexScan works well because it is already used for range partition. In my opinion, we can test index scan for the various case after https://issues.apache.org/jira/browse/TAJO-838 . However, as you said, this patch lacks unit tests. So, it would be better to progress in another branch instead of master. What do you think about it?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/15#discussion_r14544570

        — Diff: tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/StoreIndexExec.java —
        @@ -0,0 +1,110 @@
        +/*
        + * Licensed to the Apache Software Foundation (ASF) under one
        + * or more contributor license agreements. See the NOTICE file
        + * distributed with this work for additional information
        + * regarding copyright ownership. The ASF licenses this file
        + * to you under the Apache License, Version 2.0 (the
        + * "License"); you may not use this file except in compliance
        + * with the License. You may obtain a copy of the License at
        + *
        + * http://www.apache.org/licenses/LICENSE-2.0
        + *
        + * Unless required by applicable law or agreed to in writing, software
        + * distributed under the License is distributed on an "AS IS" BASIS,
        + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        + * See the License for the specific language governing permissions and
        + * limitations under the License.
        + */
        +
        +package org.apache.tajo.engine.planner.physical;
        +
        +import org.apache.commons.logging.Log;
        +import org.apache.commons.logging.LogFactory;
        +import org.apache.hadoop.fs.Path;
        +import org.apache.hadoop.io.IOUtils;
        +import org.apache.tajo.catalog.Column;
        +import org.apache.tajo.catalog.Schema;
        +import org.apache.tajo.catalog.SortSpec;
        +import org.apache.tajo.conf.TajoConf;
        +import org.apache.tajo.conf.TajoConf.ConfVars;
        +import org.apache.tajo.engine.planner.PlannerUtil;
        +import org.apache.tajo.engine.planner.logical.CreateIndexNode;
        +import org.apache.tajo.storage.RowStoreUtil;
        +import org.apache.tajo.storage.Tuple;
        +import org.apache.tajo.storage.TupleComparator;
        +import org.apache.tajo.storage.VTuple;
        +import org.apache.tajo.storage.index.bst.BSTIndex;
        +import org.apache.tajo.storage.index.bst.BSTIndex.BSTIndexWriter;
        +import org.apache.tajo.worker.TaskAttemptContext;
        +
        +import java.io.IOException;
        +
        +public class StoreIndexExec extends UnaryPhysicalExec {
        + private static final Log LOG = LogFactory.getLog(StoreIndexExec.class);
        + private BSTIndexWriter indexWriter;
        + private final CreateIndexNode logicalPlan;
        + private int[] indexKeys = null;
        + private Schema keySchema;
        + private TupleComparator comparator;
        +
        + public StoreIndexExec(final TaskAttemptContext context, final CreateIndexNode logicalPlan,
        + final PhysicalExec child)

        { + super(context, logicalPlan.getInSchema(), logicalPlan.getOutSchema(), child); + this.logicalPlan = logicalPlan; + }

        +
        + @Override
        + public void init() throws IOException {
        + super.init();
        +
        + SortSpec[] sortSpecs = logicalPlan.getSortSpecs();
        + indexKeys = new int[sortSpecs.length];
        + keySchema = PlannerUtil.sortSpecsToSchema(sortSpecs);
        +
        + Column col;
        + for (int i = 0 ; i < sortSpecs.length; i++)

        { + col = sortSpecs[i].getSortKey(); + indexKeys[i] = inSchema.getColumnId(col.getQualifiedName()); + }

        +
        + TajoConf conf = new TajoConf();
        +
        + Path indexPath = new Path(conf.getVar(ConfVars.INDEX_DIR), logicalPlan.getIndexName() + "/" +
        — End diff –

        Hi Hyunsik,
        I also deeply thought about the directory structure.
        In MHO, your suggestion might cause unnecessary operations while reading table data. That is, the scanner should check that the file(or directory) is index or table data.

        Also, in PostgreSQL, catalog maintains indexes as well as tables as relations. In other words, indexes are treated a kind of relations. This policy prevents any indexes in a database from having the same name.

        So, how about the directory structure of ```$

        {DATABASE_NAME}

        /$

        {INDEX_NAME}

        ```? I think that this structure can help the index management and also dose not disturb other operations.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/15#discussion_r14544570 — Diff: tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/StoreIndexExec.java — @@ -0,0 +1,110 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.tajo.engine.planner.physical; + +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.io.IOUtils; +import org.apache.tajo.catalog.Column; +import org.apache.tajo.catalog.Schema; +import org.apache.tajo.catalog.SortSpec; +import org.apache.tajo.conf.TajoConf; +import org.apache.tajo.conf.TajoConf.ConfVars; +import org.apache.tajo.engine.planner.PlannerUtil; +import org.apache.tajo.engine.planner.logical.CreateIndexNode; +import org.apache.tajo.storage.RowStoreUtil; +import org.apache.tajo.storage.Tuple; +import org.apache.tajo.storage.TupleComparator; +import org.apache.tajo.storage.VTuple; +import org.apache.tajo.storage.index.bst.BSTIndex; +import org.apache.tajo.storage.index.bst.BSTIndex.BSTIndexWriter; +import org.apache.tajo.worker.TaskAttemptContext; + +import java.io.IOException; + +public class StoreIndexExec extends UnaryPhysicalExec { + private static final Log LOG = LogFactory.getLog(StoreIndexExec.class); + private BSTIndexWriter indexWriter; + private final CreateIndexNode logicalPlan; + private int[] indexKeys = null; + private Schema keySchema; + private TupleComparator comparator; + + public StoreIndexExec(final TaskAttemptContext context, final CreateIndexNode logicalPlan, + final PhysicalExec child) { + super(context, logicalPlan.getInSchema(), logicalPlan.getOutSchema(), child); + this.logicalPlan = logicalPlan; + } + + @Override + public void init() throws IOException { + super.init(); + + SortSpec[] sortSpecs = logicalPlan.getSortSpecs(); + indexKeys = new int [sortSpecs.length] ; + keySchema = PlannerUtil.sortSpecsToSchema(sortSpecs); + + Column col; + for (int i = 0 ; i < sortSpecs.length; i++) { + col = sortSpecs[i].getSortKey(); + indexKeys[i] = inSchema.getColumnId(col.getQualifiedName()); + } + + TajoConf conf = new TajoConf(); + + Path indexPath = new Path(conf.getVar(ConfVars.INDEX_DIR), logicalPlan.getIndexName() + "/" + — End diff – Hi Hyunsik, I also deeply thought about the directory structure. In MHO, your suggestion might cause unnecessary operations while reading table data. That is, the scanner should check that the file(or directory) is index or table data. Also, in PostgreSQL, catalog maintains indexes as well as tables as relations. In other words, indexes are treated a kind of relations. This policy prevents any indexes in a database from having the same name. So, how about the directory structure of ```$ {DATABASE_NAME} /$ {INDEX_NAME} ```? I think that this structure can help the index management and also dose not disturb other operations.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/15#issuecomment-47933752

        The patch looks good to me. I have one question. After this patch is applied, can we use IndexScan right now? If so, we need to add more unit tests for scan using BST index.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/15#issuecomment-47933752 The patch looks good to me. I have one question. After this patch is applied, can we use IndexScan right now? If so, we need to add more unit tests for scan using BST index.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/15#discussion_r14514922

        — Diff: tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/StoreIndexExec.java —
        @@ -0,0 +1,110 @@
        +/*
        + * Licensed to the Apache Software Foundation (ASF) under one
        + * or more contributor license agreements. See the NOTICE file
        + * distributed with this work for additional information
        + * regarding copyright ownership. The ASF licenses this file
        + * to you under the Apache License, Version 2.0 (the
        + * "License"); you may not use this file except in compliance
        + * with the License. You may obtain a copy of the License at
        + *
        + * http://www.apache.org/licenses/LICENSE-2.0
        + *
        + * Unless required by applicable law or agreed to in writing, software
        + * distributed under the License is distributed on an "AS IS" BASIS,
        + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        + * See the License for the specific language governing permissions and
        + * limitations under the License.
        + */
        +
        +package org.apache.tajo.engine.planner.physical;
        +
        +import org.apache.commons.logging.Log;
        +import org.apache.commons.logging.LogFactory;
        +import org.apache.hadoop.fs.Path;
        +import org.apache.hadoop.io.IOUtils;
        +import org.apache.tajo.catalog.Column;
        +import org.apache.tajo.catalog.Schema;
        +import org.apache.tajo.catalog.SortSpec;
        +import org.apache.tajo.conf.TajoConf;
        +import org.apache.tajo.conf.TajoConf.ConfVars;
        +import org.apache.tajo.engine.planner.PlannerUtil;
        +import org.apache.tajo.engine.planner.logical.CreateIndexNode;
        +import org.apache.tajo.storage.RowStoreUtil;
        +import org.apache.tajo.storage.Tuple;
        +import org.apache.tajo.storage.TupleComparator;
        +import org.apache.tajo.storage.VTuple;
        +import org.apache.tajo.storage.index.bst.BSTIndex;
        +import org.apache.tajo.storage.index.bst.BSTIndex.BSTIndexWriter;
        +import org.apache.tajo.worker.TaskAttemptContext;
        +
        +import java.io.IOException;
        +
        +public class StoreIndexExec extends UnaryPhysicalExec {
        + private static final Log LOG = LogFactory.getLog(StoreIndexExec.class);
        + private BSTIndexWriter indexWriter;
        + private final CreateIndexNode logicalPlan;
        + private int[] indexKeys = null;
        + private Schema keySchema;
        + private TupleComparator comparator;
        +
        + public StoreIndexExec(final TaskAttemptContext context, final CreateIndexNode logicalPlan,
        + final PhysicalExec child)

        { + super(context, logicalPlan.getInSchema(), logicalPlan.getOutSchema(), child); + this.logicalPlan = logicalPlan; + }

        +
        + @Override
        + public void init() throws IOException {
        + super.init();
        +
        + SortSpec[] sortSpecs = logicalPlan.getSortSpecs();
        + indexKeys = new int[sortSpecs.length];
        + keySchema = PlannerUtil.sortSpecsToSchema(sortSpecs);
        +
        + Column col;
        + for (int i = 0 ; i < sortSpecs.length; i++)

        { + col = sortSpecs[i].getSortKey(); + indexKeys[i] = inSchema.getColumnId(col.getQualifiedName()); + }

        +
        + TajoConf conf = new TajoConf();
        +
        + Path indexPath = new Path(conf.getVar(ConfVars.INDEX_DIR), logicalPlan.getIndexName() + "/" +
        — End diff –

        It seems to write index on ```$

        {INDEX_DIR}

        /$

        {INDEX NAME}

        ```. It would be better if index is located as the following directory hierarchy: ```$

        {DATABASE_NAME}

        /$

        {TABLE_NAME}

        /$

        {INDEX_NAME}

        ```.

        There are one or more indices against one table. The proposed directory hierarchy will enable Tajo to manage indices more effectively.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/15#discussion_r14514922 — Diff: tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/StoreIndexExec.java — @@ -0,0 +1,110 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.tajo.engine.planner.physical; + +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.io.IOUtils; +import org.apache.tajo.catalog.Column; +import org.apache.tajo.catalog.Schema; +import org.apache.tajo.catalog.SortSpec; +import org.apache.tajo.conf.TajoConf; +import org.apache.tajo.conf.TajoConf.ConfVars; +import org.apache.tajo.engine.planner.PlannerUtil; +import org.apache.tajo.engine.planner.logical.CreateIndexNode; +import org.apache.tajo.storage.RowStoreUtil; +import org.apache.tajo.storage.Tuple; +import org.apache.tajo.storage.TupleComparator; +import org.apache.tajo.storage.VTuple; +import org.apache.tajo.storage.index.bst.BSTIndex; +import org.apache.tajo.storage.index.bst.BSTIndex.BSTIndexWriter; +import org.apache.tajo.worker.TaskAttemptContext; + +import java.io.IOException; + +public class StoreIndexExec extends UnaryPhysicalExec { + private static final Log LOG = LogFactory.getLog(StoreIndexExec.class); + private BSTIndexWriter indexWriter; + private final CreateIndexNode logicalPlan; + private int[] indexKeys = null; + private Schema keySchema; + private TupleComparator comparator; + + public StoreIndexExec(final TaskAttemptContext context, final CreateIndexNode logicalPlan, + final PhysicalExec child) { + super(context, logicalPlan.getInSchema(), logicalPlan.getOutSchema(), child); + this.logicalPlan = logicalPlan; + } + + @Override + public void init() throws IOException { + super.init(); + + SortSpec[] sortSpecs = logicalPlan.getSortSpecs(); + indexKeys = new int [sortSpecs.length] ; + keySchema = PlannerUtil.sortSpecsToSchema(sortSpecs); + + Column col; + for (int i = 0 ; i < sortSpecs.length; i++) { + col = sortSpecs[i].getSortKey(); + indexKeys[i] = inSchema.getColumnId(col.getQualifiedName()); + } + + TajoConf conf = new TajoConf(); + + Path indexPath = new Path(conf.getVar(ConfVars.INDEX_DIR), logicalPlan.getIndexName() + "/" + — End diff – It seems to write index on ```$ {INDEX_DIR} /$ {INDEX NAME} ```. It would be better if index is located as the following directory hierarchy: ```$ {DATABASE_NAME} /$ {TABLE_NAME} /$ {INDEX_NAME} ```. There are one or more indices against one table. The proposed directory hierarchy will enable Tajo to manage indices more effectively.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/15#issuecomment-47857053

        I updated the patch against the master branch.
        Please review this patch.
        Thanks.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/15#issuecomment-47857053 I updated the patch against the master branch. Please review this patch. Thanks.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/15#discussion_r14493505

        — Diff: tajo-core/src/main/antlr4/org/apache/tajo/engine/parser/SQLParser.g4 —
        @@ -68,8 +68,8 @@ schema_statement
        ;

        index_statement

        • : CREATE (u=UNIQUE)? INDEX n=identifier ON t=table_name (m=method_specifier)?
        • LEFT_PAREN s=sort_specifier_list RIGHT_PAREN p=param_clause?
          + : CREATE (u=UNIQUE)? INDEX identifier ON table_name (method_specifier)?
          + LEFT_PAREN sort_specifier_list RIGHT_PAREN param_clause? (where_clause)?
            • End diff –

        ```where_clause``` is already supported in this patch.
        CreateIndexNode can have ScanNode as its child.
        The ```where_clause``` is evaluated in Scan.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/15#discussion_r14493505 — Diff: tajo-core/src/main/antlr4/org/apache/tajo/engine/parser/SQLParser.g4 — @@ -68,8 +68,8 @@ schema_statement ; index_statement : CREATE (u=UNIQUE)? INDEX n=identifier ON t=table_name (m=method_specifier)? LEFT_PAREN s=sort_specifier_list RIGHT_PAREN p=param_clause? + : CREATE (u=UNIQUE)? INDEX identifier ON table_name (method_specifier)? + LEFT_PAREN sort_specifier_list RIGHT_PAREN param_clause? (where_clause)? End diff – ```where_clause``` is already supported in this patch. CreateIndexNode can have ScanNode as its child. The ```where_clause``` is evaluated in Scan.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/15#discussion_r14493424

        — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java —
        @@ -150,6 +150,8 @@ public static int setDateOrder(int dateOrder) {
        SYSTEM_CONF_PATH("tajo.system-conf.path", EMPTY_VALUE),
        SYSTEM_CONF_REPLICA_COUNT("tajo.system-conf.replica-count", 20),

        + INDEX_DIR("tajo.index.rootdir", "file:///tmp/tajo-$

        {user.name}

        /index"),
        — End diff –

        @hyunsik sorry for late reply.
        I also intended that INDEX_DIR will be the directory structure.
        Actually, several index files for each HDFS block are created in the specified index directory.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/15#discussion_r14493424 — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java — @@ -150,6 +150,8 @@ public static int setDateOrder(int dateOrder) { SYSTEM_CONF_PATH("tajo.system-conf.path", EMPTY_VALUE), SYSTEM_CONF_REPLICA_COUNT("tajo.system-conf.replica-count", 20), + INDEX_DIR("tajo.index.rootdir", "file:///tmp/tajo-$ {user.name} /index"), — End diff – @hyunsik sorry for late reply. I also intended that INDEX_DIR will be the directory structure. Actually, several index files for each HDFS block are created in the specified index directory.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/15#discussion_r14166576

        — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java —
        @@ -150,6 +150,8 @@ public static int setDateOrder(int dateOrder) {
        SYSTEM_CONF_PATH("tajo.system-conf.path", EMPTY_VALUE),
        SYSTEM_CONF_REPLICA_COUNT("tajo.system-conf.replica-count", 20),

        + INDEX_DIR("tajo.index.rootdir", "file:///tmp/tajo-$

        {user.name}

        /index"),
        — End diff –

        What do you think the directory structure? Will INDEX_DIR have a number of sub directories, each of which corresponds to database and tables?

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/15#discussion_r14166576 — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java — @@ -150,6 +150,8 @@ public static int setDateOrder(int dateOrder) { SYSTEM_CONF_PATH("tajo.system-conf.path", EMPTY_VALUE), SYSTEM_CONF_REPLICA_COUNT("tajo.system-conf.replica-count", 20), + INDEX_DIR("tajo.index.rootdir", "file:///tmp/tajo-$ {user.name} /index"), — End diff – What do you think the directory structure? Will INDEX_DIR have a number of sub directories, each of which corresponds to database and tables?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/15#discussion_r14166548

        — Diff: tajo-core/src/main/antlr4/org/apache/tajo/engine/parser/SQLParser.g4 —
        @@ -68,8 +68,8 @@ schema_statement
        ;

        index_statement

        • : CREATE (u=UNIQUE)? INDEX n=identifier ON t=table_name (m=method_specifier)?
        • LEFT_PAREN s=sort_specifier_list RIGHT_PAREN p=param_clause?
          + : CREATE (u=UNIQUE)? INDEX identifier ON table_name (method_specifier)?
          + LEFT_PAREN sort_specifier_list RIGHT_PAREN param_clause? (where_clause)?
            • End diff –

        If this patch does not support ```(where_clause)?``` now, how about adding it when it needed?

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/15#discussion_r14166548 — Diff: tajo-core/src/main/antlr4/org/apache/tajo/engine/parser/SQLParser.g4 — @@ -68,8 +68,8 @@ schema_statement ; index_statement : CREATE (u=UNIQUE)? INDEX n=identifier ON t=table_name (m=method_specifier)? LEFT_PAREN s=sort_specifier_list RIGHT_PAREN p=param_clause? + : CREATE (u=UNIQUE)? INDEX identifier ON table_name (method_specifier)? + LEFT_PAREN sort_specifier_list RIGHT_PAREN param_clause? (where_clause)? End diff – If this patch does not support ```(where_clause)?``` now, how about adding it when it needed?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/15#discussion_r13635248

        — Diff: tajo-core/src/main/antlr4/org/apache/tajo/engine/parser/SQLParser.g4 —
        @@ -68,8 +68,8 @@ schema_statement
        ;

        index_statement

        • : CREATE (u=UNIQUE)? INDEX n=identifier ON t=table_name (m=method_specifier)?
        • LEFT_PAREN s=sort_specifier_list RIGHT_PAREN p=param_clause?
          + : CREATE (u=UNIQUE)? INDEX identifier ON table_name (method_specifier)?
          + LEFT_PAREN sort_specifier_list RIGHT_PAREN param_clause? (where_clause)?
            • End diff –

        Right. That's what I mean.
        Here is the document of the create index statement in Postgresql.
        http://www.postgresql.org/docs/9.1/static/sql-createindex.html
        I think that we will add the tablespace parameter later.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/15#discussion_r13635248 — Diff: tajo-core/src/main/antlr4/org/apache/tajo/engine/parser/SQLParser.g4 — @@ -68,8 +68,8 @@ schema_statement ; index_statement : CREATE (u=UNIQUE)? INDEX n=identifier ON t=table_name (m=method_specifier)? LEFT_PAREN s=sort_specifier_list RIGHT_PAREN p=param_clause? + : CREATE (u=UNIQUE)? INDEX identifier ON table_name (method_specifier)? + LEFT_PAREN sort_specifier_list RIGHT_PAREN param_clause? (where_clause)? End diff – Right. That's what I mean. Here is the document of the create index statement in Postgresql. http://www.postgresql.org/docs/9.1/static/sql-createindex.html I think that we will add the tablespace parameter later.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/15#discussion_r13635208

        — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java —
        @@ -150,6 +150,8 @@ public static int setDateOrder(int dateOrder) {
        SYSTEM_CONF_PATH("tajo.system-conf.path", EMPTY_VALUE),
        SYSTEM_CONF_REPLICA_COUNT("tajo.system-conf.replica-count", 20),

        + INDEX_DIR("tajo.index.rootdir", "file:///tmp/tajo-$

        {user.name}

        /index"),
        — End diff –

        It is used for BSTIndex, but I think that every disk-based index structure can use this property.
        What do you think about it?

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/15#discussion_r13635208 — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java — @@ -150,6 +150,8 @@ public static int setDateOrder(int dateOrder) { SYSTEM_CONF_PATH("tajo.system-conf.path", EMPTY_VALUE), SYSTEM_CONF_REPLICA_COUNT("tajo.system-conf.replica-count", 20), + INDEX_DIR("tajo.index.rootdir", "file:///tmp/tajo-$ {user.name} /index"), — End diff – It is used for BSTIndex, but I think that every disk-based index structure can use this property. What do you think about it?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/15#discussion_r13635178

        — Diff: tajo-algebra/src/main/java/org/apache/tajo/algebra/CreateIndex.java —
        @@ -0,0 +1,129 @@
        +/*
        + * Licensed to the Apache Software Foundation (ASF) under one
        + * or more contributor license agreements. See the NOTICE file
        + * distributed with this work for additional information
        + * regarding copyright ownership. The ASF licenses this file
        + * to you under the Apache License, Version 2.0 (the
        + * "License"); you may not use this file except in compliance
        + * with the License. You may obtain a copy of the License at
        + *
        + * http://www.apache.org/licenses/LICENSE-2.0
        + *
        + * Unless required by applicable law or agreed to in writing, software
        + * distributed under the License is distributed on an "AS IS" BASIS,
        + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        + * See the License for the specific language governing permissions and
        + * limitations under the License.
        + */
        +
        +package org.apache.tajo.algebra;
        +
        +import com.google.common.base.Objects;
        +import com.google.gson.annotations.Expose;
        +import com.google.gson.annotations.SerializedName;
        +import org.apache.tajo.algebra.Sort.SortSpec;
        +import org.apache.tajo.util.TUtil;
        +
        +import java.util.Map;
        +
        +public class CreateIndex extends UnaryOperator {
        + @Expose @SerializedName("IsUnique")
        + private boolean unique = false;
        + @Expose @SerializedName("IndexName")
        + private String indexName;
        + @Expose @SerializedName("SortSpecs")
        + private SortSpec[] sortSpecs;
        + @Expose @SerializedName("IndexProperties")
        + private Map<String, String> params;
        — End diff –

        At first, I worried about the duplicated serialized name of different operators. As you know, gson does not allow the same serialized name of different objects.
        However, only some DDL operators such as CreateDatabase, CreateTable, Insert, and CreateIndex have the 'params' value, and these operators are never connected in a query plan. So, it will be free from the gson restriction for the same name.
        I'll change it. Thanks!

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/15#discussion_r13635178 — Diff: tajo-algebra/src/main/java/org/apache/tajo/algebra/CreateIndex.java — @@ -0,0 +1,129 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.tajo.algebra; + +import com.google.common.base.Objects; +import com.google.gson.annotations.Expose; +import com.google.gson.annotations.SerializedName; +import org.apache.tajo.algebra.Sort.SortSpec; +import org.apache.tajo.util.TUtil; + +import java.util.Map; + +public class CreateIndex extends UnaryOperator { + @Expose @SerializedName("IsUnique") + private boolean unique = false; + @Expose @SerializedName("IndexName") + private String indexName; + @Expose @SerializedName("SortSpecs") + private SortSpec[] sortSpecs; + @Expose @SerializedName("IndexProperties") + private Map<String, String> params; — End diff – At first, I worried about the duplicated serialized name of different operators. As you know, gson does not allow the same serialized name of different objects. However, only some DDL operators such as CreateDatabase, CreateTable, Insert, and CreateIndex have the 'params' value, and these operators are never connected in a query plan. So, it will be free from the gson restriction for the same name. I'll change it. Thanks!
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/15#issuecomment-45692810

        The patch itself looks nice to me. In my point of view, we need to discuss the long term plan of indexing feature in this time. So, I leave some comments in the umbrella issue (https://issues.apache.org/jira/browse/TAJO-835). We can continue the discussion in the Jira.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/15#issuecomment-45692810 The patch itself looks nice to me. In my point of view, we need to discuss the long term plan of indexing feature in this time. So, I leave some comments in the umbrella issue ( https://issues.apache.org/jira/browse/TAJO-835 ). We can continue the discussion in the Jira.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/15#discussion_r13629901

        — Diff: tajo-core/src/main/antlr4/org/apache/tajo/engine/parser/SQLParser.g4 —
        @@ -68,8 +68,8 @@ schema_statement
        ;

        index_statement

        • : CREATE (u=UNIQUE)? INDEX n=identifier ON t=table_name (m=method_specifier)?
        • LEFT_PAREN s=sort_specifier_list RIGHT_PAREN p=param_clause?
          + : CREATE (u=UNIQUE)? INDEX identifier ON table_name (method_specifier)?
          + LEFT_PAREN sort_specifier_list RIGHT_PAREN param_clause? (where_clause)?
            • End diff –

        I'm expecting that the main purpose of ```(where_clause)?``` is to allow index data only for the subset of table data. If I'm wrong, please let me know that.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/15#discussion_r13629901 — Diff: tajo-core/src/main/antlr4/org/apache/tajo/engine/parser/SQLParser.g4 — @@ -68,8 +68,8 @@ schema_statement ; index_statement : CREATE (u=UNIQUE)? INDEX n=identifier ON t=table_name (m=method_specifier)? LEFT_PAREN s=sort_specifier_list RIGHT_PAREN p=param_clause? + : CREATE (u=UNIQUE)? INDEX identifier ON table_name (method_specifier)? + LEFT_PAREN sort_specifier_list RIGHT_PAREN param_clause? (where_clause)? End diff – I'm expecting that the main purpose of ```(where_clause)?``` is to allow index data only for the subset of table data. If I'm wrong, please let me know that.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/15#discussion_r13629856

        — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java —
        @@ -150,6 +150,8 @@ public static int setDateOrder(int dateOrder) {
        SYSTEM_CONF_PATH("tajo.system-conf.path", EMPTY_VALUE),
        SYSTEM_CONF_REPLICA_COUNT("tajo.system-conf.replica-count", 20),

        + INDEX_DIR("tajo.index.rootdir", "file:///tmp/tajo-$

        {user.name}

        /index"),
        — End diff –

        Will it be used only for BSTIndex?

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/15#discussion_r13629856 — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java — @@ -150,6 +150,8 @@ public static int setDateOrder(int dateOrder) { SYSTEM_CONF_PATH("tajo.system-conf.path", EMPTY_VALUE), SYSTEM_CONF_REPLICA_COUNT("tajo.system-conf.replica-count", 20), + INDEX_DIR("tajo.index.rootdir", "file:///tmp/tajo-$ {user.name} /index"), — End diff – Will it be used only for BSTIndex?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/15#discussion_r13629771

        — Diff: tajo-algebra/src/main/java/org/apache/tajo/algebra/CreateIndex.java —
        @@ -0,0 +1,129 @@
        +/*
        + * Licensed to the Apache Software Foundation (ASF) under one
        + * or more contributor license agreements. See the NOTICE file
        + * distributed with this work for additional information
        + * regarding copyright ownership. The ASF licenses this file
        + * to you under the Apache License, Version 2.0 (the
        + * "License"); you may not use this file except in compliance
        + * with the License. You may obtain a copy of the License at
        + *
        + * http://www.apache.org/licenses/LICENSE-2.0
        + *
        + * Unless required by applicable law or agreed to in writing, software
        + * distributed under the License is distributed on an "AS IS" BASIS,
        + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
        + * See the License for the specific language governing permissions and
        + * limitations under the License.
        + */
        +
        +package org.apache.tajo.algebra;
        +
        +import com.google.common.base.Objects;
        +import com.google.gson.annotations.Expose;
        +import com.google.gson.annotations.SerializedName;
        +import org.apache.tajo.algebra.Sort.SortSpec;
        +import org.apache.tajo.util.TUtil;
        +
        +import java.util.Map;
        +
        +public class CreateIndex extends UnaryOperator {
        + @Expose @SerializedName("IsUnique")
        + private boolean unique = false;
        + @Expose @SerializedName("IndexName")
        + private String indexName;
        + @Expose @SerializedName("SortSpecs")
        + private SortSpec[] sortSpecs;
        + @Expose @SerializedName("IndexProperties")
        + private Map<String, String> params;
        — End diff –

        Recently, I'm thinking that ```properties``` is more proper name for the variable name that we have used as ``param``` in TableMeta. So, I'd like to suggest to rename it to ```properties```.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/15#discussion_r13629771 — Diff: tajo-algebra/src/main/java/org/apache/tajo/algebra/CreateIndex.java — @@ -0,0 +1,129 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.tajo.algebra; + +import com.google.common.base.Objects; +import com.google.gson.annotations.Expose; +import com.google.gson.annotations.SerializedName; +import org.apache.tajo.algebra.Sort.SortSpec; +import org.apache.tajo.util.TUtil; + +import java.util.Map; + +public class CreateIndex extends UnaryOperator { + @Expose @SerializedName("IsUnique") + private boolean unique = false; + @Expose @SerializedName("IndexName") + private String indexName; + @Expose @SerializedName("SortSpecs") + private SortSpec[] sortSpecs; + @Expose @SerializedName("IndexProperties") + private Map<String, String> params; — End diff – Recently, I'm thinking that ```properties``` is more proper name for the variable name that we have used as ``param``` in TableMeta. So, I'd like to suggest to rename it to ```properties```.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/15#issuecomment-45569828

        I'll review it soon. I'm sorry for late.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/15#issuecomment-45569828 I'll review it soon. I'm sorry for late.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/15#issuecomment-45463166

        Would anyone review this patch, please?

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/15#issuecomment-45463166 Would anyone review this patch, please?
        Hide
        tajoqa Tajo QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12646202/TAJO-836_3.patch
        against master revision e2d0464.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 8 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The applied patch does not increase the total number of javadoc warnings.

        +1 checkstyle. The patch generated 0 code style errors.

        -1 findbugs. The patch appears to introduce 202 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in tajo-algebra tajo-common tajo-core tajo-storage:
        org.apache.tajo.engine.eval.TestSQLExpression

        Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/436//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/436//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/436//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core.html
        Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/436//console

        This message is automatically generated.

        Show
        tajoqa Tajo QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12646202/TAJO-836_3.patch against master revision e2d0464. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The applied patch does not increase the total number of javadoc warnings. +1 checkstyle. The patch generated 0 code style errors. -1 findbugs. The patch appears to introduce 202 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in tajo-algebra tajo-common tajo-core tajo-storage: org.apache.tajo.engine.eval.TestSQLExpression Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/436//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/436//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/436//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core.html Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/436//console This message is automatically generated.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/15#issuecomment-44211119

        I'll review this tomorrow. I'm sorry for late review.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/15#issuecomment-44211119 I'll review this tomorrow. I'm sorry for late review.
        Hide
        jihoonson Jihoon Son added a comment -

        Compilation error fixed.

        Show
        jihoonson Jihoon Son added a comment - Compilation error fixed.
        Hide
        jihoonson Jihoon Son added a comment -

        The above bug fixed.

        Show
        jihoonson Jihoon Son added a comment - The above bug fixed.
        Hide
        jihoonson Jihoon Son added a comment -

        In this patch, index creation is failed on a real cluster because every task attempts to write indexes at the same path.
        I'll submit a new patch after fixing it.

        Show
        jihoonson Jihoon Son added a comment - In this patch, index creation is failed on a real cluster because every task attempts to write indexes at the same path. I'll submit a new patch after fixing it.
        Hide
        jihoonson Jihoon Son added a comment -

        Currently, index result or query processing using index are hard to be validated.
        Thus, this patch just checks whether the index is created or not.
        After TAJO-837 and TAJO-838, I'll improve unit tests.

        Show
        jihoonson Jihoon Son added a comment - Currently, index result or query processing using index are hard to be validated. Thus, this patch just checks whether the index is created or not. After TAJO-837 and TAJO-838 , I'll improve unit tests.
        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user jihoonson opened a pull request:

        https://github.com/apache/tajo/pull/15

        TAJO-836

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/jihoonson/tajo-2 TAJO-836

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/15.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #15


        commit eb8300d62bc388ea70f2119d389c416987c1c876
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2014-05-22T06:42:46Z

        TAJO-836


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user jihoonson opened a pull request: https://github.com/apache/tajo/pull/15 TAJO-836 You can merge this pull request into a Git repository by running: $ git pull https://github.com/jihoonson/tajo-2 TAJO-836 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/15.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15 commit eb8300d62bc388ea70f2119d389c416987c1c876 Author: Jihoon Son <jihoonson@apache.org> Date: 2014-05-22T06:42:46Z TAJO-836

          People

          • Assignee:
            jihoonson Jihoon Son
            Reporter:
            jihoonson Jihoon Son
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development