Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11.0, 0.12.0
    • Component/s: Catalog
    • Labels:
      None

      Description

      Tajo stores a list of partitions for each table in its CatalogStore. If, however, new partitions are directly added to HDFS, the CatalogStore will not be aware of these partitions unless the user runs ALTER TABLE table_name ADD PARTITION commands on each of the newly added partitions.

      However, users can run a CatalogStore check command with the repair table option:

      MSCK REPAIR TABLE <table_name>
      

      For the reference, I've referenced hive msck command.

      1. TAJO-1673_2.patch
        31 kB
        Jaehwa Jung
      2. TAJO-1673.patch
        31 kB
        Jaehwa Jung

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user blrunner opened a pull request:

        https://github.com/apache/tajo/pull/626

        TAJO-1673: Implement recover partitions

        This is still ongoing work. I'll do more tests and update DDLExecutor after committing TAJO-1345.

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/blrunner/tajo TAJO-1673

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/626.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #626


        commit c66ab61ae0cf66f42fc44fbdd4394f87ec5b5700
        Author: JaeHwa Jung <blrunner@apache.org>
        Date: 2015-07-09T02:55:30Z

        Add msck repair table statment.

        commit e3bd88f661722e38667f83b8d6cdafda03934a3b
        Author: JaeHwa Jung <blrunner@apache.org>
        Date: 2015-07-09T15:39:55Z

        Implement msck repair table on DDLExecutor.


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user blrunner opened a pull request: https://github.com/apache/tajo/pull/626 TAJO-1673 : Implement recover partitions This is still ongoing work. I'll do more tests and update DDLExecutor after committing TAJO-1345 . You can merge this pull request into a Git repository by running: $ git pull https://github.com/blrunner/tajo TAJO-1673 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/626.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #626 commit c66ab61ae0cf66f42fc44fbdd4394f87ec5b5700 Author: JaeHwa Jung <blrunner@apache.org> Date: 2015-07-09T02:55:30Z Add msck repair table statment. commit e3bd88f661722e38667f83b8d6cdafda03934a3b Author: JaeHwa Jung <blrunner@apache.org> Date: 2015-07-09T15:39:55Z Implement msck repair table on DDLExecutor.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/626#issuecomment-120840158

        Could you let me know the full name of msck?

        The below expression seems to be more consistent form for a full command, I think.
        ```
        ALTER TABLE [table name] REPAIR PARTITION
        ```

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/626#issuecomment-120840158 Could you let me know the full name of msck? The below expression seems to be more consistent form for a full command, I think. ``` ALTER TABLE [table name] REPAIR PARTITION ```
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/626#issuecomment-120847030

        Hi @hyunsik

        I'just referred hive's policy. But your suggestion looks more reasonable now.
        I'll update the patch tonight.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/626#issuecomment-120847030 Hi @hyunsik I'just referred hive's policy. But your suggestion looks more reasonable now. I'll update the patch tonight.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/626#issuecomment-121148222

        I've updated MSCK TABLE statement to ALTER TABLE REPAIR TABLE statement.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/626#issuecomment-121148222 I've updated MSCK TABLE statement to ALTER TABLE REPAIR TABLE statement.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/626#issuecomment-127478487

        I applied CatalogStore::addPartitions to DDLExecutor and implemented some unit test cases.
        When I implementing unit test cases, I found that drop partition operate on different way than expected. I think that drop partition need to delete data of table with `PURGE` option. But currently, when dropping partition on managed table without `PURGE` option, all data had been deleted. So, I've fixed this bug.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/626#issuecomment-127478487 I applied CatalogStore::addPartitions to DDLExecutor and implemented some unit test cases. When I implementing unit test cases, I found that drop partition operate on different way than expected. I think that drop partition need to delete data of table with `PURGE` option. But currently, when dropping partition on managed table without `PURGE` option, all data had been deleted. So, I've fixed this bug.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/626#issuecomment-137636805

        I added codes to check existing partitions from CatalogStore before adding missing partitions.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/626#issuecomment-137636805 I added codes to check existing partitions from CatalogStore before adding missing partitions.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/626#issuecomment-137649891

        I've finished partition repair test successfully with MySQLStore and HiveCatalogStore and fixed a few bugs.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/626#issuecomment-137649891 I've finished partition repair test successfully with MySQLStore and HiveCatalogStore and fixed a few bugs.
        Hide
        tajoqa Tajo QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12754141/TAJO-1673.patch
        against master revision release-0.9.0-rc0-442-gd794c1d.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 7 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The applied patch does not increase the total number of javadoc warnings.

        +1 checkstyle. The patch generated 0 code style errors.

        -1 findbugs. The patch appears to cause Findbugs (version 2.0.3) to fail.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in tajo-algebra tajo-catalog/tajo-catalog-drivers/tajo-hive tajo-core tajo-core-tests tajo-plan.

        Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/849//testReport/
        Findbugs results: https://builds.apache.org/job/PreCommit-TAJO-Build/849//findbugsResult
        Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/849//console

        This message is automatically generated.

        Show
        tajoqa Tajo QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12754141/TAJO-1673.patch against master revision release-0.9.0-rc0-442-gd794c1d. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The applied patch does not increase the total number of javadoc warnings. +1 checkstyle. The patch generated 0 code style errors. -1 findbugs. The patch appears to cause Findbugs (version 2.0.3) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in tajo-algebra tajo-catalog/tajo-catalog-drivers/tajo-hive tajo-core tajo-core-tests tajo-plan. Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/849//testReport/ Findbugs results: https://builds.apache.org/job/PreCommit-TAJO-Build/849//findbugsResult Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/849//console This message is automatically generated.
        Hide
        blrunner Jaehwa Jung added a comment -

        Rebased

        Show
        blrunner Jaehwa Jung added a comment - Rebased
        Hide
        tajoqa Tajo QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12754455/TAJO-1673_2.patch
        against master revision release-0.9.0-rc0-447-g33b8893.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 7 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The applied patch does not increase the total number of javadoc warnings.

        +1 checkstyle. The patch generated 0 code style errors.

        -1 findbugs. The patch appears to cause Findbugs (version 2.0.3) to fail.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in tajo-algebra tajo-catalog/tajo-catalog-drivers/tajo-hive tajo-core tajo-core-tests tajo-plan tajo-sql-parser.

        Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/858//testReport/
        Findbugs results: https://builds.apache.org/job/PreCommit-TAJO-Build/858//findbugsResult
        Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/858//console

        This message is automatically generated.

        Show
        tajoqa Tajo QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12754455/TAJO-1673_2.patch against master revision release-0.9.0-rc0-447-g33b8893. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The applied patch does not increase the total number of javadoc warnings. +1 checkstyle. The patch generated 0 code style errors. -1 findbugs. The patch appears to cause Findbugs (version 2.0.3) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in tajo-algebra tajo-catalog/tajo-catalog-drivers/tajo-hive tajo-core tajo-core-tests tajo-plan tajo-sql-parser. Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/858//testReport/ Findbugs results: https://builds.apache.org/job/PreCommit-TAJO-Build/858//findbugsResult Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/858//console This message is automatically generated.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/626#issuecomment-140945055

        Rebased.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/626#issuecomment-140945055 Rebased.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/626#discussion_r40110721

        — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java —
        @@ -23,29 +23,36 @@
        import org.apache.hadoop.fs.FileStatus;
        import org.apache.hadoop.fs.FileSystem;
        import org.apache.hadoop.fs.Path;
        +import org.apache.hadoop.fs.PathFilter;
        import org.apache.tajo.algebra.AlterTableOpType;
        import org.apache.tajo.algebra.AlterTablespaceSetType;
        import org.apache.tajo.annotation.Nullable;
        import org.apache.tajo.catalog.*;
        +import org.apache.tajo.catalog.partition.PartitionMethodDesc;
        import org.apache.tajo.catalog.proto.CatalogProtos;
        import org.apache.tajo.catalog.proto.CatalogProtos.AlterTablespaceProto;
        import org.apache.tajo.catalog.proto.CatalogProtos.PartitionKeyProto;
        +import org.apache.tajo.catalog.proto.CatalogProtos.PartitionDescProto;
        import org.apache.tajo.conf.TajoConf;
        import org.apache.tajo.engine.query.QueryContext;
        import org.apache.tajo.exception.*;
        import org.apache.tajo.master.TajoMaster;
        import org.apache.tajo.plan.LogicalPlan;
        import org.apache.tajo.plan.logical.*;
        +import org.apache.tajo.plan.rewrite.rules.PartitionedTableRewriter;
        import org.apache.tajo.plan.util.PlannerUtil;
        import org.apache.tajo.storage.FileTablespace;
        import org.apache.tajo.storage.StorageUtil;
        import org.apache.tajo.storage.Tablespace;
        import org.apache.tajo.storage.TablespaceManager;
        import org.apache.tajo.util.Pair;
        +import org.apache.tajo.util.StringUtils;
        +import org.apache.tajo.util.TUtil;

        import java.io.IOException;
        import java.util.ArrayList;
        import java.util.List;
        +import java.util.Set;
        — End diff –

        Unused import.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/626#discussion_r40110721 — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java — @@ -23,29 +23,36 @@ import org.apache.hadoop.fs.FileStatus; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.PathFilter; import org.apache.tajo.algebra.AlterTableOpType; import org.apache.tajo.algebra.AlterTablespaceSetType; import org.apache.tajo.annotation.Nullable; import org.apache.tajo.catalog.*; +import org.apache.tajo.catalog.partition.PartitionMethodDesc; import org.apache.tajo.catalog.proto.CatalogProtos; import org.apache.tajo.catalog.proto.CatalogProtos.AlterTablespaceProto; import org.apache.tajo.catalog.proto.CatalogProtos.PartitionKeyProto; +import org.apache.tajo.catalog.proto.CatalogProtos.PartitionDescProto; import org.apache.tajo.conf.TajoConf; import org.apache.tajo.engine.query.QueryContext; import org.apache.tajo.exception.*; import org.apache.tajo.master.TajoMaster; import org.apache.tajo.plan.LogicalPlan; import org.apache.tajo.plan.logical.*; +import org.apache.tajo.plan.rewrite.rules.PartitionedTableRewriter; import org.apache.tajo.plan.util.PlannerUtil; import org.apache.tajo.storage.FileTablespace; import org.apache.tajo.storage.StorageUtil; import org.apache.tajo.storage.Tablespace; import org.apache.tajo.storage.TablespaceManager; import org.apache.tajo.util.Pair; +import org.apache.tajo.util.StringUtils; +import org.apache.tajo.util.TUtil; import java.io.IOException; import java.util.ArrayList; import java.util.List; +import java.util.Set; — End diff – Unused import.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/626#discussion_r40111028

        — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java —
        @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer
        catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(),
        alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION));

        • // When dropping partition on an managed table, the data will be delete from file system.
        • if (!desc.isExternal()) {
          + // When dropping partition on a table, the data in the table will NOT be deleted from the file system.
          + // But if PURGE is specified, the partition data will be deleted.
            • End diff –

        How about changing the comment to as follows?
        ```When dropping a partition on a table, its data will NOT be deleted if the 'PURGE' option is not specified.```

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/626#discussion_r40111028 — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java — @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(), alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION)); // When dropping partition on an managed table, the data will be delete from file system. if (!desc.isExternal()) { + // When dropping partition on a table, the data in the table will NOT be deleted from the file system. + // But if PURGE is specified, the partition data will be deleted. End diff – How about changing the comment to as follows? ```When dropping a partition on a table, its data will NOT be deleted if the 'PURGE' option is not specified.```
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/626#discussion_r40112090

        — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java —
        @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer
        catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(),
        alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION));

        • // When dropping partition on an managed table, the data will be delete from file system.
        • if (!desc.isExternal()) {
          + // When dropping partition on a table, the data in the table will NOT be deleted from the file system.
          + // But if PURGE is specified, the partition data will be deleted.
          + if (alterTable.isPurge()) { deletePartitionPath(partitionDescProto); - }

          else {

        • // When dropping partition on an external table, the data in the table will NOT be deleted from the file
        • // system. But if PURGE is specified, the partition data will be deleted.
        • if (alterTable.isPurge()) { - deletePartitionPath(partitionDescProto); - }

          }
          }
          -
          + break;
          + case REPAIR_PARTITION:
          + repairPartition(context, queryContext, alterTable);
          break;
          default:
          throw new InternalError("alterTable cannot handle such query: \n" + alterTable.toJson());
          }
          }

        + /**
        + * Run ALTER TABLE table_name REPAIR TABLE statement.
        + * This will recovery all partitions which exists on table directory.
        + *
        + *
        + * @param context
        + * @param queryContext
        + * @param alterTable
        + * @throws IOException
        + */
        + public void repairPartition(TajoMaster.MasterContext context, final QueryContext queryContext,
        + final AlterTableNode alterTable) throws IOException, TajoException {
        + final CatalogService catalog = context.getCatalog();
        + final String tableName = alterTable.getTableName();
        +
        + String databaseName;
        + String simpleTableName;
        + if (CatalogUtil.isFQTableName(tableName))

        { + String[] split = CatalogUtil.splitFQTableName(tableName); + databaseName = split[0]; + simpleTableName = split[1]; + }

        else

        { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableName; + }

        +
        + if (!catalog.existsTable(databaseName, simpleTableName))

        { + throw new UndefinedTableException(alterTable.getTableName()); + }

        +
        + TableDesc tableDesc = catalog.getTableDesc(databaseName, simpleTableName);
        +
        + if(tableDesc.getPartitionMethod() == null)

        { + throw new UndefinedPartitionMethodException(simpleTableName); + }

        +
        + Path tablePath = new Path(tableDesc.getUri());
        + FileSystem fs = tablePath.getFileSystem(context.getConf());
        +
        + PartitionMethodDesc partitionDesc = tableDesc.getPartitionMethod();
        + Schema partitionColumns = new Schema();
        + for (Column column : partitionDesc.getExpressionSchema().getRootColumns())

        { + partitionColumns.addColumn(column); + }

        +
        + // Get the array of path filter, accepting all partition paths.
        + PathFilter[] filters = PartitionedTableRewriter.buildAllAcceptingPathFilters(partitionColumns);
        +
        + // loop from one to the number of partition columns
        + Path [] filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(tablePath, filters[0]));
        +
        + // Get all file status matched to a ith level path filter.
        + for (int i = 1; i < partitionColumns.size(); i++)

        { + filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(filteredPaths, filters[i])); + }

        +
        + // Find missing partitions from filesystem
        + List<PartitionDescProto> existingPartitions = catalog.getPartitions(databaseName, simpleTableName);
        + List<String> existingPartitionNames = TUtil.newList();
        + Path existingPartitionPath = null;
        +
        + for(PartitionDescProto existingPartition : existingPartitions) {
        + existingPartitionPath = new Path(existingPartition.getPath());
        + existingPartitionNames.add(existingPartition.getPartitionName());
        + if (!fs.exists(existingPartitionPath)) {
        + LOG.info("Partitions missing from Filesystem:" + existingPartition.getPartitionName());
        — End diff –

        I think ```warn``` is the more appropriate logging level.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/626#discussion_r40112090 — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java — @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(), alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION)); // When dropping partition on an managed table, the data will be delete from file system. if (!desc.isExternal()) { + // When dropping partition on a table, the data in the table will NOT be deleted from the file system. + // But if PURGE is specified, the partition data will be deleted. + if (alterTable.isPurge()) { deletePartitionPath(partitionDescProto); - } else { // When dropping partition on an external table, the data in the table will NOT be deleted from the file // system. But if PURGE is specified, the partition data will be deleted. if (alterTable.isPurge()) { - deletePartitionPath(partitionDescProto); - } } } - + break; + case REPAIR_PARTITION: + repairPartition(context, queryContext, alterTable); break; default: throw new InternalError("alterTable cannot handle such query: \n" + alterTable.toJson()); } } + /** + * Run ALTER TABLE table_name REPAIR TABLE statement. + * This will recovery all partitions which exists on table directory. + * + * + * @param context + * @param queryContext + * @param alterTable + * @throws IOException + */ + public void repairPartition(TajoMaster.MasterContext context, final QueryContext queryContext, + final AlterTableNode alterTable) throws IOException, TajoException { + final CatalogService catalog = context.getCatalog(); + final String tableName = alterTable.getTableName(); + + String databaseName; + String simpleTableName; + if (CatalogUtil.isFQTableName(tableName)) { + String[] split = CatalogUtil.splitFQTableName(tableName); + databaseName = split[0]; + simpleTableName = split[1]; + } else { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableName; + } + + if (!catalog.existsTable(databaseName, simpleTableName)) { + throw new UndefinedTableException(alterTable.getTableName()); + } + + TableDesc tableDesc = catalog.getTableDesc(databaseName, simpleTableName); + + if(tableDesc.getPartitionMethod() == null) { + throw new UndefinedPartitionMethodException(simpleTableName); + } + + Path tablePath = new Path(tableDesc.getUri()); + FileSystem fs = tablePath.getFileSystem(context.getConf()); + + PartitionMethodDesc partitionDesc = tableDesc.getPartitionMethod(); + Schema partitionColumns = new Schema(); + for (Column column : partitionDesc.getExpressionSchema().getRootColumns()) { + partitionColumns.addColumn(column); + } + + // Get the array of path filter, accepting all partition paths. + PathFilter[] filters = PartitionedTableRewriter.buildAllAcceptingPathFilters(partitionColumns); + + // loop from one to the number of partition columns + Path [] filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(tablePath, filters [0] )); + + // Get all file status matched to a ith level path filter. + for (int i = 1; i < partitionColumns.size(); i++) { + filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(filteredPaths, filters[i])); + } + + // Find missing partitions from filesystem + List<PartitionDescProto> existingPartitions = catalog.getPartitions(databaseName, simpleTableName); + List<String> existingPartitionNames = TUtil.newList(); + Path existingPartitionPath = null; + + for(PartitionDescProto existingPartition : existingPartitions) { + existingPartitionPath = new Path(existingPartition.getPath()); + existingPartitionNames.add(existingPartition.getPartitionName()); + if (!fs.exists(existingPartitionPath)) { + LOG.info("Partitions missing from Filesystem:" + existingPartition.getPartitionName()); — End diff – I think ```warn``` is the more appropriate logging level.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/626#discussion_r40112331

        — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java —
        @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer
        catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(),
        alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION));

        • // When dropping partition on an managed table, the data will be delete from file system.
        • if (!desc.isExternal()) {
          + // When dropping partition on a table, the data in the table will NOT be deleted from the file system.
          + // But if PURGE is specified, the partition data will be deleted.
          + if (alterTable.isPurge()) { deletePartitionPath(partitionDescProto); - }

          else {

        • // When dropping partition on an external table, the data in the table will NOT be deleted from the file
        • // system. But if PURGE is specified, the partition data will be deleted.
        • if (alterTable.isPurge()) { - deletePartitionPath(partitionDescProto); - }

          }
          }
          -
          + break;
          + case REPAIR_PARTITION:
          + repairPartition(context, queryContext, alterTable);
          break;
          default:
          throw new InternalError("alterTable cannot handle such query: \n" + alterTable.toJson());
          }
          }

        + /**
        + * Run ALTER TABLE table_name REPAIR TABLE statement.
        + * This will recovery all partitions which exists on table directory.
        + *
        + *
        + * @param context
        + * @param queryContext
        + * @param alterTable
        + * @throws IOException
        + */
        + public void repairPartition(TajoMaster.MasterContext context, final QueryContext queryContext,
        + final AlterTableNode alterTable) throws IOException, TajoException {
        + final CatalogService catalog = context.getCatalog();
        + final String tableName = alterTable.getTableName();
        +
        + String databaseName;
        + String simpleTableName;
        + if (CatalogUtil.isFQTableName(tableName))

        { + String[] split = CatalogUtil.splitFQTableName(tableName); + databaseName = split[0]; + simpleTableName = split[1]; + }

        else

        { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableName; + }

        +
        + if (!catalog.existsTable(databaseName, simpleTableName))

        { + throw new UndefinedTableException(alterTable.getTableName()); + }

        +
        + TableDesc tableDesc = catalog.getTableDesc(databaseName, simpleTableName);
        +
        + if(tableDesc.getPartitionMethod() == null)

        { + throw new UndefinedPartitionMethodException(simpleTableName); + }

        +
        + Path tablePath = new Path(tableDesc.getUri());
        + FileSystem fs = tablePath.getFileSystem(context.getConf());
        +
        + PartitionMethodDesc partitionDesc = tableDesc.getPartitionMethod();
        + Schema partitionColumns = new Schema();
        — End diff –

        Why did you create a copy of the partition schema?

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/626#discussion_r40112331 — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java — @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(), alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION)); // When dropping partition on an managed table, the data will be delete from file system. if (!desc.isExternal()) { + // When dropping partition on a table, the data in the table will NOT be deleted from the file system. + // But if PURGE is specified, the partition data will be deleted. + if (alterTable.isPurge()) { deletePartitionPath(partitionDescProto); - } else { // When dropping partition on an external table, the data in the table will NOT be deleted from the file // system. But if PURGE is specified, the partition data will be deleted. if (alterTable.isPurge()) { - deletePartitionPath(partitionDescProto); - } } } - + break; + case REPAIR_PARTITION: + repairPartition(context, queryContext, alterTable); break; default: throw new InternalError("alterTable cannot handle such query: \n" + alterTable.toJson()); } } + /** + * Run ALTER TABLE table_name REPAIR TABLE statement. + * This will recovery all partitions which exists on table directory. + * + * + * @param context + * @param queryContext + * @param alterTable + * @throws IOException + */ + public void repairPartition(TajoMaster.MasterContext context, final QueryContext queryContext, + final AlterTableNode alterTable) throws IOException, TajoException { + final CatalogService catalog = context.getCatalog(); + final String tableName = alterTable.getTableName(); + + String databaseName; + String simpleTableName; + if (CatalogUtil.isFQTableName(tableName)) { + String[] split = CatalogUtil.splitFQTableName(tableName); + databaseName = split[0]; + simpleTableName = split[1]; + } else { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableName; + } + + if (!catalog.existsTable(databaseName, simpleTableName)) { + throw new UndefinedTableException(alterTable.getTableName()); + } + + TableDesc tableDesc = catalog.getTableDesc(databaseName, simpleTableName); + + if(tableDesc.getPartitionMethod() == null) { + throw new UndefinedPartitionMethodException(simpleTableName); + } + + Path tablePath = new Path(tableDesc.getUri()); + FileSystem fs = tablePath.getFileSystem(context.getConf()); + + PartitionMethodDesc partitionDesc = tableDesc.getPartitionMethod(); + Schema partitionColumns = new Schema(); — End diff – Why did you create a copy of the partition schema?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/626#discussion_r40113668

        — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java —
        @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer
        catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(),
        alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION));

        • // When dropping partition on an managed table, the data will be delete from file system.
        • if (!desc.isExternal()) {
          + // When dropping partition on a table, the data in the table will NOT be deleted from the file system.
          + // But if PURGE is specified, the partition data will be deleted.
          + if (alterTable.isPurge()) { deletePartitionPath(partitionDescProto); - }

          else {

        • // When dropping partition on an external table, the data in the table will NOT be deleted from the file
        • // system. But if PURGE is specified, the partition data will be deleted.
        • if (alterTable.isPurge()) { - deletePartitionPath(partitionDescProto); - }

          }
          }
          -
          + break;
          + case REPAIR_PARTITION:
          + repairPartition(context, queryContext, alterTable);
          break;
          default:
          throw new InternalError("alterTable cannot handle such query: \n" + alterTable.toJson());
          }
          }

        + /**
        + * Run ALTER TABLE table_name REPAIR TABLE statement.
        + * This will recovery all partitions which exists on table directory.
        + *
        + *
        + * @param context
        + * @param queryContext
        + * @param alterTable
        + * @throws IOException
        + */
        + public void repairPartition(TajoMaster.MasterContext context, final QueryContext queryContext,
        + final AlterTableNode alterTable) throws IOException, TajoException {
        + final CatalogService catalog = context.getCatalog();
        + final String tableName = alterTable.getTableName();
        +
        + String databaseName;
        + String simpleTableName;
        + if (CatalogUtil.isFQTableName(tableName))

        { + String[] split = CatalogUtil.splitFQTableName(tableName); + databaseName = split[0]; + simpleTableName = split[1]; + }

        else

        { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableName; + }

        +
        + if (!catalog.existsTable(databaseName, simpleTableName))

        { + throw new UndefinedTableException(alterTable.getTableName()); + }

        +
        + TableDesc tableDesc = catalog.getTableDesc(databaseName, simpleTableName);
        +
        + if(tableDesc.getPartitionMethod() == null)

        { + throw new UndefinedPartitionMethodException(simpleTableName); + }

        +
        + Path tablePath = new Path(tableDesc.getUri());
        + FileSystem fs = tablePath.getFileSystem(context.getConf());
        +
        + PartitionMethodDesc partitionDesc = tableDesc.getPartitionMethod();
        + Schema partitionColumns = new Schema();
        + for (Column column : partitionDesc.getExpressionSchema().getRootColumns())

        { + partitionColumns.addColumn(column); + }

        +
        + // Get the array of path filter, accepting all partition paths.
        + PathFilter[] filters = PartitionedTableRewriter.buildAllAcceptingPathFilters(partitionColumns);
        +
        + // loop from one to the number of partition columns
        + Path [] filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(tablePath, filters[0]));
        +
        + // Get all file status matched to a ith level path filter.
        + for (int i = 1; i < partitionColumns.size(); i++)

        { + filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(filteredPaths, filters[i])); + }

        +
        + // Find missing partitions from filesystem
        + List<PartitionDescProto> existingPartitions = catalog.getPartitions(databaseName, simpleTableName);
        + List<String> existingPartitionNames = TUtil.newList();
        + Path existingPartitionPath = null;
        +
        + for(PartitionDescProto existingPartition : existingPartitions) {
        + existingPartitionPath = new Path(existingPartition.getPath());
        + existingPartitionNames.add(existingPartition.getPartitionName());
        + if (!fs.exists(existingPartitionPath))

        { + LOG.info("Partitions missing from Filesystem:" + existingPartition.getPartitionName()); + }

        + }
        +
        + // Find missing partitions from CatalogStore
        + List<PartitionDescProto> targetPartitions = TUtil.newList();
        + for(Path filteredPath : filteredPaths) {
        + PartitionDescProto targetPartition = getPartitionDesc(simpleTableName, filteredPath);
        + if (!existingPartitionNames.contains(targetPartition.getPartitionName())) {
        + LOG.info("Partitions not in CatalogStore:" + targetPartition.getPartitionName());
        — End diff –

        This will emit too many logs.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/626#discussion_r40113668 — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java — @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(), alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION)); // When dropping partition on an managed table, the data will be delete from file system. if (!desc.isExternal()) { + // When dropping partition on a table, the data in the table will NOT be deleted from the file system. + // But if PURGE is specified, the partition data will be deleted. + if (alterTable.isPurge()) { deletePartitionPath(partitionDescProto); - } else { // When dropping partition on an external table, the data in the table will NOT be deleted from the file // system. But if PURGE is specified, the partition data will be deleted. if (alterTable.isPurge()) { - deletePartitionPath(partitionDescProto); - } } } - + break; + case REPAIR_PARTITION: + repairPartition(context, queryContext, alterTable); break; default: throw new InternalError("alterTable cannot handle such query: \n" + alterTable.toJson()); } } + /** + * Run ALTER TABLE table_name REPAIR TABLE statement. + * This will recovery all partitions which exists on table directory. + * + * + * @param context + * @param queryContext + * @param alterTable + * @throws IOException + */ + public void repairPartition(TajoMaster.MasterContext context, final QueryContext queryContext, + final AlterTableNode alterTable) throws IOException, TajoException { + final CatalogService catalog = context.getCatalog(); + final String tableName = alterTable.getTableName(); + + String databaseName; + String simpleTableName; + if (CatalogUtil.isFQTableName(tableName)) { + String[] split = CatalogUtil.splitFQTableName(tableName); + databaseName = split[0]; + simpleTableName = split[1]; + } else { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableName; + } + + if (!catalog.existsTable(databaseName, simpleTableName)) { + throw new UndefinedTableException(alterTable.getTableName()); + } + + TableDesc tableDesc = catalog.getTableDesc(databaseName, simpleTableName); + + if(tableDesc.getPartitionMethod() == null) { + throw new UndefinedPartitionMethodException(simpleTableName); + } + + Path tablePath = new Path(tableDesc.getUri()); + FileSystem fs = tablePath.getFileSystem(context.getConf()); + + PartitionMethodDesc partitionDesc = tableDesc.getPartitionMethod(); + Schema partitionColumns = new Schema(); + for (Column column : partitionDesc.getExpressionSchema().getRootColumns()) { + partitionColumns.addColumn(column); + } + + // Get the array of path filter, accepting all partition paths. + PathFilter[] filters = PartitionedTableRewriter.buildAllAcceptingPathFilters(partitionColumns); + + // loop from one to the number of partition columns + Path [] filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(tablePath, filters [0] )); + + // Get all file status matched to a ith level path filter. + for (int i = 1; i < partitionColumns.size(); i++) { + filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(filteredPaths, filters[i])); + } + + // Find missing partitions from filesystem + List<PartitionDescProto> existingPartitions = catalog.getPartitions(databaseName, simpleTableName); + List<String> existingPartitionNames = TUtil.newList(); + Path existingPartitionPath = null; + + for(PartitionDescProto existingPartition : existingPartitions) { + existingPartitionPath = new Path(existingPartition.getPath()); + existingPartitionNames.add(existingPartition.getPartitionName()); + if (!fs.exists(existingPartitionPath)) { + LOG.info("Partitions missing from Filesystem:" + existingPartition.getPartitionName()); + } + } + + // Find missing partitions from CatalogStore + List<PartitionDescProto> targetPartitions = TUtil.newList(); + for(Path filteredPath : filteredPaths) { + PartitionDescProto targetPartition = getPartitionDesc(simpleTableName, filteredPath); + if (!existingPartitionNames.contains(targetPartition.getPartitionName())) { + LOG.info("Partitions not in CatalogStore:" + targetPartition.getPartitionName()); — End diff – This will emit too many logs.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/626#discussion_r40113680

        — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java —
        @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer
        catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(),
        alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION));

        • // When dropping partition on an managed table, the data will be delete from file system.
        • if (!desc.isExternal()) {
          + // When dropping partition on a table, the data in the table will NOT be deleted from the file system.
          + // But if PURGE is specified, the partition data will be deleted.
          + if (alterTable.isPurge()) { deletePartitionPath(partitionDescProto); - }

          else {

        • // When dropping partition on an external table, the data in the table will NOT be deleted from the file
        • // system. But if PURGE is specified, the partition data will be deleted.
        • if (alterTable.isPurge()) { - deletePartitionPath(partitionDescProto); - }

          }
          }
          -
          + break;
          + case REPAIR_PARTITION:
          + repairPartition(context, queryContext, alterTable);
          break;
          default:
          throw new InternalError("alterTable cannot handle such query: \n" + alterTable.toJson());
          }
          }

        + /**
        + * Run ALTER TABLE table_name REPAIR TABLE statement.
        + * This will recovery all partitions which exists on table directory.
        + *
        + *
        + * @param context
        + * @param queryContext
        + * @param alterTable
        + * @throws IOException
        + */
        + public void repairPartition(TajoMaster.MasterContext context, final QueryContext queryContext,
        + final AlterTableNode alterTable) throws IOException, TajoException {
        + final CatalogService catalog = context.getCatalog();
        + final String tableName = alterTable.getTableName();
        +
        + String databaseName;
        + String simpleTableName;
        + if (CatalogUtil.isFQTableName(tableName))

        { + String[] split = CatalogUtil.splitFQTableName(tableName); + databaseName = split[0]; + simpleTableName = split[1]; + }

        else

        { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableName; + }

        +
        + if (!catalog.existsTable(databaseName, simpleTableName))

        { + throw new UndefinedTableException(alterTable.getTableName()); + }

        +
        + TableDesc tableDesc = catalog.getTableDesc(databaseName, simpleTableName);
        +
        + if(tableDesc.getPartitionMethod() == null)

        { + throw new UndefinedPartitionMethodException(simpleTableName); + }

        +
        + Path tablePath = new Path(tableDesc.getUri());
        + FileSystem fs = tablePath.getFileSystem(context.getConf());
        +
        + PartitionMethodDesc partitionDesc = tableDesc.getPartitionMethod();
        + Schema partitionColumns = new Schema();
        + for (Column column : partitionDesc.getExpressionSchema().getRootColumns())

        { + partitionColumns.addColumn(column); + }

        +
        + // Get the array of path filter, accepting all partition paths.
        + PathFilter[] filters = PartitionedTableRewriter.buildAllAcceptingPathFilters(partitionColumns);
        +
        + // loop from one to the number of partition columns
        + Path [] filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(tablePath, filters[0]));
        +
        + // Get all file status matched to a ith level path filter.
        + for (int i = 1; i < partitionColumns.size(); i++)

        { + filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(filteredPaths, filters[i])); + }

        +
        + // Find missing partitions from filesystem
        + List<PartitionDescProto> existingPartitions = catalog.getPartitions(databaseName, simpleTableName);
        + List<String> existingPartitionNames = TUtil.newList();
        + Path existingPartitionPath = null;
        +
        + for(PartitionDescProto existingPartition : existingPartitions) {
        + existingPartitionPath = new Path(existingPartition.getPath());
        + existingPartitionNames.add(existingPartition.getPartitionName());
        + if (!fs.exists(existingPartitionPath))

        { + LOG.info("Partitions missing from Filesystem:" + existingPartition.getPartitionName()); + }

        + }
        +
        + // Find missing partitions from CatalogStore
        + List<PartitionDescProto> targetPartitions = TUtil.newList();
        + for(Path filteredPath : filteredPaths) {
        + PartitionDescProto targetPartition = getPartitionDesc(simpleTableName, filteredPath);
        + if (!existingPartitionNames.contains(targetPartition.getPartitionName()))

        { + LOG.info("Partitions not in CatalogStore:" + targetPartition.getPartitionName()); + targetPartitions.add(targetPartition); + }

        + }
        +
        + catalog.addPartitions(databaseName, simpleTableName, targetPartitions, true);
        +
        + for(PartitionDescProto targetPartition: targetPartitions) {
        + LOG.info("Repair: Added partition to CatalogStore " + tableName + ":" + targetPartition.getPartitionName());
        — End diff –

        This will emit too many logs.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/626#discussion_r40113680 — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java — @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(), alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION)); // When dropping partition on an managed table, the data will be delete from file system. if (!desc.isExternal()) { + // When dropping partition on a table, the data in the table will NOT be deleted from the file system. + // But if PURGE is specified, the partition data will be deleted. + if (alterTable.isPurge()) { deletePartitionPath(partitionDescProto); - } else { // When dropping partition on an external table, the data in the table will NOT be deleted from the file // system. But if PURGE is specified, the partition data will be deleted. if (alterTable.isPurge()) { - deletePartitionPath(partitionDescProto); - } } } - + break; + case REPAIR_PARTITION: + repairPartition(context, queryContext, alterTable); break; default: throw new InternalError("alterTable cannot handle such query: \n" + alterTable.toJson()); } } + /** + * Run ALTER TABLE table_name REPAIR TABLE statement. + * This will recovery all partitions which exists on table directory. + * + * + * @param context + * @param queryContext + * @param alterTable + * @throws IOException + */ + public void repairPartition(TajoMaster.MasterContext context, final QueryContext queryContext, + final AlterTableNode alterTable) throws IOException, TajoException { + final CatalogService catalog = context.getCatalog(); + final String tableName = alterTable.getTableName(); + + String databaseName; + String simpleTableName; + if (CatalogUtil.isFQTableName(tableName)) { + String[] split = CatalogUtil.splitFQTableName(tableName); + databaseName = split[0]; + simpleTableName = split[1]; + } else { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableName; + } + + if (!catalog.existsTable(databaseName, simpleTableName)) { + throw new UndefinedTableException(alterTable.getTableName()); + } + + TableDesc tableDesc = catalog.getTableDesc(databaseName, simpleTableName); + + if(tableDesc.getPartitionMethod() == null) { + throw new UndefinedPartitionMethodException(simpleTableName); + } + + Path tablePath = new Path(tableDesc.getUri()); + FileSystem fs = tablePath.getFileSystem(context.getConf()); + + PartitionMethodDesc partitionDesc = tableDesc.getPartitionMethod(); + Schema partitionColumns = new Schema(); + for (Column column : partitionDesc.getExpressionSchema().getRootColumns()) { + partitionColumns.addColumn(column); + } + + // Get the array of path filter, accepting all partition paths. + PathFilter[] filters = PartitionedTableRewriter.buildAllAcceptingPathFilters(partitionColumns); + + // loop from one to the number of partition columns + Path [] filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(tablePath, filters [0] )); + + // Get all file status matched to a ith level path filter. + for (int i = 1; i < partitionColumns.size(); i++) { + filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(filteredPaths, filters[i])); + } + + // Find missing partitions from filesystem + List<PartitionDescProto> existingPartitions = catalog.getPartitions(databaseName, simpleTableName); + List<String> existingPartitionNames = TUtil.newList(); + Path existingPartitionPath = null; + + for(PartitionDescProto existingPartition : existingPartitions) { + existingPartitionPath = new Path(existingPartition.getPath()); + existingPartitionNames.add(existingPartition.getPartitionName()); + if (!fs.exists(existingPartitionPath)) { + LOG.info("Partitions missing from Filesystem:" + existingPartition.getPartitionName()); + } + } + + // Find missing partitions from CatalogStore + List<PartitionDescProto> targetPartitions = TUtil.newList(); + for(Path filteredPath : filteredPaths) { + PartitionDescProto targetPartition = getPartitionDesc(simpleTableName, filteredPath); + if (!existingPartitionNames.contains(targetPartition.getPartitionName())) { + LOG.info("Partitions not in CatalogStore:" + targetPartition.getPartitionName()); + targetPartitions.add(targetPartition); + } + } + + catalog.addPartitions(databaseName, simpleTableName, targetPartitions, true); + + for(PartitionDescProto targetPartition: targetPartitions) { + LOG.info("Repair: Added partition to CatalogStore " + tableName + ":" + targetPartition.getPartitionName()); — End diff – This will emit too many logs.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/626#discussion_r40114180

        — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java —
        @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer
        catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(),
        alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION));

        • // When dropping partition on an managed table, the data will be delete from file system.
        • if (!desc.isExternal()) {
          + // When dropping partition on a table, the data in the table will NOT be deleted from the file system.
          + // But if PURGE is specified, the partition data will be deleted.
          + if (alterTable.isPurge()) { deletePartitionPath(partitionDescProto); - }

          else {

        • // When dropping partition on an external table, the data in the table will NOT be deleted from the file
        • // system. But if PURGE is specified, the partition data will be deleted.
        • if (alterTable.isPurge()) { - deletePartitionPath(partitionDescProto); - }

          }
          }
          -
          + break;
          + case REPAIR_PARTITION:
          + repairPartition(context, queryContext, alterTable);
          break;
          default:
          throw new InternalError("alterTable cannot handle such query: \n" + alterTable.toJson());
          }
          }

        + /**
        + * Run ALTER TABLE table_name REPAIR TABLE statement.
        + * This will recovery all partitions which exists on table directory.
        + *
        + *
        + * @param context
        + * @param queryContext
        + * @param alterTable
        + * @throws IOException
        + */
        + public void repairPartition(TajoMaster.MasterContext context, final QueryContext queryContext,
        + final AlterTableNode alterTable) throws IOException, TajoException {
        + final CatalogService catalog = context.getCatalog();
        + final String tableName = alterTable.getTableName();
        +
        + String databaseName;
        + String simpleTableName;
        + if (CatalogUtil.isFQTableName(tableName))

        { + String[] split = CatalogUtil.splitFQTableName(tableName); + databaseName = split[0]; + simpleTableName = split[1]; + }

        else

        { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableName; + }

        +
        + if (!catalog.existsTable(databaseName, simpleTableName))

        { + throw new UndefinedTableException(alterTable.getTableName()); + }

        +
        + TableDesc tableDesc = catalog.getTableDesc(databaseName, simpleTableName);
        +
        + if(tableDesc.getPartitionMethod() == null)

        { + throw new UndefinedPartitionMethodException(simpleTableName); + }

        +
        + Path tablePath = new Path(tableDesc.getUri());
        + FileSystem fs = tablePath.getFileSystem(context.getConf());
        +
        + PartitionMethodDesc partitionDesc = tableDesc.getPartitionMethod();
        + Schema partitionColumns = new Schema();
        + for (Column column : partitionDesc.getExpressionSchema().getRootColumns())

        { + partitionColumns.addColumn(column); + }

        +
        + // Get the array of path filter, accepting all partition paths.
        + PathFilter[] filters = PartitionedTableRewriter.buildAllAcceptingPathFilters(partitionColumns);
        +
        + // loop from one to the number of partition columns
        + Path [] filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(tablePath, filters[0]));
        +
        + // Get all file status matched to a ith level path filter.
        + for (int i = 1; i < partitionColumns.size(); i++)

        { + filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(filteredPaths, filters[i])); + }

        +
        + // Find missing partitions from filesystem
        — End diff –

        The below routine is too complex and includes unnecessary codes, I think.
        IMHO, the algorithm will be simple as follows.
        1. Getting PartitionDescProtos from the catalog
        2. While iterating PartitionDescProtos, call catalog.addPartition() if the partition directory exists. Maybe catalog.addPartitions() will be called instead.

        If I missed any points, please let me know.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/626#discussion_r40114180 — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java — @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(), alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION)); // When dropping partition on an managed table, the data will be delete from file system. if (!desc.isExternal()) { + // When dropping partition on a table, the data in the table will NOT be deleted from the file system. + // But if PURGE is specified, the partition data will be deleted. + if (alterTable.isPurge()) { deletePartitionPath(partitionDescProto); - } else { // When dropping partition on an external table, the data in the table will NOT be deleted from the file // system. But if PURGE is specified, the partition data will be deleted. if (alterTable.isPurge()) { - deletePartitionPath(partitionDescProto); - } } } - + break; + case REPAIR_PARTITION: + repairPartition(context, queryContext, alterTable); break; default: throw new InternalError("alterTable cannot handle such query: \n" + alterTable.toJson()); } } + /** + * Run ALTER TABLE table_name REPAIR TABLE statement. + * This will recovery all partitions which exists on table directory. + * + * + * @param context + * @param queryContext + * @param alterTable + * @throws IOException + */ + public void repairPartition(TajoMaster.MasterContext context, final QueryContext queryContext, + final AlterTableNode alterTable) throws IOException, TajoException { + final CatalogService catalog = context.getCatalog(); + final String tableName = alterTable.getTableName(); + + String databaseName; + String simpleTableName; + if (CatalogUtil.isFQTableName(tableName)) { + String[] split = CatalogUtil.splitFQTableName(tableName); + databaseName = split[0]; + simpleTableName = split[1]; + } else { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableName; + } + + if (!catalog.existsTable(databaseName, simpleTableName)) { + throw new UndefinedTableException(alterTable.getTableName()); + } + + TableDesc tableDesc = catalog.getTableDesc(databaseName, simpleTableName); + + if(tableDesc.getPartitionMethod() == null) { + throw new UndefinedPartitionMethodException(simpleTableName); + } + + Path tablePath = new Path(tableDesc.getUri()); + FileSystem fs = tablePath.getFileSystem(context.getConf()); + + PartitionMethodDesc partitionDesc = tableDesc.getPartitionMethod(); + Schema partitionColumns = new Schema(); + for (Column column : partitionDesc.getExpressionSchema().getRootColumns()) { + partitionColumns.addColumn(column); + } + + // Get the array of path filter, accepting all partition paths. + PathFilter[] filters = PartitionedTableRewriter.buildAllAcceptingPathFilters(partitionColumns); + + // loop from one to the number of partition columns + Path [] filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(tablePath, filters [0] )); + + // Get all file status matched to a ith level path filter. + for (int i = 1; i < partitionColumns.size(); i++) { + filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(filteredPaths, filters[i])); + } + + // Find missing partitions from filesystem — End diff – The below routine is too complex and includes unnecessary codes, I think. IMHO, the algorithm will be simple as follows. 1. Getting PartitionDescProtos from the catalog 2. While iterating PartitionDescProtos, call catalog.addPartition() if the partition directory exists. Maybe catalog.addPartitions() will be called instead. If I missed any points, please let me know.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/626#discussion_r40114567

        — Diff: tajo-docs/src/main/sphinx/sql_language/alter_table.rst —
        @@ -96,4 +96,31 @@ You can use ``ALTER TABLE ADD PARTITION`` to add partitions to a table. The loca
        ALTER TABLE table1 DROP PARTITION (col1 = '2015' , col2 = '01', col3 = '11' )
        ALTER TABLE table1 DROP PARTITION (col1 = 'TAJO' ) PURGE

        -You can use ``ALTER TABLE DROP PARTITION`` to drop a partition for a table. This removes the data for a managed table and this doesn't remove the data for an external table. But if ``PURGE`` is specified for an external table, the partition data will be removed. The metadata is completely lost in all cases. An error is thrown if the partition for the table doesn't exists. You can use ``IF EXISTS`` to skip the error.
        +You can use ``ALTER TABLE DROP PARTITION`` to drop a partition for a table. This doesn't remove the data for a table. But if ``PURGE`` is specified, the partition data will be removed. The metadata is completely lost in all cases. An error is thrown if the partition for the table doesn't exists. You can use ``IF EXISTS`` to skip the error.
        — End diff –

        In ```if the partition for the table doesn't exists```, ```exists``` should be ```exist```.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/626#discussion_r40114567 — Diff: tajo-docs/src/main/sphinx/sql_language/alter_table.rst — @@ -96,4 +96,31 @@ You can use ``ALTER TABLE ADD PARTITION`` to add partitions to a table. The loca ALTER TABLE table1 DROP PARTITION (col1 = '2015' , col2 = '01', col3 = '11' ) ALTER TABLE table1 DROP PARTITION (col1 = 'TAJO' ) PURGE -You can use ``ALTER TABLE DROP PARTITION`` to drop a partition for a table. This removes the data for a managed table and this doesn't remove the data for an external table. But if ``PURGE`` is specified for an external table, the partition data will be removed. The metadata is completely lost in all cases. An error is thrown if the partition for the table doesn't exists. You can use ``IF EXISTS`` to skip the error. +You can use ``ALTER TABLE DROP PARTITION`` to drop a partition for a table. This doesn't remove the data for a table. But if ``PURGE`` is specified, the partition data will be removed. The metadata is completely lost in all cases. An error is thrown if the partition for the table doesn't exists. You can use ``IF EXISTS`` to skip the error. — End diff – In ```if the partition for the table doesn't exists```, ```exists``` should be ```exist```.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/626#discussion_r40114921

        — Diff: tajo-docs/src/main/sphinx/sql_language/alter_table.rst —
        @@ -96,4 +96,31 @@ You can use ``ALTER TABLE ADD PARTITION`` to add partitions to a table. The loca
        ALTER TABLE table1 DROP PARTITION (col1 = '2015' , col2 = '01', col3 = '11' )
        ALTER TABLE table1 DROP PARTITION (col1 = 'TAJO' ) PURGE

        -You can use ``ALTER TABLE DROP PARTITION`` to drop a partition for a table. This removes the data for a managed table and this doesn't remove the data for an external table. But if ``PURGE`` is specified for an external table, the partition data will be removed. The metadata is completely lost in all cases. An error is thrown if the partition for the table doesn't exists. You can use ``IF EXISTS`` to skip the error.
        +You can use ``ALTER TABLE DROP PARTITION`` to drop a partition for a table. This doesn't remove the data for a table. But if ``PURGE`` is specified, the partition data will be removed. The metadata is completely lost in all cases. An error is thrown if the partition for the table doesn't exists. You can use ``IF EXISTS`` to skip the error.
        +
        +========================
        +REPAIR PARTITION
        +========================
        +
        +Tajo stores a list of partitions for each table in its catalog. If partitions are manually added to the distributed file system, the metastore is not aware of these partitions. Running the ``ALTER TABLE REPAIR PARTITION`` statement ensures that the tables are properly populated.
        +
        +Synopsis
        +
        +.. code-block:: sql
        +
        + ALTER TABLE <table_name> REPAIR PARTITION
        +
        +Examples
        +
        +.. code-block:: sql
        +
        + ALTER TABLE student REPAIR PARTITION;
        — End diff –

        I think this example is so obvious, and thus unnecessary.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/626#discussion_r40114921 — Diff: tajo-docs/src/main/sphinx/sql_language/alter_table.rst — @@ -96,4 +96,31 @@ You can use ``ALTER TABLE ADD PARTITION`` to add partitions to a table. The loca ALTER TABLE table1 DROP PARTITION (col1 = '2015' , col2 = '01', col3 = '11' ) ALTER TABLE table1 DROP PARTITION (col1 = 'TAJO' ) PURGE -You can use ``ALTER TABLE DROP PARTITION`` to drop a partition for a table. This removes the data for a managed table and this doesn't remove the data for an external table. But if ``PURGE`` is specified for an external table, the partition data will be removed. The metadata is completely lost in all cases. An error is thrown if the partition for the table doesn't exists. You can use ``IF EXISTS`` to skip the error. +You can use ``ALTER TABLE DROP PARTITION`` to drop a partition for a table. This doesn't remove the data for a table. But if ``PURGE`` is specified, the partition data will be removed. The metadata is completely lost in all cases. An error is thrown if the partition for the table doesn't exists. You can use ``IF EXISTS`` to skip the error. + +======================== +REPAIR PARTITION +======================== + +Tajo stores a list of partitions for each table in its catalog. If partitions are manually added to the distributed file system, the metastore is not aware of these partitions. Running the ``ALTER TABLE REPAIR PARTITION`` statement ensures that the tables are properly populated. + + Synopsis + +.. code-block:: sql + + ALTER TABLE <table_name> REPAIR PARTITION + + Examples + +.. code-block:: sql + + ALTER TABLE student REPAIR PARTITION; — End diff – I think this example is so obvious, and thus unnecessary.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/626#discussion_r40115745

        — Diff: tajo-docs/src/main/sphinx/sql_language/alter_table.rst —
        @@ -96,4 +96,31 @@ You can use ``ALTER TABLE ADD PARTITION`` to add partitions to a table. The loca
        ALTER TABLE table1 DROP PARTITION (col1 = '2015' , col2 = '01', col3 = '11' )
        ALTER TABLE table1 DROP PARTITION (col1 = 'TAJO' ) PURGE

        -You can use ``ALTER TABLE DROP PARTITION`` to drop a partition for a table. This removes the data for a managed table and this doesn't remove the data for an external table. But if ``PURGE`` is specified for an external table, the partition data will be removed. The metadata is completely lost in all cases. An error is thrown if the partition for the table doesn't exists. You can use ``IF EXISTS`` to skip the error.
        +You can use ``ALTER TABLE DROP PARTITION`` to drop a partition for a table. This doesn't remove the data for a table. But if ``PURGE`` is specified, the partition data will be removed. The metadata is completely lost in all cases. An error is thrown if the partition for the table doesn't exists. You can use ``IF EXISTS`` to skip the error.
        +
        +========================
        +REPAIR PARTITION
        +========================
        +
        +Tajo stores a list of partitions for each table in its catalog. If partitions are manually added to the distributed file system, the metastore is not aware of these partitions. Running the ``ALTER TABLE REPAIR PARTITION`` statement ensures that the tables are properly populated.
        +
        +Synopsis
        +
        +.. code-block:: sql
        +
        + ALTER TABLE <table_name> REPAIR PARTITION
        +
        +Examples
        +
        +.. code-block:: sql
        +
        + ALTER TABLE student REPAIR PARTITION;
        +
        +.. note::
        +
        + If you recover partitions in the situation that partitions just exists on catalog and it doesn't exist on file system, Tajo would not make directories on the file system and would print messages to TajoMaster log as following:
        — End diff –

        How about changing this paragraph to as follows?
        ```
        Even though an information of a partition is stored in the catalog, Tajo does not recover it when its partition directory doesn't exist in the file system.
        ```

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/626#discussion_r40115745 — Diff: tajo-docs/src/main/sphinx/sql_language/alter_table.rst — @@ -96,4 +96,31 @@ You can use ``ALTER TABLE ADD PARTITION`` to add partitions to a table. The loca ALTER TABLE table1 DROP PARTITION (col1 = '2015' , col2 = '01', col3 = '11' ) ALTER TABLE table1 DROP PARTITION (col1 = 'TAJO' ) PURGE -You can use ``ALTER TABLE DROP PARTITION`` to drop a partition for a table. This removes the data for a managed table and this doesn't remove the data for an external table. But if ``PURGE`` is specified for an external table, the partition data will be removed. The metadata is completely lost in all cases. An error is thrown if the partition for the table doesn't exists. You can use ``IF EXISTS`` to skip the error. +You can use ``ALTER TABLE DROP PARTITION`` to drop a partition for a table. This doesn't remove the data for a table. But if ``PURGE`` is specified, the partition data will be removed. The metadata is completely lost in all cases. An error is thrown if the partition for the table doesn't exists. You can use ``IF EXISTS`` to skip the error. + +======================== +REPAIR PARTITION +======================== + +Tajo stores a list of partitions for each table in its catalog. If partitions are manually added to the distributed file system, the metastore is not aware of these partitions. Running the ``ALTER TABLE REPAIR PARTITION`` statement ensures that the tables are properly populated. + + Synopsis + +.. code-block:: sql + + ALTER TABLE <table_name> REPAIR PARTITION + + Examples + +.. code-block:: sql + + ALTER TABLE student REPAIR PARTITION; + +.. note:: + + If you recover partitions in the situation that partitions just exists on catalog and it doesn't exist on file system, Tajo would not make directories on the file system and would print messages to TajoMaster log as following: — End diff – How about changing this paragraph to as follows? ``` Even though an information of a partition is stored in the catalog, Tajo does not recover it when its partition directory doesn't exist in the file system. ```
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/626#discussion_r40204949

        — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java —
        @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer
        catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(),
        alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION));

        • // When dropping partition on an managed table, the data will be delete from file system.
        • if (!desc.isExternal()) {
          + // When dropping partition on a table, the data in the table will NOT be deleted from the file system.
          + // But if PURGE is specified, the partition data will be deleted.
          + if (alterTable.isPurge()) { deletePartitionPath(partitionDescProto); - }

          else {

        • // When dropping partition on an external table, the data in the table will NOT be deleted from the file
        • // system. But if PURGE is specified, the partition data will be deleted.
        • if (alterTable.isPurge()) { - deletePartitionPath(partitionDescProto); - }

          }
          }
          -
          + break;
          + case REPAIR_PARTITION:
          + repairPartition(context, queryContext, alterTable);
          break;
          default:
          throw new InternalError("alterTable cannot handle such query: \n" + alterTable.toJson());
          }
          }

        + /**
        + * Run ALTER TABLE table_name REPAIR TABLE statement.
        + * This will recovery all partitions which exists on table directory.
        + *
        + *
        + * @param context
        + * @param queryContext
        + * @param alterTable
        + * @throws IOException
        + */
        + public void repairPartition(TajoMaster.MasterContext context, final QueryContext queryContext,
        + final AlterTableNode alterTable) throws IOException, TajoException {
        + final CatalogService catalog = context.getCatalog();
        + final String tableName = alterTable.getTableName();
        +
        + String databaseName;
        + String simpleTableName;
        + if (CatalogUtil.isFQTableName(tableName))

        { + String[] split = CatalogUtil.splitFQTableName(tableName); + databaseName = split[0]; + simpleTableName = split[1]; + }

        else

        { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableName; + }

        +
        + if (!catalog.existsTable(databaseName, simpleTableName))

        { + throw new UndefinedTableException(alterTable.getTableName()); + }

        +
        + TableDesc tableDesc = catalog.getTableDesc(databaseName, simpleTableName);
        +
        + if(tableDesc.getPartitionMethod() == null)

        { + throw new UndefinedPartitionMethodException(simpleTableName); + }

        +
        + Path tablePath = new Path(tableDesc.getUri());
        + FileSystem fs = tablePath.getFileSystem(context.getConf());
        +
        + PartitionMethodDesc partitionDesc = tableDesc.getPartitionMethod();
        + Schema partitionColumns = new Schema();
        + for (Column column : partitionDesc.getExpressionSchema().getRootColumns())

        { + partitionColumns.addColumn(column); + }

        +
        + // Get the array of path filter, accepting all partition paths.
        + PathFilter[] filters = PartitionedTableRewriter.buildAllAcceptingPathFilters(partitionColumns);
        +
        + // loop from one to the number of partition columns
        + Path [] filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(tablePath, filters[0]));
        +
        + // Get all file status matched to a ith level path filter.
        + for (int i = 1; i < partitionColumns.size(); i++)

        { + filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(filteredPaths, filters[i])); + }

        +
        + // Find missing partitions from filesystem
        — End diff –

        I misunderstood the below codes. Sorry.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/626#discussion_r40204949 — Diff: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java — @@ -529,24 +536,137 @@ public void alterTable(TajoMaster.MasterContext context, final QueryContext quer catalog.alterTable(CatalogUtil.addOrDropPartition(qualifiedName, alterTable.getPartitionColumns(), alterTable.getPartitionValues(), alterTable.getLocation(), AlterTableType.DROP_PARTITION)); // When dropping partition on an managed table, the data will be delete from file system. if (!desc.isExternal()) { + // When dropping partition on a table, the data in the table will NOT be deleted from the file system. + // But if PURGE is specified, the partition data will be deleted. + if (alterTable.isPurge()) { deletePartitionPath(partitionDescProto); - } else { // When dropping partition on an external table, the data in the table will NOT be deleted from the file // system. But if PURGE is specified, the partition data will be deleted. if (alterTable.isPurge()) { - deletePartitionPath(partitionDescProto); - } } } - + break; + case REPAIR_PARTITION: + repairPartition(context, queryContext, alterTable); break; default: throw new InternalError("alterTable cannot handle such query: \n" + alterTable.toJson()); } } + /** + * Run ALTER TABLE table_name REPAIR TABLE statement. + * This will recovery all partitions which exists on table directory. + * + * + * @param context + * @param queryContext + * @param alterTable + * @throws IOException + */ + public void repairPartition(TajoMaster.MasterContext context, final QueryContext queryContext, + final AlterTableNode alterTable) throws IOException, TajoException { + final CatalogService catalog = context.getCatalog(); + final String tableName = alterTable.getTableName(); + + String databaseName; + String simpleTableName; + if (CatalogUtil.isFQTableName(tableName)) { + String[] split = CatalogUtil.splitFQTableName(tableName); + databaseName = split[0]; + simpleTableName = split[1]; + } else { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableName; + } + + if (!catalog.existsTable(databaseName, simpleTableName)) { + throw new UndefinedTableException(alterTable.getTableName()); + } + + TableDesc tableDesc = catalog.getTableDesc(databaseName, simpleTableName); + + if(tableDesc.getPartitionMethod() == null) { + throw new UndefinedPartitionMethodException(simpleTableName); + } + + Path tablePath = new Path(tableDesc.getUri()); + FileSystem fs = tablePath.getFileSystem(context.getConf()); + + PartitionMethodDesc partitionDesc = tableDesc.getPartitionMethod(); + Schema partitionColumns = new Schema(); + for (Column column : partitionDesc.getExpressionSchema().getRootColumns()) { + partitionColumns.addColumn(column); + } + + // Get the array of path filter, accepting all partition paths. + PathFilter[] filters = PartitionedTableRewriter.buildAllAcceptingPathFilters(partitionColumns); + + // loop from one to the number of partition columns + Path [] filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(tablePath, filters [0] )); + + // Get all file status matched to a ith level path filter. + for (int i = 1; i < partitionColumns.size(); i++) { + filteredPaths = PartitionedTableRewriter.toPathArray(fs.listStatus(filteredPaths, filters[i])); + } + + // Find missing partitions from filesystem — End diff – I misunderstood the below codes. Sorry.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/626#issuecomment-142608568

        +1. Thank you for your work!

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/626#issuecomment-142608568 +1. Thank you for your work!
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/626#issuecomment-142637917

        Thanks @jihoonson

        I'll ship it at once.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/626#issuecomment-142637917 Thanks @jihoonson I'll ship it at once.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/626

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/626
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-CODEGEN-build #536 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/536/)
        TAJO-1673: Implement recover partitions. (blrunner: rev 5af330d228991b9a2a7a9ce49979fd6feb84d464)

        • tajo-sql-parser/src/main/antlr4/org/apache/tajo/parser/sql/SQLParser.g4
        • tajo-plan/src/main/java/org/apache/tajo/plan/serder/LogicalNodeDeserializer.java
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/TestLogicalPlanner.java
        • tajo-core/src/main/java/org/apache/tajo/parser/sql/SQLAnalyzer.java
        • tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java
        • tajo-core-tests/src/test/resources/queries/TestSQLAnalyzer/alter_table_repair_partition_1.sql
        • tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/rules/PartitionedTableRewriter.java
        • CHANGES
        • tajo-core-tests/src/test/resources/queries/TestAlterTable/alter_table_drop_partition1.sql
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestAlterTable.java
        • tajo-core-tests/src/test/resources/queries/TestAlterTable/create_partitioned_table2.sql
        • tajo-algebra/src/main/java/org/apache/tajo/algebra/AlterTableOpType.java
        • tajo-core-tests/src/test/resources/results/TestSQLAnalyzer/alter_table_repair_partition_1.result
        • tajo-sql-parser/src/main/antlr4/org/apache/tajo/parser/sql/SQLLexer.g4
        • tajo-plan/src/main/proto/Plan.proto
        • tajo-plan/src/main/java/org/apache/tajo/plan/serder/LogicalNodeSerializer.java
        • tajo-docs/src/main/sphinx/sql_language/alter_table.rst
        • tajo-core-tests/src/test/resources/queries/TestAlterTable/alter_table_drop_partition2.sql
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-CODEGEN-build #536 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/536/ ) TAJO-1673 : Implement recover partitions. (blrunner: rev 5af330d228991b9a2a7a9ce49979fd6feb84d464) tajo-sql-parser/src/main/antlr4/org/apache/tajo/parser/sql/SQLParser.g4 tajo-plan/src/main/java/org/apache/tajo/plan/serder/LogicalNodeDeserializer.java tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/TestLogicalPlanner.java tajo-core/src/main/java/org/apache/tajo/parser/sql/SQLAnalyzer.java tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java tajo-core-tests/src/test/resources/queries/TestSQLAnalyzer/alter_table_repair_partition_1.sql tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/rules/PartitionedTableRewriter.java CHANGES tajo-core-tests/src/test/resources/queries/TestAlterTable/alter_table_drop_partition1.sql tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestAlterTable.java tajo-core-tests/src/test/resources/queries/TestAlterTable/create_partitioned_table2.sql tajo-algebra/src/main/java/org/apache/tajo/algebra/AlterTableOpType.java tajo-core-tests/src/test/resources/results/TestSQLAnalyzer/alter_table_repair_partition_1.result tajo-sql-parser/src/main/antlr4/org/apache/tajo/parser/sql/SQLLexer.g4 tajo-plan/src/main/proto/Plan.proto tajo-plan/src/main/java/org/apache/tajo/plan/serder/LogicalNodeSerializer.java tajo-docs/src/main/sphinx/sql_language/alter_table.rst tajo-core-tests/src/test/resources/queries/TestAlterTable/alter_table_drop_partition2.sql
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-0.11.0-build #70 (See https://builds.apache.org/job/Tajo-0.11.0-build/70/)
        TAJO-1673: Implement recover partitions. (blrunner: rev fcc0c030c5fcd3f72d46b225decd75ee668bfd7c)

        • tajo-core/src/main/java/org/apache/tajo/parser/sql/SQLAnalyzer.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/serder/LogicalNodeSerializer.java
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestAlterTable.java
        • tajo-core-tests/src/test/resources/queries/TestAlterTable/alter_table_drop_partition1.sql
        • tajo-docs/src/main/sphinx/sql_language/alter_table.rst
        • tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/rules/PartitionedTableRewriter.java
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/TestLogicalPlanner.java
        • tajo-algebra/src/main/java/org/apache/tajo/algebra/AlterTableOpType.java
        • tajo-sql-parser/src/main/antlr4/org/apache/tajo/parser/sql/SQLParser.g4
        • tajo-core-tests/src/test/resources/results/TestSQLAnalyzer/alter_table_repair_partition_1.result
        • CHANGES
        • tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java
        • tajo-sql-parser/src/main/antlr4/org/apache/tajo/parser/sql/SQLLexer.g4
        • tajo-plan/src/main/proto/Plan.proto
        • tajo-core-tests/src/test/resources/queries/TestAlterTable/alter_table_drop_partition2.sql
        • tajo-core-tests/src/test/resources/queries/TestAlterTable/create_partitioned_table2.sql
        • tajo-core-tests/src/test/resources/queries/TestSQLAnalyzer/alter_table_repair_partition_1.sql
        • tajo-plan/src/main/java/org/apache/tajo/plan/serder/LogicalNodeDeserializer.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-0.11.0-build #70 (See https://builds.apache.org/job/Tajo-0.11.0-build/70/ ) TAJO-1673 : Implement recover partitions. (blrunner: rev fcc0c030c5fcd3f72d46b225decd75ee668bfd7c) tajo-core/src/main/java/org/apache/tajo/parser/sql/SQLAnalyzer.java tajo-plan/src/main/java/org/apache/tajo/plan/serder/LogicalNodeSerializer.java tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestAlterTable.java tajo-core-tests/src/test/resources/queries/TestAlterTable/alter_table_drop_partition1.sql tajo-docs/src/main/sphinx/sql_language/alter_table.rst tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/rules/PartitionedTableRewriter.java tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/TestLogicalPlanner.java tajo-algebra/src/main/java/org/apache/tajo/algebra/AlterTableOpType.java tajo-sql-parser/src/main/antlr4/org/apache/tajo/parser/sql/SQLParser.g4 tajo-core-tests/src/test/resources/results/TestSQLAnalyzer/alter_table_repair_partition_1.result CHANGES tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java tajo-sql-parser/src/main/antlr4/org/apache/tajo/parser/sql/SQLLexer.g4 tajo-plan/src/main/proto/Plan.proto tajo-core-tests/src/test/resources/queries/TestAlterTable/alter_table_drop_partition2.sql tajo-core-tests/src/test/resources/queries/TestAlterTable/create_partitioned_table2.sql tajo-core-tests/src/test/resources/queries/TestSQLAnalyzer/alter_table_repair_partition_1.sql tajo-plan/src/main/java/org/apache/tajo/plan/serder/LogicalNodeDeserializer.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #904 (See https://builds.apache.org/job/Tajo-master-build/904/)
        TAJO-1673: Implement recover partitions. (blrunner: rev 5af330d228991b9a2a7a9ce49979fd6feb84d464)

        • tajo-core/src/main/java/org/apache/tajo/parser/sql/SQLAnalyzer.java
        • tajo-sql-parser/src/main/antlr4/org/apache/tajo/parser/sql/SQLParser.g4
        • tajo-core-tests/src/test/resources/queries/TestSQLAnalyzer/alter_table_repair_partition_1.sql
        • tajo-core-tests/src/test/resources/results/TestSQLAnalyzer/alter_table_repair_partition_1.result
        • tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java
        • tajo-plan/src/main/proto/Plan.proto
        • tajo-plan/src/main/java/org/apache/tajo/plan/serder/LogicalNodeDeserializer.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/rules/PartitionedTableRewriter.java
        • tajo-core-tests/src/test/resources/queries/TestAlterTable/alter_table_drop_partition1.sql
        • CHANGES
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestAlterTable.java
        • tajo-algebra/src/main/java/org/apache/tajo/algebra/AlterTableOpType.java
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/TestLogicalPlanner.java
        • tajo-sql-parser/src/main/antlr4/org/apache/tajo/parser/sql/SQLLexer.g4
        • tajo-plan/src/main/java/org/apache/tajo/plan/serder/LogicalNodeSerializer.java
        • tajo-core-tests/src/test/resources/queries/TestAlterTable/create_partitioned_table2.sql
        • tajo-docs/src/main/sphinx/sql_language/alter_table.rst
        • tajo-core-tests/src/test/resources/queries/TestAlterTable/alter_table_drop_partition2.sql
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #904 (See https://builds.apache.org/job/Tajo-master-build/904/ ) TAJO-1673 : Implement recover partitions. (blrunner: rev 5af330d228991b9a2a7a9ce49979fd6feb84d464) tajo-core/src/main/java/org/apache/tajo/parser/sql/SQLAnalyzer.java tajo-sql-parser/src/main/antlr4/org/apache/tajo/parser/sql/SQLParser.g4 tajo-core-tests/src/test/resources/queries/TestSQLAnalyzer/alter_table_repair_partition_1.sql tajo-core-tests/src/test/resources/results/TestSQLAnalyzer/alter_table_repair_partition_1.result tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java tajo-plan/src/main/proto/Plan.proto tajo-plan/src/main/java/org/apache/tajo/plan/serder/LogicalNodeDeserializer.java tajo-plan/src/main/java/org/apache/tajo/plan/rewrite/rules/PartitionedTableRewriter.java tajo-core-tests/src/test/resources/queries/TestAlterTable/alter_table_drop_partition1.sql CHANGES tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestAlterTable.java tajo-algebra/src/main/java/org/apache/tajo/algebra/AlterTableOpType.java tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/TestLogicalPlanner.java tajo-sql-parser/src/main/antlr4/org/apache/tajo/parser/sql/SQLLexer.g4 tajo-plan/src/main/java/org/apache/tajo/plan/serder/LogicalNodeSerializer.java tajo-core-tests/src/test/resources/queries/TestAlterTable/create_partitioned_table2.sql tajo-docs/src/main/sphinx/sql_language/alter_table.rst tajo-core-tests/src/test/resources/queries/TestAlterTable/alter_table_drop_partition2.sql

          People

          • Assignee:
            blrunner Jaehwa Jung
            Reporter:
            blrunner Jaehwa Jung
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development