Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1644

When inserting empty data into a partitioned table, existing data would be removed.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11.0, 0.10.2
    • Component/s: QueryMaster
    • Labels:
      None

      Description

      When inserting empty data into a partitioned table, existing data would be removed. Tajo provides column value partition which is hive-style partition. In hive, when inserting empty data into a partition, there are two cases. If you use dynamic partitions, existing data never would be removed. But if you don't use dynamic partitions, existing data would be removed. When inserting a data to partition, tajo user don't specify each column and each column value. So, it is similar to dynamic partition of hive. It seems to update deletion logic of partitioned table.

      1. TAJO-1644_2.patch
        13 kB
        Jaehwa Jung
      2. TAJO-1644.patch
        9 kB
        Jaehwa Jung

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user blrunner opened a pull request:

        https://github.com/apache/tajo/pull/601

        TAJO-1644: When inserting empty data into a partitioned table, existing data would be removed.

        When inserting empty data into a partitioned table, existing data would be removed. Tajo provides column value partition which is hive-style partition. In hive, when inserting empty data into a partition, there are two cases. If you use dynamic partitions, existing data never would be removed. But if you don't use dynamic partitions, existing data would be removed. When inserting a data to partition, tajo user don't specify each column and each column value. So, it is similar to dynamic partition of hive. It seems to update deletion logic of partitioned table.

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/blrunner/tajo TAJO-1644

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/601.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #601


        commit 67f818e23e82099a4dd84f8deb21f9008c912faf
        Author: JaeHwa Jung <blrunner@apache.org>
        Date: 2015-06-15T22:47:53Z

        TAJO-1644: When inserting empty data into a partitioned table, existing data would be removed.

        commit 74c3e29bae6cba30dec33bf19f61af8141ab2dd4
        Author: JaeHwa Jung <blrunner@apache.org>
        Date: 2015-06-15T22:49:14Z

        Remove unnecessary packages.


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user blrunner opened a pull request: https://github.com/apache/tajo/pull/601 TAJO-1644 : When inserting empty data into a partitioned table, existing data would be removed. When inserting empty data into a partitioned table, existing data would be removed. Tajo provides column value partition which is hive-style partition. In hive, when inserting empty data into a partition, there are two cases. If you use dynamic partitions, existing data never would be removed. But if you don't use dynamic partitions, existing data would be removed. When inserting a data to partition, tajo user don't specify each column and each column value. So, it is similar to dynamic partition of hive. It seems to update deletion logic of partitioned table. You can merge this pull request into a Git repository by running: $ git pull https://github.com/blrunner/tajo TAJO-1644 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/601.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #601 commit 67f818e23e82099a4dd84f8deb21f9008c912faf Author: JaeHwa Jung <blrunner@apache.org> Date: 2015-06-15T22:47:53Z TAJO-1644 : When inserting empty data into a partitioned table, existing data would be removed. commit 74c3e29bae6cba30dec33bf19f61af8141ab2dd4 Author: JaeHwa Jung <blrunner@apache.org> Date: 2015-06-15T22:49:14Z Remove unnecessary packages.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/601#discussion_r32494015

        — Diff: tajo-core/src/test/java/org/apache/tajo/engine/query/TestTablePartitions.java —
        @@ -548,54 +526,61 @@ public final void testInsertIntoColumnPartitionedTableByThreeColumns() throws Ex
        + " where l_orderkey = 1 and l_partkey = 1 and l_linenumber = 1");
        res.close();

        • desc = catalog.getTableDesc(DEFAULT_DATABASE_NAME, tableName);
        • assertTrue(fs.isDirectory(path));
        • assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=1")));
        • assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=1/col2=1")));
        • assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=1/col2=1/col3=17.0")));
        • assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=1/col2=1/col3=30.0")));
        • assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=2")));
        • assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=2/col2=2")));
        • assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=2/col2=2/col3=38.0")));
        • assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=3")));
        • assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=3/col2=2")));
        • assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=3/col2=3")));
        • assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=3/col2=2/col3=45.0")));
        • assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=3/col2=3/col3=49.0")));
          -
          + verifyDirectoriesForThreeColumns(fs, path, 3);
          if (!testingCluster.isHiveCatalogStoreRunning()) { // TODO: If there is existing another partition directory, we must add its rows number to result row numbers. + // desc = catalog.getTableDesc(DEFAULT_DATABASE_NAME, tableName); // assertEquals(6, desc.getStats().getNumRows().intValue()); }
        • res = executeString("select * from " + tableName + " where col2 = 1");
        • resultSetData = resultSetToString(res);
        • res.close();
        • expected = "col4,col1,col2,col3\n" +
        • "-------------------------------\n" +
        • "N,1,1,17.0\n" +
        • "N,1,1,17.0\n" +
        • "N,1,1,30.0\n" +
        • "N,1,1,36.0\n" +
        • "N,1,1,36.0\n";
          -
        • assertEquals(expected, resultSetData);
          + verifyMaintainExistingData(res, tableName);

        // insert overwrite empty result to partitioned table
        res = executeString("insert overwrite into " + tableName

        • + " select l_returnflag, l_orderkey, l_partkey, l_quantity from lineitem where l_orderkey" +
        • " > 100");
          + + " select l_returnflag, l_orderkey, l_partkey, l_quantity from lineitem where l_orderkey > 100");
          res.close();
        • desc = catalog.getTableDesc(DEFAULT_DATABASE_NAME, tableName);
          + verifyDirectoriesForThreeColumns(fs, path, 4);
          + verifyMaintainExistingData(res, tableName);
        • ContentSummary summary = fs.getContentSummary(new Path(desc.getPath()));
          + executeString("DROP TABLE " + tableName + " PURGE").close();
          + }
        • assertEquals(summary.getDirectoryCount(), 1L);
        • assertEquals(summary.getFileCount(), 0L);
        • assertEquals(summary.getLength(), 0L);
          + private final void verifyMaintainExistingData(ResultSet res, String tableName) throws Exception {
            • End diff –

        It would be better if you rename verifyMaintainExistingData to verifyKeptExistingData.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/601#discussion_r32494015 — Diff: tajo-core/src/test/java/org/apache/tajo/engine/query/TestTablePartitions.java — @@ -548,54 +526,61 @@ public final void testInsertIntoColumnPartitionedTableByThreeColumns() throws Ex + " where l_orderkey = 1 and l_partkey = 1 and l_linenumber = 1"); res.close(); desc = catalog.getTableDesc(DEFAULT_DATABASE_NAME, tableName); assertTrue(fs.isDirectory(path)); assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=1"))); assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=1/col2=1"))); assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=1/col2=1/col3=17.0"))); assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=1/col2=1/col3=30.0"))); assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=2"))); assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=2/col2=2"))); assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=2/col2=2/col3=38.0"))); assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=3"))); assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=3/col2=2"))); assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=3/col2=3"))); assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=3/col2=2/col3=45.0"))); assertTrue(fs.isDirectory(new Path(path.toUri() + "/col1=3/col2=3/col3=49.0"))); - + verifyDirectoriesForThreeColumns(fs, path, 3); if (!testingCluster.isHiveCatalogStoreRunning()) { // TODO: If there is existing another partition directory, we must add its rows number to result row numbers. + // desc = catalog.getTableDesc(DEFAULT_DATABASE_NAME, tableName); // assertEquals(6, desc.getStats().getNumRows().intValue()); } res = executeString("select * from " + tableName + " where col2 = 1"); resultSetData = resultSetToString(res); res.close(); expected = "col4,col1,col2,col3\n" + "-------------------------------\n" + "N,1,1,17.0\n" + "N,1,1,17.0\n" + "N,1,1,30.0\n" + "N,1,1,36.0\n" + "N,1,1,36.0\n"; - assertEquals(expected, resultSetData); + verifyMaintainExistingData(res, tableName); // insert overwrite empty result to partitioned table res = executeString("insert overwrite into " + tableName + " select l_returnflag, l_orderkey, l_partkey, l_quantity from lineitem where l_orderkey" + " > 100"); + + " select l_returnflag, l_orderkey, l_partkey, l_quantity from lineitem where l_orderkey > 100"); res.close(); desc = catalog.getTableDesc(DEFAULT_DATABASE_NAME, tableName); + verifyDirectoriesForThreeColumns(fs, path, 4); + verifyMaintainExistingData(res, tableName); ContentSummary summary = fs.getContentSummary(new Path(desc.getPath())); + executeString("DROP TABLE " + tableName + " PURGE").close(); + } assertEquals(summary.getDirectoryCount(), 1L); assertEquals(summary.getFileCount(), 0L); assertEquals(summary.getLength(), 0L); + private final void verifyMaintainExistingData(ResultSet res, String tableName) throws Exception { End diff – It would be better if you rename verifyMaintainExistingData to verifyKeptExistingData.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/601#discussion_r32494311

        — Diff: tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileTablespace.java —
        @@ -892,9 +892,8 @@ protected Path commitOutputData(OverridableConf queryContext, boolean changeFile
        boolean movedToOldTable = false;
        boolean committed = false;
        Path oldTableDir = new Path(stagingDir, TajoConstants.INSERT_OVERWIRTE_OLD_TABLE_NAME);

        • ContentSummary summary = fs.getContentSummary(stagingResultDir);
        • if (!queryContext.get(QueryVars.OUTPUT_PARTITIONS, "").isEmpty() && summary.getFileCount() > 0L) {
          + if (!queryContext.get(QueryVars.OUTPUT_PARTITIONS, "").isEmpty()) {
            • End diff –

        It would be better if it works according to session variable to allow users to decide whether original data set is removed or not when inserting rows is empty.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/601#discussion_r32494311 — Diff: tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileTablespace.java — @@ -892,9 +892,8 @@ protected Path commitOutputData(OverridableConf queryContext, boolean changeFile boolean movedToOldTable = false; boolean committed = false; Path oldTableDir = new Path(stagingDir, TajoConstants.INSERT_OVERWIRTE_OLD_TABLE_NAME); ContentSummary summary = fs.getContentSummary(stagingResultDir); if (!queryContext.get(QueryVars.OUTPUT_PARTITIONS, "").isEmpty() && summary.getFileCount() > 0L) { + if (!queryContext.get(QueryVars.OUTPUT_PARTITIONS, "").isEmpty()) { End diff – It would be better if it works according to session variable to allow users to decide whether original data set is removed or not when inserting rows is empty.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/601#issuecomment-112447587

        Thanks @hyunsik

        I've just updated the patch using your comments.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/601#issuecomment-112447587 Thanks @hyunsik I've just updated the patch using your comments.
        Hide
        blrunner Jaehwa Jung added a comment -

        I've fixed a test case bug.

        Show
        blrunner Jaehwa Jung added a comment - I've fixed a test case bug.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/601#issuecomment-113364944

        Hi @blrunner,

        Thank you for your work. How about the insert overwrite behavior for not partitioned table? Does that the same semantic?

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/601#issuecomment-113364944 Hi @blrunner, Thank you for your work. How about the insert overwrite behavior for not partitioned table? Does that the same semantic?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/601#issuecomment-113871681

        Hi @hyunsik

        Thank you for your review. I also agree with you. But when inserting empty data into a non partitioned table in hive, existing data would be removed always. I'm a bit anxious about user's confusion between hive and tajo.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/601#issuecomment-113871681 Hi @hyunsik Thank you for your review. I also agree with you. But when inserting empty data into a non partitioned table in hive, existing data would be removed always. I'm a bit anxious about user's confusion between hive and tajo.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/601#discussion_r33110575

        — Diff: tajo-common/src/main/java/org/apache/tajo/SessionVars.java —
        @@ -126,6 +126,9 @@
        NULL_CHAR(ConfVars.$TEXT_NULL, "null char of text file output", DEFAULT),
        CODEGEN(ConfVars.$CODEGEN, "Runtime code generation enabled (experiment)", DEFAULT),

        + TABLE_PARTITION_KEPT_EXISTING_DATA_ENABLED(ConfVars.$TABLE_PARTITION_KEPT_EXISTING_DATA_ENABLED,
        — End diff –

        How about TABLE_PARTITION_NO_RESULT_OVERWRITE_ENABLED for the config name? It seems to be more intuitive for me.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/601#discussion_r33110575 — Diff: tajo-common/src/main/java/org/apache/tajo/SessionVars.java — @@ -126,6 +126,9 @@ NULL_CHAR(ConfVars.$TEXT_NULL, "null char of text file output", DEFAULT), CODEGEN(ConfVars.$CODEGEN, "Runtime code generation enabled (experiment)", DEFAULT), + TABLE_PARTITION_KEPT_EXISTING_DATA_ENABLED(ConfVars.$TABLE_PARTITION_KEPT_EXISTING_DATA_ENABLED, — End diff – How about TABLE_PARTITION_NO_RESULT_OVERWRITE_ENABLED for the config name? It seems to be more intuitive for me.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/601#discussion_r33110710

        — Diff: tajo-common/src/main/java/org/apache/tajo/SessionVars.java —
        @@ -126,6 +126,9 @@
        NULL_CHAR(ConfVars.$TEXT_NULL, "null char of text file output", DEFAULT),
        CODEGEN(ConfVars.$CODEGEN, "Runtime code generation enabled (experiment)", DEFAULT),

        + TABLE_PARTITION_KEPT_EXISTING_DATA_ENABLED(ConfVars.$TABLE_PARTITION_KEPT_EXISTING_DATA_ENABLED,
        + "When inserting empty data into a partitioned table, set to keep existing data.", DEFAULT),
        — End diff –

        It would be better if the description is 'If True, a partitioned table is overwritten even if a sub query leads to no result. Otherwise, the table data will be kept if there is no result.'.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/601#discussion_r33110710 — Diff: tajo-common/src/main/java/org/apache/tajo/SessionVars.java — @@ -126,6 +126,9 @@ NULL_CHAR(ConfVars.$TEXT_NULL, "null char of text file output", DEFAULT), CODEGEN(ConfVars.$CODEGEN, "Runtime code generation enabled (experiment)", DEFAULT), + TABLE_PARTITION_KEPT_EXISTING_DATA_ENABLED(ConfVars.$TABLE_PARTITION_KEPT_EXISTING_DATA_ENABLED, + "When inserting empty data into a partitioned table, set to keep existing data.", DEFAULT), — End diff – It would be better if the description is 'If True, a partitioned table is overwritten even if a sub query leads to no result. Otherwise, the table data will be kept if there is no result.'.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/601#discussion_r33110732

        — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java —
        @@ -376,7 +375,10 @@ public static int setDateOrder(int dateOrder) {
        // Behavior Control ---------------------------------------------------------
        $BEHAVIOR_ARITHMETIC_ABORT("tajo.behavior.arithmetic-abort", false),

        • // ResultSet ---------------------------------------------------------
          + // When inserting empty data into a partitioned table, set to keep existing data.
            • End diff –

        This comment needs to be updated by my above comment.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/601#discussion_r33110732 — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java — @@ -376,7 +375,10 @@ public static int setDateOrder(int dateOrder) { // Behavior Control --------------------------------------------------------- $BEHAVIOR_ARITHMETIC_ABORT("tajo.behavior.arithmetic-abort", false), // ResultSet --------------------------------------------------------- + // When inserting empty data into a partitioned table, set to keep existing data. End diff – This comment needs to be updated by my above comment.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/601#discussion_r33110852

        — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java —
        @@ -376,7 +375,10 @@ public static int setDateOrder(int dateOrder) {
        // Behavior Control ---------------------------------------------------------
        $BEHAVIOR_ARITHMETIC_ABORT("tajo.behavior.arithmetic-abort", false),

        • // ResultSet ---------------------------------------------------------
          + // When inserting empty data into a partitioned table, set to keep existing data.
          + $TABLE_PARTITION_KEPT_EXISTING_DATA_ENABLED("tajo.table-partition.kept-existing-data.enabled", true),
            • End diff –

        This also should be updated by my above comment. Also, it would be nice if the config key is changed to 'tajo.partition.overwrite.even-if-no-result', and its default config should be ``false``.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/601#discussion_r33110852 — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java — @@ -376,7 +375,10 @@ public static int setDateOrder(int dateOrder) { // Behavior Control --------------------------------------------------------- $BEHAVIOR_ARITHMETIC_ABORT("tajo.behavior.arithmetic-abort", false), // ResultSet --------------------------------------------------------- + // When inserting empty data into a partitioned table, set to keep existing data. + $TABLE_PARTITION_KEPT_EXISTING_DATA_ENABLED("tajo.table-partition.kept-existing-data.enabled", true), End diff – This also should be updated by my above comment. Also, it would be nice if the config key is changed to 'tajo.partition.overwrite.even-if-no-result', and its default config should be ``false``.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/601#discussion_r33110868

        — Diff: tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileTablespace.java —
        @@ -894,7 +894,12 @@ protected Path commitOutputData(OverridableConf queryContext, boolean changeFile
        Path oldTableDir = new Path(stagingDir, TajoConstants.INSERT_OVERWIRTE_OLD_TABLE_NAME);
        ContentSummary summary = fs.getContentSummary(stagingResultDir);

        • if (!queryContext.get(QueryVars.OUTPUT_PARTITIONS, "").isEmpty() && summary.getFileCount() > 0L) {
          + // When inserting empty data into a partitioned table, check if keep existing data need to be remove or not.
          + boolean keptEnabled = queryContext.getBool(SessionVars.TABLE_PARTITION_KEPT_EXISTING_DATA_ENABLED);
          +
          + // If existing data doesn't need to keep, check if there are some files.
          + if ( (!queryContext.get(QueryVars.OUTPUT_PARTITIONS, "").isEmpty())
            • End diff –

        This condition should be changed depending on the changed default value.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/601#discussion_r33110868 — Diff: tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileTablespace.java — @@ -894,7 +894,12 @@ protected Path commitOutputData(OverridableConf queryContext, boolean changeFile Path oldTableDir = new Path(stagingDir, TajoConstants.INSERT_OVERWIRTE_OLD_TABLE_NAME); ContentSummary summary = fs.getContentSummary(stagingResultDir); if (!queryContext.get(QueryVars.OUTPUT_PARTITIONS, "").isEmpty() && summary.getFileCount() > 0L) { + // When inserting empty data into a partitioned table, check if keep existing data need to be remove or not. + boolean keptEnabled = queryContext.getBool(SessionVars.TABLE_PARTITION_KEPT_EXISTING_DATA_ENABLED); + + // If existing data doesn't need to keep, check if there are some files. + if ( (!queryContext.get(QueryVars.OUTPUT_PARTITIONS, "").isEmpty()) End diff – This condition should be changed depending on the changed default value.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/601#issuecomment-114696318

        @hyunsik

        Thank you for your detailed review.
        I've just updated the patch using your comments.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/601#issuecomment-114696318 @hyunsik Thank you for your detailed review. I've just updated the patch using your comments.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/601#discussion_r33194715

        — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java —
        @@ -376,7 +375,11 @@ public static int setDateOrder(int dateOrder) {
        // Behavior Control ---------------------------------------------------------
        $BEHAVIOR_ARITHMETIC_ABORT("tajo.behavior.arithmetic-abort", false),

        • // ResultSet ---------------------------------------------------------
          + // If True, a partitioned table is overwritten even if a sub query leads to no result.
          + // Otherwise, the table data will be kept if there is no result
          + $TABLE_PARTITION_NO_RESULT_OVERWRITE_ENABLED("tajo.partition.overwrite.even-if-no-result", false),
            • End diff –

        I have one more suggestion. It would be great if you use PARTITION instead of TABLE_PARTITION. It is still clear because we only use the term 'partition' to indicate table partition.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/601#discussion_r33194715 — Diff: tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java — @@ -376,7 +375,11 @@ public static int setDateOrder(int dateOrder) { // Behavior Control --------------------------------------------------------- $BEHAVIOR_ARITHMETIC_ABORT("tajo.behavior.arithmetic-abort", false), // ResultSet --------------------------------------------------------- + // If True, a partitioned table is overwritten even if a sub query leads to no result. + // Otherwise, the table data will be kept if there is no result + $TABLE_PARTITION_NO_RESULT_OVERWRITE_ENABLED("tajo.partition.overwrite.even-if-no-result", false), End diff – I have one more suggestion. It would be great if you use PARTITION instead of TABLE_PARTITION. It is still clear because we only use the term 'partition' to indicate table partition.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/601#discussion_r33194745

        — Diff: tajo-common/src/main/java/org/apache/tajo/SessionVars.java —
        @@ -126,6 +126,10 @@
        NULL_CHAR(ConfVars.$TEXT_NULL, "null char of text file output", DEFAULT),
        CODEGEN(ConfVars.$CODEGEN, "Runtime code generation enabled (experiment)", DEFAULT),

        + TABLE_PARTITION_NO_RESULT_OVERWRITE_ENABLED(ConfVars.$TABLE_PARTITION_NO_RESULT_OVERWRITE_ENABLED,
        — End diff –

        I have one more suggestion. It would be great if you use PARTITION instead of TABLE_PARTITION. It is still clear because we only use the term 'partition' to indicate table partition.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/601#discussion_r33194745 — Diff: tajo-common/src/main/java/org/apache/tajo/SessionVars.java — @@ -126,6 +126,10 @@ NULL_CHAR(ConfVars.$TEXT_NULL, "null char of text file output", DEFAULT), CODEGEN(ConfVars.$CODEGEN, "Runtime code generation enabled (experiment)", DEFAULT), + TABLE_PARTITION_NO_RESULT_OVERWRITE_ENABLED(ConfVars.$TABLE_PARTITION_NO_RESULT_OVERWRITE_ENABLED, — End diff – I have one more suggestion. It would be great if you use PARTITION instead of TABLE_PARTITION. It is still clear because we only use the term 'partition' to indicate table partition.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/601#issuecomment-115009452

        +1
        Looks good to me. I leaved one trivial comment. If you agree, you can commit the patch after you reflect my comment.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/601#issuecomment-115009452 +1 Looks good to me. I leaved one trivial comment. If you agree, you can commit the patch after you reflect my comment.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/601

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/601
        Hide
        blrunner Jaehwa Jung added a comment -

        Thanks Hyunsik Choi

        I've just committed it to master branch and 0.10.2 branch.

        Show
        blrunner Jaehwa Jung added a comment - Thanks Hyunsik Choi I've just committed it to master branch and 0.10.2 branch.
        Hide
        hudson Hudson added a comment -

        ABORTED: Integrated in Tajo-master-CODEGEN-build #382 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/382/)
        TAJO-1644: When inserting empty data into a partitioned table, existing data would be removed. (jaehwa) (blrunner: rev f57d6c43fd201326fef2a695b1d1e798d0f814e3)

        • CHANGES
        • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
        • tajo-core/src/test/java/org/apache/tajo/engine/query/TestTablePartitions.java
        • tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileTablespace.java
        • tajo-core/src/test/java/org/apache/tajo/engine/query/TestOuterJoinQuery.java
        • tajo-core/src/test/resources/results/TestTajoCli/testHelpSessionVars.result
        • tajo-common/src/main/java/org/apache/tajo/SessionVars.java
        Show
        hudson Hudson added a comment - ABORTED: Integrated in Tajo-master-CODEGEN-build #382 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/382/ ) TAJO-1644 : When inserting empty data into a partitioned table, existing data would be removed. (jaehwa) (blrunner: rev f57d6c43fd201326fef2a695b1d1e798d0f814e3) CHANGES tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java tajo-core/src/test/java/org/apache/tajo/engine/query/TestTablePartitions.java tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileTablespace.java tajo-core/src/test/java/org/apache/tajo/engine/query/TestOuterJoinQuery.java tajo-core/src/test/resources/results/TestTajoCli/testHelpSessionVars.result tajo-common/src/main/java/org/apache/tajo/SessionVars.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #742 (See https://builds.apache.org/job/Tajo-master-build/742/)
        TAJO-1644: When inserting empty data into a partitioned table, existing data would be removed. (jaehwa) (blrunner: rev f57d6c43fd201326fef2a695b1d1e798d0f814e3)

        • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
        • CHANGES
        • tajo-core/src/test/java/org/apache/tajo/engine/query/TestOuterJoinQuery.java
        • tajo-core/src/test/resources/results/TestTajoCli/testHelpSessionVars.result
        • tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileTablespace.java
        • tajo-core/src/test/java/org/apache/tajo/engine/query/TestTablePartitions.java
        • tajo-common/src/main/java/org/apache/tajo/SessionVars.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #742 (See https://builds.apache.org/job/Tajo-master-build/742/ ) TAJO-1644 : When inserting empty data into a partitioned table, existing data would be removed. (jaehwa) (blrunner: rev f57d6c43fd201326fef2a695b1d1e798d0f814e3) tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java CHANGES tajo-core/src/test/java/org/apache/tajo/engine/query/TestOuterJoinQuery.java tajo-core/src/test/resources/results/TestTajoCli/testHelpSessionVars.result tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/FileTablespace.java tajo-core/src/test/java/org/apache/tajo/engine/query/TestTablePartitions.java tajo-common/src/main/java/org/apache/tajo/SessionVars.java

          People

          • Assignee:
            blrunner Jaehwa Jung
            Reporter:
            blrunner Jaehwa Jung
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development