Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1798

Dynamic partitioning occasionally fails.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11.0, 0.12.0
    • Component/s: None
    • Labels:
      None

      Description

      The following query runs continuously even though the progress is reached to 100%.

      tpch1> create table partitioned_lineitem (l_orderkey int8, l_partkey int8, l_suppkey int8, l_linenumber int8, l_quantity float8, l_extendedprice float8, l_discount float8, l_tax float8, l_linestatus text, l_shipdate text, l_commitdate text, l_receiptdate text, l_shipinstruct text, l_shipmode text, l_comment text) partition by column (l_returnflag text) as select l_orderkey, l_partkey, l_suppkey, l_linenumber , l_quantity , l_extendedprice , l_discount , l_tax , l_linestatus , l_shipdate , l_commitdate , l_receiptdate , l_shipinstruct , l_shipmode , l_comment, l_returnflag from lineitem;
      ...
      Progress: 100%, response time: 328.515 sec
      Progress: 100%, response time: 329.517 sec
      Progress: 100%, response time: 330.524 sec
      Progress: 100%, response time: 331.527 sec
      Progress: 100%, response time: 332.529 sec
      ...
      

      When I check the query status through the web UI, its all stages are already completed with 'SUCCEEDED' state. So, I killed the running query. However, surprisingly, I found that the result table of the above query is successfully registered in catalog as follows.

      tpch1> \d
      customer
      lineitem
      nation
      orders
      partitioned_lineitem
      partitioned_nation
      partitioned_region
      partsupp
      region
      supplier
      tpch1> \d partitioned_lineitem
      
      table name: tpch1.partitioned_lineitem
      table uri: hdfs://localhost:7020/tajo/warehouse/tpch1/partitioned_lineitem
      store type: TEXT
      number of rows: 6001215
      volume: 750.5 MB
      Options: 
      	'text.delimiter'='|'
      
      schema: 
      l_orderkey	INT8
      l_partkey	INT8
      l_suppkey	INT8
      l_linenumber	INT8
      l_quantity	FLOAT8
      l_extendedprice	FLOAT8
      l_discount	FLOAT8
      l_tax	FLOAT8
      l_linestatus	TEXT
      l_shipdate	TEXT
      l_commitdate	TEXT
      l_receiptdate	TEXT
      l_shipinstruct	TEXT
      l_shipmode	TEXT
      l_comment	TEXT
      
      Partitions: 
      type:COLUMN
      columns::l_returnflag (TEXT)
      
      tpch1> select count(*) from partitioned_lineitem;
      Progress: 6%, response time: 1.164 sec
      Progress: 6%, response time: 1.165 sec
      Progress: 6%, response time: 1.568 sec
      Progress: 100%, response time: 1.793 sec
      ?count
      -------------------------------
      6001215
      (1 rows, 1.793 sec, 8 B selected)
      

        Activity

        Hide
        hyunsik Hyunsik Choi added a comment -

        Any error log in query master?

        Show
        hyunsik Hyunsik Choi added a comment - Any error log in query master?
        Hide
        jihoonson Jihoon Son added a comment -

        Thanks for asking me. I've forgot it.
        Here is the log.

        2015-08-23 13:44:11,170 INFO org.apache.tajo.querymaster.Query: Processing q_1440304970100_0001 of type QUERY_COMPLETED
        2015-08-23 13:44:11,170 INFO org.apache.tajo.worker.TaskManager: Stopped execution block:eb_1440304970100_0001_000002
        2015-08-23 13:44:11,183 INFO org.apache.tajo.storage.FileTablespace: Moved from the staging dir to the output directory 'hdfs://localhost:7020/tajo/warehouse/tpch1/partitioned_lineitem
        2015-08-23 13:44:11,376 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
        org.apache.tajo.exception.TajoInternalError: internal error: The statement was aborted because it would have caused a duplicate key value in a unique or primary key constraint or unique index identified by 'C_PARTITIONS_UNIQ' defined on 'PARTITIONS'.
                at org.apache.tajo.exception.ExceptionUtil.toTajoExceptionCommon(ExceptionUtil.java:124)
                at org.apache.tajo.exception.ExceptionUtil.toTajoException(ExceptionUtil.java:151)
                at org.apache.tajo.exception.ExceptionUtil.throwsIfThisError(ExceptionUtil.java:103)
                at org.apache.tajo.catalog.AbstractCatalogClient.addPartitions(AbstractCatalogClient.java:483)
                at org.apache.tajo.querymaster.Query$QueryCompletedTransition.finalizeQuery(Query.java:526)
                at org.apache.tajo.querymaster.Query$QueryCompletedTransition.transition(Query.java:447)
                at org.apache.tajo.querymaster.Query$QueryCompletedTransition.transition(Query.java:436)
                at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
                at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
                at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
                at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
                at org.apache.tajo.querymaster.Query.handle(Query.java:860)
                at org.apache.tajo.querymaster.Query.handle(Query.java:67)
                at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
                at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
                at java.lang.Thread.run(Thread.java:745)
        
        Show
        jihoonson Jihoon Son added a comment - Thanks for asking me. I've forgot it. Here is the log. 2015-08-23 13:44:11,170 INFO org.apache.tajo.querymaster.Query: Processing q_1440304970100_0001 of type QUERY_COMPLETED 2015-08-23 13:44:11,170 INFO org.apache.tajo.worker.TaskManager: Stopped execution block:eb_1440304970100_0001_000002 2015-08-23 13:44:11,183 INFO org.apache.tajo.storage.FileTablespace: Moved from the staging dir to the output directory 'hdfs://localhost:7020/tajo/warehouse/tpch1/partitioned_lineitem 2015-08-23 13:44:11,376 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread org.apache.tajo.exception.TajoInternalError: internal error: The statement was aborted because it would have caused a duplicate key value in a unique or primary key constraint or unique index identified by 'C_PARTITIONS_UNIQ' defined on 'PARTITIONS'. at org.apache.tajo.exception.ExceptionUtil.toTajoExceptionCommon(ExceptionUtil.java:124) at org.apache.tajo.exception.ExceptionUtil.toTajoException(ExceptionUtil.java:151) at org.apache.tajo.exception.ExceptionUtil.throwsIfThisError(ExceptionUtil.java:103) at org.apache.tajo.catalog.AbstractCatalogClient.addPartitions(AbstractCatalogClient.java:483) at org.apache.tajo.querymaster.Query$QueryCompletedTransition.finalizeQuery(Query.java:526) at org.apache.tajo.querymaster.Query$QueryCompletedTransition.transition(Query.java:447) at org.apache.tajo.querymaster.Query$QueryCompletedTransition.transition(Query.java:436) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.tajo.querymaster.Query.handle(Query.java:860) at org.apache.tajo.querymaster.Query.handle(Query.java:67) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:745)
        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user blrunner opened a pull request:

        https://github.com/apache/tajo/pull/709

        TAJO-1798: Query execution is not finished even though it actually is done.

        I verified this patch with TPC-H 1G.

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/blrunner/tajo TAJO-1798

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/709.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #709


        commit cb69c69a278fe23e54631986ddb7a8cf75db95ac
        Author: JaeHwa Jung <blrunner@apache.org>
        Date: 2015-08-25T09:07:15Z

        TAJO-1798: Query execution is not finished even though it actually is done.


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user blrunner opened a pull request: https://github.com/apache/tajo/pull/709 TAJO-1798 : Query execution is not finished even though it actually is done. I verified this patch with TPC-H 1G. You can merge this pull request into a Git repository by running: $ git pull https://github.com/blrunner/tajo TAJO-1798 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/709.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #709 commit cb69c69a278fe23e54631986ddb7a8cf75db95ac Author: JaeHwa Jung <blrunner@apache.org> Date: 2015-08-25T09:07:15Z TAJO-1798 : Query execution is not finished even though it actually is done.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/709#issuecomment-134612791

        I converted List<PartitionDescProto> to Set<PartitionDescProto> for avoiding duplicated partitions. And after stored partitions to catalog, cleared all Set<PartitionDescProto> instances.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/709#issuecomment-134612791 I converted List<PartitionDescProto> to Set<PartitionDescProto> for avoiding duplicated partitions. And after stored partitions to catalog, cleared all Set<PartitionDescProto> instances.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/709#discussion_r37939197

        — Diff: tajo-core/src/main/java/org/apache/tajo/querymaster/Query.java —
        @@ -505,30 +510,31 @@ private QueryState finalizeQuery(Query query, QueryCompletedEvent event) {
        QueryHookExecutor hookExecutor = new QueryHookExecutor(query.context.getQueryMasterContext());
        hookExecutor.execute(query.context.getQueryContext(), query, event.getExecutionBlockId(), finalOutputDir);

        • TableDesc desc = query.getResultDesc();
          -
        • // If there is partitions
        • List<PartitionDescProto> partitions = query.getPartitions();
        • if (partitions!= null && !partitions.isEmpty()) {
          -
        • String databaseName, simpleTableName;
          -
        • if (CatalogUtil.isFQTableName(desc.getName())) {
        • String[] split = CatalogUtil.splitFQTableName(desc.getName());
        • databaseName = split[0];
        • simpleTableName = split[1];
          + // Add dynamic partitions to catalog for partition table.
          + if (queryContext.hasOutputTableUri() && queryContext.hasPartition()) {
          + Set<PartitionDescProto> partitions = query.getPartitions();
          + if (partitions != null)
          Unknown macro: { + String databaseName, simpleTableName; + + if (CatalogUtil.isFQTableName(tableDesc.getName())) { + String[] split = CatalogUtil.splitFQTableName(tableDesc.getName()); + databaseName = split[0]; + simpleTableName = split[1]; + } else { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableDesc.getName(); + } + + // Store partitions to CatalogStore using alter table statement. + catalog.addPartitions(databaseName, simpleTableName, TUtil.newList(partitions), true); + LOG.info("Added partitions to catalog (total=" + partitions.size() + ")"); }

          else

          { - databaseName = queryContext.getCurrentDatabase(); - simpleTableName = desc.getName(); + LOG.info("Can't find partitions for adding."); }

          -

        • // Store partitions to CatalogStore using alter table statement.
        • catalog.addPartitions(databaseName, simpleTableName, partitions, true);
        • } else {
        • LOG.info("Can't find partitions for adding.");
          + query.clearPartitions();
          + partitions.clear();
            • End diff –

        please remove unnecessary code

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on a diff in the pull request: https://github.com/apache/tajo/pull/709#discussion_r37939197 — Diff: tajo-core/src/main/java/org/apache/tajo/querymaster/Query.java — @@ -505,30 +510,31 @@ private QueryState finalizeQuery(Query query, QueryCompletedEvent event) { QueryHookExecutor hookExecutor = new QueryHookExecutor(query.context.getQueryMasterContext()); hookExecutor.execute(query.context.getQueryContext(), query, event.getExecutionBlockId(), finalOutputDir); TableDesc desc = query.getResultDesc(); - // If there is partitions List<PartitionDescProto> partitions = query.getPartitions(); if (partitions!= null && !partitions.isEmpty()) { - String databaseName, simpleTableName; - if (CatalogUtil.isFQTableName(desc.getName())) { String[] split = CatalogUtil.splitFQTableName(desc.getName()); databaseName = split [0] ; simpleTableName = split [1] ; + // Add dynamic partitions to catalog for partition table. + if (queryContext.hasOutputTableUri() && queryContext.hasPartition()) { + Set<PartitionDescProto> partitions = query.getPartitions(); + if (partitions != null) Unknown macro: { + String databaseName, simpleTableName; + + if (CatalogUtil.isFQTableName(tableDesc.getName())) { + String[] split = CatalogUtil.splitFQTableName(tableDesc.getName()); + databaseName = split[0]; + simpleTableName = split[1]; + } else { + databaseName = queryContext.getCurrentDatabase(); + simpleTableName = tableDesc.getName(); + } + + // Store partitions to CatalogStore using alter table statement. + catalog.addPartitions(databaseName, simpleTableName, TUtil.newList(partitions), true); + LOG.info("Added partitions to catalog (total=" + partitions.size() + ")"); } else { - databaseName = queryContext.getCurrentDatabase(); - simpleTableName = desc.getName(); + LOG.info("Can't find partitions for adding."); } - // Store partitions to CatalogStore using alter table statement. catalog.addPartitions(databaseName, simpleTableName, partitions, true); } else { LOG.info("Can't find partitions for adding."); + query.clearPartitions(); + partitions.clear(); End diff – please remove unnecessary code
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on the pull request:

        https://github.com/apache/tajo/pull/709#issuecomment-134785868

        Could you add tests ?

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on the pull request: https://github.com/apache/tajo/pull/709#issuecomment-134785868 Could you add tests ?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/709#issuecomment-134855789

        Thanks @jinossy
        I've just updated the patch using your comments.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/709#issuecomment-134855789 Thanks @jinossy I've just updated the patch using your comments.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/709#issuecomment-134871803

        Your change mainly appears to focus on partition bugs. Only this line (https://github.com/apache/tajo/pull/709/files#diff-f6cacc8f5447d717968f06d0b133f171L530) is related to the issue title. So, this issue would be more searchable if this issue title is changed into a better one.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/709#issuecomment-134871803 Your change mainly appears to focus on partition bugs. Only this line ( https://github.com/apache/tajo/pull/709/files#diff-f6cacc8f5447d717968f06d0b133f171L530 ) is related to the issue title. So, this issue would be more searchable if this issue title is changed into a better one.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/709#issuecomment-134874699

        Thanks @hyunsik
        I've just changed the title.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/709#issuecomment-134874699 Thanks @hyunsik I've just changed the title.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on the pull request:

        https://github.com/apache/tajo/pull/709#issuecomment-134889515

        +1 LGTM

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on the pull request: https://github.com/apache/tajo/pull/709#issuecomment-134889515 +1 LGTM
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/709

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/709
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/709#issuecomment-134905612

        Thanks @jinossy and @hyunsik
        I've just committed it to master branch and 0.11.0 branch.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/709#issuecomment-134905612 Thanks @jinossy and @hyunsik I've just committed it to master branch and 0.11.0 branch.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-CODEGEN-build #464 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/464/)
        TAJO-1798: Dynamic partitioning occasionally fails. (blrunner: rev 4be6746102d026dcf47b12a0548ddb15f33bde3e)

        • tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestTablePartitions.java
        • tajo-core/src/main/java/org/apache/tajo/querymaster/Query.java
        • tajo-core/src/main/java/org/apache/tajo/querymaster/TaskAttempt.java
        • CHANGES
        • tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-CODEGEN-build #464 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/464/ ) TAJO-1798 : Dynamic partitioning occasionally fails. (blrunner: rev 4be6746102d026dcf47b12a0548ddb15f33bde3e) tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestTablePartitions.java tajo-core/src/main/java/org/apache/tajo/querymaster/Query.java tajo-core/src/main/java/org/apache/tajo/querymaster/TaskAttempt.java CHANGES tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #818 (See https://builds.apache.org/job/Tajo-master-build/818/)
        TAJO-1798: Dynamic partitioning occasionally fails. (blrunner: rev 4be6746102d026dcf47b12a0548ddb15f33bde3e)

        • tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestTablePartitions.java
        • CHANGES
        • tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java
        • tajo-core/src/main/java/org/apache/tajo/querymaster/TaskAttempt.java
        • tajo-core/src/main/java/org/apache/tajo/querymaster/Query.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #818 (See https://builds.apache.org/job/Tajo-master-build/818/ ) TAJO-1798 : Dynamic partitioning occasionally fails. (blrunner: rev 4be6746102d026dcf47b12a0548ddb15f33bde3e) tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestTablePartitions.java CHANGES tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java tajo-core/src/main/java/org/apache/tajo/querymaster/TaskAttempt.java tajo-core/src/main/java/org/apache/tajo/querymaster/Query.java

          People

          • Assignee:
            blrunner Jaehwa Jung
            Reporter:
            jihoonson Jihoon Son
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development