Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1211

Staging directory for CTAS and INSERT should be in the output dir.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.10.0
    • Component/s: QueryMaster
    • Labels:
      None

      Description

      Background

      Staging directory plays a role to keep the final output data temporarily. The final output data are moved toe the the final output dir if query is successfully finished. It is important to keep the output directory consistent even if query is failed.

      Problem

      Currently, staging directory is included /tmp/tajo-${user.name}/ in HDFS that ${tajo.root} uses. The final output directory and the staging directory can be on different file systems. In this case, the move will cause unnecessary copy overheads. In addition, in S3, such a move operation may be more problematic.

      Solution
      CTAS and INSERT (OVERWRITE) INTO should use the staging dir as a hidden subdirectory in the final output dir. For example, if the output dir is /table1, the corresponding staging dir should be /table1/.staging.

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user hyunsik opened a pull request:

        https://github.com/apache/tajo/pull/274

        TAJO-1211: Staging directory for CTAS and INSERT should be in the output...

        ... dir.

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/hyunsik/tajo TAJO-1211

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/274.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #274


        commit 779be0cc948d79863f1b812c1f3c47f9363bd2e5
        Author: Hyunsik Choi <hyunsik@apache.org>
        Date: 2014-11-30T14:14:03Z

        TAJO-1211: Staging directory for CTAS and INSERT should be in the output dir.


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user hyunsik opened a pull request: https://github.com/apache/tajo/pull/274 TAJO-1211 : Staging directory for CTAS and INSERT should be in the output... ... dir. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hyunsik/tajo TAJO-1211 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/274.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #274 commit 779be0cc948d79863f1b812c1f3c47f9363bd2e5 Author: Hyunsik Choi <hyunsik@apache.org> Date: 2014-11-30T14:14:03Z TAJO-1211 : Staging directory for CTAS and INSERT should be in the output dir.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/274#issuecomment-65019661

        I updated the patch. This patch also fixes a bug that misses exceptions occurring in Query::commitOuputData.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/274#issuecomment-65019661 I updated the patch. This patch also fixes a bug that misses exceptions occurring in Query::commitOuputData.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on the pull request:

        https://github.com/apache/tajo/pull/274#issuecomment-65031538

        +1
        I verified "insert overwrite" and "create table"

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on the pull request: https://github.com/apache/tajo/pull/274#issuecomment-65031538 +1 I verified "insert overwrite" and "create table"
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/274

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/274
        Hide
        hyunsik Hyunsik Choi added a comment -

        I just committed it to master branch.

        Show
        hyunsik Hyunsik Choi added a comment - I just committed it to master branch.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-CODEGEN-build #114 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/114/)
        TAJO-1211: Staging directory for CTAS and INSERT should be in the output dir. (hyunsik: rev b4adc18cd25de550fe04a43ef69d715c146976db)

        • tajo-core/src/test/java/org/apache/tajo/engine/query/TestInsertQuery.java
        • tajo-core/src/main/resources/webapps/admin/index.jsp
        • tajo-core/src/main/java/org/apache/tajo/master/querymaster/Query.java
        • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
        • tajo-core/src/main/java/org/apache/tajo/master/GlobalEngine.java
        • tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryMasterTask.java
        • tajo-core/src/main/java/org/apache/tajo/master/TajoMaster.java
        • CHANGES
        • tajo-core/src/test/java/org/apache/tajo/engine/query/TestCTASQuery.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-CODEGEN-build #114 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/114/ ) TAJO-1211 : Staging directory for CTAS and INSERT should be in the output dir. (hyunsik: rev b4adc18cd25de550fe04a43ef69d715c146976db) tajo-core/src/test/java/org/apache/tajo/engine/query/TestInsertQuery.java tajo-core/src/main/resources/webapps/admin/index.jsp tajo-core/src/main/java/org/apache/tajo/master/querymaster/Query.java tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java tajo-core/src/main/java/org/apache/tajo/master/GlobalEngine.java tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryMasterTask.java tajo-core/src/main/java/org/apache/tajo/master/TajoMaster.java CHANGES tajo-core/src/test/java/org/apache/tajo/engine/query/TestCTASQuery.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #473 (See https://builds.apache.org/job/Tajo-master-build/473/)
        TAJO-1211: Staging directory for CTAS and INSERT should be in the output dir. (hyunsik: rev b4adc18cd25de550fe04a43ef69d715c146976db)

        • tajo-core/src/test/java/org/apache/tajo/engine/query/TestCTASQuery.java
        • tajo-core/src/main/java/org/apache/tajo/master/querymaster/Query.java
        • tajo-core/src/main/java/org/apache/tajo/master/TajoMaster.java
        • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
        • tajo-core/src/main/resources/webapps/admin/index.jsp
        • tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryMasterTask.java
        • tajo-core/src/main/java/org/apache/tajo/master/GlobalEngine.java
        • CHANGES
        • tajo-core/src/test/java/org/apache/tajo/engine/query/TestInsertQuery.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #473 (See https://builds.apache.org/job/Tajo-master-build/473/ ) TAJO-1211 : Staging directory for CTAS and INSERT should be in the output dir. (hyunsik: rev b4adc18cd25de550fe04a43ef69d715c146976db) tajo-core/src/test/java/org/apache/tajo/engine/query/TestCTASQuery.java tajo-core/src/main/java/org/apache/tajo/master/querymaster/Query.java tajo-core/src/main/java/org/apache/tajo/master/TajoMaster.java tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java tajo-core/src/main/resources/webapps/admin/index.jsp tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryMasterTask.java tajo-core/src/main/java/org/apache/tajo/master/GlobalEngine.java CHANGES tajo-core/src/test/java/org/apache/tajo/engine/query/TestInsertQuery.java

          People

          • Assignee:
            hyunsik Hyunsik Choi
            Reporter:
            hyunsik Hyunsik Choi
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development