Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-806

CreateTableNode in CTAS uses a wrong schema as output schema and table schema.

    Details

      Description

      In below case, currently, TajoWriteSupport just takes the schema of the table orders. In other words, each column qualifier was default.orders instead of default.parquet_test. This is a bug. In such a case, we can meet the following error when we read parquet files.

      default> create table parquet_test using parquet as select * from orders;
      Progress: 0%, response time: 1.119 sec
      Progress: 0%, response time: 2.121 sec
      Progress: 0%, response time: 3.123 sec
      Progress: 83%, response time: 4.126 sec
      Progress: 100%, response time: 4.709 sec
      (1500000 rows, 4.709 sec, 109.9 MiB inserted)
      
      default> select * from parquet_test;
      SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
      SLF4J: Defaulting to no-operation (NOP) logger implementation
      SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
      Exception in thread "main" java.lang.NullPointerException
      	at parquet.hadoop.InternalParquetRecordReader.close(InternalParquetRecordReader.java:118)
      	at parquet.hadoop.ParquetReader.close(ParquetReader.java:144)
      	at org.apache.tajo.storage.parquet.ParquetScanner.close(ParquetScanner.java:87)
      	at org.apache.tajo.storage.MergeScanner.close(MergeScanner.java:137)
      	at org.apache.tajo.jdbc.TajoResultSet.close(TajoResultSet.java:153)
      	at org.apache.tajo.cli.TajoCli.localQueryCompleted(TajoCli.java:387)
      	at org.apache.tajo.cli.TajoCli.executeQuery(TajoCli.java:365)
      	at org.apache.tajo.cli.TajoCli.executeParsedResults(TajoCli.java:322)
      	at org.apache.tajo.cli.TajoCli.runShell(TajoCli.java:311)
      	at org.apache.tajo.cli.TajoCli.main(TajoCli.java:490)
      Apr 30, 2014 11:04:01 AM INFO: parquet.hadoop.ParquetFileReader: reading another 1 footers
      

      The patch fixes the bug where CreateTableNode takes the wrong schema.

      In addition, I found the potential problem where ParquetFile stores the Tajo Schema into its extra meta data. I think that it will problem when users renames its database name or table name. So, I removed the code to insert a Tajo schema into extra metadata and I changed Parquet reading to not use extra metadata.

      Tajo mainly uses Catalog system to manage schemas, and reading parquet files in Tajo depends on Tajo catalog. So, it will work well. Also, other systems can access parquet files by directly reading parquet's native schema.

      1. TAJO-806.patch
        28 kB
        Hyunsik Choi

        Activity

        Hide
        hyunsik Hyunsik Choi added a comment -

        I also fixed some trivial bugs of QueryTestCase in this patch.

        Show
        hyunsik Hyunsik Choi added a comment - I also fixed some trivial bugs of QueryTestCase in this patch.
        Hide
        hyunsik Hyunsik Choi added a comment - - edited

        Created a review request against branch master in reviewboard
        https://reviews.apache.org/r/20877/

        Show
        hyunsik Hyunsik Choi added a comment - - edited Created a review request against branch master in reviewboard https://reviews.apache.org/r/20877/
        Hide
        tajoqa Tajo QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12642592/TAJO-806.patch
        against master revision 6200311.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 15 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The applied patch does not increase the total number of javadoc warnings.

        +1 checkstyle. The patch generated 0 code style errors.

        -1 findbugs. The patch appears to introduce 189 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in tajo-client tajo-core tajo-storage:
        org.apache.tajo.engine.query.TestInsertQuery

        Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/404//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/404//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core.html
        Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/404//console

        This message is automatically generated.

        Show
        tajoqa Tajo QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12642592/TAJO-806.patch against master revision 6200311. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The applied patch does not increase the total number of javadoc warnings. +1 checkstyle. The patch generated 0 code style errors. -1 findbugs. The patch appears to introduce 189 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in tajo-client tajo-core tajo-storage: org.apache.tajo.engine.query.TestInsertQuery Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/404//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TAJO-Build/404//artifact/incubator-tajo/patchprocess/newPatchFindbugsWarningstajo-core.html Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/404//console This message is automatically generated.
        Hide
        jhkim Jinho Kim added a comment -

        +1 for the patch.

        Show
        jhkim Jinho Kim added a comment - +1 for the patch.
        Hide
        hyunsik Hyunsik Choi added a comment -

        Thanks for quick review. I'll commit it shortly.

        Show
        hyunsik Hyunsik Choi added a comment - Thanks for quick review. I'll commit it shortly.
        Hide
        hyunsik Hyunsik Choi added a comment -

        committed it to master and branch-0.8.1.

        Show
        hyunsik Hyunsik Choi added a comment - committed it to master and branch-0.8.1.
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #211 (See https://builds.apache.org/job/Tajo-master-build/211/)
        TAJO-806: CreateTableNode in CTAS uses a wrong schema as output schema and table schema. (hyunsik) (hyunsik: rev cde1bcaf563befb8889cafc23e72850782725488)

        • tajo-core/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java
        • tajo-core/src/test/java/org/apache/tajo/engine/query/TestInsertQuery.java
        • tajo-core/src/test/resources/queries/TestInsertQuery/full_table_csv_ddl.sql
        • tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithTargetColumns.sql
        • tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwrite.sql
        • tajo-core/src/test/resources/results/TestInsertQuery/testInsertOverwriteWithAsteriskUsingParquet2.result
        • tajo-client/src/main/java/org/apache/tajo/cli/ConnectDatabaseCommand.java
        • tajo-core/src/test/resources/queries/TestInsertQuery/table1_ddl.sql
        • tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteLocationWithCompression.sql
        • tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithCompression_ddl.sql
        • tajo-storage/src/main/java/org/apache/tajo/storage/parquet/TajoReadSupport.java
        • tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithAsterisk.sql
        • tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithCompression.sql
        • tajo-storage/src/main/java/org/apache/tajo/storage/parquet/TajoWriteSupport.java
        • CHANGES
        • tajo-storage/src/main/java/org/apache/tajo/storage/parquet/ParquetScanner.java
        • tajo-core/src/test/resources/queries/TestInsertQuery/full_table_parquet_ddl.sql
        • tajo-core/src/test/resources/results/TestInsertQuery/testInsertOverwriteWithAsteriskUsingParquet.result
        • tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteSmallerColumns.sql
        • tajo-core/src/test/java/org/apache/tajo/QueryTestCaseBase.java
        • tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteLocation.sql
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #211 (See https://builds.apache.org/job/Tajo-master-build/211/ ) TAJO-806 : CreateTableNode in CTAS uses a wrong schema as output schema and table schema. (hyunsik) (hyunsik: rev cde1bcaf563befb8889cafc23e72850782725488) tajo-core/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java tajo-core/src/test/java/org/apache/tajo/engine/query/TestInsertQuery.java tajo-core/src/test/resources/queries/TestInsertQuery/full_table_csv_ddl.sql tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithTargetColumns.sql tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwrite.sql tajo-core/src/test/resources/results/TestInsertQuery/testInsertOverwriteWithAsteriskUsingParquet2.result tajo-client/src/main/java/org/apache/tajo/cli/ConnectDatabaseCommand.java tajo-core/src/test/resources/queries/TestInsertQuery/table1_ddl.sql tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteLocationWithCompression.sql tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithCompression_ddl.sql tajo-storage/src/main/java/org/apache/tajo/storage/parquet/TajoReadSupport.java tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithAsterisk.sql tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithCompression.sql tajo-storage/src/main/java/org/apache/tajo/storage/parquet/TajoWriteSupport.java CHANGES tajo-storage/src/main/java/org/apache/tajo/storage/parquet/ParquetScanner.java tajo-core/src/test/resources/queries/TestInsertQuery/full_table_parquet_ddl.sql tajo-core/src/test/resources/results/TestInsertQuery/testInsertOverwriteWithAsteriskUsingParquet.result tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteSmallerColumns.sql tajo-core/src/test/java/org/apache/tajo/QueryTestCaseBase.java tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteLocation.sql

          People

          • Assignee:
            hyunsik Hyunsik Choi
            Reporter:
            hyunsik Hyunsik Choi
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development