Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1796

Count all query for Parquet is crashed

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.10.1
    • Fix Version/s: 0.11.0
    • Component/s: Storage
    • Labels:
      None

      Description

      When 'select count( * )' is excuted, it fails with following error log:

      2015-08-21 17:17:31,371 ERROR org.apache.tajo.engine.planner.physical.HashShuffleFileWriteExec: A group type can not be empty. Parquet does not support empty group without leaves. Empty group: table_schema
      org.apache.parquet.schema.InvalidSchemaException: A group type can not be empty. Parquet does not support empty group without leaves. Empty group: table_schema
      at org.apache.parquet.schema.GroupType.<init>(GroupType.java:92)
      at org.apache.parquet.schema.GroupType.<init>(GroupType.java:48)
      at org.apache.parquet.schema.MessageType.<init>(MessageType.java:50)
      at org.apache.tajo.storage.parquet.TajoSchemaConverter.convert(TajoSchemaConverter.java:152)
      at org.apache.tajo.storage.parquet.TajoReadSupport.init(TajoReadSupport.java:76)
      at org.apache.parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:172)
      at org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:152)
      at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:128)
      at org.apache.tajo.storage.parquet.ParquetScanner.next(ParquetScanner.java:73)
      ...

        Issue Links

          Activity

          Hide
          eminency Jongyoung Park added a comment -

          While Parquet version is updated, verification code for empty schema is added.
          And this exception is from the part, because Tajo doesn't assign any projection schema when count all is done.

          Show
          eminency Jongyoung Park added a comment - While Parquet version is updated, verification code for empty schema is added. And this exception is from the part, because Tajo doesn't assign any projection schema when count all is done.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user eminency opened a pull request:

          https://github.com/apache/tajo/pull/729

          TAJO-1796: Count all query for Parquet is crashed

          New PhysicalExec is added for simple count query for Parquet.

          It is just a workaround.
          It should be improved later to enable for other types supporting statistics.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/eminency/tajo TAJO-1796

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/tajo/pull/729.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #729


          commit 51e97f7f8012365c7996287a05bc1fb784b31d75
          Author: Jongyoung Park <eminency@gmail.com>
          Date: 2015-08-26T03:23:53Z

          workaround

          commit f41305c68b06706e3d3861038415855b39a1cbc9
          Author: Jongyoung Park <eminency@gmail.com>
          Date: 2015-09-04T12:38:32Z

          Parquet format property added

          commit 76680d51180b4d063079b4c686ee1fc19c183c61
          Author: Jongyoung Park <eminency@gmail.com>
          Date: 2015-09-04T12:39:56Z

          isCountQuery is added in SimScanProto

          commit 8e6ad2a49c0ab0b68ecea03541515dfd18276325
          Author: Jongyoung Park <eminency@gmail.com>
          Date: 2015-09-04T12:43:59Z

          Logical/physical plan implementation logic is changed

          commit dd5cd57f753e1a8386db38d82691af67e88b997c
          Author: Jongyoung Park <eminency@gmail.com>
          Date: 2015-09-04T12:44:11Z

          Test added

          commit 77611d420e4d8cbae175c653c11c1096620939ac
          Author: Jongyoung Park <eminency@gmail.com>
          Date: 2015-09-04T13:40:13Z

          Fix compilation error after rebasing


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user eminency opened a pull request: https://github.com/apache/tajo/pull/729 TAJO-1796 : Count all query for Parquet is crashed New PhysicalExec is added for simple count query for Parquet. It is just a workaround. It should be improved later to enable for other types supporting statistics. You can merge this pull request into a Git repository by running: $ git pull https://github.com/eminency/tajo TAJO-1796 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/729.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #729 commit 51e97f7f8012365c7996287a05bc1fb784b31d75 Author: Jongyoung Park <eminency@gmail.com> Date: 2015-08-26T03:23:53Z workaround commit f41305c68b06706e3d3861038415855b39a1cbc9 Author: Jongyoung Park <eminency@gmail.com> Date: 2015-09-04T12:38:32Z Parquet format property added commit 76680d51180b4d063079b4c686ee1fc19c183c61 Author: Jongyoung Park <eminency@gmail.com> Date: 2015-09-04T12:39:56Z isCountQuery is added in SimScanProto commit 8e6ad2a49c0ab0b68ecea03541515dfd18276325 Author: Jongyoung Park <eminency@gmail.com> Date: 2015-09-04T12:43:59Z Logical/physical plan implementation logic is changed commit dd5cd57f753e1a8386db38d82691af67e88b997c Author: Jongyoung Park <eminency@gmail.com> Date: 2015-09-04T12:44:11Z Test added commit 77611d420e4d8cbae175c653c11c1096620939ac Author: Jongyoung Park <eminency@gmail.com> Date: 2015-09-04T13:40:13Z Fix compilation error after rebasing
          Hide
          tajoqa Tajo QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12754206/TAJO-1796.patch
          against master revision release-0.9.0-rc0-443-g7e0a4a1.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The applied patch does not increase the total number of javadoc warnings.

          +1 checkstyle. The patch generated 0 code style errors.

          -1 findbugs. The patch appears to cause Findbugs (version 2.0.3) to fail.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in tajo-core tajo-core-tests tajo-plan tajo-storage/tajo-storage-common tajo-storage/tajo-storage-hbase tajo-storage/tajo-storage-hdfs:
          org.apache.tajo.engine.query.TestHBaseTable

          Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/852//testReport/
          Findbugs results: https://builds.apache.org/job/PreCommit-TAJO-Build/852//findbugsResult
          Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/852//console

          This message is automatically generated.

          Show
          tajoqa Tajo QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12754206/TAJO-1796.patch against master revision release-0.9.0-rc0-443-g7e0a4a1. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The applied patch does not increase the total number of javadoc warnings. +1 checkstyle. The patch generated 0 code style errors. -1 findbugs. The patch appears to cause Findbugs (version 2.0.3) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in tajo-core tajo-core-tests tajo-plan tajo-storage/tajo-storage-common tajo-storage/tajo-storage-hbase tajo-storage/tajo-storage-hdfs: org.apache.tajo.engine.query.TestHBaseTable Test results: https://builds.apache.org/job/PreCommit-TAJO-Build/852//testReport/ Findbugs results: https://builds.apache.org/job/PreCommit-TAJO-Build/852//findbugsResult Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/852//console This message is automatically generated.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user eminency closed the pull request at:

          https://github.com/apache/tajo/pull/729

          Show
          githubbot ASF GitHub Bot added a comment - Github user eminency closed the pull request at: https://github.com/apache/tajo/pull/729
          Hide
          eminency Jongyoung Park added a comment -

          By TAJO-1826, it is solved.

          Show
          eminency Jongyoung Park added a comment - By TAJO-1826 , it is solved.

            People

            • Assignee:
              eminency Jongyoung Park
              Reporter:
              eminency Jongyoung Park
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development