Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17261

Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

    Details

    • Type: Improvement
    • Status: Reopened
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.2.0
    • Fix Version/s: 3.0.0
    • Component/s: Database/Schema
    • Labels:
      None
    • Target Version/s:

      Description

      Hive use deprecated ParquetInputSplit in https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128

      Please see interface definition in https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80

      Old interface set rowgroupoffset values which will lead to skip dictionary filter in parquet.

      1. HIVE-17261.10.patch
        14 kB
        Junjie Chen
      2. HIVE-17261.11.patch
        14 kB
        Junjie Chen
      3. HIVE-17261.2.patch
        11 kB
        Junjie Chen
      4. HIVE-17261.3.patch
        8 kB
        Junjie Chen
      5. HIVE-17261.4.patch
        9 kB
        Junjie Chen
      6. HIVE-17261.5.patch
        11 kB
        Junjie Chen
      7. HIVE-17261.6.patch
        11 kB
        Junjie Chen
      8. HIVE-17261.7.patch
        12 kB
        Junjie Chen
      9. HIVE-17261.8.patch
        12 kB
        Junjie Chen
      10. HIVE-17261.diff
        1 kB
        Junjie Chen
      11. HIVE-17261.patch
        1 kB
        Junjie Chen

        Issue Links

          Activity

          Hide
          junjie Junjie Chen added a comment - - edited

          Just update one function for parquet, so no unit test.

          Show
          junjie Junjie Chen added a comment - - edited Just update one function for parquet, so no unit test.
          Hide
          junjie Junjie Chen added a comment -

          Hi liyunzhang_intel
          Could you please have a look?

          Show
          junjie Junjie Chen added a comment - Hi liyunzhang_intel Could you please have a look?
          Hide
          kellyzly liyunzhang_intel added a comment -

          Junjie Chen: GTM from my side.
          Ferdinand Xu and Chao Sun: can you help review as you have more knowledge on it.

          Show
          kellyzly liyunzhang_intel added a comment - Junjie Chen : GTM from my side. Ferdinand Xu and Chao Sun : can you help review as you have more knowledge on it.
          Hide
          Ferd Ferdinand Xu added a comment - - edited

          Can you rename the patch to HIVE-17261.patch? I see the new APIs doesn't require filtedBlocks as its parameter. So Parquet can handle filter using search argument in its side?

          Show
          Ferd Ferdinand Xu added a comment - - edited Can you rename the patch to HIVE-17261 .patch? I see the new APIs doesn't require filtedBlocks as its parameter. So Parquet can handle filter using search argument in its side?
          Hide
          junjie Junjie Chen added a comment -

          Hive convert search argument to FilterPredicate and push down to Parquet. Please see here: https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L151

          Show
          junjie Junjie Chen added a comment - Hive convert search argument to FilterPredicate and push down to Parquet. Please see here: https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L151
          Hide
          Ferd Ferdinand Xu added a comment -

          Sorry for misleading. Update previous question.

          Show
          Ferd Ferdinand Xu added a comment - Sorry for misleading. Update previous question.
          Hide
          junjie Junjie Chen added a comment -
          Show
          junjie Junjie Chen added a comment - Yes, ParquetFileReader@filterRowGroups does the job.
          Hide
          hiveqa Hive QA added a comment -

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12881138/HIVE-17261.patch

          ERROR: -1 due to no test(s) being added or modified.

          ERROR: -1 due to 9 failed/errored test(s), 11000 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move] (batchId=243)
          org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only] (batchId=243)
          org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only] (batchId=243)
          org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only] (batchId=170)
          org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=169)
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99)
          org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=180)
          org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=180)
          org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=180)
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6332/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6332/console
          Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6332/

          Messages:

          Executing org.apache.hive.ptest.execution.TestCheckPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 9 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12881138 - PreCommit-HIVE-Build

          Show
          hiveqa Hive QA added a comment - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12881138/HIVE-17261.patch ERROR: -1 due to no test(s) being added or modified. ERROR: -1 due to 9 failed/errored test(s), 11000 tests executed Failed tests: org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move] (batchId=243) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only] (batchId=243) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only] (batchId=243) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only] (batchId=170) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=169) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=180) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=180) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=180) Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6332/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6332/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6332/ Messages: Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed This message is automatically generated. ATTACHMENT ID: 12881138 - PreCommit-HIVE-Build
          Hide
          hiveqa Hive QA added a comment -

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12881138/HIVE-17261.patch

          ERROR: -1 due to no test(s) being added or modified.

          ERROR: -1 due to 10 failed/errored test(s), 11000 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move] (batchId=243)
          org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only] (batchId=243)
          org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only] (batchId=243)
          org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only] (batchId=170)
          org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=169)
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100)
          org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235)
          org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=180)
          org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=180)
          org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=180)
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6333/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6333/console
          Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6333/

          Messages:

          Executing org.apache.hive.ptest.execution.TestCheckPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 10 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12881138 - PreCommit-HIVE-Build

          Show
          hiveqa Hive QA added a comment - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12881138/HIVE-17261.patch ERROR: -1 due to no test(s) being added or modified. ERROR: -1 due to 10 failed/errored test(s), 11000 tests executed Failed tests: org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move] (batchId=243) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only] (batchId=243) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only] (batchId=243) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only] (batchId=170) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=169) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=180) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=180) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=180) Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6333/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6333/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6333/ Messages: Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed This message is automatically generated. ATTACHMENT ID: 12881138 - PreCommit-HIVE-Build
          Hide
          Ferd Ferdinand Xu added a comment -

          Junjie Chen ParquetInputSplit doesn't implicitly call row group filter. If you want to deprecate existing constructor, please use end() method provided by Parquet-MR side.

          Show
          Ferd Ferdinand Xu added a comment - Junjie Chen ParquetInputSplit doesn't implicitly call row group filter. If you want to deprecate existing constructor, please use end() method provided by Parquet-MR side.
          Hide
          junjie Junjie Chen added a comment -

          Ferdinand Xu, I don't understand what you means, end() is private member function used by deprecated constructor, why I should use it in new one?

          Show
          junjie Junjie Chen added a comment - Ferdinand Xu , I don't understand what you means, end() is private member function used by deprecated constructor, why I should use it in new one?
          Hide
          Ferd Ferdinand Xu added a comment -

          As discussed offline, we can clean up the code to filter row groups if no more needed.

          Show
          Ferd Ferdinand Xu added a comment - As discussed offline, we can clean up the code to filter row groups if no more needed.
          Hide
          junjie Junjie Chen added a comment -

          Actually, Hive use two deprecated parquet APIs, one is ParquetInputSplit, another is filterRowGroup. This is because parquet introduce new dictionary filter. The key point here is how to leverage both statistics filter and dictionary filter, in existing code, hive explicitly apply statistic filter in Hive side.

          To apply both statistics and dictionary filter, we can either explicitly changed filterRowGroup API or pass predicate statement through job configuration to parquet and filter at parquet side. The patch I provide is to pass predicate statement and skip explicitly filter at hive side.

          Show
          junjie Junjie Chen added a comment - Actually, Hive use two deprecated parquet APIs, one is ParquetInputSplit, another is filterRowGroup. This is because parquet introduce new dictionary filter. The key point here is how to leverage both statistics filter and dictionary filter, in existing code, hive explicitly apply statistic filter in Hive side. To apply both statistics and dictionary filter, we can either explicitly changed filterRowGroup API or pass predicate statement through job configuration to parquet and filter at parquet side. The patch I provide is to pass predicate statement and skip explicitly filter at hive side.
          Hide
          Ferd Ferdinand Xu added a comment -

          Thank Junjie Chen for further analysis. Any way to add some UTs for this ticket?

          Show
          Ferd Ferdinand Xu added a comment - Thank Junjie Chen for further analysis. Any way to add some UTs for this ticket?
          Hide
          junjie Junjie Chen added a comment - - edited

          Ferdinand Xu, Updated original unit tests to apply filter by using new APIs.

          Show
          junjie Junjie Chen added a comment - - edited Ferdinand Xu , Updated original unit tests to apply filter by using new APIs.
          Hide
          Ferd Ferdinand Xu added a comment -

          Can we set the conf in the base class? Otherwise there is no pushing down in vectorization reader path.

          Show
          Ferd Ferdinand Xu added a comment - Can we set the conf in the base class? Otherwise there is no pushing down in vectorization reader path.
          Hide
          hiveqa Hive QA added a comment -

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12881376/HIVE-17261.2.patch

          ERROR: -1 due to no test(s) being added or modified.

          ERROR: -1 due to 18 failed/errored test(s), 11002 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=240)
          org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move] (batchId=243)
          org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only] (batchId=243)
          org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only] (batchId=243)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9)
          org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=159)
          org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only] (batchId=170)
          org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=169)
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99)
          org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235)
          org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=235)
          org.apache.hadoop.hive.ql.io.parquet.TestParquetRowGroupFilter.testRowGroupFilterTakeEffect (batchId=263)
          org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=180)
          org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=180)
          org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=180)
          org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDate2 (batchId=183)
          org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDecimalXY (batchId=183)
          org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteTimestamp (batchId=183)
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6351/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6351/console
          Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6351/

          Messages:

          Executing org.apache.hive.ptest.execution.TestCheckPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 18 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12881376 - PreCommit-HIVE-Build

          Show
          hiveqa Hive QA added a comment - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12881376/HIVE-17261.2.patch ERROR: -1 due to no test(s) being added or modified. ERROR: -1 due to 18 failed/errored test(s), 11002 tests executed Failed tests: org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=240) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move] (batchId=243) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only] (batchId=243) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only] (batchId=243) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=159) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only] (batchId=170) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=169) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=235) org.apache.hadoop.hive.ql.io.parquet.TestParquetRowGroupFilter.testRowGroupFilterTakeEffect (batchId=263) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=180) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=180) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=180) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDate2 (batchId=183) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDecimalXY (batchId=183) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteTimestamp (batchId=183) Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6351/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6351/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6351/ Messages: Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed This message is automatically generated. ATTACHMENT ID: 12881376 - PreCommit-HIVE-Build
          Hide
          hiveqa Hive QA added a comment -

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12881406/HIVE-17261.3.patch

          SUCCESS: +1 due to 1 test(s) being added or modified.

          ERROR: -1 due to 15 failed/errored test(s), 11002 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=240)
          org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=240)
          org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move] (batchId=243)
          org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only] (batchId=243)
          org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only] (batchId=243)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9)
          org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=159)
          org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only] (batchId=170)
          org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=169)
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99)
          org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235)
          org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=180)
          org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=180)
          org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=180)
          org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout (batchId=228)
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6353/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6353/console
          Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6353/

          Messages:

          Executing org.apache.hive.ptest.execution.TestCheckPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 15 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12881406 - PreCommit-HIVE-Build

          Show
          hiveqa Hive QA added a comment - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12881406/HIVE-17261.3.patch SUCCESS: +1 due to 1 test(s) being added or modified. ERROR: -1 due to 15 failed/errored test(s), 11002 tests executed Failed tests: org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=240) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=240) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move] (batchId=243) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only] (batchId=243) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only] (batchId=243) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=159) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only] (batchId=170) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=169) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=180) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=180) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=180) org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout (batchId=228) Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6353/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6353/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6353/ Messages: Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed This message is automatically generated. ATTACHMENT ID: 12881406 - PreCommit-HIVE-Build
          Hide
          hiveqa Hive QA added a comment -

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12885305/HIVE-17261.4.patch

          SUCCESS: +1 due to 1 test(s) being added or modified.

          ERROR: -1 due to 4 failed/errored test(s), 11033 tests executed
          Failed tests:

          TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) (batchId=280)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9)
          org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234)
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6669/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6669/console
          Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6669/

          Messages:

          Executing org.apache.hive.ptest.execution.TestCheckPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 4 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12885305 - PreCommit-HIVE-Build

          Show
          hiveqa Hive QA added a comment - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12885305/HIVE-17261.4.patch SUCCESS: +1 due to 1 test(s) being added or modified. ERROR: -1 due to 4 failed/errored test(s), 11033 tests executed Failed tests: TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) (batchId=280) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6669/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6669/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6669/ Messages: Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed This message is automatically generated. ATTACHMENT ID: 12885305 - PreCommit-HIVE-Build
          Hide
          Ferd Ferdinand Xu added a comment - - edited

          Thanks Junjie Chen for the patch. Some comments left below:
          In ParquetRecordReaderBase.java

          1. Please remove @ Depercated annotation since we are not using the deprecated constructor in L65
          2. In L103 - L107, two space indents.
          3. Please update the setFilter method since the return value is no more needed.
          4. The searchArg is passing to setFilter as a final variable. Then the converted filter property is not passed to Parquet reader?
          Show
          Ferd Ferdinand Xu added a comment - - edited Thanks Junjie Chen for the patch. Some comments left below: In ParquetRecordReaderBase.java Please remove @ Depercated annotation since we are not using the deprecated constructor in L65 In L103 - L107, two space indents. Please update the setFilter method since the return value is no more needed. The searchArg is passing to setFilter as a final variable. Then the converted filter property is not passed to Parquet reader?
          Hide
          junjie Junjie Chen added a comment -

          Thanks Ferdinand Xu
          As for 4, since jobconf is a member variable, so it doesn't need to explicit transfer.

          Show
          junjie Junjie Chen added a comment - Thanks Ferdinand Xu As for 4, since jobconf is a member variable, so it doesn't need to explicit transfer.
          Hide
          hiveqa Hive QA added a comment -

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12885720/HIVE-17261.5.patch

          SUCCESS: +1 due to 1 test(s) being added or modified.

          ERROR: -1 due to 6 failed/errored test(s), 11028 tests executed
          Failed tests:

          TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
          TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
          TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) (batchId=280)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9)
          org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234)
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6710/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6710/console
          Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6710/

          Messages:

          Executing org.apache.hive.ptest.execution.TestCheckPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 6 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12885720 - PreCommit-HIVE-Build

          Show
          hiveqa Hive QA added a comment - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12885720/HIVE-17261.5.patch SUCCESS: +1 due to 1 test(s) being added or modified. ERROR: -1 due to 6 failed/errored test(s), 11028 tests executed Failed tests: TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) (batchId=280) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6710/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6710/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6710/ Messages: Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed This message is automatically generated. ATTACHMENT ID: 12885720 - PreCommit-HIVE-Build
          Hide
          Ferd Ferdinand Xu added a comment - - edited

          Thanks Junjie Chen for the patch.
          One comment is not addressed:
          In ParquetRecordReaderBase.java

          • Please remove @ Depercated annotation since we are not using the deprecated constructor in L65

          A few more comments left:
          In ParquetRecordReaderBase.java

          • Remove the unnecessary return in L131

          In TestParquetRowGroupFilter.java

          • Since the filter is taking effect automatically within Parquet reader, we should add test cases to ensure its functionality in reader level while current tests are only focusing on the functionality of RowGroupFilter.filterRowGroups.

          Could you create a review board next time for review? Thank you!

          Show
          Ferd Ferdinand Xu added a comment - - edited Thanks Junjie Chen for the patch. One comment is not addressed: In ParquetRecordReaderBase.java Please remove @ Depercated annotation since we are not using the deprecated constructor in L65 A few more comments left: In ParquetRecordReaderBase.java Remove the unnecessary return in L131 In TestParquetRowGroupFilter.java Since the filter is taking effect automatically within Parquet reader, we should add test cases to ensure its functionality in reader level while current tests are only focusing on the functionality of RowGroupFilter.filterRowGroups. Could you create a review board next time for review? Thank you!
          Hide
          hiveqa Hive QA added a comment -

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12886325/HIVE-17261.6.patch

          SUCCESS: +1 due to 1 test(s) being added or modified.

          ERROR: -1 due to 7 failed/errored test(s), 11033 tests executed
          Failed tests:

          TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
          TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9)
          org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=143)
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100)
          org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234)
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6761/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6761/console
          Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6761/

          Messages:

          Executing org.apache.hive.ptest.execution.TestCheckPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 7 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12886325 - PreCommit-HIVE-Build

          Show
          hiveqa Hive QA added a comment - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12886325/HIVE-17261.6.patch SUCCESS: +1 due to 1 test(s) being added or modified. ERROR: -1 due to 7 failed/errored test(s), 11033 tests executed Failed tests: TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=143) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6761/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6761/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6761/ Messages: Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed This message is automatically generated. ATTACHMENT ID: 12886325 - PreCommit-HIVE-Build
          Hide
          Ferd Ferdinand Xu added a comment -

          Junjie Chen, I take a further round look. One more minor comment left:
          In ParquetRecordReaderBase

          • L69, no needed for split. You can just return new ParquetInputSplit in the end.
          Show
          Ferd Ferdinand Xu added a comment - Junjie Chen , I take a further round look. One more minor comment left: In ParquetRecordReaderBase L69, no needed for split. You can just return new ParquetInputSplit in the end.
          Hide
          hiveqa Hive QA added a comment -

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12886407/HIVE-17261.7.patch

          SUCCESS: +1 due to 1 test(s) being added or modified.

          ERROR: -1 due to 5 failed/errored test(s), 11033 tests executed
          Failed tests:

          TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
          TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9)
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100)
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6766/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6766/console
          Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6766/

          Messages:

          Executing org.apache.hive.ptest.execution.TestCheckPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 5 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12886407 - PreCommit-HIVE-Build

          Show
          hiveqa Hive QA added a comment - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12886407/HIVE-17261.7.patch SUCCESS: +1 due to 1 test(s) being added or modified. ERROR: -1 due to 5 failed/errored test(s), 11033 tests executed Failed tests: TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6766/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6766/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6766/ Messages: Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed This message is automatically generated. ATTACHMENT ID: 12886407 - PreCommit-HIVE-Build
          Hide
          hiveqa Hive QA added a comment -

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12886550/HIVE-17261.8.patch

          SUCCESS: +1 due to 1 test(s) being added or modified.

          ERROR: -1 due to 10 failed/errored test(s), 11036 tests executed
          Failed tests:

          TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
          TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat6] (batchId=7)
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] (batchId=99)
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2] (batchId=89)
          org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234)
          org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=234)
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6783/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6783/console
          Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6783/

          Messages:

          Executing org.apache.hive.ptest.execution.TestCheckPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 10 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12886550 - PreCommit-HIVE-Build

          Show
          hiveqa Hive QA added a comment - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12886550/HIVE-17261.8.patch SUCCESS: +1 due to 1 test(s) being added or modified. ERROR: -1 due to 10 failed/errored test(s), 11036 tests executed Failed tests: TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat6] (batchId=7) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] (batchId=99) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2] (batchId=89) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=234) Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6783/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6783/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6783/ Messages: Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed This message is automatically generated. ATTACHMENT ID: 12886550 - PreCommit-HIVE-Build
          Hide
          Ferd Ferdinand Xu added a comment -

          LGTM +1 pending on the test

          Show
          Ferd Ferdinand Xu added a comment - LGTM +1 pending on the test
          Hide
          hiveqa Hive QA added a comment -

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12886803/HIVE-17261.11.patch

          SUCCESS: +1 due to 1 test(s) being added or modified.

          ERROR: -1 due to 11 failed/errored test(s), 11040 tests executed
          Failed tests:

          TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
          TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9)
          org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=143)
          org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=156)
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2] (batchId=89)
          org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
          org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 (batchId=215)
          org.apache.hadoop.hive.ql.io.orc.TestNewInputOutputFormat.testNewOutputFormat (batchId=262)
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6808/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6808/console
          Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6808/

          Messages:

          Executing org.apache.hive.ptest.execution.TestCheckPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 11 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12886803 - PreCommit-HIVE-Build

          Show
          hiveqa Hive QA added a comment - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12886803/HIVE-17261.11.patch SUCCESS: +1 due to 1 test(s) being added or modified. ERROR: -1 due to 11 failed/errored test(s), 11040 tests executed Failed tests: TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=156) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2] (batchId=89) org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215) org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 (batchId=215) org.apache.hadoop.hive.ql.io.orc.TestNewInputOutputFormat.testNewOutputFormat (batchId=262) Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6808/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6808/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6808/ Messages: Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 11 tests failed This message is automatically generated. ATTACHMENT ID: 12886803 - PreCommit-HIVE-Build
          Hide
          hiveqa Hive QA added a comment -

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12886803/HIVE-17261.11.patch

          SUCCESS: +1 due to 1 test(s) being added or modified.

          ERROR: -1 due to 11 failed/errored test(s), 11040 tests executed
          Failed tests:

          TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
          TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=47)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61)
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9)
          org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=156)
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2] (batchId=89)
          org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234)
          org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215)
          org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 (batchId=215)
          

          Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6809/testReport
          Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6809/console
          Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6809/

          Messages:

          Executing org.apache.hive.ptest.execution.TestCheckPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 11 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12886803 - PreCommit-HIVE-Build

          Show
          hiveqa Hive QA added a comment - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12886803/HIVE-17261.11.patch SUCCESS: +1 due to 1 test(s) being added or modified. ERROR: -1 due to 11 failed/errored test(s), 11040 tests executed Failed tests: TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=156) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2] (batchId=89) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=215) org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 (batchId=215) Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6809/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6809/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6809/ Messages: Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 11 tests failed This message is automatically generated. ATTACHMENT ID: 12886803 - PreCommit-HIVE-Build
          Hide
          Ferd Ferdinand Xu added a comment -

          Failed test cases are not related. Committed to the upstream.

          Show
          Ferd Ferdinand Xu added a comment - Failed test cases are not related. Committed to the upstream.
          Hide
          kgyrtkirk Zoltan Haindrich added a comment -

          I'm afraid that this patch have caused a regression in parquet_ppd_char.q

          at first glance it seems to me that there are some missing results:
          https://builds.apache.org/job/PreCommit-HIVE-Build/6809/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_parquet_ppd_char_/

          according to jenkins history this test failed when 17261 was executed; and since it's in - it started failing with every build
          https://builds.apache.org/job/PreCommit-HIVE-Build/6813/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_parquet_ppd_char_/history/

          Show
          kgyrtkirk Zoltan Haindrich added a comment - I'm afraid that this patch have caused a regression in parquet_ppd_char.q at first glance it seems to me that there are some missing results: https://builds.apache.org/job/PreCommit-HIVE-Build/6809/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_parquet_ppd_char_/ according to jenkins history this test failed when 17261 was executed; and since it's in - it started failing with every build https://builds.apache.org/job/PreCommit-HIVE-Build/6813/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_parquet_ppd_char_/history/
          Hide
          junjie Junjie Chen added a comment - - edited

          The insert statement following store values in parquet without tail spaces.
          insert overwrite table newtypestbl select * from (select cast("apple" as char(10)), cast("bee" as varchar(10)), 0.22, cast("1970-02-20" as date) from src src1 union all select cast("hello" as char(10)), cast("world" as varchar(10)), 11.22, cast("1970-02-27" as date) from src src2 limit 10) uniontbl;

          However hive pass predicate

          "eq(c, Binary{"apple     "})"

          to parquet, so the records are filtered in RecordReader#nextKeyValue().

          So hive should also remove spaces in tail for predicate.

          Show
          junjie Junjie Chen added a comment - - edited The insert statement following store values in parquet without tail spaces. insert overwrite table newtypestbl select * from (select cast("apple" as char(10)), cast("bee" as varchar(10)), 0.22, cast("1970-02-20" as date) from src src1 union all select cast("hello" as char(10)), cast("world" as varchar(10)), 11.22, cast("1970-02-27" as date) from src src2 limit 10) uniontbl; However hive pass predicate "eq(c, Binary{"apple "})" to parquet, so the records are filtered in RecordReader#nextKeyValue(). So hive should also remove spaces in tail for predicate.
          Hide
          Ferd Ferdinand Xu added a comment -

          Thanks Junjie Chen for the investigation. Hive should cast constant in the filter as char 10 in this case. You can see the following table schema for table newtypestbl.

          create table newtypestbl(c char(10), v varchar(10), d decimal(5,3), da date) stored as parquet;
          
          Show
          Ferd Ferdinand Xu added a comment - Thanks Junjie Chen for the investigation. Hive should cast constant in the filter as char 10 in this case. You can see the following table schema for table newtypestbl. create table newtypestbl(c char(10), v varchar(10), d decimal(5,3), da date) stored as parquet;
          Hide
          junjie Junjie Chen added a comment -

          I think the length in create table should specify the maximum length for column. Looks like hive does not write cast values to parquet.
          Following are parquet file dump, no tail spaces in the end.
          c = hello
          v = world
          d = ACvU
          da = 57

          c = apple
          v = bee
          d = AADc
          da = 50

          c = hello
          v = world
          d = ACvU
          da = 57

          c = apple
          v = bee
          d = AADc
          da = 50

          Show
          junjie Junjie Chen added a comment - I think the length in create table should specify the maximum length for column. Looks like hive does not write cast values to parquet. Following are parquet file dump, no tail spaces in the end. c = hello v = world d = ACvU da = 57 c = apple v = bee d = AADc da = 50 c = hello v = world d = ACvU da = 57 c = apple v = bee d = AADc da = 50
          Hide
          csun Chao Sun added a comment -

          Ferdinand Xu: does HIVE-14836 depend on this JIRA?

          Show
          csun Chao Sun added a comment - Ferdinand Xu : does HIVE-14836 depend on this JIRA?
          Hide
          Ferd Ferdinand Xu added a comment -

          Hi Chao Sun, Vectorization reader can't leverage the existing mechanism of predicate pushing down in ParquetRecordReader. So I just remove the blocking link. Thank you for pointing this out.

          Show
          Ferd Ferdinand Xu added a comment - Hi Chao Sun , Vectorization reader can't leverage the existing mechanism of predicate pushing down in ParquetRecordReader. So I just remove the blocking link. Thank you for pointing this out.

            People

            • Assignee:
              junjie Junjie Chen
              Reporter:
              junjie Junjie Chen
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:

                Development