Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17261

Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • 2.2.0
    • 3.2.0
    • Database/Schema
    • None

    Description

      Hive use deprecated ParquetInputSplit in https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128

      Please see interface definition in https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80

      Old interface set rowgroupoffset values which will lead to skip dictionary filter in parquet.

      Attachments

        1. HIVE-17261.patch
          1 kB
          Junjie Chen
        2. HIVE-17261.diff
          1 kB
          Junjie Chen
        3. HIVE-17261.8.patch
          12 kB
          Junjie Chen
        4. HIVE-17261.7.patch
          12 kB
          Junjie Chen
        5. HIVE-17261.6.patch
          11 kB
          Junjie Chen
        6. HIVE-17261.5.patch
          11 kB
          Junjie Chen
        7. HIVE-17261.4.patch
          9 kB
          Junjie Chen
        8. HIVE-17261.3.patch
          8 kB
          Junjie Chen
        9. HIVE-17261.2.patch
          11 kB
          Junjie Chen
        10. HIVE-17261.11.patch
          14 kB
          Junjie Chen
        11. HIVE-17261.10.patch
          14 kB
          Junjie Chen

        Issue Links

          Activity

            People

              junjie Junjie Chen
              junjie Junjie Chen
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: