Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14826

Support vectorization for Parquet

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Parquet vectorized reader can improve both throughput and also leverages existing Hive vectorization execution engine. This is an umbrella ticket to track this feature.

        Attachments

          Issue Links

          1.
          Implement Parquet vectorization reader for Primitive types Sub-task Resolved Ferdinand Xu
          2.
          Micro benchmark for Parquet vectorized reader Sub-task Resolved Colin
          3.
          Test the predicate pushing down support for Parquet vectorization read path Sub-task Patch Available Ferdinand Xu
          4.
          Implement Parquet vectorization reader for Struct type Sub-task Resolved Ferdinand Xu
          5.
          Support Nested Column Field Pruning for Parquet Vectorized Reader Sub-task Open Chao Sun
          6.
          Fix the NullPointer problem caused by split phase Sub-task Resolved Colin
          7.
          Parquet vectorization doesn't work for tables with partition info Sub-task Closed Colin
          8.
          ParquetFileReader should be closed to avoid resource leak Sub-task Closed Colin
          9.
          Measure Performance for Parquet Vectorization Reader Sub-task Open Colin
          10.
          When we enable Parquet Writer Version V2, hive throws an exception: Unsupported encoding: DELTA_BYTE_ARRAY. Sub-task Closed Colin
          11.
          Add more q-tests for Hive-on-Spark with Parquet vectorized reader Sub-task Closed Ferdinand Xu
          12.
          Add a config to turn off parquet vectorization Sub-task Closed Vihang Karajgaonkar
          13.
          Vectorized reader does not seem to be pushing down projection columns in certain code paths Sub-task Closed Ferdinand Xu
          14.
          Remove Parquet specific code from VectorizedColumnReader Sub-task Open Unassigned
          15.
          Parquet vectorization fails on tables with complex columns when there are no projected columns Sub-task Closed Vihang Karajgaonkar
          16.
          Vectorized reader does push down projection columns for index access schema Sub-task Resolved Unassigned
          17.
          Implement Parquet vectorization reader for Array type Sub-task Closed Colin
          18.
          Support column projection for index access when using Parquet Vectorization Sub-task Closed Ferdinand Xu
          19.
          NPE during initialization of VectorizedParquetRecordReader when input split is null Sub-task Closed Vihang Karajgaonkar
          20.
          Implement Parquet vectorization reader for Map type Sub-task Closed Colin
          21.
          Fix API call in VectorizedListColumnReader to get value from BytesColumnVector Sub-task Closed Colin
          22.
          Support to read multiple level definition for Map type in Parquet file Sub-task Closed Colin
          23.
          Support vectorization for INTERVAL_DAY_TIME type Sub-task Open Unassigned
          24.
          Vectorization: add the support of timestamp in VectorizedPrimitiveColumnReader for parquet Sub-task Closed Vihang Karajgaonkar
          25.
          Fix ArrayIndexOutOfBoundsException for VectorizedListColumnReader Sub-task Closed Colin
          26.
          Support schema evolution in Parquet Vectorization reader Sub-task Closed Ferdinand Xu
          27.
          Support to read nested complex type with Parquet in vectorization mode Sub-task Open Haifeng Chen

            Activity

              People

              • Assignee:
                Ferd Ferdinand Xu
                Reporter:
                Ferd Ferdinand Xu
              • Votes:
                0 Vote for this issue
                Watchers:
                13 Start watching this issue

                Dates

                • Created:
                  Updated: