XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Tests
    • None

    Description

      In Parquet statistics, a boolean value hasNonNullValue is used for each column chunk. Hive could use this value to skip a column, avoid null-checking logic, and speed up vectorization like HIVE-4478 (in the future, Parquet vectorization is not completed yet).

      In this Jira we could check whether this null optimization works, and make changes if any.

      Attachments

        1. HIVE-10257-parquet.1.patch
          6 kB
          Dong Chen
        2. HIVE-10257-parquet.2.patch
          6 kB
          Dong Chen
        3. HIVE-10257-parquet.patch
          4 kB
          Dong Chen

        Activity

          People

            dongc Dong Chen
            dongc Dong Chen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: