XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.2.0
    • None
    • None

    Description

      In Hive, predicate pushdown figures out the search condition in HQL, serialize it, and push to file format. ORC could use the predicate to filter stripes. Similarly, Parquet should use the statics saved in row group to filter not match row group. But it does not work.

      In ParquetRecordReaderWrapper, it get splits with all row groups (client side), and push the filter to Parquet for further processing (parquet side). But in ParquetRecordReader.initializeInternalReader(), if the splits have already been selected by client side, it will not handle filter again.

      We should make the behavior consistent in Hive. Maybe we could get splits, filter them, and then pass to parquet. This means using client side strategy.

      Attachments

        1. HIVE-10252.patch
          11 kB
          Dong Chen

        Issue Links

          Activity

            People

              dongc Dong Chen
              dongc Dong Chen
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: