Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12631

Improve count star performance for parquet scans

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • Impala 4.4.0
    • Backend
    • None
    • ghx-label-3

    Description

      The code in the backend function HdfsParquetScanner::GetNextInternal() is not efficient now. We use row group statistics instead of file meta statistics, which leads to unnecessary materialization overhead.

      Attachments

        Activity

          People

            zhangyifan27 YifanZhang
            zhangyifan27 YifanZhang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: