Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4986

Use Parquet statistics when evaluating min/max/count aggregates

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Impala 2.9.0
    • None
    • Frontend

    Description

      There are various ways in which Parquet statistics such as num_rows and also parquet::Statistics can be used to speed up aggregation queries with min/max/count. Some of the improvements can be done at execution-time only, others also need query-plan modifications. The subtasks illustrate the various optimization opportunities/dimensions, and can be tackled separately.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              lv Lars Volker
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: