Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6411

Compute stats for a subset of columns

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • Impala 2.11.0
    • None
    • None
    • None
    • ghx-label-1

    Description

      For tables with a large # of columns (>400), the metadata and processing time for computing stats on all columns becomes prohibitively expensive.  It would increase performance and reduce catalog memory to be able to run stats only on a subset of columns that are frequently accessed.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              peter.ebert Peter Ebert
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: