Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2649 improve incremental stats scalability
  3. IMPALA-7425

Add option to load incremental statistics from catalog

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 3.1.0
    • Impala 3.1.0
    • Catalog
    • None
    • ghx-label-6

    Description

      Incremental statistics currently store all required data in catalogd and all impalad coordinators. However, this data is only required when computing incremental statistics. In cases where incremental statistics is used on many partition columns (due to tables with many columns, many partitions or both), this data can dominate the overall memory footprint. This can lead to OOM's, increased network usage, and instability.

      Add an option to avoid propagating incremental stats to all coordinators and instead, pull it on demand from the catalog only when needed by the compute incremental statistics statement.

      Attachments

        Issue Links

          Activity

            People

              vukercegovac Vuk Ercegovac
              vukercegovac Vuk Ercegovac
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: