Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7127 Fetch-on-demand metadata for the impalad-side catalog
  3. IMPALA-7533

Optimize fetch-from-catalog by caching partitions across table versions

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Impala 4.0.0
    • Component/s: None
    • Labels:
    • Epic Color:
      ghx-label-6

      Description

      Currently, the cached partition-level information in CatalogdMetaProvider is tied to a particular version number of its containing table. This means that if the table is modified in any way (eg even a comment changes) all of the partitions are effectively invalidated and need to be re-loaded from catalogd.

      We could avoid this invalidation-and-refetch in a couple ways:
      1) make partitions immutable given an ID. Instead of modifying partitions in place, we could drop the partition and add a new one with a new ID. This is already done in several code paths, but not all. If we did this, then we'd just need to invalidate the partition list for a table, and when we fetched the new list, we'd see which partitions changed and need to be reloaded.
      2) add a partition-level version/sequence number which is modified whenever the partition is mutated in place. If we fetched that as part of the partition list, and used it as part of the cache key, we could avoid invalidating partitions when nothing changed. This would have the cost of 4 or 8 bytes per partition (perhaps manageable considering the hundreds of bytes saved by recent patches)

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                stigahuang Quanlong Huang
                Reporter:
                tlipcon Todd Lipcon
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: