Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9703

Skip loading partition meta and file meta for PB scale tables

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • None
    • None
    • Catalog

    Description

      PB scale tables that have >100K partitions may hit catalog limitations. Caching all the partitions is also a waste since usually only few of them are required. Queries scanning all partitions probably fail with resource limitation errors so it's not in our consideration.

      This JIRA tracks the work to skip caching partition meta of a table. Catalogd will only cache the HmsTable object and partition list (partition names, e.g. "p1=a/p2=b" and internal partition ids generated by Impala). Coordinators fetch the partition meta on-demand when compiling queries.

      Attachments

        Activity

          People

            tangzhi Zhi Tang
            stigahuang Quanlong Huang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: