Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3127

Decouple partitions from tables

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: Impala 2.2.4
    • Fix Version/s: None
    • Component/s: Catalog

      Description

      Currently, partitions are tightly integrated into the HdfsTable objects, making incremental metadata updates difficult to perform. Furthermore, the catalog transmits entire table metadata even when only few partitions change, introducing significant latencies, wasting network bandwidth and CPU cycles while updating table metadata at the receiving impalads. As a first step, we should decouple partitions from tables and add them as a separate level in the hierarchy of catalog entities (server-db-table-partition). Subsequently, the catalog should transmit only entities that have changed after DDL/DML statements.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                vihangk1 Vihang Karajgaonkar
                Reporter:
                dtsirogiannis Dimitris Tsirogiannis
              • Votes:
                1 Vote for this issue
                Watchers:
                11 Start watching this issue

                Dates

                • Created:
                  Updated: