Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12809

Iceberg metadata table scanner should always be scheduled to the coordinator

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 4.4.0
    • None
    • Backend
    • ghx-label-14

    Description

      On larger clusters the Iceberg metadata scanner can be scheduled to executors, for example during a join. The fragment in this case will fail a precondition check, because either the frontend_ object will not be present or the table. Setting exec_at_coord to true is not enough and these fragments should be scheduled to the coord_only_executor_group.

      Additionally, setting NUM_NODES=1 should be a viable workaround.

      Reproducible with the following local dev Impala cluster:
      ./bin/start-impala-cluster.py --cluster_size=3 --num_coordinators=1 --use_exclusive_coordinators
      and query:
      select count(b.parent_id) from functional_parquet.iceberg_query_metadata.history a
      join functional_parquet.iceberg_query_metadata.history b on a.snapshot_id = b.snapshot_id;

      Attachments

        Issue Links

          Activity

            Commit 9071030f7fc272520c26ddb793551987226a5693 in impala's branch refs/heads/master from Daniel Becker
            [ https://gitbox.apache.org/repos/asf?p=impala.git;h=9071030f7 ]

            IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator

            On clusters with dedicated coordinators and executors the Iceberg
            metadata scanner fragment(s) can be scheduled to executors, for example
            during a join. The fragment in this case will fail a precondition check,
            because either the 'frontend_' object or the table will not be present.

            This change forces Iceberg metadata scanner fragments to be scheduled on
            the coordinator. It is not enough to set the DataPartition type to
            UNPARTITIONED, because unpartitioned fragments can still be scheduled on
            executors. This change introduces a new flag in the TPlanFragment thrift
            struct - if it is true, the fragment is always scheduled on the
            coordinator.

            Testing:

            • Added a regression test in test_coordinators.py.
            • Added a new planner test with two metadata tables and a regular table
              joined together.

            Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
            Reviewed-on: http://gerrit.cloudera.org:8080/21138
            Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
            Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            jira-bot ASF subversion and git services added a comment - Commit 9071030f7fc272520c26ddb793551987226a5693 in impala's branch refs/heads/master from Daniel Becker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=9071030f7 ] IMPALA-12809 : Iceberg metadata table scanner should always be scheduled to the coordinator On clusters with dedicated coordinators and executors the Iceberg metadata scanner fragment(s) can be scheduled to executors, for example during a join. The fragment in this case will fail a precondition check, because either the 'frontend_' object or the table will not be present. This change forces Iceberg metadata scanner fragments to be scheduled on the coordinator. It is not enough to set the DataPartition type to UNPARTITIONED, because unpartitioned fragments can still be scheduled on executors. This change introduces a new flag in the TPlanFragment thrift struct - if it is true, the fragment is always scheduled on the coordinator. Testing: Added a regression test in test_coordinators.py. Added a new planner test with two metadata tables and a regular table joined together. Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e Reviewed-on: http://gerrit.cloudera.org:8080/21138 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

            People

              daniel.becker Daniel Becker
              tmate Tamas Mate
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: